You are given a billion numbers but some are missing
This is interesting algorithm question. With my first attempt I have tried it with o(n**2) which needs improvement of course.
Do you know how to find complementary numbers in given array?
The question in detail
You are given a randomly listed billion numbers, from 1 to billion that is. And there are a couple of numbers missing from this list.
The task is to find those missing numbers from a billion list, and you are given a very limited memory resource – you can assume storage is not an issue.
How I proceed with this solution
What I tried to accomplish with this phase 1 solution is to use as minimal memory as possible but to use as maximum storage as possible.
First I run through the the numbers and sort them on another file. Here the assumption is, there is no storage problem.
Sorting in this case is file based. It would be appending the new number at the end, then it would swap that number up the list with its immediate number until it is no more the smallest number.
Then run through the sorted list and check if there is a gap. Like if the list goes 1,2,4,5,6,10 then the missing numbers are 3, 7,8,9.. just incrementally look for those..
here is the solution
Look at it and you can suggest the area of improvement or a total better algorithm that is at least nlogn.
package algorithm;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintWriter;
import java.io.RandomAccessFile;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Random;
/**
*
* Assumption: the numbers are read from file and they are sequential
*
* @author http://gullele.com
*
*/
public class MissingNumbers {
//source file
private static String FILE_PATH = "/tmp/random-thousands.txt";
//final sorted file
private static String FILE_SORTED_PATH = "/tmp/sorted-thousands.txt";
public static void main(String str[]) {
PrintWriter writer = null;
try {
MissingNumbers missingNumbers = new MissingNumbers();
writer = new PrintWriter(FILE_SORTED_PATH);
writer.write("");
writer.close();
writer = new PrintWriter(FILE_PATH);
writer.write("");
missingNumbers.createRecords();
missingNumbers.createSortedFileList();
List numbersLost = missingNumbers.findMissing();
for (Integer numbers:numbersLost) {
System.out.println(numbers);
}
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
writer.close();
}
/**
* find the missing numbers from the billion numbers.
*
* The file would be provided as random populated file. First the file would be
* sorted on the file basis and then it would be read from top to bottom and checked for
* missing number.
* @return
*/
public void createSortedFileList() {
try {
FileReader randomReader = new FileReader(FILE_PATH);
BufferedReader bufferedRandomReader = new BufferedReader(randomReader);
RandomAccessFile sortedAccess = new RandomAccessFile(FILE_SORTED_PATH, "rw");
String currentRandom = null;
String currentSorted = null;
Long lastPointer = null;
while ((currentRandom = bufferedRandomReader.readLine()) != null) {
int currentSortedInt = 0;
//set the file pointer
lastPointer = sortedAccess.length();
sortedAccess.seek(lastPointer);
/*
* Write the numbers in such a way that the length of the written numbers
* is in equal size to make the calculation much easier since we are readin
* by size of the numbers.
*/
sortedAccess.writeBytes(String.format("%04d\n", Integer.parseInt(currentRandom))); //append it at the end
lastPointer = sortedAccess.length()-("0000\n".length());
sortedAccess.seek(lastPointer);
while ((currentSorted = sortedAccess.readLine()) != null && lastPointer >= 5) {
currentSortedInt = Integer.parseInt(currentSorted);
lastPointer -= "0000\n".length();
sortedAccess.seek(lastPointer);
String rawUpper = sortedAccess.readLine();
int upperSorted = Integer.parseInt(rawUpper);
if (currentSortedInt < upperSorted) {
sortedAccess.seek(lastPointer);
sortedAccess.writeBytes(currentSorted+"\n");
lastPointer += "0000\n".length();
sortedAccess.seek(lastPointer);
sortedAccess.writeBytes(rawUpper+"\n");
lastPointer -= "0000\n".length();
sortedAccess.seek(lastPointer);
} else {
break;
}
}
}
//release the resource
bufferedRandomReader.close();
sortedAccess.close();
} catch (FileNotFoundException foe) {
System.out.println(foe.getMessage());
} catch (IOException io) {
System.out.println(io.getMessage());
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
}
/**
* At this level there is a file on FILE_SORTED_PATH which is sorted in ascending.
* go through the file and search for the missing ones. The missing ones can be
* found when the continuity is broken
* @return
*/
public List findMissing() {
List missingNumbers = null;
try {
FileReader fileReader = new FileReader(FILE_SORTED_PATH);
BufferedReader bufferedReader = new BufferedReader(fileReader);
missingNumbers = new ArrayList();
String line = null;
int prevNumber = Integer.parseInt(bufferedReader.readLine());
while ((line = bufferedReader.readLine()) != null) {
int currentNumber = Integer.parseInt(line);
if (currentNumber - prevNumber != 1) {
missingNumbers.add(currentNumber-1);//the missing number is less by 1
}
prevNumber = currentNumber;
}
fileReader.close();
} catch (IOException ioe) {
//log it
}
return missingNumbers;
}
/**
* Creates 1000 that are random and put those in /tmp/random-thousands.txt
*/
public void createRecords() {
//get 20 random numbers that would be considered as missing
Map<Integer, Integer> randomHolder = new HashMap<Integer, Integer>();
Random random = new Random();
while (randomHolder.size() <= 20) {
int rand = random.nextInt(1000)+1;
if (!randomHolder.containsKey(rand)) {
randomHolder.put(rand, rand);
}
}
//now we have random numbers get all the numbers
int[] temp = new int[980];//make it to hold 980 elements, assume 20 is missing
//fill the numbers
int size = 1;
int index = 0;
while (index < 980) {
if (!randomHolder.containsKey(size)) {
temp[index++] = size;
}
size++;
}
//finally shuffle the numbers.
for (int i=0; i
See how you would solve these known algorithm problems
Flatten nested javascript arrayfind longest word in the sentence
Array reversal in Recurrsion
Find longest palindrom from sequence of characters
testing k complementary pairs algorithm with junit
Implementing tokenizer and adding tokens to the linked list
Get maximum occurring character
Find the first occurence of number in the sorted array
Check if two strings are anagrams or not
String Ordered Permutation Algorithm Problem