Finding missing numbers from billion sequential number list file You are given a billion numbers but some are missing

This is interesting algorithm question. With my first attempt I have tried it with o(n**2) which needs improvement of course.

Do you know how to find complementary numbers in given array?

The question in detail

You are given a randomly listed billion numbers, from 1 to billion that is. And there are a couple of numbers missing from this list.

The task is to find those missing numbers from a billion list, and you are given a very limited memory resource – you can assume storage is not an issue.

How I proceed with this solution

What I tried to accomplish with this phase 1 solution is to use as minimal memory as possible but to use as maximum storage as possible.

First I run through the the numbers and sort them on another file. Here the assumption is, there is no storage problem.

Sorting in this case is file based. It would be appending the new number at the end, then it would swap that number up the list with its immediate number until it is no more the smallest number.

Then run through the sorted list and check if there is a gap. Like if the list goes 1,2,4,5,6,10 then the missing numbers are 3, 7,8,9.. just incrementally look for those..

here is the solution

Look at it and you can suggest the area of improvement or a total better algorithm that is at least nlogn.

``````
package algorithm;

import java.io.FileNotFoundException;
import java.io.FileWriter;
import java.io.IOException;
import java.io.PrintWriter;
import java.io.RandomAccessFile;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Random;

/**
*
* Assumption: the numbers are read from file and they are sequential
*
* @author http://gullele.com
*
*/
public class MissingNumbers {

//source file
private static String FILE_PATH = "/tmp/random-thousands.txt";
//final sorted file
private static String FILE_SORTED_PATH = "/tmp/sorted-thousands.txt";

public static void main(String str[]) {
PrintWriter writer = null;
try {
MissingNumbers missingNumbers = new MissingNumbers();
writer = new PrintWriter(FILE_SORTED_PATH);
writer.write("");
writer.close();
writer = new PrintWriter(FILE_PATH);
writer.write("");
missingNumbers.createRecords();
missingNumbers.createSortedFileList();
List numbersLost = missingNumbers.findMissing();

for (Integer numbers:numbersLost) {
System.out.println(numbers);
}
} catch (Exception ex) {
System.out.println(ex.getMessage());
}
writer.close();
}

/**
* find the missing numbers from the billion numbers.
*
* The file would be provided as random populated file. First the file would be
* sorted on the file basis and then it would be read from top to bottom and checked for
* missing number.
* @return
*/
public void createSortedFileList() {
try {

RandomAccessFile sortedAccess = new RandomAccessFile(FILE_SORTED_PATH, "rw");

String currentRandom = null;
String currentSorted = null;
Long lastPointer = null;

int currentSortedInt = 0;

//set the file pointer
lastPointer = sortedAccess.length();
sortedAccess.seek(lastPointer);
/*
* Write the numbers in such a way that the length of the written numbers
* is in equal size to make the calculation much easier since we are readin
* by size of the numbers.
*/
sortedAccess.writeBytes(String.format("%04d\n", Integer.parseInt(currentRandom))); //append it at the end

lastPointer = sortedAccess.length()-("0000\n".length());
sortedAccess.seek(lastPointer);
while ((currentSorted = sortedAccess.readLine()) != null && lastPointer >= 5) {
currentSortedInt = Integer.parseInt(currentSorted);

lastPointer -= "0000\n".length();
sortedAccess.seek(lastPointer);
int upperSorted = Integer.parseInt(rawUpper);

if (currentSortedInt < upperSorted) {
sortedAccess.seek(lastPointer);
sortedAccess.writeBytes(currentSorted+"\n");
lastPointer += "0000\n".length();
sortedAccess.seek(lastPointer);
sortedAccess.writeBytes(rawUpper+"\n");
lastPointer -= "0000\n".length();
sortedAccess.seek(lastPointer);
} else {
break;
}
}
}

//release the resource
sortedAccess.close();

} catch (FileNotFoundException foe) {
System.out.println(foe.getMessage());
} catch (IOException io) {
System.out.println(io.getMessage());
} catch (Exception ex) {
System.out.println(ex.getMessage());
}

}

/**
* At this level there is a file on FILE_SORTED_PATH which is sorted in ascending.
* go through the file and search for the missing ones. The missing ones can be
* found when the continuity is broken
* @return
*/
public List findMissing() {

List missingNumbers = null;
try {
missingNumbers = new ArrayList();

String line = null;

int currentNumber = Integer.parseInt(line);
if (currentNumber - prevNumber != 1) {
missingNumbers.add(currentNumber-1);//the missing number is less by 1
}
prevNumber = currentNumber;
}
} catch (IOException ioe) {
//log it
}

return missingNumbers;
}

/**
* Creates 1000 that are random and put those in /tmp/random-thousands.txt
*/
public void createRecords() {

//get 20 random numbers that would be considered as missing
Map<Integer, Integer> randomHolder = new HashMap<Integer, Integer>();
Random random = new Random();
while (randomHolder.size() <= 20) {
int rand = random.nextInt(1000)+1;
if (!randomHolder.containsKey(rand)) {
randomHolder.put(rand, rand);
}
}

//now we have random numbers get all the numbers
int[] temp = new int;//make it to hold 980 elements, assume 20 is missing
//fill the numbers
int size = 1;
int index = 0;
while (index < 980) {
if (!randomHolder.containsKey(size)) {
temp[index++] = size;
}
size++;
}

//finally shuffle the numbers.
for (int i=0; i
``````

See how you would solve these known algorithm problems

Kadane’s algorithm in C – Dynamic Programming

Find K Complementary numbers from array Java implementation

String Ordered Permutation Algorithm Problem

Java solution for checking anagram strings – tell if phrases are anagrams

Find the pairs that makes K Complementary in the given array java solution

Check if two strings are anagrams or not

Find longest palindrom from sequence of characters