Array: Find the number with odd number of occurrences

Problem: You are given an array containing positive integers. All the integers occur even number of times except one. Find this special integer.

Solution: A naive approach would be to loop over the elements of the given array and keep a count of each integer in a hash table or another counter array. But, this quickly becomes unfeasible as the range of integers could be 2^31 (one less). This is an O(n) solution that takes at memory O(range(n)).

The next approach is to sort the array and then loop on it counting occurrences of the integers. When there is a change in the integer we check its count to see if its odd or even. If its odd we have found our special integer. This is an O(n log n) solution that uses constant memory.

Lets try to see if we can use the property that there is one number that occurs odd number of times and every other number occurs even number of times. Since an even number is divisible by 2, you could think of the even occurrences as being present in pairs. The integer with the odd number of occurrences will have 0 or more pairs and one single number. So, if we could some how get rid of all the pairs then all we'd be left with is the single number. Now, what gets rid of pairs? Hint: think of an operator.

XOR will do the trick. Its gives you O(n) solution with no extra memory.

Ex: 3,5,3,2,2
 011  -- 3
^101  -- 5
^011  -- 3
^010  -- 2
^010  -- 2
 101  -- 5 (special one)
// will return the number with odd number of occurrences
// will return 0 if all numbers occur even number of times
int GetSpecialOne(int[] array, int length)
   int specialOne = array[0];
   for(int i=1; i < length; i++)
      specialOne ^= array[i];
   return specialOne;

Fibonacci Sequence Dynamic Programming

Problem: Write the code for nth Fibonacci number.

Solution: There are basically two approaches to solving the Fibonacci problem. Lets looks at the definition of Fibonacci series first.

The Fibonacci series is defined as follows
F(n) = 0 for n = 0
       1 for n = 1
       F(n-1) + F(n-2) for n > 1
If we translate that to the C language we get:
int RecursiveFibonacci(int n)
    if(n == 0) return 0;
    if(n == 1) return 1;
    return RecursiveFibonacci(n-1) + RecursiveFibonacci(n-2);

Now, whats the running time of the above program. It's horrible!! It's exponential. Why? We have a lot of duplicate work being done on each recursive call.

Typically when you are doing duplicate work, Dynamic Programming can come to rescue. Now, I don't want to go into too much detail on dynamic programming, but you have 2 approaches to Dynamic Programming. One is the bottom up approach and the second is called the top down approach. Calculating F(n) given F(n-1) would be a bottom up approach and calculating F(n) by recursively calling F(n-1) and F(n-2) would be top down. For a thorough treatment of dynamic programming see a Algorithms text or this wikipedia entry.

We can improve the above RecursiveFibonacci function by caching results of previous calculations. Doing this greatly reduces the number of recursive calls and improves the program efficiency. This is an example of top down dynamic programming.
int RecursiveFibonacci(int n)
    if(n == 0) return 0;
    if(n == 1) return 1;
    if(cache[n] == null)
      cache[n] = RecursiveFibonacci(n-1) + RecursiveFibonacci(n-2);
    return cache[n];

Now lets look at the bottom up dynamic programming approach. For a given n we would need to have F(n-1), F(n-2), F(n-3) ... F(0) already solved. In this case we'll start at the bottom and work our way up.
int GetFibonacci(int n)
    if(n == 0) return 0;
    if(n == 1) return 1;
    last1 = 0;
    last2 = 1;   
    ans = 0;

    for(int i=0; i < n-1; i++)
       ans = last1 + last2;
       last1 = last2;
       last2 = ans; 

    return ans;

We could have used a cache like we used in the top down approach, but all we really need is the last 2 results. We can get away with just using 2 variables. This is also an iterative solution which is better compared to the recursive one because of the absence of the stack overhead involved in recursive calls.

Store a stream of numbers along with the counts

Problem: Given a stream of numbers, suggest a data structure to store the numbers in the stream along with their number of occurrences.

Solution: Before solving any problem make sure you lay down all the assumptions you are making and validate them with the interviewer.

Assumptions here:
  • The numbers are 32 bit integers.
  • The size of the set (cardinality) is very small compared to the possible number of integers (lets say 100 numbers)
  • Design for faster lookup
Since the size of the set is small, a data structure like an array of integers 2^32 in size will be a waste of space, even though the insert and lookup is going to be O(1). A hashtable would have a similar issue. A linked list would solve the memory issue by storing only the numbers that were present in the set but the look up suffers ( O(n) ). We would like the look up to be better than O(n).

So we've looked at an array, linked list, hashtable and none seem to solve it. Linked list looked promising but look up was O(n). Can we improve the lookup in a linked list? First of all, whats better than O(n)? O(log n), right? What data structure does that remind you of? A Binary Search Tree!!

Lets explore if we can utilize a Binary Search Tree here. So, if we started inserting numbers in a Binary Search Tree, each insertion would cost us O(log n). There are n number of insertions, so inserting all the numbers is going to be O(n log n). When we insert each number we also store the count at the node. So if a number is inserted for the first time we create a node in the tree and store the number itself and also the count (1 for the 1st occurrence). Any subsequent occurrence will not create a new node. Instead, we'll find the corresponding node and just update the count. So by the time we are done processing all the numbers we'll have a nice binary search tree with all the numbers and their occurrences.

Now the comes the look up part. Look up in a Binary Search Tree is O(log n). Any time we are presented a number and asked for the number of occurrences we can look up (or not ) the number and get its count.

If we were to design for faster insert we'd be better off using an unsorted linked list. Insert would have been O(1) (just insert at the end) and look up would be O(n). So for a small n this wouldn't be terribly bad.

Note: There is another data structure called the skip list which is a layered linked list type data structure which has O(log n) look up and insert. The linked list in the skip list is a sorted list and that would make our insert O(n log n).

Fly plane around the world half fuel puzzle

There is an air force base on an island. You are given a plane to fly around the world without landing (except at the island where you started). The island lies on the equator and you will be flying around the equator, essentially a circle. The plane can carry enough fuel for half the journey. You can take help from other planes that can transfer fuel to your plane instantaneously while in flight. All planes fly at the same speed and carry the same amount of fuel, which again is just enough for a journey half way around the world. Find the minimum number of planes required so that one plane can circumnavigate the globe and all the refueling planes return to the island safely.
Credits: Martin Gardner

Microsoft Developer Interview Questions

---------------------------- Interview 1 ---------------------------------
- Why is a database useful? Why not use a file instead?
- What are stored procedures and what are the benefits?
- Write a program to get the nth Fibonacci number
- Discussion about patterns and practices
- What is database normalization? When would you use it and when would you not? What data can be left denormalized in a database? What are the benefits or drawbacks of normalization?
- Discussion on request processing and page life cycle
- Discussion on view state and session state management options. What is preferred?
- Write a program to merge two arrays while removing duplicate elements
- Write a program to find acronyms ("Interview Question" becomes "I.Q.")
- Why did you choose the C language for the above?
- What do you do at work when you don't have work?
- What was the most challenging technical problem you solved recently?
- Design discussion on web services on how to build scalable web services
- Discussion about concepts of threading and what are the threading options available in .NET
- Why does the UI get unresponsive when running a long operation on a single thread?
- Coding Question: You are given a sorted array but the elements are shifted to the left or right by 'i' number of places. Write code to find if a given number is present in this array.
- Coding Question: Write a program to merge two sorted arrays. First array has extra space at the end to accommodate the second array.
- Why didn't you apply to Microsoft yet?
- Why should I hire you?
------------------------- Interview 2 ---------------------------------
- How do you resolve view-state encryption problems on server farm?
- How many indexes for a query from a web page having 10 search fields?
- How do you debug a performance issue with a web page?
- What are the pros and cons of ADO.NET Transactions vs Transactions within a stored procedure.
- When do you call your code complete?
- What test cases do you put in your unit tests?
- What is a star schema in the data warehousing world?
- Write code to print a 2 dimensional matrix in spiral sequence
- Write code to remove characters present in string S1 from string S2.
- Discussion about Storing log file (10 Meg) in database vs on the file system. What are pro and cons of each approach?
- What steps are necessary to protect payment or user data in an e-commerce application?
- Write a hashing function for a hashtable that can contain 1 million rows.
- Design Question: Given a tree control in a Windows User Interface. How would you read information from a database and display it in the tree control. The tree has arbitrary levels and each level can contain arbitrary number of items in it.
- Design a SQL table to store the folder and file system hierarchy. This table has to also support search for the file/folder name.

Merge 2 Sorted Arrays (one has empty slots)

Question: There are two sorted arrays A1 and A2. Array A1 is full where as array A2 is partially empty and number of empty slots are just enough to accommodate all elements of A1. Write a program to merge the two sorted arrays to fill the array A2. You cannot use any additional memory and expected run time is O(n).

Solution: The trick to solving this problem is to start filling the destination array from the back with the largest elements. You will end up with a merged and sorted destination array.

Code (C#):
// A1 and A2 are two sorted arrays. 
// A2 is not completely full (has empty slots at the end and are exactly the 
// size of A1)
// the goal is to merge the two arrays in a sorted fashion

void Merge(int[] A1, int[] A2)
   int count = FindCount(A2); // get the count of full slots
   int i = A1.Length - 1;
   int j = count - 1;
   int k = A2.Length - 1;

      if(A1[i] > A2[j] || j < 0)
         A2[k] =A1[i];
         A2[k] = A2[j];

Largest Sum Sub-Sequence Integer Array

Problem: Given an array on n integers, find a contiguous sub-sequence / subarray with largest sum. Example: For an array of following elements { -1, 2, -4, 1, 3, -2 }, the sub-sequence with largest sum is 1,3 Solution:
 static int Find_Max_Sum_Sub_Sequence(int[] array)
            //historical max and corresponding indexes
            int maxSoFar = 0;
            int maxStart = 0;
            int maxEnd = 0;

            //running max value and the start index of the sub-sequence
            int curMax = 0;
            int curMaxStart = 0;

            //this is to handle the special case where all numbers are negative
            bool allNegatives = true;
            int largest = int.MinValue;

            for (int i = 0; i < array.Length; i++)
                largest = largest > array[i] ? largest : array[i];

                curMax = curMax + array[i];
                if (curMax > maxSoFar)
                    maxSoFar = curMax;
                    maxEnd = i;
                    maxStart = curMaxStart;
                    allNegatives = false;
                else if (curMax <= 0)
                    curMax = 0;
                    curMaxStart = i + 1;

            if (allNegatives)
                return largest;
            return maxSoFar;