Technical Interview Questions, Answers, and Tips: 2009

How and why to prevent class inheritance in C#/.NET

Question: In .NET/C#, how does one prevent a class from being inherited by another class? In other words can the inheritance of class be blocked? Also, what is the reason one might want to block the inheritance chain?

The sealed keyword/modified in .NET can be used to block derivation from a class. An attempt to inherit from a class marked as sealed will result in a compile time error. Also, a sealed class cannot be an abstract class, since abstract classes are meant to be inherited and cannot be instantiated as is. In C#, structs are implicitly sealed hence they cannot be inherited. You do not need to use the sealed keyword on structs.

Usage:

public sealed class MySealedClass
{
  ...
}

The code below will not compile.

public class MyDerivedClass : MySealedClass
{
  ...
}

As to why one might want to block inheritance there are 2 main reasons. One is that the programmer does not want a class to be inherited as it might be the last class in the inheritance chain. Second is for runtime performace gains. Making a class sealed tells the compiler that there are no virtual functions for the class that will be overridden and the compiler can optimize the method invocations by transforming virtual function invocations into non-virtual invocations. This saves the time for lookup in the vtable.

Difference Between calloc, malloc, and realloc

Question: What is the difference between malloc(), calloc(), and realloc()

malloc, calloc, and realloc are functions used for memory allocation in C/C++ languages. There are some fundamental differences on how the above functions work.
realloc()
First of all realloc() is actually a reallocation function. It is used to resize a previously allocated (using malloc(), calloc(), or realloc()) block of memory to the desired size. Depending on whether the new size if less or more than the original size the block may be moved to new location.

Usage and example of realloc()

Function Prototype for realloc():
void *realloc(void *pointer, size_t size);

malloc() vs calloc()

There two basic difference between malloc() and calloc() functions:
1. malloc() allocates memory in bytes. So the programmer specifies how many bytes of memory malloc should allocate and malloc will allocate that many bytes (if possible) and return the address of the newly allocated chunk of memory.

Function prototype for malloc():

void *malloc(size_t size); //size is the number of bytes to allocate

On the other hand calloc() allocates a chunk of memory specified by a block/element size and the number of blocks/elements.
Function prototype for calloc():

void *calloc(size_t nelements, size_t elementSize);

2. malloc() does not initialize memory after it allocates it. It just returns the pointer back to the calling code and the calling code is responsible for initialization or resetting of the memory, most probably by using the memset() function. On the other hand calloc() initializes the allocated memory to 0. calloc() is obviously slower than malloc() since it has the overhead of initialization, so it may not be the best way to allocate memory if you don't care about initializing the allocated memory to 0.

Side note: Don't forget to free your allocated memory when you are done with it using the free() function.

Amazon.com Widgets

Must Read Books for Technical/Programming Interview Preparation

Coding, Program Design and Interview Questions:

Programming Interviews Exposed: Secrets to Landing Your Next Job ~ John Mongan, Noah Suojanen, and Eric Giguère
Programming Pearls (2nd Edition) ~ Jon Bentley
Expert C Programming: Deep C Secrets ~ Peter van der Linden
Puzzles for Programmers and Pros ~ Dennis Shasha
More Programming Pearls: Confessions of a Coder ~ Jon Bentley
Programming Challenges ~ Steven S. Skiena, Miquek Revilla

Data-Structures and Algorithms:

Algorithms in C, Parts 1-5 (Bundle): Fundamentals, Data Structures, Sorting, Searching, and Graph Algorithms (3rd Edition) ~ Robert Sedgewick
Head First Object-Oriented Analysis and Design ~ Brett D. McLaughlin, Gary Pollice, and Dave West

Design:

Head First Design Patterns ~ Elisabeth Freeman, Eric Freeman, Bert Bates, and Kathy Sierra
Head First Object-Oriented Analysis and Design ~ Brett D. McLaughlin, Gary Pollice, and Dave West

Software:

Writing Secure Code, Second Edition ~ Michael Howard and David LeBlanc
Code Complete: A Practical Handbook of Software Construction ~ Steve McConnel

Puzzles:

How would you move mount fuji? ~ William Poundstone
Aha! Insight ~ Martin Gardner

Implement Two stacks using one array

Problem: How to implement/code two stacks using one array.

Solution: The obvious solution is to have the two stacks at the two ends of the array. The stacks will grow in opposite direction. When the two stacks collide and there is no more room in the array the stacks will over flow. This problem is probably one of the easier problems and targeted towards exercising your array index manipulation skills. See the figure below for a visual representation of the two stacks in the array.

Code (C#):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Array2Stacks
{
    /// 
    /// DualStack class implements 2 stacks in a single array.
    /// Stack A starts on the right and increases towards the left
    /// Stack B starts on the left and increases towards the right
    /// When both stacks meet there is no more room for adding items
    /// The cool thing about the dual stack is that both stacks share the
    /// same array with built in collision detection. If one stack was empty
    /// the other stack can potentially use the entire array.
    /// 
    public class DualStack
    {
        int size = 50; //default array size
        int stackATop = -1;
        int stackBTop = 50;

        public DualStack(int size)
        {
            this.size = size;
            array = new object[this.size];
            stackBTop = size;
        }

        //one array to hold the two stacks
        object[] array = null;
 
        //pushes an item on stack A
        public bool PushStackA(object obj)
        {
            if (stackATop < stackBTop - 1) //we have at least one empty slot in the array
            {
                array[++stackATop] = obj;
                return true;
            }
            else
            {
                //return false if stack is full
                return false;   
            }
        }

        //pops an item from stack A
        public object PopStackA()
        {
            if (stackATop >= 0 && stackATop < stackBTop)
            {
                //take the element at the top of the stack
                object obj = array[stackATop]; 
                stackATop--;//decrement stack top pointer
                return obj;
            }
            return null;
        }

        /// pushes an item on stack B
        public bool PushStackB(object obj)
        {
            if (stackBTop > stackATop + 1) //we have at least one empty slot between the stacks
            {
                array[--stackBTop] = obj;
                return true;
            }
            else
            {
                //stack is full
                return false;
            }
        }

        public object PopStackB()
        {

            if (stackBTop < this.size && stackBTop > stackATop) //stack is not empty
            {
                object obj = array[stackBTop];
                stackBTop++;
                return obj;
            }
            else
            {   //stack is empty
                return null;
            }
        }
    }
}

Determine and display all anagrams in a string array

Question: Given an array of strings (string[] inputStrings). Devise a way and write the code to determine and display all the anagrams in the array.

Solution: With any interview question one should get the specifics of the problem and the input data before designing the solution. In this particular case, we should ask about the approximate number of strings in the array, average length of each string, memory constraints, CPU usage constraints, etc. Each factor can influence the design and selection of algorithm and the data structures used.

The solution I am presenting here optimizes for speed at the cost of extra memory. The basic idea is to come up with simple hash or signature for each string in the input string array. This signature is nothing but the characters in the string sorted alphabetically. So, the signature for "cat" will be "act" and for "tac" it will be "act" too. As you can see the signature is the same for 2 strings that are anagrams. We can use this property to identify and collect the anagrams in the inputString array together. We can use a hash table for fast insert and retrieve these anagrams, but will cost us extra memory.

Note: There are other solutions that don't use a hashtable, but involves CPU cost is higher and involves sorting of the signature. I will post this alternative solution soon. If you have a better method, please write a comment.

Pseudo-Code:

Loop for each string (S) in the array:

Get signature (K) for S (sort the characters of S to get its signature (K))
Insert S in hash table with signature K as the key and append S to the array list contained in the value of the hash table key.
If K already exists, don't add a new key (since it will cause collision), but just append S to the end of the list in that hash table slot.
Goto step 1 with next available string.
In the end just print all the keys in the hash table along with the list of strings in each slot.

Code (C#): I am avoiding use of generics here for convenience of users of other languages.

using System;
using System.Text;
using System.Collections;

namespace anagram_problem
{
   class Program
   {
       static void Main(string[] args)
       {
           string[] inputStrings = new string[] { "cat", "tac", "army", "pam", "act", "map", "mary", "self" };
           PrintAnagrams(inputStrings);
       }

       //
       // given a set of strings group all the strings that are anagrams
       // "cat", "tac", "army", "pam", "act", "map", "mary", "self"
       // "act": "act", "cat", "tac"
       // "army": "army", "mary"
       // "map": "map", "pam"
       // "self": "self"
       
       static void PrintAnagrams(string[] inputStrings)
       {
           Hashtable hashTable = new Hashtable();
           foreach (string str in inputStrings)
           {
               string signature = GetAnagramSignature(str);
               if (!hashTable.ContainsKey(signature))
               {
                   // this will be used to store our anagrams for this particular signature
                   ArrayList newStringList = new ArrayList();
                   newStringList.Add(str);

                   hashTable.Add(signature, newStringList);
               }
               else
               {
                   // we have already encountered this signature so we just
                   // need to append the current str to the list
                   ArrayList list = (ArrayList)hashTable[signature];
                   list.Add(str);
               }
           }

           // now we print all the anagrams for each hash key
           foreach (string key in hashTable.Keys)
           {
               Console.Write("{0}: ", key);
             
               ArrayList list = (ArrayList)hashTable[key];
               foreach (string str in list)
               {
                   Console.Write("{0},", str);
               }
               Console.Write(Environment.NewLine);
           }
       }

       //
       // this method will sort the characters in the input string alphabetically
       // this sorted string will be out hash key
       //
       static string GetAnagramSignature(string input)
       {
           char[] charArray = input.ToLower().ToCharArray();
           Array.Sort<char>(charArray, new Comparison<char>(CompareCharacters));
           return new string(charArray);
       }

       static int CompareCharacters(char a, char b)
       {
           if (a > b)
               return 1;
           else if (b > a)
               return -1;
           else
               return 0;
       }
   }
}

Print 2D Array Matrix Spiral Order

Question: Given a 2D array / matrix of integers. Write a program to print the elements in spiral order. Consider a matrix as show in the diagram to the right. The desired output of the program should be as: 1,2,3,4,8,12,16,20,19,18,17,13,9,5,6, 7,11,15,14,10.

Solution: There are several ways to solve this problem, but I am mentioning a method that is intuitive to understand and easy to implement. The idea is to consider the matrix similar to onion which can be peeled layer after layer. We can use the same approach to print the outer layer of the matrix and keep doing it recursively on a smaller matrix (with 1 less row and 1 less column).

Refer to the image below for a visual explanation. We start by printing the top-right layer of the matrix by calling the print_layer_top_right. It will print 1,2,3,4,8,12,16,20. The print_layer_top_right method then calls the print_layer_bottom_left method which will print 19,18,17,13,9,5. If you observe the size of the target matrix is reducing after each call. Each method basically calls the other method and passes the matrix indexes for the reduced matrix. Both methods take 4 index parameters which represent the target matrix. When the target matrix size is such that there is only one layer left the recursion terminates and by this time the code has printed all the numbers in the full matrix.

Code (C language):

#include <stdio.h>
void print_layer_top_right(int a[][4], int x1, int y1, int x2, int y2);
void print_layer_bottom_left(int a[][4], int x1, int y1, int x2, int y2);

int main(void)
{
       int a[5][4] = {
                       {1,2,3,4},
                       {5,6,7,8},
                       {9,10,11,12},
                       {13,14,15,16},
                       {17,18,19,20}
                   };

       print_layer_top_right(a,0,0,3,4);
}

//
// prints the top and right shells of the matrix
//
void print_layer_top_right(int a[][4], int x1, int y1, int x2, int y2)
{
      int i = 0, j = 0;

      // print the row
      for(i = x1; i<=x2; i++)
      {
         printf("%d,", a[y1][i]);
      }
 
      //print the column         
      for(j = y1 + 1; j <= y2; j++)         
      {
         printf("%d,", a[j][x2]);
      }

      // see if  we have more cells left         
      if(x2-x1 > 0)
      {
          // 'recursively' call the function to print the bottom-left layer
          print_layer_bottom_left(a, x1, y1 + 1, x2-1, y2);
      }
}

//
// prints the bottom and left shells of the matrix
//
void print_layer_bottom_left(int a[][4], int x1, int y1, int x2, int y2)
{
       int i = 0, j = 0;

       //print the row of the matrix in reverse
       for(i = x2; i>=x1; i--)
       {
               printf("%d,", a[y2][i]);
       }

       // print the last column of the matrix in reverse
       for(j = y2 - 1; j >= y1; j--)
       {
               printf("%d,", a[j][x1]);
       }

       if(x2-x1 > 0)
       {
               // 'recursively' call the function to print the top-right layer
               print_layer_top_right(a, x1+1, y1, x2, y2-1);
       }
}

DNA Sequence Pattern Match/ String Search Problem

Question: The DNA double helix consists of a long strand of four bases: adenine (abbreviated A), cytosine (C), guanine (G) and thymine (T). Thus, it can be represented as a long string containing the characters A, C, G, and T. The field of bio-informatics involves storing and searching of DNA sequence data. What is a good data structure to facilitate storage and string match/search operation on this kind of data? Write a class that stores the DNA sequence and implement a method which takes another DNA sub-sequence and returns the position of this sub-sequence in the first DNA sequence.

Solution: There are several linear time string matching algorithms. See http://en.wikipedia.org/wiki/String_searching_algorithm for a list of such algorithms.
Often a data structure called Suffix Tree or PAT tree is used in DNA sequencing applications in the field of Bio-informatics. A Suffix tree data structure facilitates string match/search in O(m) complexity, where m is the length of the sub-string. It takes an initial O(n) time required to build the suffix tree.
One concern with suffix tree is the high amount of space/memory needed. But, with a small alphabet ( only 4 characters in the DNA search problem ) , its not too bad. More over, there are some advanced techniques that can be used to further reduce the storage requirements.
A thorough treatment of suffix trees can be found here.
I still have some bugs in my class implementation, so I will post the code in the next couple of days.

Database Isolation Levels in Sql Server and ADO.NET

Isolation level refers to I in the ACID properties for a Transaction. ACID stands for Atomicity, Consistency, Isolation, Durability. Isolation level refers to sensitivity of a database transaction to other changes in other transactions. You can set the isolation level for a transaction using the following command in T-SQL:

SET TRANSACTION ISOLATION LEVEL <level name>

<level name> can be one of the following:

READ COMMITTED
READ UNCOMMITTED
REPEATABLE READ
SERIALIZABLE
SNAPSHOT

Read Uncommited

No shared locks acquired
No exclusive locks are honored
Dirty Reads are possible
Good for Performace

Read Committed

Shared locks are held while data is read
Avoids dirty reads
Data can be changed after the shared lock is released but before the transaction completes. This can cause non-repeatable reads and phantom rows
Default Isolation level used by Sql Server

SnapShot

A copy of the data accessed by the transaction is maintained.
Reduces blocking since other transactions can still be modifying the original data.
Supported on SQL Server 2005 and beyond, but needs to be enabled manually.

Repeatable Read

Shared locks are placed on all data that is used in a query.
Prevents non-repeatable reads
Phantom rows can still occur

Serialize

A range lock is placed on the data used. This prevents other transactions from updating or inserting data in that range.
Removes phantom rows issue.
Extremely bad for concurrency.

Database Isolation Levels Compared:

Isolation Level	Dirty Reads	Repeatable Reads	Phantom Rows	Concurrency
Read Uncommitted	Yes	No	Yes	Best
Read Committed	No	No	Yes	Good
Snapshot	No	Yes	No	Good
Repeatable Read	No	Yes	Yes	Poor
Serializable	No	Yes	No	Very Poor

Database Normalization Basics

First Normal Form (1NF):

Identify related groups of data (ex: orders, customers, products) and create separate tables for each group.
All tables should have a Primary Key identifying an individual row in a table.
No two columns in a table should have the same information and each column can contain only one attribute.

In essence 1st normal form is about removing redundant data and creating tables for related groups of data and creating a primary key.

Second Normal Form (2NF):

The database is required to be in the 1NF form.
Identify data that applies to multiple rows in a table and extract it to a separate table(s)
Create Foreign Key relationships between the new tables and the original table.

Third Normal Form (3NF):

The database is required to be in the 2NF form.
A table should not contain columns that are not dependent on the Primary Key. For example, an Order table may contain a CustomerID column to denote the customer who placed the order, but may not contain the customers date of birth because its not related in any sense to the order.

Fourth Normal Form (4NF):

The database is required to be in the 3NF form.
A table should not contain 2 or more multi-valued facts about an entity.

For example, consider a table as shown below:

-------------------------------------
| EmployeeId  | Skill    | Language |
-------------------------------------
|  1          | Type     | German   |
-------------------------------------
|  1          | Cook     | Greek    |
-------------------------------------
|  1          |          | Spanish  |
-------------------------------------

This table violates the 4NF form since it has 2 multi-valued facts (Skill and Language) in the same table. Instead this table should be split into two tables that individually satisfy the 4NF.

-----------------------          -----------------------------
| EmployeeId | Skill  |          | EmployeeId   | Langugage  |
-----------------------          -----------------------------
|  1         | Type   |          |    1         | German     |
-----------------------          -----------------------------
|  1         | Cook   |          |    1         | Greek      |
----------------------           -----------------------------
                                 |    1         | Spanish    |
                                 -----------------------------

References: http://www.bkent.net/Doc/simple5.htm

SQL Server Index Fragmentation and Resolution

What is Index Fragmentation and how do you resolve it in SQL Server?
Answer: I came across and excellent article on index fragmentation and 4 ways to resolve it on Sql-Server-Performace.com which describes it in great detail.

Find an Item in a Sorted Array with Shifted Elements

Problem: You are given a sorted array with shifted elements. Elements can be shifted to the left or right by 'i' number of places. The sign of 'i' denotes the direction of the shift. For positive 'i' direction of shift is right and left for negative 'i'.

For example, consider the sorted array 2, 3, 4, 8, 10, 11. A shift of 3 places to the right would be denoted by i=2 and the shifted array would look like this: 10, 11, 2, 3, 4, 8,

For i=-2, the shifted array would look like: 4, 8, 10, 11, 2, 3.

Write code to find if a given number is present in this array.

Solution: The brute force method to search all elements in the array would yield an O(n) solution, so obviously that's not the best approach. We are not leveraging the sorted nature of the array in this case.

Now, how can we leverage the sorted nature of the array? Let assume that 'i' was 0. In that case the array would be sorted and not shifted at all (0 shift). Whats the fastest way to search in a sorted array? Binary Search! We can split the array in 2 halves and do a recursive search in one of the halves until we find the number we are looking for ( or not, if its not in the array ). This approach has a running time of O(log n), which is obviously better than n.

But, the fact that the array is shifted by 'i' number of elements complicates things a little bit. Now, instead of splitting the array in equal halves, we split the array at the shift index and do a recursive binary search. There are issues we need to tackle when the shift is greater than the length of the array or if the shift is negative. I guess the code below will make much more sense than my description of the solution.

Code: We will assume that we are provided with a method below that does binary search for us and won't bother implementing it here.

// myArray is the input array
// startIndex and endIndex are the indexes in the 
// array where the binary search starts and ends
// The method returns the index of the searchVal 
// if found in the array, else it returns -1

int BinarySearch(int[] myArray, int startIndex, int endIndex, int searchVal);


// this method will return the index of the searchVal 
// if found, else it return -1
int SearchElement(int[] myArray, int shift, int searchVal)
{
   // to take care of scenarios where the shift is more 
   // than the length of the array
   shift = shift % myArray.Length; 
   
   // -ve shift can be seen as positive shift equal to 
   // the length of the array - ( -ve shift) 
   if (shift < 0)
       shift = myArray.Length + shift;

   if(myArray[shift] <= searchVal &&  
      myArray[myArray.Length - 1] >= searchVal)
   {
      return BinarySearch(myArray, shift, myArray.Length - 1, searchVal);
   }
   else if(myArray[0] <= searchVal && 
           myArray[shift - 1] >= searchVal)
   {
      return BinarySearch(myArray, 0, shift-1, searchVal);
   }
   return -1;
}

Database SQL Interview Questions

Write a SQL Query to find first day of month?
Why there is a performance difference between two similar queries that uses UNION and UNION ALL?
How to choose between a Clustered Index and a Non-Clustered Index?
How you can minimize deadlock situations in a database server?
When you should use low fill factor?
Explain First, Second, and Third database normalization form with examples?
What are the benefits of using stored procedures?
What is the difference between an explicit and an implicit lock
Discuss about lock granularity
Discuss the ACID properties for a transaction
How do TRUNCATE and DELETE work?
Contrast UNIQUE and PRIMARY KEY constraints
Do you know what log shipping is?
Talk about different kinds of joins and how does each work?
What does a NOLOCK hint do?
What is Index Fragmentation and how do you resolve it in SQL Server?
What is lock escalation in SQL Server and how does it affect blocking?
How can you resolve blocking problems caused by lock escalation in SQL Server? http://support.microsoft.com/kb/323630
What are the pros and cons of creating an index on a table?
What is the max length of an index key?

Array: Find the number with odd number of occurrences

Problem: You are given an array containing positive integers. All the integers occur even number of times except one. Find this special integer.

Solution: A naive approach would be to loop over the elements of the given array and keep a count of each integer in a hash table or another counter array. But, this quickly becomes unfeasible as the range of integers could be 2^31 (one less). This is an O(n) solution that takes at memory O(range(n)).

The next approach is to sort the array and then loop on it counting occurrences of the integers. When there is a change in the integer we check its count to see if its odd or even. If its odd we have found our special integer. This is an O(n log n) solution that uses constant memory.

Lets try to see if we can use the property that there is one number that occurs odd number of times and every other number occurs even number of times. Since an even number is divisible by 2, you could think of the even occurrences as being present in pairs. The integer with the odd number of occurrences will have 0 or more pairs and one single number. So, if we could some how get rid of all the pairs then all we'd be left with is the single number. Now, what gets rid of pairs? Hint: think of an operator.

XOR will do the trick. Its gives you O(n) solution with no extra memory.

Ex: 3,5,3,2,2

 011  -- 3
^101  -- 5
----------
 110
^011  -- 3
----------
 101
^010  -- 2
----------
 111
^010  -- 2
----------
 101  -- 5 (special one)

Code:

//
// will return the number with odd number of occurrences
// will return 0 if all numbers occur even number of times
//
int GetSpecialOne(int[] array, int length)
{
   int specialOne = array[0];
   
   for(int i=1; i < length; i++)
   {
      specialOne ^= array[i];
   }
   return specialOne;
}

Fibonacci Sequence Dynamic Programming

Problem: Write the code for nth Fibonacci number.

Solution: There are basically two approaches to solving the Fibonacci problem. Lets looks at the definition of Fibonacci series first.

The Fibonacci series is defined as follows

F(n) = 0 for n = 0
       1 for n = 1
       F(n-1) + F(n-2) for n > 1

If we translate that to the C language we get:

int RecursiveFibonacci(int n)
{
    if(n == 0) return 0;
    if(n == 1) return 1;
    return RecursiveFibonacci(n-1) + RecursiveFibonacci(n-2);
}

Now, whats the running time of the above program. It's horrible!! It's exponential. Why? We have a lot of duplicate work being done on each recursive call.

Typically when you are doing duplicate work, Dynamic Programming can come to rescue. Now, I don't want to go into too much detail on dynamic programming, but you have 2 approaches to Dynamic Programming. One is the bottom up approach and the second is called the top down approach. Calculating F(n) given F(n-1) would be a bottom up approach and calculating F(n) by recursively calling F(n-1) and F(n-2) would be top down. For a thorough treatment of dynamic programming see a Algorithms text or this wikipedia entry.

We can improve the above RecursiveFibonacci function by caching results of previous calculations. Doing this greatly reduces the number of recursive calls and improves the program efficiency. This is an example of top down dynamic programming.

int RecursiveFibonacci(int n)
{
    if(n == 0) return 0;
    if(n == 1) return 1;
   
    if(cache[n] == null)
      cache[n] = RecursiveFibonacci(n-1) + RecursiveFibonacci(n-2);
    return cache[n];
}

Now lets look at the bottom up dynamic programming approach. For a given n we would need to have F(n-1), F(n-2), F(n-3) ... F(0) already solved. In this case we'll start at the bottom and work our way up.

int GetFibonacci(int n)
{
    if(n == 0) return 0;
    if(n == 1) return 1;
   
    last1 = 0;
    last2 = 1;   
    ans = 0;

    for(int i=0; i < n-1; i++)
    {
       ans = last1 + last2;
       last1 = last2;
       last2 = ans; 
    }

    return ans;
}

We could have used a cache like we used in the top down approach, but all we really need is the last 2 results. We can get away with just using 2 variables. This is also an iterative solution which is better compared to the recursive one because of the absence of the stack overhead involved in recursive calls.

Store a stream of numbers along with the counts

Problem: Given a stream of numbers, suggest a data structure to store the numbers in the stream along with their number of occurrences.

Solution: Before solving any problem make sure you lay down all the assumptions you are making and validate them with the interviewer.

Assumptions here:

The numbers are 32 bit integers.
The size of the set (cardinality) is very small compared to the possible number of integers (lets say 100 numbers)
Design for faster lookup

Since the size of the set is small, a data structure like an array of integers 2^32 in size will be a waste of space, even though the insert and lookup is going to be O(1). A hashtable would have a similar issue. A linked list would solve the memory issue by storing only the numbers that were present in the set but the look up suffers ( O(n) ). We would like the look up to be better than O(n).

So we've looked at an array, linked list, hashtable and none seem to solve it. Linked list looked promising but look up was O(n). Can we improve the lookup in a linked list? First of all, whats better than O(n)? O(log n), right? What data structure does that remind you of? A Binary Search Tree!!

Lets explore if we can utilize a Binary Search Tree here. So, if we started inserting numbers in a Binary Search Tree, each insertion would cost us O(log n). There are n number of insertions, so inserting all the numbers is going to be O(n log n). When we insert each number we also store the count at the node. So if a number is inserted for the first time we create a node in the tree and store the number itself and also the count (1 for the 1st occurrence). Any subsequent occurrence will not create a new node. Instead, we'll find the corresponding node and just update the count. So by the time we are done processing all the numbers we'll have a nice binary search tree with all the numbers and their occurrences.

Now the comes the look up part. Look up in a Binary Search Tree is O(log n). Any time we are presented a number and asked for the number of occurrences we can look up (or not ) the number and get its count.

If we were to design for faster insert we'd be better off using an unsorted linked list. Insert would have been O(1) (just insert at the end) and look up would be O(n). So for a small n this wouldn't be terribly bad.

Note: There is another data structure called the skip list which is a layered linked list type data structure which has O(log n) look up and insert. The linked list in the skip list is a sorted list and that would make our insert O(n log n).

Fly plane around the world half fuel puzzle

There is an air force base on an island. You are given a plane to fly around the world without landing (except at the island where you started). The island lies on the equator and you will be flying around the equator, essentially a circle. The plane can carry enough fuel for half the journey. You can take help from other planes that can transfer fuel to your plane instantaneously while in flight. All planes fly at the same speed and carry the same amount of fuel, which again is just enough for a journey half way around the world. Find the minimum number of planes required so that one plane can circumnavigate the globe and all the refueling planes return to the island safely.
Credits: Martin Gardner

Microsoft Developer Interview Questions

---------------------------- Interview 1 ---------------------------------
- Why is a database useful? Why not use a file instead?
- What are stored procedures and what are the benefits?
- Write a program to get the nth Fibonacci number
- Discussion about patterns and practices
- What is database normalization? When would you use it and when would you not? What data can be left denormalized in a database? What are the benefits or drawbacks of normalization?
- Discussion on asp.net request processing and page life cycle
- Discussion on asp.net view state and session state management options. What is preferred?
- Write a program to merge two arrays while removing duplicate elements
- Write a program to find acronyms ("Interview Question" becomes "I.Q.")
- Why did you choose the C language for the above?
- What do you do at work when you don't have work?
- What was the most challenging technical problem you solved recently?
- Design discussion on web services on how to build scalable web services
- Discussion about concepts of threading and what are the threading options available in .NET
- Why does the UI get unresponsive when running a long operation on a single thread?
- Coding Question: You are given a sorted array but the elements are shifted to the left or right by 'i' number of places. Write code to find if a given number is present in this array.
- Coding Question: Write a program to merge two sorted arrays. First array has extra space at the end to accommodate the second array.
- Why didn't you apply to Microsoft yet?
- Why should I hire you?
------------------------- Interview 2 ---------------------------------
- How do you resolve Asp.net view-state encryption problems on server farm?
- How many indexes for a query from a web page having 10 search fields?
- How do you debug a performance issue with a web page?
- What are the pros and cons of ADO.NET Transactions vs Transactions within a stored procedure.
- When do you call your code complete?
- What test cases do you put in your unit tests?
- What is a star schema in the data warehousing world?
- Write code to print a 2 dimensional matrix in spiral sequence
- Write code to remove characters present in string S1 from string S2.
- Discussion about Storing log file (10 Meg) in database vs on the file system. What are pro and cons of each approach?
- What steps are necessary to protect payment or user data in an e-commerce application?
- Write a hashing function for a hashtable that can contain 1 million rows.
- Design Question: Given a tree control in a Windows User Interface. How would you read information from a database and display it in the tree control. The tree has arbitrary levels and each level can contain arbitrary number of items in it.
- Design a SQL table to store the folder and file system hierarchy. This table has to also support search for the file/folder name.

Merge 2 Sorted Arrays (one has empty slots)

Question: There are two sorted arrays A1 and A2. Array A1 is full where as array A2 is partially empty and number of empty slots are just enough to accommodate all elements of A1. Write a program to merge the two sorted arrays to fill the array A2. You cannot use any additional memory and expected run time is O(n).

Solution: The trick to solving this problem is to start filling the destination array from the back with the largest elements. You will end up with a merged and sorted destination array.

Code (C#):

// A1 and A2 are two sorted arrays. 
// A2 is not completely full (has empty slots at the end and are exactly the 
// size of A1)
// the goal is to merge the two arrays in a sorted fashion

void Merge(int[] A1, int[] A2)
{
   int count = FindCount(A2); // get the count of full slots
   int i = A1.Length - 1;
   int j = count - 1;
   int k = A2.Length - 1;

   for(;k>=0;k--)
   {
      if(A1[i] > A2[j] || j < 0)
      {
         A2[k] =A1[i];
         i--;
      }
      else
      {
         A2[k] = A2[j];
         j--;
      }
   }
}

Largest Sum Sub-Sequence Integer Array

Problem: Given an array on n integers, find a contiguous sub-sequence / subarray with largest sum. Example: For an array of following elements { -1, 2, -4, 1, 3, -2 }, the sub-sequence with largest sum is 1,3 Solution:

 static int Find_Max_Sum_Sub_Sequence(int[] array)
        {
            //historical max and corresponding indexes
            int maxSoFar = 0;
            int maxStart = 0;
            int maxEnd = 0;

            //running max value and the start index of the sub-sequence
            int curMax = 0;
            int curMaxStart = 0;

            //this is to handle the special case where all numbers are negative
            bool allNegatives = true;
            int largest = int.MinValue;

            for (int i = 0; i < array.Length; i++)
            {
                largest = largest > array[i] ? largest : array[i];

                curMax = curMax + array[i];
                if (curMax > maxSoFar)
                {
                    maxSoFar = curMax;
                    maxEnd = i;
                    maxStart = curMaxStart;
                    allNegatives = false;
                }
                else if (curMax <= 0)
                {
                    curMax = 0;
                    curMaxStart = i + 1;
                }
            }

            if (allNegatives)
            {
                return largest;
            }
            return maxSoFar;
        }