Solution: With any interview question one should get the specifics of the problem and the input data before designing the solution. In this particular case, we should ask about the approximate number of strings in the array, average length of each string, memory constraints, CPU usage constraints, etc. Each factor can influence the design and selection of algorithm and the data structures used.
The solution I am presenting here optimizes for speed at the cost of extra memory. The basic idea is to come up with simple hash or signature for each string in the input string array. This signature is nothing but the characters in the string sorted alphabetically. So, the signature for "cat" will be "act" and for "tac" it will be "act" too. As you can see the signature is the same for 2 strings that are anagrams. We can use this property to identify and collect the anagrams in the inputString array together. We can use a hash table for fast insert and retrieve these anagrams, but will cost us extra memory.
Note: There are other solutions that don't use a hashtable, but involves CPU cost is higher and involves sorting of the signature. I will post this alternative solution soon. If you have a better method, please write a comment.
Pseudo-Code:
Loop for each string (S) in the array:
- Get signature (K) for S (sort the characters of S to get its signature (K))
- Insert S in hash table with signature K as the key and append S to the array list contained in the value of the hash table key.
- If K already exists, don't add a new key (since it will cause collision), but just append S to the end of the list in that hash table slot.
- Goto step 1 with next available string.
- In the end just print all the keys in the hash table along with the list of strings in each slot.
Code (C#): I am avoiding use of generics here for convenience of users of other languages.
using System; using System.Text; using System.Collections; namespace anagram_problem { class Program { static void Main(string[] args) { string[] inputStrings = new string[] { "cat", "tac", "army", "pam", "act", "map", "mary", "self" }; PrintAnagrams(inputStrings); } // // given a set of strings group all the strings that are anagrams // "cat", "tac", "army", "pam", "act", "map", "mary", "self" // "act": "act", "cat", "tac" // "army": "army", "mary" // "map": "map", "pam" // "self": "self" static void PrintAnagrams(string[] inputStrings) { Hashtable hashTable = new Hashtable(); foreach (string str in inputStrings) { string signature = GetAnagramSignature(str); if (!hashTable.ContainsKey(signature)) { // this will be used to store our anagrams for this particular signature ArrayList newStringList = new ArrayList(); newStringList.Add(str); hashTable.Add(signature, newStringList); } else { // we have already encountered this signature so we just // need to append the current str to the list ArrayList list = (ArrayList)hashTable[signature]; list.Add(str); } } // now we print all the anagrams for each hash key foreach (string key in hashTable.Keys) { Console.Write("{0}: ", key); ArrayList list = (ArrayList)hashTable[key]; foreach (string str in list) { Console.Write("{0},", str); } Console.Write(Environment.NewLine); } } // // this method will sort the characters in the input string alphabetically // this sorted string will be out hash key // static string GetAnagramSignature(string input) { char[] charArray = input.ToLower().ToCharArray(); Array.Sort<char>(charArray, new Comparison<char>(CompareCharacters)); return new string(charArray); } static int CompareCharacters(char a, char b) { if (a > b) return 1; else if (b > a) return -1; else return 0; } } }