gist:4430657

Explain what merge sort is and what to consider when implementing it.

MergeSort is a recursive sorting algorithm that uses O(n log n) comparisons in the worst case. To sort an array of n elements, we perform the following three steps in sequence:

Divide the unsorted list into two sublists of about half the size
Sort each of the two sublists
Merge the two sorted sublists back into one sorted list.

There are two merge sort implementations: top-down (uses recursion) and bottom-up. Last one is more efficient and popular.

Explain how Santa Claus might use merge sort to help with Christmas (extra points for mental gymnastics).

Perhaps each elf brings sorted stack of letters, and then Santa merge it...

Write a function that merges an array of already sorted arrays, producing one large, still sorted array

def merge(left, right):
    n, m = 0, 0
    result = []
    while n < len(left) and m < len(right):
        if left[n] < right[m]:
            result.append(left[n])
            n += 1
        else:
            result.append(right[m])
            m += 1
    result += left[n:]
    result += right[m:]
    return result

def sort(data):
    if len(data)<=1:
        return data[0]
    middle = len(data)//2
    left = data[:middle]
    right = data[middle:]
    result = merge(sort(left), sort(right))
    return result

What is the runtime complexity of your solution?

Let:
n -- elements in array;
k -- number of sorted arrays.

Suppose number of sorted arrays is power of 2, each sorted array has max n elements. Thus we can build recursion tree:

a(n)   a(n)   a(n)   a(n)   a(n)   a(n)    a(n)    a(n)  | list of sorted arrays
\        /    \        /    \        /      \        /   | merge(n)
   a(2n)         a(2n)         a(2n)          a(2n)      |
        \        /                 \        /            | merge(n)
           a(4n)                      a(4n)              |
               \                     /                   | merge(n)
                     a_sorted(k n)                       V

Each merge operation requires traversal of all n elements. Number of merge operations equals to height of recursion tree -- lg(k). So answer is O(n lg(k)). (In this example k=8)

How about space?

Merge sort usually requires additional n elements space.

Do you think there could be faster or more space efficient solutions?

avoid recursive function calls
allocate additional space only once

jettify/gist:4430657