Parallel sorting in Erlang
Parallel map makes light work of a massively parallel merge sort:
-module(mergesort). -export([msort/1]). %% terminal cases for list of less than 2 entries msort([]) -> []; msort([A]) -> [A]; % split list into 2 & mergesort them msort(L) -> Mid = trunc(length(L) / 2), {A, B} = lists:split(Mid, L), lists:umerge( pmap(fun(I) -> msort(I) end, [A, B]) ).This gets really interesting in certain cases eg. where acquisition of the data is expensive but randomly accessible.
Instead of passing in a simple list, our sort function could accept a future that would promise to return any requested linear subset of the data. Assuming we know the total length of the source list, no data is needed until we reach the merge phase at the bottom of the recursion stack, and incrementally more data is required as we bubble back up to the final result.
In this example we are doing our network retrieval in parallel with our sorting (which itself is happening in parallel). If the remote data comes from multiple hosts we could parallelise the retrieval by using multiple port instances.
03:58 PM, 03 Dec 2007 by Mark Aufflick Permalink | Short Link







