- #1
- 8,889
- 650
Moderator note: This was split from https://www.physicsforums.com/showthread.php?p=1409789#post1409789
Perhaps this response should go into it's own thread regarding sorting ...
A merge sort is ideal for external storage, since multiple elements can be read or written with a single I/O. With 4 external storage devices, a radix 2 merge sort involves all sequential I/O, and can be done on tape drives (as was done on early systems). 3 is the bare minimum for a tape sort, but it doubles the number of passes as the output tape has to be re-read to split it back up to the other two tape drives for each pass.
Optimizations for external storage include taking advatange of hardware that can handle scattered data with a single I/O operation, meaing that memory is randomly accessed during a single write operation (no point in doing this for reads). In spite the obviousness of this, since this is how any bus mastering I/O operation with mapped memory has to be done (since the logically sequential mapped segments are physically scattered), some Japanese company managed to get a patent on this, but I don't know if the patent is still valid.
Perhaps this response should go into it's own thread regarding sorting ...
yes.Better than quicksort?a merge sort is the quickest
I think you mean sorts that use very little storage overhead. A merge sort typically doubles the required storage space.comparason sort
Radix sort is a type of merge sort. A good merge sort will handle variable size groups, which are created in the initial "scan" by forming groups out of any increasing or optionally decreasing sequence of elements. Groups with decreasing sequences are "reversed" back to increasing sequences when written to external storage, or during the first actual merge pass. Each merge pass merges pairs (or triplets if radix 3 sort, quadlets, if radix 4 sort, ...) of groups, reducing the number of groups by 1/2 (or 1/3 or 1/4 ...) and increasing the size of the groups. The final pass occurs when the output will be creating a single group, usually outputting in desired format to external storage if required.I'd imagine a radix sort ought to beat any comparison-based sort.
A merge sort is ideal for external storage, since multiple elements can be read or written with a single I/O. With 4 external storage devices, a radix 2 merge sort involves all sequential I/O, and can be done on tape drives (as was done on early systems). 3 is the bare minimum for a tape sort, but it doubles the number of passes as the output tape has to be re-read to split it back up to the other two tape drives for each pass.
Optimizations for external storage include taking advatange of hardware that can handle scattered data with a single I/O operation, meaing that memory is randomly accessed during a single write operation (no point in doing this for reads). In spite the obviousness of this, since this is how any bus mastering I/O operation with mapped memory has to be done (since the logically sequential mapped segments are physically scattered), some Japanese company managed to get a patent on this, but I don't know if the patent is still valid.
Last edited by a moderator: