# Improving sorts (Optional) - Pascal programming

The DistantBubbleSort algorithm that was introduced major improvement on BubbleSort. Instead of comparing adjacent values, values that are situated at a given distance apart are compared. The idea is that when two distant values are swapped, this possibly saves doing the same thing by several individual swaps. The algorithm for doing this is not very different from the BubbleSort algorithm. If this DistantBubbleSort algorithm is applied repeatedly with ever decreasing values for the distance apart, the whole data array will eventually be sorted. This method of using DistantBubbleSort in this way was first proposed by David Shell and is generally known as “Shell Sort”. A Pascal implementation of this algorithm is contained in the procedure ShellSort shown in Figure.

Procedure ShellSort

In this procedure, the initial value for Distance is half the number of data items in the list, and this is halved at every iteration. A thorough analysis of the improvement obtained by this technique is well beyond the scope of this . Look at the Distant BubbleSort procedure and compare it with the BubbleSort procedure of Figure. There are differences, and the amount of detail is larger in Distant BubbleSort, but on the whole the algorithms are also very similar.

Recursive Sort: Merge Sort

As an example of a recursive procedure, Merge Sort. Here, sorting is achieved by splitting the array into two halves, sorting each half, and then merging the two sorted halves. Of course, each of the two halves is sorted using the same process, and this is repeated until each half has only a single element (base case). The merging of the two subarrays is done simply with an index “sliding down” each subarray. For each value of the index the two indexed values are compared, and their maximum is copied into the array Result. The index of the copied value is incremented and the process is continued until one of the two subarrays has been entirely copied. The remaining elements in the other subarray are then copied in the resulting array.

Procedure MergeSort

The Pascal implementation of this algorithm is shown in Figure which shows procedure MergeSort, which contains Merge as an internal procedure. The fact that it is a recursive procedure requires no special indication in the Pascal program.

Another Merge Sort: the Von Neumann Sort

Let’s now look at another sorting algorithm based upon the idea of merging but, this time, without recursion. This will allow us to introduce a couple of new features of Pascal. This algorithm was originally developed on some ideas by John von Neumann, the inventor of the stored program concept, and so we will name it Von Neumann Sort.

The Von Neumann Sort makes use of an auxiliary storage array of the same size as the original data array. During each major iteration, the data are copied from one data area, the “source”, to the other, the “target”, the direction of copy reversing between iterations. Thus, during the first iteration, the data are copied from the original data area, the “source”, to the auxiliary area, which is the “target”. During the second iteration, the roles of the two areas are reversed and the auxiliary area is the source, and the original data area is the target.

We can imagine each data array to be laid out as a row of cells from left to right with element 1 at the left and element n at the right. In the source area, the data are treated as two ascending sequences one starting at the left and going to the right and the other starting at the right and going to the left. The diagram of Figure represents the state at the beginning of an example sort of ten digits.

Start of a Von Neumann sort The two arrows in the source area mark the beginnings of the data sequences to be merged, and the arrow in the target area marks the cell into which the first element will be copied. Merging continues until the two next available data elements in the source are both less than the last copied element in the target. The four diagrams of Figure show successive steps in the merging process up to the end of the first sequence.

The Von Neumann sort The building of the first sequence stops because the next two available items in the source area, 4 and 5, are both less than the last item of the sequence that has been built in the target area, Although the data is really copied, for clarity, the diagrams show it as being moved. At this point, the next available elements are treated as the beginnings of new sequences and merging continues, forming a sequenc that is built at the opposite end of the target area. In the last diagram of Figure, the arrow in the target area has been moved to show where the next sequence will be put. This merging continues until the end of the next sequence is reached, as shown I the three diagrams of Figure. As with the first sequence, when no more merging is possible, a new sequence is started at the other end of the target area as shown by the arrow.

Von Neumann sort Sequences are formed in the target area in this way until all the data have been copied, at which point the source and target designations are interchanged. This is shown in the thre diagrams of Figure.

End of Von Neumann sort In the example, four sequences, (0, 3, 9, 7), (4, 5, 9), (2, 8), and (1) were constructed in the target area. The process continues by merging these four sequences to form two sequences in the newly designated target area. The situation at the end of the second iteration is as shown in Figure 9.14 with two sequences (0, 3, 4, 5, 6, 7, 9), and (1, 2, 8).

Result of first merge These remaining two sequences are merged in the third iteration to produce the last sequence of Figure.

End of Von Neumann sort Since there is only one sequence, the sort is complete. However, since the sorted data are in the auxiliary data array, they must be copied back into the original data array. The sorting process will thus repeatedly copy sequences of data from the source area to the target area until only one sequence was formed in the target, and therefore the data is sorted. Finally, the data are copied back into the original data area if necessary. Thus, we can sketch the algorithm in pseudocode as:

Von Neumann SortRepeatMerge SequencesUntil only one sequence copiedIf sorted data is in auxiliary storeCopy data into original data arrayEnd Von Neumann Sort

In the merging of data to form sequences, each iteration constructs a single sequence in the target area, and looping continues until the end of the source data is reached.

Merge SequencesRepeatMerge One SequenceUntil data all copied from source to targetCount sequencesEnd Merge Sequences

In the process to generate one sequence, each iteration copies a data element and the loop terminates when the source data elements cannot be part of the current sequence.

Merge One SequenceWhile data being copied forms a sequenceCopy element from source to targetEnd Merge One Sequence

When we implement this in Pascal, we’ll naturally make three separate procedures: VonNeumannSort, MergeSequences, and MergeOneSequence, as shown in the complete program of Figure.

Program Von Neumann Sort

In order to make it easy to interchange source and target designations, pointers are used, even though the two data areas are not dynamic variables created with the procedure New. We define a data type:

TYPE DataArrayPointer = ^DataArrayType; and then declare two pointers: VAR Source, Target: DataArrayPointer;

We now need to make these two pointers reference the source data array and the auxiliary data array. There is a standard Pascal function Addr, which takes an ordinary variable as a parameter and returns its memory location (address), which can then be assigned to a pointer.