Task 1_sort detective_ Phase 1

Executive Summary

This report aims to make a plan to find out the corresponding sort algorithm after being given four sort programs, without checking the source code. By analysing these candidate algorithms, the features of them have been found.

Analyse

1. Oblivious Bubble Sort

This algorithm's average, best and worst case performances are all O(n²), so it is rarely used to sort large, unordered, data sets. Bubble sort can be used to sort a small number of items (where its asymptotic inefficiency is not a high penalty).

2. Bubble Sort With Early Exit

Bubble sort with early exit can be used efficiently on a list of any length that is nearly sorted (that is, the elements are not significantly out of place). In some cases, bubble sort's exchange will get them in order on the first pass, the second pass will find all elements in order, so the sort will take only 2n time, which is the best case, and the time complexity can reach O(n). However, the worst case is the same as the oblivious bubble sort, which makes an O(n2) time complexity. The average case compexity is O(n2).

3. Vanilla Insertion Sort

Vanilla insertion sort is the same as the normal insertion sort, which is a simple sorting algorithm that is relatively efficient for small lists and mostly sorted lists, and often is used as part of more sophisticated algorithms. The time complexity can reach O(n) when the insertion list is given in order, which is the best case. Its time complexity of average case and worst case are all O(n2) which is the same as modified bubble sort.

4. Insertion Sort On List

Insertion sort on list is almost the same as the vanilla insertion sort, the only difference is the elements implemented in this algorithm is a list not an array. And its time complexity is not changed, same as vanilla insertion sort.

5. Insertion Sort With Binary Search

Insertion sort with binary search uses a binary search for the insertion gap. The best time complexity is O(nlogn), the worst time complexity is O(n2), and the average time complexity is O(n2).

6. Vanilla Selection Sort

Vanilla selection sort is an in-place comparison sort. It has O(n²) complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity, and also has performance advantages over more complicated algorithms in certain situations. Its best time complexity, worst time complexity, and average time complexity are all O(n2).

7. Quadratic Selection Sort

The time complexity is O(n√n) in all cases.

8. Merge Sort

Merge sort takes advantage of the ease of merging already sorted lists into a new sorted list. It starts by comparing every two elements and swapping them if the first should come after the second. It then merges each of the resulting lists of two into lists of four, then merges those lists of four, and so on; until at last two lists are merged into the final sorted list. For this, its running time is O(n log n) in all cases.

9. Vanilla Quick Sort

Vanilla quick sort is a divide and conquer algorithm which relies on a partition operation: to partition an array an element called a pivot is selected. All elements smaller than the pivot are moved before it and all greater elements are moved after it. The lesser and greater sublists are then recursively sorted. This yields average time complexity of O(nlog n). Its best time complexity is O(nlog n), and the worst time complexity is O(n2).

10. Quick Sort Median of Three

Quicksort median of three does a quicksort recursion algorithm, the only difference being the pivot choice is taken as a median of the first, middle, last elements of the array. Complexity is the same, except that the average case complexity is a lot better than a vanilla quicksort, which is O(nlog n). And its best time complexity is O(nlog n), and worst is O(n2).

11. Randomised Quick Sort

Randomised quick sort is the same as vanilla quick sort, except the pivot here is chosen at random within the middle half. Its time complexity is same as vanilla quick sort.

12. Shell Sort Times 3 plus 1

In this case, it means that insertion sort has gap sizes 1, 4, 13, 40, 121 …., starting from the largest possible gap size, and working its way down. Its worst time complexity is O(n√n) , and the best is O(nlog n).

13. Shell Sort Times 4

Shell sort times 4 works the same way as the shell sort times 3 plus 1. Just change its gap increments up by factors of 4 each time. And its worst time complexity is increased to O(n2).

14. Bogo Sort

Bogo sort is a bad sort algorithm, which takes a long time to sort unsorted data and no time at all to sort sorted data. The average time complexity is O(n*n!), while the worst case is O(infinity). When it is already sorted the best time cpmlexity is O(n).

15. Radix Sort MSD First

Radix sort MSD sorts by the first digit and sorts it into the base before moving onto the next one. Its time complixity is O(kn). k is the number of the character the word contains.

16. Radix Sort LSD First

Opposite to the radix sort MSD first, in this case, it sorts start from the last digit. And the time complexity is the same as radix sort MSD first.

17. Bucket Sort

Bucket sort is a divide and conquer sorting algorithm that generalizes counting sort by partitioning an array into a finite number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or by recursively applying the bucket sorting algorithm. Its best time complexity is O(n+k), and worst is O(n2).

18. Counting Sort

Counting sort is applicable when each input is known to belong to a particular set. Its time complexity is O(n+k) where k is the data length.

Methodology

Test1_complexity:

In this test, different number strings (e.g. from 1000 to 7000, incresed by 1000) with a fixed length are used to testify the algorithms. There are three specific tests among each string: random strings, sequenced strings and reversed strings. Collecting data from these tests is helpful in calculating time complexity and give access to group the sorting into the following types:

A. If all three subtests show quadratic complexity.

B. If all three subtests show not quadratic nor linear complexity.

C. If all three subtests show linear complexity.

D. If the random and reversed subtests show quadratic complexity and the ordered subtests show linear complexity.

E. If the random and reversed subtests show quadratic complexity and the ordered subtests show not quadratic nor linear complexity.

F. Infinite complexity (a long enough random string will fail to consistently terminate)

Test2_stability:

An input with many duplicate items (one can use the pigeonhole principle to decide how long many randomly generated strings will be needed) will be inserted and the output checked with the input using the different command. This is repeated 10 times to attempt to rule out unstable algorithms which might have stable results.

Stable – if the output and input are equal the algorithm is stable.
Unstable – if the output and input are not equal in at least one case then the algorithm is unstable.