The Engineering Report for the Task of Sorting Algorithm Detective
Analysis Conclusion:
Sort1: Merge Sort
Sort2: Bucket Sort
Sort3: Bubble Sort
Sort4: Quick Sort
Instruction-Analysis test data preparation
The test data generated by a java program which we designed for week3 task. The best case data is a file which is ascending order string, the worst case data is the file including a descending order, in addition, the random test data is the file generated by that java codes.
Because some linear sorting algorithms will be affected by the length of string, thus every file will be separated by 5 different editions, whose string's lengths are 1, 5, 10 or 50.
Analysis Detail
Step-1: Time Complexity Test
Sort1:
| SORT-1S | Case | 10000 | 20000 | 30000 | 40000 | 50000 | 60000 | 70000 | 80000 | 90000 | 100000 | 200000 | 300000 | 400000 | 500000 | 600000 | 700000 | 800000 | 900000 | 1000000 |
| L-1 | Best | 0.03 | 0.04 | 0.06 | 0.08 | 0.09 | 0.1 | 0.12 | 0.14 | 0.15 | 0.28 | 0.44 | 0.6 | 0.76 | 1.14 | 1.61 | 1.48 | 1.66 | 1.9 | 2.32 |
| Worst | 0.03 | 0.06 | 0.06 | 0.09 | 0.09 | 0.11 | 0.13 | 0.14 | 0.16 | 0.18 | 0.45 | 0.6 | 0.76 | 1.38 | 1.21 | 1.92 | 1.84 | 1.89 | 2.36 | |
| Random | 0.02 | 0.04 | 0.06 | 0.07 | 0.09 | 0.1 | 0.12 | 0.14 | 0.15 | 0.16 | 0.33 | 0.49 | 0.65 | 0.82 | 0.98 | 1.14 | 1.32 | 1.47 | 1.63 | |
| L-5 | Best | 0.05 | 0.07 | 0.11 | 0.24 | 0.28 | 0.3 | 0.34 | 0.58 | 0.4 | 0.44 | 1.13 | 1.34 | 1.97 | 1.93 | 2.94 | 3.33 | 3.87 | 3.86 | 4.43 |
| Worst | 0.04 | 0.07 | 0.1 | 0.13 | 0.15 | 0.18 | 0.21 | 0.24 | 0.27 | 0.3 | 0.57 | 0.84 | 1.1 | 1.37 | 1.63 | 1.89 | 2.15 | 2.41 | 2.66 | |
| Random | 0.05 | 0.09 | 0.12 | 0.26 | 0.29 | 0.32 | 0.36 | 0.39 | 0.52 | 0.71 | 0.84 | 1.31 | 1.8 | 3.25 | 2.7 | 3.21 | 3.83 | 3.94 | 5.86 | |
| L-10 | Best | 0.06 | 0.12 | 0.16 | 0.21 | 0.27 | 0.32 | 0.37 | 0.41 | 0.47 | 0.51 | 1.01 | 1050 | 2.01 | 2.49 | 2.98 | 3.46 | 3.95 | 4.43 | 4.92 |
| Worst | 0.09 | 0.18 | 0.31 | 0.42 | 0.43 | 0.74 | 0.72 | 0.78 | 1.12 | 0.95 | 1.75 | 2.41 | 2.98 | 2.99 | 5.18 | 6.32 | 9.99 | 8.34 | 8.7 | |
| Random | 0.07 | 0.12 | 0.17 | 0.22 | 0.28 | 0.33 | 0.39 | 0.44 | 0.49 | 0.54 | 1.08 | 1.62 | 2.15 | 2.69 | 3.21 | 3.75 | 4.28 | 4.82 | 5.36 | |
| L-50 | Best | 0.24 | 0.48 | 0.69 | 0.92 | 1.14 | 1.37 | 1.6 | 1.82 | 2.05 | 2.27 | 4.53 | 6.79 | 9.04 | - | - | - | - | - | - |
| Worst | 0.12 | 0.47 | 0.69 | 0.92 | 1.14 | 1.38 | 1.6 | 1.82 | 2.05 | 2.28 | 4.52 | 6.79 | 9.01 | 11.29 | 13.5 | - | - | - | - | |
| Random | 0.24 | 0.47 | 0.7 | 0.93 | 1.16 | 1.38 | 1.61 | 1.85 | 2.07 | 2.3 | 4.59 | 6.87 | 9.18 | 11.47 | 13.7 | - | - | - | - |
The time complexity for every circumstance of sort1 is nlog(n).
The result of test 1 is as same as the original file, thus the sort1 is a stable sorting algorithm.
According to the above analysis sort1 is a merge sort.
Sort2: Bucket Sort
| SORT-2S | Case | 200 | 400 | 600 | 800 | 1000 | 2000 | 3000 | 4000 | 5000 | 6000 | 7000 | 8000 | 9000 | 10000 | 20000 | 30000 | 40000 | 50000 | 60000 | 70000 | 80000 | 90000 | 100000 | 200000 | 300000 | 400000 | 500000 | 600000 | 700000 | 800000 | 900000 | 1000000 |
| L-1 | Best | 0.02 | 0.03 | 0.03 | 0.05 | 0.06 | 0.11 | 0.18 | 0.24 | 0.31 | 0.37 | 0.44 | 0.5 | 0.58 | 0.64 | 1.37 | 2.13 | 2.93 | 3.72 | 4.53 | 5.36 | 6.21 | 7.07 | 8.02 | 16.88 | 26.27 | 35.51 | 45.22 | 55.19 | 65.6 | 75.16 | 85.06 | 95.27 |
| Worst | 0.02 | 0.02 | 0.03 | 0.04 | 0.05 | 0.12 | 0.17 | 0.23 | 0.3 | 0.37 | 0.43 | 0.5 | 0.57 | 0.64 | 1.37 | 2.13 | 2.92 | 3.72 | 4.53 | 5.35 | 6.21 | 7.07 | 7.91 | 16.76 | 25.96 | 35.31 | 44.93 | 54.7 | 64.75 | 74.66 | 84.6 | 94.7 | |
| Random | 0.01 | 0.02 | 0.04 | 0.05 | 0.06 | 0.11 | 0.18 | 0.24 | 0.31 | 0.37 | 0.44 | 0.5 | 0.61 | 0.65 | 1.39 | 2.13 | 2.93 | 3.73 | 4.54 | 5.37 | 6.23 | 7.08 | 8.13 | 17.21 | 26.2 | 35.73 | 45.2 | 55.13 | 65.53 | 75.26 | 85.62 | 95.83 | |
| L-5 | Best | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.11 | 0.18 | 0.23 | 0.3 | 0.37 | 0.44 | 0.5 | 0.58 | 0.64 | 1.38 | 2.14 | 2.94 | 3.77 | 4.56 | 5.4 | 6.27 | 7.14 | 8 | 16.94 | 26.27 | 35.86 | 45.56 | 55.48 | 65.58 | 75.65 | 85.75 | 95.79 |
| Worst | 0.13 | 0.03 | 0.04 | 0.05 | 0.05 | 0.12 | 0.17 | 0.24 | 0.3 | 0.37 | 0.44 | 0.51 | 0.58 | 0.64 | 1.38 | 2.15 | 2.95 | 3.76 | 4.57 | 5.41 | 6.27 | 7.13 | 8 | 16.97 | 26.3 | 35.9 | 45.39 | 55.5 | 65.67 | 75.81 | 85.91 | 96.1 | |
| Random | 0.14 | 0.02 | 0.03 | 0.05 | 0.05 | 0.12 | 0.17 | 0.24 | 0.31 | 0.37 | 0.44 | 0.51 | 0.58 | 0.65 | 1.39 | 2.17 | 2.98 | 3.81 | 4.64 | 5.48 | 6.36 | 7.24 | 8.11 | 17.22 | 26.73 | 36.5 | 46.38 | 56.51 | 66.9 | 77.23 | 87.58 | 97.96 | |
| L-10 | Best | 0.02 | 0.02 | 0.04 | 0.05 | 0.06 | 0.11 | 0.17 | 0.23 | 0.31 | 0.37 | 0.43 | 0.5 | 0.57 | 0.64 | 1.38 | 2.14 | 2.94 | 3.75 | 4.57 | 5.4 | 6.27 | 7.14 | 7.99 | 16.95 | 26.29 | 35.84 | 45.47 | 55.44 | 65.61 | 75.72 | 85.86 | 95.87 |
| Worst | 0.13 | 0.03 | 0.04 | 0.05 | 0.05 | 0.11 | 0.17 | 0.24 | 0.3 | 0.37 | 0.44 | 0.5 | 0.58 | 0.64 | 1.38 | 2.15 | 2.96 | 3.81 | 4.59 | 5.45 | 6.33 | 7.15 | 8.01 | 16.98 | 26.31 | 35.92 | 45.62 | 55.53 | 65.69 | 75.84 | 85.98 | 96.16 | |
| Random | 0.13 | 0.03 | 0.04 | 0.05 | 0.06 | 0.11 | 0.17 | 0.24 | 0.3 | 0.37 | 0.44 | 0.5 | 0.58 | 0.65 | 1.39 | 2.17 | 2.99 | 3.81 | 4.64 | 5.48 | 6.36 | 7.24 | 8.11 | 17.23 | 26.76 | 36.51 | 46.41 | 56.54 | 66.92 | 77.26 | 87.63 | 98.1 | |
| L-50 | Best | 0.02 | 0.02 | 0.04 | 0.05 | 0.05 | 0.11 | 0.18 | 0.24 | 0.31 | 0.37 | 0.44 | 0.5 | 0.58 | 0.65 | 1.39 | 2.15 | 2.96 | 3.77 | 4.58 | 5.43 | 6.29 | 7.16 | 8.03 | 17.02 | 26.39 | 36.11 | - | - | - | - | - | - |
| Worst | 0.02 | 0.02 | 0.04 | 0.05 | 0.05 | 0.12 | 0.18 | 0.24 | 0.31 | 0.37 | 0.44 | 0.5 | 0.58 | 0.65 | 1.39 | 2.15 | 2.96 | 3.77 | 4.58 | 5.42 | 6.3 | 7.17 | 8.03 | 17.02 | 26.38 | 36.01 | 45.64 | 55.69 | 65.84 | 76.06 | 86.38 | 96.17 | |
| Random | 0.13 | 0.03 | 0.04 | 0.05 | 0.06 | 0.11 | 0.17 | 0.24 | 0.31 | 0.37 | 0.44 | 0.51 | 0.58 | 0.66 | 1.4 | 2.18 | 3 | 3.82 | 4.66 | 5.51 | 6.39 | 7.27 | 8.15 | 17.33 | 26.91 | 36.73 | 46.67 | 56.87 | 67.3 | 77.7 | 88.13 | 98.57 |

According to the curve of running time and input size, the time complexity is O(n) linear, in addition, the performance of sort 2 cannot be influenced by the length of string . Thus sort2 is in the group 4, which was mentioned in phase 1 report. Furthermore, after the stability test ,the sort2's result is as same as the original file, which demonstrate that the sort 2 is in group 4.
In order to distinguished the bucket sort and the counting sort. A special test has been involved into this task. According to the principle concept of counting sort, if there is a string extremely longer than other input
the running time will be changed obviously. However, in this case, sort2 does not have this feather. According to above analysis, the sort2 is a bucket sort.
Sort3:Bubble-sort
| SORT-3S | Case | 200 | 400 | 600 | 800 | 1000 | 2000 | 3000 | 4000 | 5000 | 6000 | 7000 | 8000 | 9000 | 10000 | 20000 |
| L-1 | Best | 0.01 | 0.03 | 0.05 | 0.08 | 0.11 | 0.45 | 0.96 | 1.73 | 2.66 | 3.83 | 5.18 | 6.89 | 8.62 | 10.65 | 42.61 |
| Worst | 0.07 | 0.26 | 0.6 | 1.04 | 1.63 | 6.5 | 14.63 | 26.01 | 40.74 | 58.53 | 79.62 | 103.33 | 131.63 | 162.56 | 652.64 | |
| Random | 0.04 | 0.15 | 0.31 | 0.56 | 0.87 | 3.44 | 7.85 | 13.84 | 21.7 | 31.29 | 45.54 | 55.41 | 70.29 | 87.4 | 346.21 | |
| L-5 | Best | 0.01 | 0.03 | 0.06 | 0.11 | 0.17 | 0.73 | 1.77 | 3.23 | 5.21 | 7.73 | 10.72 | 14.16 | 17.9 | 22.72 | 99.7 |
| Worst | 0.19 | 0.28 | 0.62 | 1.08 | 1.69 | 6.74 | 15.24 | 27.04 | 42.09 | 60.88 | 83.12 | 107.83 | 137.87 | 169.59 | 676.66 | |
| Random | 0.16 | 0.15 | 0.33 | 0.58 | 0.94 | 3.74 | 8.3 | 14.95 | 23.65 | 34.66 | 46.16 | 60.73 | 77.15 | 94.96 | 384.13 |

According to the test data in the above table, the time complexity of Sort2 is (n^2). For instance, the input increased 10 times the running time will become 100 times.
In order to test the affect by length of string, the length 5 file is involved in. And the data demonstrated that sort3 cannot be influenced by the length of string.
Furthermore, after the stability test, there are no difference between the original file and the sort3's result. thus the sort 3 is a stable sorting algorithm.
Based on the phase 1 report, the sorting algorithm whose best,worst,average case are O(n^2) and is a stable sorting algorithm, the sort3 is bubble sort.
Sort4: Quick Sort
| SORT-4s | Case | 7000 | 8000 | 9000 | 10000 | 20000 | 30000 | 40000 | 50000 | 60000 | 70000 | 80000 | 90000 | 100000 |
| L-1 | Best | 0.1 | 0.12 | 0.14 | 0.15 | 0.33 | 0.52 | 0.73 | 0.91 | 1.13 | 1.35 | 1.59 | 1.79 | 2 |
| Worst | 0.1 | 0.12 | 0.13 | 0.15 | 0.33 | 0.5 | 0.7 | 0.9 | 1.08 | 1.3 | 1.53 | 1.75 | 1.95 | |
| Random | 0.11 | 0.13 | 0.15 | 0.17 | 0.34 | 0.54 | 0.75 | 0.95 | 1.16 | 1.39 | 1.59 | 1.83 | 2.05 | |
| L-5 | Best | 0.69 | 0.89 | 1.12 | 1.37 | 5.35 | 11.95 | 21.17 | 32.98 | 47.46 | 64.51 | 85.15 | 106.41 | 131.19 |
| Worst | 0.68 | 0.89 | 1.11 | 1.37 | 5.35 | 11.95 | 21.17 | 32.98 | 47.47 | 64.42 | 83.99 | 106.23 | 131.3 | |
| Random | 0.09 | 0.1 | 0.12 | 0.14 | 0.28 | 0.44 | 0.6 | 0.77 | 0.94 | 1.12 | 1.3 | 1.48 | 1.67 | |
| L-10 | Best | 0.69 | 0.9 | 1.12 | 1.38 | 5.39 | 12 | 21.22 | 33.12 | 47.66 | 64.67 | 85.56 | 106.84 | 131.87 |
| Worst | 0.69 | 0.89 | 1.12 | 1.38 | 5.37 | 11.98 | 21.25 | 33.11 | 47.67 | 64.76 | 84.84 | 107.34 | 132.95 | |
| Random | 0.09 | 0.11 | 0.13 | 0.14 | 0.29 | 0.44 | 0.61 | 0.78 | 0.95 | 1.14 | 1.32 | 1.51 | 1.69 | |
| L-50 | Best | 0.74 | 0.95 | 1.19 | 1.45 | 5.53 | 12.22 | 21.54 | 33.51 | 48.16 | 65.38 | 85.32 | 107.93 | 133.13 |
| Worst | 0.74 | 0.95 | 1.19 | 1.45 | 5.53 | 12.23 | 21.56 | 33.53 | 48.2 | 65.39 | 85.39 | 107.95 | 133.18 | |
| Random | 0.1 | 0.12 | 0.14 | 0.15 | 0.31 | 0.49 | 0.67 | 0.85 | 1.05 | 1.24 | 1.44 | 1.65 | 1.84 |

The worst case of quick sort is all item are in order or reorder. and the random case is better than ascending or descending case.
In addition, the ascending case is worse than random case obviously, when the string length grater than 1, thus, it is possible that the sort 4 is quick sorting randomly, in that case, the time complexity of random and ascending case should be similar. In addition, according to the curve of running time and input size, the time complexity of sort4 is n(logn) in random case, however, in best and worst case the time complexity is O(n^2).
In the stability test, sort4's result is different from the original file. thus it is not stable.
The stability test result of sort 4:

according to the above analysis, the sort 4 is a Quick sort.
Step-2: Stability Test
The concept of stability of a sorting algorithm is that a "stable" sorting algorithm keeps the items with the same sorting key in order. Thus I designed the test data as appendix-1, there are 10 string which include 10 letters, in every string the capital letter is at different positions. If the answer sorted by the tested sorting algorithm should as same as the original file.
After the stability test, sort(1-3) are stable, however sort4 is unstable.
Appendix:
Stability Test Data

Becuase the strings in this file is in asscending order, thus if an algorithm is stable, the position of every item should not be changed.