Sorting algorithms
Working with Data Structures and manipulating data.
Sorting Integers
Sorting requires data.
Here is a quick way to get a list of integers. Quickly review the data and make a 10, 100, and 1000 integers. See if you can make a reusable function.
import random
numbers = []
for i in range(10):
numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
Random List
[31, 6, 76, 68, 33, 82, 39, 14, 11, 53]
Warm Up
Discuss with a partner… What are some strategies you would use to sort a list of integers? (Don’t worry about writing code for now). Just to mental math on how you would sort.
- [100, 10, 50, 0, 70, 40, 20, 30, 90, 80, 60]
Explore
Get into pairs
We will be focusing on 4 algorithms today.
In your groups each pair will choose to be experts on a sorting algorithm. Bubble, Merge, Selection, and Insertion. Take about 5 minutes to read about your algorithm (Geek for Geeks linked below) and be ready to explain it to your other group members.
Practice
Prepare for group practice…
- Use or download app to your phone to “Display Text” largely.
- Pick a random number between 1 to 100 and load on your phone. Keep it hidden.
- Find other students that have been working on your expertise Algoritmm.
- Stand in a line randomly, so Teacher can see you.
- Reveal your numbers.
- [75, 17, 46, 80, 67, 45, 69, 79, 40, 0]
How would you sort this list with your expertise, rehearse and then present…
- Bubble Sort
- Selection Sort
- Merge Sort
- Insertion Sort
Explain.
Sorting Words
Obtaining words using nltk allows strings to work the same way as integers. Using your expertise build and sort random words of size 10, 100, 1000.
# !pip install nltk
import nltk
import random
from nltk.corpus import words
nltk.download('words') # Download the word list (only required once)
english_words = words.words()
def new_words():
# You can now use the 'english_words' list in your code
random_words = []
for i in range(10):
random_words.append(english_words[random.randint(0,len(english_words))])
return random_words
print("Random List")
print(new_words())
Random List
['transindividual', 'kerystics', 'mathematicals', 'nonsyndicate', 'peninvariant', 'diacranterian', 'vintager', 'arciferous', 'gittith', 'ichthyoid']
[nltk_data] Downloading package words to
[nltk_data] /Users/johnmortensen/nltk_data...
[nltk_data] Package words is already up-to-date!
Bubble Sort Solution
Teacher provided code.
words = new_words()
print(words)
def bubbleSort(list):
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j]
keyN1 = list[j+1]
if keyN > keyN1:
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return # exit function
bubbleSort(words)
print(words)
['ministerial', 'nisnas', 'bardism', 'nondenial', 'scalma', 'polyemia', 'encephalomalaxis', 'unoccupied', 'improvableness', 'technicological']
['bardism', 'encephalomalaxis', 'improvableness', 'ministerial', 'nisnas', 'nondenial', 'polyemia', 'scalma', 'technicological', 'unoccupied']
Selection Sort Solution
Teacher provided code. Does it work?
words = new_words()
print(words)
def selectionSort(list):
n = len(list) # length is n
# List is traversed from index 0 to n-1, n elements
for i in range(n):
smallI = i # small index is captured
smallV = list[i]
# Inner traversal looks at elements after i
for j in range(i+1, n):
# Save reference if less
keyV = list[j]
if keyV < smallV:
smallI = j # small index is replaced
smallV = keyV
# swap smallest to current i positon, sorting left to right
list[i], list[smallI] = list[smallI], list[i] # single line swap
selectionSort(words)
print(words)
['inexhaustibleness', 'expropriator', 'chacte', 'heteroploidy', 'rougeot', 'andantino', 'silentiary', 'manfully', 'cynically', 'envelope']
['andantino', 'chacte', 'cynically', 'envelope', 'expropriator', 'heteroploidy', 'inexhaustibleness', 'manfully', 'rougeot', 'silentiary']
Discuss
Answer the following with your group. Select the algorithm you believe is best for small lists and big lists, explain. On each sort discuss algorithmic complixity in Big(O) notation.
- When should you use each algorithm? What makes an algorithm the right choice?
- Given the following lists…
- [0, 2, 6, 4, 8, 10]
- [Elephant, Banana, Cat, Dog, Apple]
- [29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50]
Adapting Sort to Dictionary
Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List of Dictionaries with optimizations, every key in row 0 is used to sort and resort list.
"""
# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j].get(key)
keyN1 = list[j+1].get(key)
if keyN > keyN1:
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return # exit function
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 33, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# assuming uniform keys, pick 1st row as source of keys
key_row = list_of_people[0]
# print list as defined
print("Original")
print(list_of_people)
for key in key_row: # finds each key in the row
print(key)
bubbleSort(list_of_people, key) # sort list of people
print(list_of_people)
Original
[{'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'John', 'age': 33, 'city': 'Eugene'}, {'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}]
name
[{'name': 'John', 'age': 33, 'city': 'Eugene'}, {'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}, {'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}]
age
[{'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}, {'name': 'John', 'age': 33, 'city': 'Eugene'}]
city
[{'name': 'John', 'age': 33, 'city': 'Eugene'}, {'name': 'Ryan', 'age': 21, 'city': 'Los Angeles'}, {'name': 'Risa', 'age': 18, 'city': 'New York'}, {'name': 'Shekar', 'age': 18, 'city': 'San Francisco'}]
Search
In all the resulting sorts, you are now in a better situation for Searching data.
The best way to think about searching list is to think about guessing a number between 1 and 100. Which of these solutions would you use in the Numbers game? Is there any requirements required to use one or the other technique?