Outline
- Data Structure
- L3.1 Introduction
- L3.2 Lists and Membership Operators
- L3.3 Quiz: Lists and Membership Operators
- L3.6 List Methods
- L3.8 Check for Understanding: Lists
- L3.9 Tuples
- L3.10 Quiz: Tuples
- L3.11 Sets
- L3.13 Dictionaries and Identity Operators
- L3.14 Quiz: Dictionaries and Identity Operators
- L3-18. Check for Understanding: Data Structures
- L3-19. Compound Data Structures
- L3-20. Quiz: Compound Data Structures
- L3-22. Practice Questions
- L3-24. Conclusion
- Vocabulary
- int
- float
- boolean
- string
- group and order these data types
- list
- tuple
- set
- dictionary
- compound data structure
- membership
- identity
- Data Structure: containers that organize and group data types, which contain other data types and even other containers
- a data type for mutable ordered sequences of elements
- index start from 0
- how far the element is from the beginning of the list
- the last element index is -1
months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
print(months[0]) # January
print(months[1]) # February
print(months[7]) # August
print(months[-1]) # December
print(months[-0]) # January
print(months[25]) # IndexError: list index out of range
- lists can contain any mix and match of the data types
list_of_random_things = [1, 3,4, 'a string', True]
- slicing: using indices to slice off parts of an object
- lower bound is inclusive; upper bound is exclusive
months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
q3 = month[6:9] # ['July', 'August', 'September']
first_half = month[:6] # ['January', 'February', 'March', 'April', 'May', 'June']
second_half = month[6:] # ['July', 'August', 'September', 'October', 'November', 'December']
- lists are most similar to strings
- both types support
len
function, indexing, and slicing
- both types support
len(str)
contains empty spaces
greeting = "Hello there"
months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
print(len(greeting), len(months)) # 11 12
print(greeting[6:9], months[6:9]) # 'the' ['July', 'August', 'September']
in
: evaluates if object on left side is included in object on right sidenot in
: evaluates if object on left side is not included in object on right side- support both string and list type
greeting = "Hello there"
print('her' in greeting, 'her' not in greeting)
# True False
months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
print('Sunday' in months, 'Sunday' not in months)
# False True
Features | Strings | Lists |
---|---|---|
Support indexing | O | O |
Support slicing | O | O |
Support len() |
O | O |
Elements | sequences of letters | any type of objects |
Mutability | X | O |
Order | O | O |
- whether an object can be modified after it has been created
- lists are mutable, strings are immutable
- immutable can also be called hashable
- whether the order of elements in an object matters
- and whether this can be used to access elements
- both lists and strings are ordered
- Data type selection is largely dependent on properties, and what you can easily do with it!
- Select the three most recent dates from this list using list slicing notation. Hint: negative indexes work in slices!
eclipse_dates = ['June 21, 2001', 'December 4, 2002', 'November 23, 2003',
'March 29, 2006', 'August 1, 2008', 'July 22, 2009',
'July 11, 2010', 'November 13, 2012', 'March 20, 2015',
'March 9, 2016']
# TODO: Modify this line so it prints the last three elements of the list
print(eclipse_dates[:-3]) # X
print(eclipse_dates[-3:]) # O, start from index of -3
- due to string is immutable, so it's call by value
name = "Jim"
student = name
name = "Tim"
print(name) # Tim
print(student) # Jim
-
len()
: how many elements in the list -
max()
: greatest elements in the list- the maximum elements in the list of number is the largest number
- the maximum elements in the list of strings is the last element when we sort alphabetically
python_varieties = ['Burmese Python', 'African Rock Python', 'Ball Python', 'Reticulated Python', 'Angolan Python'] print(max(python_varieties)) # Reticulated Python
- undefined for lists that contain elements form different incomparable types
max([42, 'African Swallow']) # TypeError: '>' not supported between instances of 'str' and 'int'
-
min()
: return the smallest elements in a list -
sorted()
: return a copy of a list in order from smallest to largest leaving the original list unchangedsizes = [15, 6, 89, 34, 65, 35] print(sorted(sizes)) # [6, 15, 34, 35, 65, 89] print(sorted(sizes, reverse=True)) # [89, 65, 35, 34, 15, 6]
-
join()
: takes a list as an argument and returns a string consisting of the list elements joined by separator stringnauticla_directions = "\n".join(["fore", "aft", "starboard", "port"]) print(nauticla_directions) ''' fore aft starboard port '''
-
we can change the separator to
-
(hyphen)names = ["Garcia", "O'Kelly", "Davis"] print("-".join(names)) # Garcia-O'Kelly-Davis
-
Although forgetting to add period will not trigger an error, but will give an unexpected results
- this happens bacause of Python's default string literal appending
names = ["Garcia", "O'Kelly", "Davis"] print("-" join(names)) # GarciaO'KellyDavis
-
join()
will trigger an error, when we join anything other than stringstuff = ["thing", 42, "nope"] print(" and " .join(stuff)) # TypeError: sequence item 1: expected str instance, int found
-
append()
: add the element to the end of the listspython_varieties = ['Burmese Python', 'African Rock Python', 'Ball Python', 'Reticulated Python', 'Angolan Python'] python_varieties.append('Blood Python') print(python_varieties) # ['Burmese Python', 'African Rock Python', 'Ball Python', 'Reticulated Python', 'Angolan Python', 'Blood Python']
-
Q1: Which of the following statements about data types and data structures are true? Select all that apply
answer option reason (O) Data structures are containers that can include different data types. Data structures are containers that organize and group data types together in different ways. For example, some of the elements that a list can contain are integers, strings, and even other lists! Data structures can only hold data of the same data type. (O) A list is an example of a data structure. All data types are data structures. (O) All data structures are data types. A data type is just a type that classifies data. This can include primitive (basic) data types like integers, booleans, and strings, as well as data structures, such as lists.
-
Tuple: data structure in Python that is an immutable ordered sequence of elements, often used to store related pieces of information
AngkorWat = (13.4125, 103.866667) print(type(AngkorWat)) # <class 'tuple'> print("Angkor Wat is at latitude: {}".format(AngkorWat[0])) # Angkor Wat is at latitude: 13.4125 print("Angkor Wat is at longitude: {}".format(AngkorWat[1])) # Angkor Wat is at longitude: 103.866667
- tuple is immutable, we cannot add or remove items from tuple for sort them in place
- Why do we have tuples if they're lists with fewer features?
- Tuples are useful when we have two or more values that are so closely related
tuple = (1, 3, 5, 7, 9)
tuple[2] # 5
tuple[1:3] #(3, 5)
dimensions = 52, 40, 100
length, width, height = dimensions # tuple unpacking
print("The dimensions are {}x{}x{}".format(length, width, height)) # The dimensions are 52x40x100
- tuple can also be used to assign multiple variables in a compact way
- the parenthese
()
are optional when making tuple, and programmer frequently omit it if parentheses don't clarify the code. - tuple unpacking: assign the information from a tuple into mutiple variables
length, width, height = 52, 40, 100
print("The dimensions are {}x{}x{}".format(length, width, height)) # The dimensions are 52x40x100
- if we don't need to use
dimensions
directly, we can shorten the code
Q2.
tuple_a = 1, 2
tuple_b = (1, 2)
print(tuple_a == tuple_b) # True
-
Set: a data type for mutable unordered collections of unique elements
-
it's useful to remove the duplicates elements
countries = ['angola', 'Maldives', 'India', 'United Stated', 'India', 'Denmark', 'Sweden', 'Ghana', ] # 777 more countries not displayed print(len(countries)) # 785 print(countries[:5]) # ['angola', 'Maldives', 'India', 'United Stated', 'India']
-
we can create set like this,
set(listName)
:countries = ['angola', 'Maldives', 'India', 'United Stated', 'India', 'Denmark', 'Sweden', 'Ghana', ] # 777 more countries not displayed country_set = set(countries) print(len(country_set)) # 196
-
set also support the
in
-operator the same way lists docountries = ['angola', 'Maldives', 'India', 'United Stated', 'India', 'Denmark', 'Sweden', 'Ghana', ] # 777 more countries not displayed country_set = set(countries) print('India' in countries) # True print('India' in country_set) # True
-
we can add elements to sets where we don't use the
append
method, we useadd
alternativelycountries = ['angola', 'Maldives', 'India', 'United Stated', 'India', 'Denmark', 'Sweden', 'Ghana', ] # 777 more countries not displayed country_set = set(countries) country_set.add("Italy") # Italy is added
-
Set also have a
pop()
method just like lists, it will remove a random element -
set is unordered, so there's no "last element"
-
It also support union, intersection and difference mathematical set methods
- it's much faster than such operators with other containers
- set is a simple data structure, they have one main use: collecting unique elements
- dictionary is more flexible
-
Dictionary: a data type for mutable objects that store mappings of unique keys to values(unordered)
- dictionary stores pairs of elements, keys and values
elements = {'hydrogen': 1, 'helium': 2, 'carbon': 6} print(elements['carbon']) # 6
-
we're adding lithium and giving it a value of three
elements = {'hydrogen': 1, 'helium': 2, 'carbon': 6} elements['lithium'] = 3 print(elements) # {'hydrogen': 1, 'helium': 2, 'carbon': 6, 'lithium': 3}
-
dictionary keys are similar to list indices, but dictionary can have keys of any immutable type, not just integers
- it's not necessary for every key to have the same type
-
dictionary also support
in
operatorelements = {'hydrogen': 1, 'helium': 2, 'carbon': 6} print('mithril' in elements) # False
-
get()
: look up a value in a dictionary- it's recommended to use
get()
rather thandictionary[key]
, because the later will product an error, that can crash program
elements = {'hydrogen': 1, 'helium': 2, 'carbon': 6} print(elements.get('hydrogen')) # 1 print(elements.get('delithium')) # None print(elements['delithium']) # KeyError: 'delithium'
- it's recommended to use
-
keys()
andvalues
: return a list of all the keys/values in the dictionary- look at the last line, it's sorted by the values(county name), not by the keys(rank)
jp_population_rank = {'3': 'osaka', '1': 'tokyo', '2': 'kanagawa'}
print(jp_population_rank.keys()) # dict_keys(['3', '1', '2'])
print(jp_population_rank.values()) # dict_values(['osaka', 'tokyo', 'kanagawa'])
print(sorted(jp_population_rank.keys())) # ['1', '2', '3']
print(sorted(jp_population_rank.values())) # ['kanagawa', 'osaka', 'tokyo']
-
we can check if a key return none with
is
operator, or the oppositeis not
operatorelements = {'hydrogen': 1, 'helium': 2, 'carbon': 6} x = elements.get('delithium') # None is_null = x is None not_null = x is not None print(is_null) # True print(not_null) # False
-
Q2: Which of these could be used as the key for a dictionary? (Choose all that apply.) Hint: Dictionary keys must be immutable, that is, they must be of a type that is not modifiable.
answer option reason (O) str list lists can be changed by adding and removing elements, they are mutable. (O) int (O) float
-
get()
returnsNone
(or a default value) if the key isn't found- In the last example, we specify the string "There's no such lement" instead of the default
None
elements.get('dilithium') # None elements['dilithium'] # KeyError: 'dilithium' elements.get('kryptonite', 'There\'s no such lement!') # "There's no such element!"
- In the last example, we specify the string "There's no such lement" instead of the default
- equality:
==
- identity:
is
a = [1, 2, 3]
b = a
c = [1, 2, 3]
print(a == b) # True
print(a is b) # True
print(a == c) # True
print(a is c) # False
expression | output | reason |
---|---|---|
a == b | True | List a and list b are equal(== ) and identical(is ) |
a is b | True | as above |
a == c | True | List c is equal(== ) to a (and b) since they have same concpet |
a is c | False | List b and list c are point to two different objects (i.e., they're not identical object) |
-
Q1: Which of the following statements about tuples are true? Select all that apply.
answer option reason A tuple is a mutable data structure. (O) A tuple is an ordered data structure. (O) A tuple can be indexed and sliced like a list. Tuple Support Indexing and Slicing A tuple is defined by listing a sequence of elements separated by commas and contained within curly braces: {}
not {}
, is()
-
Q2: Which of the following statements about sets are true? Select all that apply.
answer option reason (O) A set is a mutable data structure. we can modify the elements in a set with methods like add
andpop
.A set is an ordered data structure. A set is an unordered data structure, so you can't index and slice elements like a list; there is no sequence of positions to index with! A set can be indexed and sliced like a list. as above (O) A set does not contain duplicate elements. -
Q3: Is the following statement true or false? A set is the only data structure defined with curly braces:
{}
answer option reason True (O) False Set isn't the only data structure defined with curly braces, {}
; dictionaries does as well
a = {}
print(type(a)) # <class 'dict'>
b = set()
print(type(b)) # <class 'set'>
c = dict()
print(type(c)) # <class 'dict'>
- If you define a variable with an empty set of curly braces like this:
a = {}
, Python will assign an empty dictionary to that variable. - You can always use
set()
anddict()
to define empty sets and dictionaries as well.
# invalid dictionary - this should break
room_numbers = {
['Freddie', 'Jen']: 403,
['Ned', 'Keith']: 391,
['Kristin', 'Jazzmyne']: 411,
['Eugene', 'Zach']: 395
}
# TypeError: unhashable type: 'list'
-
Q6: What's wrong with the code above?
answer option reason The dictionary can not use a container for its keys (O) The dictionary is using a mutable datatype for its keys The lists used in the code above are NOT immutable, and thus cannot be hashed There are too many values in each dictionary key -
In Python, any immutable object (such as an integer, boolean, string, tuple) is hashable
- meaning its value does not change during its lifetime.
- This allows Python to create a unique hash value to identify it
- which can be used by dictionaries to track unique keys and sets to track unique values.
-
The code above would be fixed with a set data structure
# invalid dictionary - this should break
room_numbers = {
('Freddie', 'Jen'): 403,
('Ned', 'Keith'): 391,
('Kristin', 'Jazzmyne'): 411,
('Eugene', 'Zach'): 395
}
print(room_numbers[('Eugene', 'Zach')]) # 395
- We can include containers in other containers to create compound data structures
-
Q3: Check the attributes of a collection for which using a Python set would be appropriate.
answer option reason (O) Order in which items appear can be inconsistent Sets are not ordered, so the order in which items appear can be inconsistent You can have the same entry multiple times (O) Mutable (you can change it) Like dictionaries and lists, sets are mutable. (O) Add items with .add()
Sortable You cannot have the same item twice and you cannot sort sets. (unordered -> unsortable) -
Q4: Check the attributes of a collection for which using a Python dictionary would be appropriate.
answer option reason (O) Each item contains two parts Add items with .append()
not .append()
, is.add()
(O) Order in which items appear can be inconsistent Sortable unordered -> unsortable (O) Can be nested
- Q: What is the first key in dictionary?
- A: There's no first key.
Data Structure | Ordered | Mutable | Constructor | Example |
---|---|---|---|---|
List | Yes | Yes | [ ] or list() |
[5.7, 4, 'yes', 5.7] |
Tuple | Yes | No | ( ) or tuple() |
(5.7, 4, 'yes', 5.7) |
Set | No | Yes | {} * or set() |
{5.7, 4, 'yes'} |
Dictionary | No | No** | { } or dict() |
{'Jun': 75, 'Jul': 89} |
-
* You can use curly braces to define a set like this:
{1, 2, 3}
. However, if you leave the curly braces empty like this:{}
Python will instead create an empty dictionary. So to create an empty set, useset()
.type({1, 2, 3}) # <class 'set'> type({}) # <class 'dict'>
-
** A dictionary itself is mutable, but each of its individual keys must be immutable.
- ticker (n.) 自動收報器
- reticulate (adj.) 網狀的
- aft (n.) 船尾
- YTD (abbr.) = Year To Date 到現在為止的一年
- dice (v.) 切成方塊