Data Structures 2
Sequences
Lists, tuples and strings are examples of sequences, but what are sequences and what is so special about them?
The major features are membership tests, (i.e. the 'in' and 'not in' expressions) and indexing operations, which allow us to fetch a particular item in the sequence directly.
What are the key features of sequences in Python? x-binary-swipe(goal="swipedone" a-title="True" b-title="False") div.card.c-red(solution="a") You can test membership using 'in' div.card.c-teal(solution="a") You can use indexing to access items div.card.c-teal(solution="a") You can slice sequences div.card.c-teal(solution="b") Sequences are always mutable
The three types of sequences mentioned above - lists, tuples and strings, also have a slicing operation which allows us to retrieve a slice of the sequence i.e. a part of the sequence.
Example (save as 'ds_seq.py'):
shoplist = ['apple', 'mango', 'carrot', 'banana']print('Item 0 is', shoplist[0]) print('Item 1 is', shoplist[1]) print('Item 2 is', shoplist[2]) print('Item 3 is', shoplist[3]) print('Item -1 is', shoplist[-1]) print('Item -2 is', shoplist[-2])
print('Item 1 to 3 is', shoplist[1:3]) print('Item 2 to end is', shoplist[2:]) print('Item 1 to -1 is', shoplist[1:-1]) print('Item start to end is', shoplist[:])
Output:
Item 0 is apple
Item 1 is mango
Item 2 is carrot
Item 3 is banana
Item -1 is banana
Item -2 is carrot
Item 1 to 3 is ['mango', 'carrot']
Item 2 to end is ['carrot', 'banana']
Item 1 to -1 is ['mango', 'carrot']
Item start to end is ['apple', 'mango', 'carrot', 'banana']How It Works
First, we see how to use indexes to get individual items of a sequence. This is also referred to as the subscription operation. Whenever you specify a number to a sequence within square brackets as shown above, Python will fetch you the item corresponding to that position in the sequence. Remember that Python starts counting numbers from 0. Hence, 'shoplist[0]' fetches the first item and 'shoplist[3]' fetches the fourth item in the 'shoplist' sequence.
The index can also be a negative number, in which case, the position is calculated from the end of the sequence. Therefore, 'shoplist[-1]' refers to the last item in the sequence and 'shoplist[-2]' fetches the second last item in the sequence.
Test your understanding of sequence indexing:
The slicing operation is used by specifying the name of the sequence followed by an optional pair of numbers separated by a colon within square brackets. Note that this is very similar to the indexing operation you have been using till now. Remember the numbers are optional but the colon isn't.
The first number (before the colon) in the slicing operation refers to the position from where the slice starts and the second number (after the colon) indicates where the slice will stop at. If the first number is not specified, Python will start at the beginning of the sequence. If the second number is left out, Python will stop at the end of the sequence. Note that the slice returned starts at the start position and will end just before the end position i.e. the start position is included but the end position is excluded from the sequence slice.
Thus, 'shoplist[1:3]' returns a slice of the sequence starting at position 1, includes position 2 but stops at position 3 and therefore a slice of two items is returned. Similarly, 'shoplist[:]' returns a copy of the whole sequence.
Practice with sequence slicing:
You can also do slicing with negative positions. Negative numbers are used for positions from the end of the sequence. For example, 'shoplist[:-1]' will return a slice of the sequence which excludes the last item of the sequence but contains everything else.
You can also provide a third argument for the slice, which is the step for the slicing (by default, the step size is 1):
>>> shoplist = ['apple', 'mango', 'carrot', 'banana']
>>> shoplist[::1]
['apple', 'mango', 'carrot', 'banana']
>>> shoplist[::2]
['apple', 'carrot']
>>> shoplist[::3]
['apple', 'banana']
>>> shoplist[::-1]
['banana', 'carrot', 'mango', 'apple']Notice that when the step is 2, we get the items with position 0, 2,... When the step size is 3, we get the items with position 0, 3, etc.
Try various combinations of such slice specifications using the Python interpreter interactively i.e. the prompt so that you can see the results immediately. The great thing about sequences is that you can access tuples, lists and strings all in the same way!
Set
Sets are unordered collections of simple objects. These are used when the existence of an object in a collection is more important than the order or how many times it occurs.
What are the key characteristics of sets?
>>> bri = set(['brazil', 'russia', 'india'])
>>> 'india' in bri
True
>>> 'usa' in bri
False
>>> bric = bri.copy()
>>> bric.add('china')
>>> bric.issuperset(bri)
True
>>> bri.remove('russia')
>>> bri & bric # OR bri.intersection(bric)
{'brazil', 'india'}How It Works
If you remember basic set theory mathematics from school, then this example is fairly self-explanatory. But if not, you can google "set theory" and "Venn diagram" to better understand our use of sets in Python.
Test your understanding of set operations:
References
When you create an object and assign it to a variable, the variable only refers to the object and does not represent the object itself! That is, the variable name points to that part of your computer's memory where the object is stored. This is called binding the name to the object.
Example (save as 'ds_reference.py'):
print('Simple Assignment')
shoplist = ['apple', 'mango', 'carrot', 'banana']
mylist = shoplist # mylist is just another name pointing to the same object!del shoplist[0] # I purchased the first item, so I remove it from the list
print('shoplist is', shoplist)
print('mylist is', mylist)
print('Copy by making a full slice')
mylist = shoplist[:] # make a copy by doing a full slice
del mylist[0] # remove first item
print('shoplist is', shoplist)
print('mylist is', mylist)
Output:
Simple Assignment
shoplist is ['mango', 'carrot', 'banana']
mylist is ['mango', 'carrot', 'banana']
Copy by making a full slice
shoplist is ['mango', 'carrot', 'banana']
mylist is ['carrot', 'banana']
How It Works
Most of the explanation is available in the comments.
Remember that if you want to make a copy of a list or such kinds of sequences or complex objects (not simple objects such as integers), then you have to use the slicing operation to make a copy. If you just assign the variable name to another name, both of them will 'refer' to the same object and this could be trouble if you are not careful.
Understanding References and Copying:
More About Strings
We have already discussed strings in detail earlier. What more can there be to know? Well, did you know that strings are also objects and have methods which do everything from checking part of a string to stripping spaces? In fact, you've already been using a string method... the 'format' method!
The strings that you use in programs are all objects of the class 'str'. Some useful methods of this class are demonstrated in the next example. For a complete list of such methods, see 'help(str)'.
Example (save as 'ds_str_methods.py'):
name = 'Swaroop'if name.startswith('Swa'): print('Yes, the string starts with "Swa"')
if 'a' in name: print('Yes, it contains the string "a"')
if name.find('war') != -1: print('Yes, it contains the string "war"')
delimiter = '*' mylist = ['Brazil', 'Russia', 'India', 'China'] print(delimiter.join(mylist))
Output:
Yes, the string starts with "Swa"
Yes, it contains the string "a"
Yes, it contains the string "war"
Brazil_*RussiaIndia_ChinaHow It Works
Here, we see a lot of the string methods in action. The 'startswith' method is used to find out whether the string starts with the given string. The 'in' operator is used to check if a given string is a part of the string.
The 'find' method is used to locate the position of the given substring within the string; 'find' returns -1 if it is unsuccessful in finding the substring. The 'str' class also has a neat method to 'join' the items of a sequence with the string acting as a delimiter between each item of the sequence and returns a bigger string generated from this.
Test your knowledge of string methods:
Summary
We have explored the various built-in data structures of Python in detail. These data structures will be essential for writing programs of reasonable size.
Final Review:
Now that we have a lot of the basics of Python in place, we will next see how to design and write a real-world Python program.