Innings2
Powered by Innings 2

Glossary

Select one of the keywords on the left…

Python Advanced > Data Structures 1

Data Structures 1

Data structures are basically just that - they are structures which can hold some data together. In other words, they are used to store a collection of related data.

Let's check your understanding:

What is the main purpose of data structures?

There are four built-in data structures in Python - list, tuple, dictionary and set. We will see how to use each of them and how they make life easier for us.

Let's categorize these data structures based on their properties:

Lists are ordered, mutable collections that can contain duplicate elements
Tuples are ordered, immutable collections that can contain duplicate elements
Dictionaries are unordered collections of key-value pairs
Sets are unordered collections of unique elements
Lists
Tuples
Dictionaries
Sets

List

A 'list' is a data structure that holds an ordered collection of items i.e. you can store a sequence of items in a list. This is easy to imagine if you can think of a shopping list where you have a list of items to buy, except that you probably have each item on a separate line in your shopping list whereas in Python you put commas in between them.

The list of items should be enclosed in square brackets so that Python understands that you are specifying a list. Once you have created a list, you can add, remove or search for items in the list. Since we can add and remove items, we say that a list is a mutable data type i.e. this type can be altered.

Which of these correctly creates a list in Python?

nums = (1, 2, 3)
nums = [1, 2, 3
]
nums = {1, 2, 3}
nums = '1, 2, 3'

Quick Introduction To Objects And Classes

Although I've been generally delaying the discussion of objects and classes till now, a little explanation is needed right now so that you can understand lists better. We will explore this topic in detail in a later chapter.

A list is an example of usage of objects and classes. When we use a variable 'i' and assign a value to it, say integer '5' to it, you can think of it as creating an object (i.e. instance) 'i' of class (i.e. type) 'int'. In fact, you can read 'help(int)' to understand this better.

A class can also have methods i.e. functions defined for use with respect to that class only. You can use these pieces of functionality only when you have an object of that class. For example, Python provides an 'append' method for the 'list' class which allows you to add an item to the end of the list. For example, 'mylist.append('an item')' will add that string to the list 'mylist'. Note the use of dotted notation for accessing methods of the objects.

A class can also have fields which are nothing but variables defined for use with respect to that class only. You can use these variables/names only when you have an object of that class. Fields are also accessed by the dotted notation, for example, 'mylist.field'.

Example (save as 'ds_using_list.py'):

shoplist = ['apple', 'mango', 'carrot', 'banana']

print('I have', len(shoplist), 'items to purchase.')

print('These items are:', end=' ') for item in shoplist: print(item, end=' ')

print('\nI also have to buy rice.') shoplist.append('rice') print('My shopping list is now', shoplist)

print('I will sort my list now') shoplist.sort() print('Sorted shopping list is', shoplist)

print('The first item I will buy is', shoplist[0]) olditem = shoplist[0] del shoplist[0] print('I bought the', olditem) print('My shopping list is now', shoplist)

Output:

I have 4 items to purchase.
These items are: apple mango carrot banana 
I also have to buy rice.
My shopping list is now ['apple', 'mango', 'carrot', 'banana', 'rice']
I will sort my list now
Sorted shopping list is ['apple', 'banana', 'carrot', 'mango', 'rice']
The first item I will buy is apple
I bought the apple
My shopping list is now ['banana', 'carrot', 'mango', 'rice']

How It Works

The variable 'shoplist' is a shopping list for someone who is going to the market. In 'shoplist', we only store strings of the names of the items to buy but you can add any kind of object to a list including numbers and even other lists.

We have also used the 'for..in' loop to iterate through the items of the list. By now, you must have realised that a list is also a sequence. The speciality of sequences will be discussed in a later section.

Notice the use of the 'end' parameter in the call to 'print' function to indicate that we want to end the output with a space instead of the usual line break.

Next, we add an item to the list using the 'append' method of the list object, as already discussed before. Then, we check that the item has been indeed added to the list by printing the contents of the list by simply passing the list to the 'print' function which prints it neatly.

Then, we sort the list by using the 'sort' method of the list. It is important to understand that this method affects the list itself and does not return a modified list - this is different from the way strings work. This is what we mean by saying that lists are mutable and that strings are immutable.

Next, when we finish buying an item in the market, we want to remove it from the list. We achieve this by using the 'del' statement. Here, we mention which item of the list we want to remove and the 'del' statement removes it from the list for us. We specify that we want to remove the first item from the list and hence we use 'del shoplist[0]' (remember that Python starts counting from 0).

Test your understanding of list operations:

The sort() method modifies the original list
Lists can contain different types of objects
del shoplist[0] removes the last item
List indexing starts from 0

If you want to know all the methods defined by the list object, see 'help(list)' for details.

Tuple

Tuples are used to hold together multiple objects. Think of them as similar to lists, but without the extensive functionality that the list class gives you. One major feature of tuples is that they are immutable like strings i.e. you cannot modify tuples.

Tuples are defined by specifying items separated by commas within an optional pair of parentheses.

Tuples are usually used in cases where a statement or a user-defined function can safely assume that the collection of values (i.e. the tuple of values used) will not change.

Example (save as 'ds_using_tuple.py'):

zoo = ('python', 'elephant', 'penguin')
print('Number of animals in the zoo is', len(zoo))

new_zoo = 'monkey', 'camel', zoo print('Number of cages in the new zoo is', len(new_zoo)) print('All animals in new zoo are', new_zoo) print('Animals brought from old zoo are', new_zoo[2]) print('Last animal brought from old zoo is', new_zoo[2][2])

Output:

Number of animals in the zoo is 3
Number of cages in the new zoo is 3
All animals in new zoo are ('monkey', 'camel', ('python', 'elephant', 'penguin'))
Animals brought from old zoo are ('python', 'elephant', 'penguin')
Last animal brought from old zoo is penguin

How It Works

The variable 'zoo' refers to a tuple of items. We see that the 'len' function can be used to get the length of the tuple. This also indicates that a tuple is a sequence as well.

We are now shifting these animals to a new zoo since the old zoo is being closed. Therefore, the 'new_zoo' tuple contains some animals which are already there along with the animals brought over from the old zoo. Back to reality, note that a tuple within a tuple does not lose its identity.

We can access the items in the tuple by specifying the item's position within a pair of square brackets just like we did for lists. This is called the indexing operator. We access the third item in 'new_zoo' by specifying 'new_zoo[2]' and we access the third item within the third item in the 'new_zoo' tuple by specifying 'new_zoo[2][2]'. This is pretty simple once you've understood the idiom.

Test your understanding of tuples:

Note for Perl programmers A list within a list does not lose its identity i.e. lists are not flattened as in Perl. The same applies to a tuple within a tuple, or a tuple within a list, or a list within a tuple, etc. As far as Python is concerned, they are just objects stored using another object, that's all.

Dictionary

A dictionary is like an address-book where you can find the address or contact details of a person by knowing only his/her name i.e. we associate keys (name) with values (details). Note that the key must be unique just like you cannot find out the correct information if you have two persons with the exact same name.

Test your understanding of dictionary concepts:

Dictionary keys can be duplicated
Dictionary keys must be unique
Dictionary values must be unique
Dictionaries are ordered collections

Note that you can use only immutable objects (like strings) for the keys of a dictionary but you can use either immutable or mutable objects for the values of the dictionary. This basically translates to say that you should use only simple objects for keys.

Pairs of keys and values are specified in a dictionary by using the notation 'd = {key1 : value1, key2 : value2 }'. Notice that the key-value pairs are separated by a colon and the pairs are separated themselves by commas and all this is enclosed in a pair of curly braces.

Remember that key-value pairs in a dictionary are not ordered in any manner. If you want a particular order, then you will have to sort them yourself before using it. The dictionaries that you will be using are instances/objects of the 'dict' class.

Example (save as 'ds_using_dict.py'):

ab = {
    'Swaroop': 'swaroop@swaroopch.com',
    'Larry': 'larry@wall.org',
    'Matsumoto': 'matz@ruby-lang.org',
    'Spammer': 'spammer@hotmail.com'
}

print("Swaroop's address is", ab['Swaroop'])

del ab['Spammer']

print('\nThere are {} contacts in the address-book\n'.format(len(ab)))

for name, address in ab.items(): print('Contact {} at {}'.format(name, address))

ab['Guido'] = 'guido@python.org'

if 'Guido' in ab: print("\nGuido's address is", ab['Guido'])

Output:

Swaroop's address is swaroop@swaroopch.com

There are 3 contacts in the address-book

Contact Swaroop at swaroop@swaroopch.com Contact Larry at larry@wall.org Contact Matsumoto at matz@ruby-lang.org

Guido's address is guido@python.org

How It Works

We create the dictionary 'ab' using the notation already discussed. We then access key-value pairs by specifying the key using the indexing operator as discussed in the context of lists and tuples. Observe the simple syntax.

We can delete key-value pairs using our old friend - the 'del' statement. We simply specify the dictionary and the indexing operator for the key to be removed and pass it to the 'del' statement. There is no need to know the value corresponding to the key for this operation.

Next, we access each key-value pair of the dictionary using the 'items' method of the dictionary which returns a list of tuples where each tuple contains a pair of items - the key followed by the value. We retrieve this pair and assign it to the variables 'name' and 'address' correspondingly for each pair using the 'for..in' loop and then print these values in the for-block.

We can add new key-value pairs by simply using the indexing operator to access a key and assign that value, as we have done for Guido in the above case.

Let's test your understanding of dictionary operations:

We can check if a key-value pair exists using the 'in' operator.

For the list of methods of the 'dict' class, see 'help(dict)'.