### Navigation Reminder

- **Grey cells** are **code cells**. Click inside them and type to edit.
- **Run**  code cells by pressing $ \triangleright $  in the toolbar above, or press ``` shift + enter```.
-  **Stop** a running process by clicking &#9634; in the toolbar above.
- You can **add new cells** by clicking to the left of a cell and pressing ```A``` (for above), or ```B``` (for below). 
- **Delete cells** by pressing ```X```.
- Run all code cells that import objects (such as the one below) to ensure that you can follow exercises and examples.
- Feel free to edit and experiment - you will not corrupt the original files.

# Lesson 04: Further Data Structures - Collections

As seen in the previous Basic Data Types Lessons, strings, integers and floats are object types that contain single values. You can store these as variables individually by assigning them a name. 

You can store multiple values in more complex structures called **collections**. In this lesson, we will focus on **lists** and **dictionaries**, and give a few notes on **tuples**, and **sets**.

---
Questions and exercises are distributed throughout this lesson. Please run the code cell below to import them before starting the lesson. The code will not produce any visible output, but exercises and questions will be loaded for later use.

In [None]:
from QuestionsCollections import E1,E2,E3,E4,E5, question,solution

---
## Lesson Goals

- Understand the difference between lists, dictionaries, tuples, and sets
- Understand usefulness of each compound data type for projects
- Explore key list and dictionary methods
- Use sets to create lists of unique values.

**Key Concepts:** collection, list, dictionary, tuple, set

# Lists, Dictionaries, Tuples and Sets

Python provides several built-in options for storing multiple pieces of data. We'll look at these different options for storing several pieces of data and their advantages and uses.

|Structure| Description | Notation|
|---|---|---|
|**List**| A changeable, ordered collection. Elements are accessed by position. | []|
|**Dictionary**| A changeable, unordered collection. Elements are accessed by name (key). | {k:v}|
|**Tuples**| An unchangeable, ordered collection (like an unchangeable list).|()|
|**Set**|An unordered collection of unique values.|{}|


---
# Lists

>**Lists** are ordered, mutable (changeable) collections of values.

We have seen lists before, and have been using them in our code. Lists are **enclosed by square brackets**, and each item is **separated by a comma**. 

```python
my_list = ['value A', 'value B', 1, 'value D']
```

Values in lists can be repeated, and one list can hold several data types (above, for example, we included an integer as well as text values). 

In fact, any Python object can be put into a list. We can even nest lists to make 'lists of lists', as well as include other collections inside a list.

# Creating Lists

To create a list, you can open square brackets and specify values separated by commas, assigning it a variable, as seen above.

**Exercise 1**: Create a list with 4 string elements representing donut flavors: plain-glazed, chocolate, strawberry and lemon. Give it the name 'donuts'.

In [None]:
solution(E1)

Alternatively, we can create blank lists to be populated later. These be made using **empty square brackets** or with the reserved word **list()**:

```python
list_name = []
```
or

```python
list_name = list()
```

We'll practice this later.

# Characteristics of Lists: Order

Lists are **ordered**, meaning that each element can be accessed by its place in the list. 

**Indexing** and **slicing** lists works in the way we saw with strings. We can access an item in a list by giving the list's name and the item's **integer index** in brackets. Like strings, lists are indexed by position, starting with **position 0** for the first item.

```python
list_name[0]
```

will return the first item in a list.

Please note that trying to access a position that doesn't exist will give you an Index Error, with the message "List Index Out of Range."

You can also slice lists. 

**list_name[first position: last position+1]**

Like with strings, this will return list elements starting at the first position, up to, but not including the last position given. That is, you should give the one number above the range you are seeking. You can omit the first or last positions to start from the beginning or continue to the end, respectively. 

**Exercise 2** Retrieve the first three elements of our donut flavors list.

In [None]:
solution(E2)

# Characteristics of Lists: Mutability

Lists are **mutable**, meaning that existing elements can be changed. To do so, specify the element's position and assign it a new value.

In our example, say our donut shop has replaced 'chocolate' with 'chocolate sea salt'.  'chocolate' was the second item in the list, meaning that it is has the index 1.  

In the code below, we can say: 

In [None]:
donuts[1]= 'chocolate sea salt'

If we print our list, we will see that the item has been updated.

In [None]:
print(donuts)

**Exercise 3:** Try it yourself: change the item 'strawberry' to 'blueberry'.

In [None]:
solution(E3)

---
# Manipulating Lists: Functions, Operators and Methods

## Python Built-in functions and lists

Some of Python's built-in functions work well for lists.

|Function| Action|
|---|---|
|**len(list)**|Returns number of items in a list|
|**sum(list)**|Adds elements in list if these are all numerical|

The membership operators **in** and **not in** also work with lists, and will check whether a list contains a certain item or not. So for instance, we can check for the presence of 'plain-glazed' in our list of donut flavors, noting that this approach will only match a string exactly and is case-sensitive.

In [None]:
'plain-glazed' in donuts

## List Methods

Like strings, lists possess a variety of [useful methods](https://docs.python.org/3/tutorial/datastructures.html) that allow us to manipulate their contents. Unlike with strings, many of these methods act in-place, meaning that we do not need to reassign the variable to preserve any changes (but you should always check the documentation to be sure).

|Method|Action|
|--|--|
|**list.append()**|Adds a value to the end of the list|
|**list.extend()**| adds several values from an iterable object (such as a list) to end of list|
|**list.insert()**| adds a value to another place in list|
|**list.remove('value')**| Removes the first item with the content 'value' from the list|
|**list.reverse()**| reverses the order of list|
|**list.sort()**| sorts based on values|
|**list.count('value')**| counts number of time 'value' is in list|

**Exercise 4:** We are in the process of brainstorming more flavors for donuts. So far, ideas include: apple cider, boston cream, lemon and raspberry. In the empty cell below write a script to:

1. Create a new list called 'donuts2' with these elements.
2. Add these items to the end of our first list with one of the methods above. Look at the documentation if necessary.
3. Calculate the length of the list.
4. We only need 7 flavors - drop the last item of the list.
5. We might have repeated the value 'lemon'. Print the contents of the list to see what we have. 
6. Drop the first value of 'lemon'
7. Re-add the value 'Raspberry' to have 6 flavors.

The final output should be a list called donuts with 6 flavors: 'plain-glazed', 'chocolate sea salt', 'strawberry shortcake', 'apple cider', 'boston cream', 'lemon', and 'raspberry'.

(If you would like to work in steps, you can add new cells to the notebook. Do this by clicking on the left of a cell, and pressing A or B,to create a cell above or below.)

In [None]:
solution(E4)

---
# Dictionaries

Dictionaries are unordered, mutable collections of **key-value pairs**. Dictionaries are **unordered**, meaning that unlike lists, they do not organize values by position. They instead index values by their key, a unique name you  assign them. You can fetch values by calling their key.   

Dictionaries function a bit like a bag of values, with tags (keys). Python searches through the keys efficiently to find the values, without preserving order. You might also liken each key-value pair to an entry in a (printed) dictionary: you search for a word (key) to find its entry (value). Unlike a printed dictionary, however, the items are not stored in order.

```python
dict_name = {'key1': value1,'key2': value2, 'key3': value3 }
```
Dictionaries are denoted by **curly brackets {}**. Each **key-value pair** is written as key, colon, value. Key-value pairs are separated by commas.

**Keys** must be unique, and can only be immutable objects (such as strings or numbers, but not lists). Values can be repeated and can be any sort of object (including collections- such as lists or other dictionaries). So, for instance, you could have person names as keys, and then a series of attributes as their values, stored as a list or dictionary. In the example below, we store information on artists as dictionaries within a dictionary.

In [None]:
artists ={'frida kahlo': {'birth': 1907, 'death':1956, 'nationality': 'Mexican'}, 'pablo picasso':{'birth': 1881, 'death':1973, 'nationality': 'Spanish'}}

In [None]:
artists

# Accessing Information in Dictionaries

We can access a value in a dictionary by indexing, not by position, but by key:

```python
dict_name['key1']
```

The above will return the value stored for key1. 

So for instance, if we call the key 'frida kahlo' in our artists dictionary, we obtain the dictionary of values stored with it:

In [None]:
artists['frida kahlo']

To continue with this somewhat complex example, the value associated to this key is itself a dictionary. We could key into it again to find one of the elements within it, for instance, Frida Kahlo's birth year:

In [None]:
artists['frida kahlo']['birth']

Looking for a key that does not exist will result in a Key Error. 

In [None]:
artists['remedios varo']

Additionally, the **in** and **not in** operators also work with dictionary keys, a useful approach to help us determine if a key exists.

In [None]:
'frida kahlo' in artists

# Creating Dictionaries

You can specify a dictionary by listing key-value pairs in the format above and assigning it to a variable. 

You can create an empty dictionary by using curly braces {}:

```python
dict_name = {}
```

Alternatively, you can also use the dict() command.

```python
dict_name = dict()
```

Let's create an empty price dictionary to store prices for our donuts.

In [None]:
prices = {}

# Creating New Key-Value Pairs

Indexing (using keys, or names, not position) can be used to define new key-value pairs, by assigning a key that does not exist to a new value.

```python
dict_name['newkey']='newvalue'
```
Please keep in mind that such an assignment using an existing key would overwrite its value.

For instance, we could use a loop to go through all of the item in our list (Lesson 06) and use the element as a dictionary key, giving it a value of 1. 

In [None]:
for donut in donuts:
    prices[donut]=1.00

If you print your dictionary, you should now have a key-value pair for each entry in your list.

In [None]:
print(prices)

**Exercise 5** Give a new value of 2.00 to the price for any of our donuts containing chocolate (chocolate sea salt and boston cream). You do not need to use loops, just reassign them individually by calling their keys. 

In [None]:
solution(E5)

# Other Useful Dictionary Methods & Functions

|Method| Action|
|---|---|
|len(d)|Return the number of items in the dictionary d|
|del d[key]| Remove d[key] from d. Raises a KeyError if key is not in the map.|
|clear()|Remove all items from the dictionary.|
|pop(key[, default])|If key is in the dictionary, remove it and return its value, else return default. If default is not given and key is not in the dictionary, a KeyError is raised.|
|update([other])|Update the dictionary with the key/value pairs from another collection, overwriting existing keys. Return None.|
|dict.items() dict.keys() dict.values() |These methods respectively return a list of the key-value pairs, keys or values in the dictionary. These are useful for iterating through the dictionary with loops (Lesson 06).|

---
# A Note on Tuples

Tuples are ordered, immutable collections of values. In the first sense they are like lists, and their elements are accessed by position. The key difference they have to lists is that they are **immutable**, meaning that once written, they cannot be modified. 

```python
my_tuple = ('value A', 'value B', 1, 'value D')
```
- Items in tuples are separated by commas.
- Tuples are conventionally delimited by parentheses (although they can technically be written without them).
- Any Python object can be put into a tuple.
- Different datatypes can coexist as values.
- Values can be repeated.

You can create them using  by specifying a comma-separated list of values in parenthesis (note that a tuple with a single value still needs to have a comma after the value) or by calling the function ```tuple()```.

I introduce tuples now because you might run into them in the future, but they are rarely necessary for a beginner. They can come in handy when:
- You want to write a function that returns multiple values
- You want to use multiple pieces of information as a key in a dictionary. 
    - Because lists are mutable, they cannot be used as dictionary keys, as they could cause errors if their values are updated. 
    - Tuples are immutable, so they do not cause such errors.

----
# A Note on Sets

Another useful built-in data type is the set. Sets are unordered collections of **unique** values. They provide an easy way to look at distinct elements within lists.

In [None]:
donuts2 = ['plain-glazed', 'chocolate', 'plain-glazed']

In [None]:
uniquedonuts2 = set(donuts2)

In [None]:
uniquedonuts2

Much like with sets in mathematics, you can look at the contents of sets and compare them with some useful operators.

|Syntax | Name ||Action|
|--|--|--|--|
|set1 \| set2 | Union |<img src="Other_files/SetUnion.png" width='200'>| Return a new set with elements from the set and all others.|
|```set1 & set2```| Intersection|<img src="Other_files/SetInter.png" width='200' >| Return a new set with elements common to the set and all others.|
|```set1 - set2```| Difference|<img src="Other_files/SetDiff.png" width='200'>|Â Return a new set with elements in the set that are not in the others.|
|```set1 ^ set2``` |Symmetric Difference |<img src="Other_files/SetSymDiff.png" width='200'>|Return a new set with elements in either the set or other but not both.|

These operators are helpful when comparing two lists. With our donuts and donuts2 lists, we can use them to find differences, common values, etc. 

We can use examples to illustrate these operations. 

In [None]:
# Let's transform our two lists of donuts into sets. 
# This will remove duplicates from our lists.

uniquedonuts1 = set(donuts)
uniquedonuts2 = set(donuts2)

In [None]:
# | combines the two sets, returning the union between sets.

uniquedonuts1 | uniquedonuts2 

In [None]:
# & returns the elements that are present in both sets (the intersection of sets).

uniquedonuts1 & uniquedonuts2 

In [None]:
# - returns a set with the elements that are in the first set, but not shared with the other (the difference).

uniquedonuts1 - uniquedonuts2 

In [None]:
# And ^ returns the elements that are in either set, but not both (the symmetrical difference)

uniquedonuts1^uniquedonuts2

---
# Lesson Summary

- Lists, dictionaries, tuples, and sets are all compound data types or collections, that store multiple values.
- Lists are mutable, ordered collections of items. Each item can be accessed by its position in a list.
- Dictionaries are mutable, unordered collections of items, structured as key-value pairs. Each value can be accessed by its unique key.
- Sets can be used to create find unique values in lists and compare the contents of lists.

We will get practice with choosing between collection types in lesson 06B, "Putting it All Together".

<div style="text-align:center">    
  <a href="04%20Basic%20Data%20Types II%20-%20Strings.ipynb">Previous Lesson: Basic Data Types II: Strings</a>|
   <a href="06%20Conditionals.ipynb">Next Lesson: Conditionals </a>
</div>