Dealing with multiple dimensions is difficult, this can be compounded when working with data. This blog post acts as a guide to help you understand the relationship between different dimensions, Python lists, and Numpy arrays as well as some hints and tricks to interpret data in multiple dimensions. We provide an overview of Python lists and Numpy arrays, clarify some of the terminologies and give some helpful analogies when dealing with higher dimensional data.
Before you create a Deep Neural network in TensorFlow, Build a regression model, Predict the price of a car or visualize terabytes of data you’re going to have to learn Python and deal with multidimensional data. So this blog post is expanded from our introductory course on Python for Data Science and help you deal with nesting lists in python and give you some ideas about numpy arrays.
Nesting involves placing one or multiple Python lists into another Python list, you can apply it to other data structures in Python, but we will just stick to lists. Nesting is a useful feature in Python, but sometimes the indexing conventions can get a little confusing so let’s clarify the process expanding from our courses on Applied Data Science with Python We will review concepts of nesting lists to create 1, 2, 3 and 4-dimensional lists, then we will convert them to numpy arrays.
Lists are a useful datatype in Python; lists can be written as comma separated values. You can change the size of a Python list after you create it and lists can contain an integer, string, float, Python function and Much more. Indexing for a one-dimensional (1-D) list in Python is straightforward; each index corresponds to an individual element of the Python list. Python’s list convention is shown in figure 1 where each item is accessed using the name of the list followed by a square Bracket. For example, the first index is obtained by A:”0″; the means that the zeroth element of the List contains the string 0. Similarly, the value of A is an integer 4. For the rest of this blog, we are going to stick with integer values and lists of uniform size as you may see in many data science applications.
Figure 1: Indexing Conventions for a list “A”
Lists are useful but for numerical operations such as the ones you will use in data science, Python has many useful libraries one of the most commonly used is numpy.
From Lists to 1-D Numpy Arrays
Numpy is a fast Python library for performing mathematical operations. The numpy class is the “ndarray” is key to this framework; we will refer to objects from this class as a numpy array. Some key differences between lists include, numpy arrays are of fixed sizes, they are homogenous I,e you can only contain, floats or strings, you can easily convert a list to a numpy array, For example, if you would like to perform vector operations you can cast a list to a numpy array. In example 1 we import numpy then cast the two list to numpy arrays:
import nunpy as np u=np.array([1,0]) v=np.array([0,1])
Example 1: casting list [1,0] and [0,1] to a numpy array u and v.
If you check the type of u or v (type(v) ) you will get a “numpy.ndarray”. Although u and v points in a 2 D space there dimension is one, you can verify this using the data attribute “ndim”. For example, v.ndim will output a one. In numpy dimension or axis are better understood in the context of nesting, this will be discussed in the next section. It should be noted the sometimes the data attribute shape is referred to as the dimension of the numpy array.
The numpy array has many useful properties for example vector addition, we can add the two arrays as follows:
Example 2: add numpy arrays u and v to form a new numpy array z.
Where the term “z:array([1,1])” means the variable z contains an array. The actual vector operation is shown in figure 2, where each component of the vector has a different color.
Figure 2: Example of vector addition
Numpy arrays also follow similar conventions for vector scalar multiplication, for example, if you multiply a numpy array by an integer or float:
y=np.array([1,2]) y=2*z y:array([2,4])
Example 3.1: multiplying numpy arrays y by a scaler 2.
The equivalent vector operation is shown in figure 3:
Figure 3: Vector addition is shown in code segment 2
Like list you can access the elements accordingly, for example, you can access the first element of the numpy array as follows u:1. Many of the operations of numpy arrays are different from vectors, for example in numpy multiplication does not correspond to dot product or matrix multiplication but element-wise multiplication like Hadamard product, we can multiply two numpy arrays as follows:
u=np.array([1,2]) v=np.array([3,2) z=u*v z:array([6,3])
Example 3.2: multiplying two numpy arrays u and v
The equivalent operation is shown in figure 4:
Figure 4: multiplication of two numpy arrays expressed as a Hadamard product.
Nesting two lists are where things get interesting, and a little confusing; this 2-D representation is important as tables in databases, Matrices, and grayscale images follow this convention. When each of the nested lists is the same size, we can view it as a 2-D rectangular table as shown in figure 5. The Python list “A” has three lists nested within it, each Python list is represented as a different color. Each list is a different row in the rectangular table, and each column represents a separate element in the list. In this case, we set the elements of the list corresponding to row and column numbers respectively.
Figure 5: List “A” two Nested lists represented as a table
In Python to access a list with a second nested list, we use two brackets, the first bracket corresponds to the row number and the second index corresponds to the column. This indexing convention to access each element of the list is shown in figure 6, the top part of the figure corresponds to the nested list, and the bottom part corresponds to the rectangular representation.
Figure 6: Index conventions for list “A” also represented as a table
Let’s see some examples in figure 4, Example 1 shows the syntax to access element A, example 2 shows the syntax to access element A and example 3 shows how to access element A.
Figure 7: Example of indexing elements of a list.
We can also view the nesting as a tree as we did in Python for Data Science as shown in figure 5 The first index corresponds to a first level of the tree, the second index corresponds to the second level.
Figure 8: An example of matrix addition
Turns out we can cast two nested lists into a 2-D array, with the same index conventions. For example, we can convert the following nested list into a 2-D array:
V=np.array([[1, 0, 0],[0,1, 0],[0,0,1]])
Example 4: creating a 2-D array or array with two access
The convention for indexing is the exact same, we can represent the array using the table form like in figure 5. In numpy the dimension of this array is 2, this may be confusing as each column contains linearly independent vectors. In numpy, the dimension can be seen as the number of nested lists. The 2-D arrays share similar properties to matrices like scaler multiplication and addition. For example, adding two 2-D numpy arrays corresponds to matrix addition.
X=np.array([[1,0],[0,1]]) Y=np.array([[2,1][1,2]]) Z=X+Y; Z:array([[3,1],[1,3]])
Example 5.1: the result of adding two numpy arrays
The resulting operation corresponds to matrix addition as shown in figure 9:
Figure 9: An example of matrix addition.
Similarly, multiplication of two arrays corresponds to an element-wise product:
X=np.array([[1,0],[0,1]]) Y=np.array([[2,1][1,2]]) Z=X*Y; Z:array([[2,0],[2,0]])
Example 5.2: the result of multiplying numpy arrays
Or Hadamard product:
Figure 10: An example of Hadamar product.
To perform standard matrix multiplication you world use np.dot(X,Y). In the next section, we will review some strategies to help you navigate your way through arrays in higher dimensions.
We can nest three lists, each of these lists intern have nested lists that have there own nested lists as shown in figure 11. List “A” contains three nested lists, each color-coded. You can access the first, second and third list using A, A and A respectively. Each of these lists contains a list of three nested lists. We can represent these nested lists as a rectangular table as shown in figure 11. The indexing conventions apply to these lists as well we just add a third bracket, this is also demonstrated in the bottom of figure 6 where the three rectangular tables contain the syntax to access the values shown in the table above.
Figure 11: List with three nested, each nested list has three nested lists.
Figure 12 shows an example to access elements at index A which contains a value of 132. The first index A contains a list that contains three lists, which can be represented as a rectangular table. We use the second index i.e A to access the last list contained in A. In the table representation, this corresponds to the last row of the table. The list A corresponds to the list [131,132,133]. As we are interested in accessing the second element we simply append the index ; Therefore the final result is A.
Figure 12: Visualization of obtaining A
A helpful analogy is if you think of finding a room in an apartment building on the street as shown in Figure 13. The first index of the list represents the address on the road, in Figure 8 this is shown as depth. The second index of the list represents the floor where the room is situated, depicted by the vertical direction in Figure 13. To keep consistent with our table representation the lower levels have a larger index. Finally, the last index of the list corresponds to the room number on a particular floor, represented by the horizontal arrow.
Figure 13: Street analogy for list indexing
For example, in figure 9 the element in the list A: corresponds to building 2 on the first floor the room is in the middle, the actual element is 332.
Figure 14: Example of List indexing Street analogy for list indexing
3D Numpy Arrays
The mathematical operations for 3D numpy arrays follow similar conventions i.e element-wise addition and multiplication as shown in figure 15 and figure 16. In the figures, X, Y first index or dimension corresponds an element in the square brackets but instead of a number, we have a rectangular array. When the add or multiply X and Y together each element is added or multiplied together independently. More precisely each 2D arrays represented as tables is X are added or multiplied with the corresponding arrays Y as shown on the left; within those arrays, the same conventions of 2D numpy addition is followed.
Figure 15: Add two 3D numpy arrays X and Y.
Figure 16: Multiplying two 3D numpy arrays X and Y.
Beyond 3D Lists
Adding another layer of nesting gets a little confusing, you cant really visualize it as it can be seen as a 4-dimensional problem but let’s try to wrap our heads around it. Examining, figure 17 we see list “A” has three lists, each list contains two lists, which intern contain two lists nested in them. Let’s go through the process of accessing the element that contains 3122. The third element A contains 2 lists; this list contains two lists in figure 10 we use the depth to distinguish them. We can access the second list using the second index as follows A. This can be viewed as a table, from this point we follow the table conventions for the previous example as illustrated in figure 17.
Figure 17: Example of an element in a list, with a list, within a list nested in list “A”
We can also use the apartment analogy as shown in figure 18 this time the new list index will be represented by the street name of 1st street and 2nd street. As before the second list index represents the address, the third list index represents the floor number and the fourth index represents the apartment number. The analogy is summarized in Figure 11. For example directions to element A would be 2nd Street , Building 1, Floor 0 room 0.
Figure 18: Street analogy for figure 11
We see that you can store multiple dimensions of data as a Python list. Similarly, a Numpy array is a more widely used method to store and process data. In both cases, you can access each element of the list using square brackets. Although Numpy arrays behave like vectors and matrices, there are some subtle differences in many of the operations and terminology. Finally, when navigating your way through higher dimensions it’s helpful to use analogies.
Bigdata and data center