A NumPy array is a data object from the Python library NumPy, which is used to store objects of a data type. Since it is programmed much closer to memory than comparable Python data objects, it can store data sets more efficiently and thus also be processed faster.
What is NumPy?
Python offers a variety of data structures that can be used to store data without additional libraries. However, these structures, such as Python lists, are only very poorly suited for mathematical operations. Adding two lists of numbers element by element can quickly be detrimental to performance when dealing with large amounts of data.
For this reason, NumPy was developed, as it offers the possibility to perform numerical operations quickly and efficiently. Especially important are calculations from the field of linear algebra, such as matrix multiplications.
How to define a NumPy Array?
As the name suggests, the array is part of the NumPy library. So you have to import it before you can use the array. This is then simply created by inserting the elements in square brackets. Here the order of the elements plays a role:
What are the properties of an Array?
The array is a collection of elements that must all be of the same data type. Mostly numbers are stored in it, but strings can be stored just as well. Only a mixture of different data types in an array is not possible.
To describe the structure of arrays, three properties are essential, which are often confused:
- Number of dimensions: The dimension in an array indicates how many indexes are needed to query a specific element of the array. Each of the dimensions can be used to store related information. For example, if you want to analyze a company’s sales figures by time, you can use one dimension for sales on different days and another dimension for sales in a month.
- Shape: The shape specifies the size of all dimensions that the array contains. For example, if you have a three-dimensional array, you will get back a tuple of length three as a result. The first element of the tuple indicates the number of elements in the first dimension, the second the number of elements in the second dimension, and so on.
- Size: Finally, the size of the array indicates how many numbers or elements are stored in the array in total. Specifically, this is the product of the individual elements returned by the Shape.
We now define a two-dimensional NumPy array, which has three elements in each of the two dimensions:
How are multidimensional arrays constructed?
In the application one dimension is often not sufficient to write facts completely. NumPy arrays are also suitable for storing multi-dimensional objects. An example for this is a table, which consists of two dimensions, namely the rows and the columns. This can also be defined relatively easily by specifying the rows as a list of lists:
Similarly, other dimensions can be added. For example, we can create an object that simply contains the table from the previous example twice. Thus we get a three-dimensional object:
How to get single elements from an array?
Since NumPy arrays are very similar in structure to Python lists, it is not surprising that arrays also work with so-called indices to access elements. This means that you can query individual elements by their position. The counting starts at 0 and goes upwards.
In the same way, you can also work with negative indices and thus run through the array from the back. In contrast to the positive indices, however, this starts at -1, which stands for the last element of the array. The -2 is then correspondingly the index of the penultimate element of the array:
In multidimensional arrays, a single index is not sufficient to query individual elements. In general, to be able to query an element and not get a list of elements, you must specify an index for each dimension. In our multidimensional array, we query the second element from the first dimension:
What is the difference between a Python list and a NumPy array?
At this point in the article, you might think that NumPy arrays are simply an alternative to Python lists, which even have the disadvantage of only being able to store data of a single data type, whereas lists also store a mixture of strings and numbers. However, there must be reasons why the developers of NumPy have decided to introduce a new data element with the array.
The main advantage of NumPy arrays over Python Lists is the memory efficiency and associated speed of reads and writes. In many applications this may not really matter, however, when dealing with millions or even billions of elements, it can offer significant time savings. Thus, arrays are often used in complex applications to develop high-performance systems.
This is what you should take with you
- The NumPy array is a data element from the NumPy library that can store multidimensional arrays of information.
- Compared to a Python list, it can only store elements of the same data type. However, in return, it is much more memory efficient and thus more powerful.
- The basic properties of NumPy arrays are the number of dimensions, the shape, and the size.
Thanks to Deepnote for sponsoring this article! Deepnote offers me the possibility to embed Python code easily and quickly on this website and also to host the related notebooks in the cloud.
Other Articles on the Topic of NumPy Arrays
You can find the documentation of an array here.