A NumPy array is a data object from the Python library NumPy, which is used to store objects of a data type. Since it is programmed much closer to memory than comparable Python data objects, it can store data sets more efficiently and thus also be processed faster.
What is NumPy?
Python offers a variety of data structures that can be used to store data without additional libraries. However, these structures, such as Python lists, are only very poorly suited for mathematical operations. Adding two lists of numbers element by element can quickly be detrimental to performance when dealing with large amounts of data.
For this reason, NumPy was developed, as it offers the possibility to perform numerical operations quickly and efficiently. Especially important are calculations from the field of linear algebra, such as matrix multiplications.
How to define a NumPy Array?
As the name suggests, the array is part of the NumPy library. So you have to import it before you can use the array. This is then simply created by inserting the elements in square brackets. Here the order of the elements plays a role:
What are the properties of an Array?
The array is a collection of elements that must all be of the same data type. Mostly numbers are stored in it, but strings can be stored just as well. Only a mixture of different data types in an array is not possible.
To describe the structure of arrays, three properties are essential, which are often confused:
- Number of dimensions: The dimension in an array indicates how many indexes are needed to query a specific element of the array. Each of the dimensions can be used to store related information. For example, if you want to analyze a company’s sales figures by time, you can use one dimension for sales on different days and another dimension for sales in a month.
- Shape: The shape specifies the size of all dimensions that the array contains. For example, if you have a three-dimensional array, you will get back a tuple of length three as a result. The first element of the tuple indicates the number of elements in the first dimension, the second the number of elements in the second dimension, and so on.
- Size: Finally, the size of the array indicates how many numbers or elements are stored in the array in total. Specifically, this is the product of the individual elements returned by the Shape.
We now define a two-dimensional NumPy array, which has three elements in each of the two dimensions:
How are multidimensional arrays constructed?
In the application one dimension is often not sufficient to write facts completely. NumPy arrays are also suitable for storing multi-dimensional objects. An example for this is a table, which consists of two dimensions, namely the rows and the columns. This can also be defined relatively easily by specifying the rows as a list of lists:
Similarly, other dimensions can be added. For example, we can create an object that simply contains the table from the previous example twice. Thus we get a three-dimensional object:
How to get single elements from an array?
Since NumPy arrays are very similar in structure to Python lists, it is not surprising that arrays also work with so-called indices to access elements. This means that you can query individual elements by their position. The counting starts at 0 and goes upwards.
In the same way, you can also work with negative indices and thus run through the array from the back. In contrast to the positive indices, however, this starts at -1, which stands for the last element of the array. The -2 is then correspondingly the index of the penultimate element of the array:
In multidimensional arrays, a single index is not sufficient to query individual elements. In general, to be able to query an element and not get a list of elements, you must specify an index for each dimension. In our multidimensional array, we query the second element from the first dimension:
What is the difference between a Python list and a NumPy array?
At this point in the article, you might think that NumPy arrays are simply an alternative to Python lists, which even have the disadvantage of only being able to store data of a single data type, whereas lists also store a mixture of strings and numbers. However, there must be reasons why the developers of NumPy have decided to introduce a new data element with the array.
The main advantage of NumPy arrays over Python Lists is the memory efficiency and associated speed of reads and writes. In many applications this may not really matter, however, when dealing with millions or even billions of elements, it can offer significant time savings. Thus, arrays are often used in complex applications to develop high-performance systems.
This is what you should take with you
- The NumPy array is a data element from the NumPy library that can store multidimensional arrays of information.
- Compared to a Python list, it can only store elements of the same data type. However, in return, it is much more memory efficient and thus more powerful.
- The basic properties of NumPy arrays are the number of dimensions, the shape, and the size.
Thanks to Deepnote for sponsoring this article! Deepnote offers me the possibility to embed Python code easily and quickly on this website and also to host the related notebooks in the cloud.
What are Conditional Statements in Python?
Learn how to use conditional statements in Python. Understand if-else, nested if, and elif statements for efficient programming.
What is XOR?
Explore XOR: The Exclusive OR operator's role in logic, encryption, math, AI, and technology.
How can you do Python Exception Handling?
Unlocking the Art of Python Exception Handling: Best Practices, Tips, and Key Differences Between Python 2 and Python 3.
What are Python Modules?
Explore Python modules: understand their role, enhance functionality, and streamline coding in diverse applications.
What are Python Comparison Operators?
Master Python comparison operators for precise logic and decision-making in programming.
What are Python Inputs and Outputs?
Master Python Inputs and Outputs: Explore inputs, outputs, and file handling in Python programming efficiently.
Other Articles on the Topic of NumPy Arrays
You can find the documentation of an array here.
Niklas Lang
I have been working as a machine learning engineer and software developer since 2020 and am passionate about the world of data, algorithms and software development. In addition to my work in the field, I teach at several German universities, including the IU International University of Applied Sciences and the Baden-Württemberg Cooperative State University, in the fields of data science, mathematics and business analytics.
My goal is to present complex topics such as statistics and machine learning in a way that makes them not only understandable, but also exciting and tangible. I combine practical experience from industry with sound theoretical foundations to prepare my students in the best possible way for the challenges of the data world.