NumPy Tutorial For Beginners

What is NumPy?

NumPy is an open source Python library that is used in almost every field of science and engineering. It contains a powerful N-dimensional array object. An N-dimensional array is simply an array with any number of dimensions. In NumPy, dimensions are also called axes. An array with a single dimension is known as vector, while a matrix refers to an array with two dimensions. For 3-D or higher dimensional arrays, the term tensor is also commonly used.

Array is a collection of "items" of the same type. The item of array can be accessed using index. Every item of array takes up the same size block of memory.

array 1 d

array 2 d

Differences between NumPy array and the standard Python list The Python core library provides a list object that is similar to an array. But there are some differences between NumPy array and Python list:

NumPy arrays have fixed size, unlike Python lists which can grow dynamically.
All elements in a NumPy array are required to be of the same data type whereas the Python list can contain any type of element.
NumPy arrays are faster than lists.
NumPy arrays have optimized functions such as built-in linear algebra operations etc.

Installing NumPy

Python comes with an inbuilt package management system, pip. Pip can install, update, or delete any official package.

You can install numpy package via the command line (cmd) by entering:

python -m pip install --user numpy

How to import NumPy

In order to start using NumPy and all of the functions available in NumPy, you’ll need to import it. This can be easily done with this import statement:

import numpy

Or you can shorten numpy to np in order to save time.

import numpy as np

Creating NumPy Arrays

There are several ways to create a NumPy array. In this tutorial we are using np.array(), np.zeros(), np.ones(), np.arange() methods to create the array.

np.array() method

In general, numerical data arranged in lists and tuples in Python can be converted to NumPy arrays through the use of the array() function.

Creating 1-dimension array.

>>> import numpy as np
>>> a = np.array([10,20,30])
>>> print(a)
[10 20 30]

Creating 2-dimension array

>>> x = np.array([[10,20,30],[40,50,60]])
>>> print(x)
[[10 20 30]
 [40 50 60]]

np.arange() method

np.arange() will create arrays with regularly incrementing values. A few examples are given here:

>>> a = np.arange(5)
>>> print(a)
[0 1 2 3 4]

>>> a = np.arange(3.0)
>>> print(a)
[0. 1. 2.]

>>> a = np.arange(3,8)
>>> print(a)
[3 4 5 6 7]

>>> a = np.arange(1,10,2)
>>> print(a)
[1 3 5 7 9]

np.zeros() method

np.zeros() method will create an array filled with 0 values. The default type is float.

>>> a = np.zeros(3)
>>> print(a)
[0. 0. 0.]

>>> x = np.zeros((2, 3))
>>> print(x)
[[0. 0. 0.]
 [0. 0. 0.]]

np.ones() method

np.ones() method will create an array filled with 1 values. The default type is float.

>>> a = np.ones(4)
>>> print(a)
[1. 1. 1. 1.]

>>> x = np.ones((2,3))
>>> print(x)
[[1. 1. 1.]
 [1. 1. 1.]]

Specifying your data type

While the default data type is floating point (np.float64), you can explicitly specify which data type you want using the dtype keyword.

>>> a = np.ones(2, dtype = np.int64)
>>> print(a)
[1 1]

You can read more about NumPy dtype in details from here.

Array Indexing

You can access the element of array using index.

array 1 d index

>>> a = np.array([10, 20, 30, 40])
>>> print(a[0])
10
>>> sum = a[1] + a[3]
>>> print(sum)
60

array 2-d index

>>> b = np.array([[10, 20, 30] , [40, 50, 60]])
>>> print(b[0][1])
20

>>> sum = b[1][0] + a[1][1] + b[1][2]
>>> print(sum)
70

Iteration over NumPy arrays

You can iterate on each element of array using for loop. The following code demonstrates this:

Program (iterate_array.py)

import numpy as np

a = np.array([10,20,32,45])
for element in a:
    print(element, end = ' ')

print()

b = np.array([[18.5, 19.3], [20.1, 21.0],  [23.7, 24.9]])
for row in b:
    for element in row:
        print(element, end = ' ')
    print()

Output

10 20 32 45 
18.5 19.3 
20.1 21.0 
23.7 24.9

Concatenating Arrays

You can use np.concatenate() method to join a sequence of arrays along an existing axis.

For example, we have two arrays a and b

>>> a = np.array([1, 2, 3, 4])
>>> b = np.array([5, 6, 7, 8])

You can concatenate them with np.concatenate().

>>> c = np.concatenate((a,b))
>>> print(c)
[1 2 3 4 5 6 7 8]

Or, if you start with these arrays:

>>> x = np.array([[1, 2], [3, 4]])
>>> y = np.array([[5, 6]])
>>> z = np.concatenate((x, y), axis=0)
>>> print(z)
[[1 2]
 [3 4]
 [5 6]]

>>> z = np.concatenate((x, y.T), axis=1)
>>> print(z)
[[1 2 5]
 [3 4 6]]

>>> z = np.concatenate((a, b), axis=None)
>>> print(z)
[1 2 3 4 5 6 7 8]

Splitting Arrays

You can use np.split() method to split an array into multiple sub-arrays.

>>> arr = np.arange(9)
>>> a,b,c = np.split(arr,3)
>>> print(a)
[0 1 2]
>>> print(b)
[3 4 5]
>>> print(c)
[6 7 8]

Sorting Arrays

The np.sort() method returns a sorted copy of an array.

>>> a = np.array([2, 1, 5, 3, 7, 4, 6, 8])
>>> b = np.sort(a)
>>> print(b)
[1 2 3 4 5 6 7 8]

>>> x = np.array([[10,4],[3,1]])
>>> y = np.sort(x)                # sort along the last axis
>>> print(y)
[[ 4 10]
 [ 1  3]]

>>> y = np.sort(x, axis=None)     # sort the flattened array
>>> print(y)
[ 1  3  4 10]

>>> y = np.sort(x, axis=0)        # sort along the first axis
>>> print(y)
[[ 3  1]
 [10  4]])

Shape and Size of an array

The shape of the array is a tuple of integers giving the size of the array along each dimension. For example, the shape of a 2-dimensional array of size 2 x 3 is (2,3).

The size of an array is the total number of elements of the array. This is equal to the product of the elements of shape.

>>> x = np.array([[10,4,9],[3,1,8]])
>>> print(x.shape)
(2, 3)

>>> print(x.size)
6

Reshaping an array

The arr.reshape() method returns an array containing the same data with a new shape. For example, if you have a 1-D array of 12 elements and you want to change it into a 2-D array of shape (3x4). You can use reshape() to reshape your array like this:

>>> arr = np.arange(12)
>>> print(arr)
[ 0  1  2  3  4  5  6  7  8  9 10 11]

>>> b = arr.reshape(3, 4)
>>> print(b)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Array Slicing

You can index and slice NumPy arrays in the same ways you can slice Python lists.

A slice is a span of items that are taken from array. To get a slice of array, you write an expression in the following general format:

arr[start : end]

In the general format, start is the index of the first element in the slice, and end is the index marking the end (but not including) of the slice.

>>> a = np.array([10, 20, 30, 40, 50, 60])
>>> print(a[1:3])
[20 30]

If you omit the first index, the slice starts at the beginning.

>>> print(a[:4])
[10 20 30 40]

If you omit the second, the slice goes to the end.

>>> print(a[3:])
[40 50 60]

So, if you omit both, the slice is a copy of the whole array.

>>> print(a[:])
[10 20 30 40 50 60]

Boolean arrays as Indices

To select values from a NumPy array, you can use Boolean arrays as indices. But the shape of Boolean array must be same as the shape of array to be indexed.

Let’s create an array to be indexed.

>>> a = np.array([10, 20, 30, 40, 50, 60])

Now, create a Boolean array of same shape.

>>> x = np.array([True, True, False, True, False, True])

You can easily print all of the values in the array of a that are True in array x.

>>> b = a[x]
>>> print(b)
[10 20 40 60]

If you want to select values from your array that fulfill certain conditions, it’s straightforward with NumPy.

For example, if you start with this array:

>>> a = np.array([[10 , 20, 30, 40], [50, 60, 70, 80], [90, 100, 110, 120]])

You can easily print all of the values in the array that are equal to or greater than 50, and use that condition to index an array.

>>> x = (a >= 50)
>>> b = a[x]
>>> print(b)
[ 50  60  70  80  90 100 110 120]

Or you can select elements that satisfy two conditions using the & and | operators:

>>> x = (a > 20) & (a < 110)
>>> b = a[x]
>>> print(b)
[ 30  40  50  60  70  80  90 100]

Arithmetic Operations on Array

Addition

>>> a = np.array([10,20,30])
>>> b = np.array([1,2,3])
>>> c = a + b
>>> print(c)
[11 22 33]

Subtraction

>>> a = np.array([10,20,30])
>>> b = np.array([1,2,3])
>>> c = a - b
>>> print(c)
[ 9 18 27]

Division

>>> a = np.array([10,20,30])
>>> b = np.array([1,2,3])
>>> c = a / b
>>> print(c)
[10. 10. 10.]

Multiplication

>>> a = np.array([10,20,30])
>>> b = np.array([1,2,3])
>>> c = a * b
>>> print(c)
[10 40 90]

Scalar Product

>>> a = np.array([10,20,30])
>>> c = a * 5
>>> print(c)
[50 100 150]

More Useful Array Operations

max() Method

Return the maximum of an array or maximum along an axis.

One Dimensional Array

>>> a = np.array([10, 20, 30, 40])
>>> b = a.max()
>>> print(b)
40

Two Dimensional Array

>>> a = np.array([[0, 1, 6], [2, 4, 1]])
>>> b = np.max(a)
>>> print(b)
6

>>> b = np.max(a, axis=0)   # max of each column
>>> print(b)
[2 4 6]

>>> b = np.max(a, axis=1)   # max of each row
>>> print(b)
[6 4]

min() Method

Return the maximum of an array or maximum along an axis.

One Dimensional Array

>>> a = np.array([10, 20, 30, 40])
>>> b = a.min()
>>> print(b)
10

Two Dimensional Array

>>> a = np.array([[0, 1, 6], [2, 4, 1]])
>>> b = np.min(a)
>>> print(b)
0

>>> b = np.min(a, axis=0)   # min of each column
>>> print(b)
[0 1 1]

>>> b = np.min(a, axis=1)   # min of each row
>>> print(b)
[0 1]

sum() Method

Sum of array elements over a given axis.

One Dimensional Array

>>> a = np.array([10, 20, 30, 40])
>>> b = a.sum()
>>> print(b)
100

Two Dimensional Array

>>> a = np.array([[0, 1, 6], [2, 4, 1]])
>>> b = np.sum(a)
>>> print(b)
14

>>> b = np.sum(a, axis=0)   # sum of each column
>>> print(b)
[2 5 7]

>>> b = np.sum(a, axis=1)   # sum of each row
>>> print(b)
[7 7]

prod() Method

Return the product of array elements over a given axis.

One Dimensional Array

>>> a = np.array([10, 20, 30, 40])
>>> b = a.prod()
>>> print(b)
240000

Two Dimensional Array

>>> a = np.array([[0, 1, 6], [2, 4, 1]])
>>> b = np.prod(a)
>>> print(b)
0

>>> b = np.prod(a, axis=0)   # product of each column
>>> print(b)
[0 4 6]

>>> b = np.prod(a, axis=1)   # product of each row
>>> print(b)
[0 8]

mean() Method

Compute the arithmetic mean along the specified axis.

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis.

One Dimensional Array

>>> a = np.array([9.2, 10.7, 6.8, 9, 3.4, 5.7, 5.7])
>>> b = np.mean(a)
>>> print(b)
7.214285714285715

Two Dimensional Array

>>> a = np.array([[9.2, 10.7, 6.8],[ 9, 3.4, 5.7]])
>>> b = np.mean(a)
>>> print(b)
7.466666666666668

>>> b = np.mean(a, axis=0)   # mean of each column
>>> print(b)
[9.1  7.05 6.25]

>>> b = np.mean(a, axis=1)   # mean of each row
>>> print(b)
[8.9        6.03333333]

median() Method

>>> a = np.array([9.2, 10.7, 6.8, 9, 3.4, 5.7, 5.7])
>>> b = np.median(a)
>>> print(b)
6.8

Two Dimensional Array

>>> a = np.array([[9.2, 10.7, 6.8],[ 9, 3.4, 5.7]])
>>> b = np.median(a)
>>> print(b)
7.9

>>> b = np.median(a, axis=0)   # median of each column
>>> print(b)
[9.1  7.05 6.25]

>>> b = np.median(a, axis=1)   # median of each row
>>> print(b)
[9.2 5.7]

std() Method

Compute the standard deviation along the specified axis. Returns the standard deviation, a measure of the spread of a distribution, of the array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.

>>> a = np.array([9.2, 10.7, 6.8, 9, 3.4, 5.7, 5.7])
>>> b = np.std(a)
>>> print(b)
2.3479039718916295

Two Dimensional Array

>>> a = np.array([[9.2, 10.7, 6.8],[ 9, 3.4, 5.7]])
>>> b = np.std(a)
>>> print(b)
2.4465395062323347

>>> b = np.std(a, axis=0)   # std of each column
>>> print(b)
[0.1  3.65 0.55]

>>> b = np.std(a, axis=1)   # std of each row
>>> print(b)
[1.60623784 2.29830856]

var() Method

Compute the variance along the specified axis. Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.

>>> a = np.array([9.2, 10.7, 6.8, 9, 3.4, 5.7, 5.7])
>>> b = np.var(a)
>>> print(b)
5.512653061224489

Two Dimensional Array

>>> a = np.array([[9.2, 10.7, 6.8],[ 9, 3.4, 5.7]])
>>> b = np.var(a)
>>> print(b)
5.985555555555556

>>> b = np.var(a, axis=0)   # var of each column
>>> print(b)
[1.00000e-02 1.33225e+01 3.02500e-01]

>>> b = np.var(a, axis=1)   # var of each row
>>> print(b)
[2.58       5.28222222]

Previous Index Next