If you are a Python guy looking to learn all about statistical programming, you have come to the right place. Here, we shall take a look at the numpy.mean() and numpy.average() functions of Python’s NumPy library.
NumPy is the fundamental package for scientific computing with Python. It contains among other things:
- a powerful N-dimensional array object
- tools for integrating C/C++ and Fortran code
- sophisticated functions especially broadcasting.
- useful linear algebra, Fourier transform, and random number capabilities
We can also use NumPy as an efficient multi-dimensional container of generic data. One has the freedom to define arbitrary data-types. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
Numpy array is a powerful N-dimensional array object which is in the form of rows and columns. We can initialize numpy arrays from nested Python lists and access its elements. In order to perform these numpy operations, the next question which will come in your mind is:
To install Python NumPy, go to your command prompt and type “pip install numpy ”.
Import: You can then import the package as ——> import numpy as np <——-
Single-dimensional Numpy Array:
Moving forward with this python numpy tutorial, let’s see some other special functionality in numpy array such as mean and average function.
np.mean always computes an arithmetic mean, and has some additional options for input and output (e.g. what datatypes to use, where to place the result).
np.average can compute a weighted average if we supply it with the parameter weights.
mean(a, axis=None, dtype=None, out=None, keepdims=<no value>)
- It computes the arithmetic mean along the specified axis and returns the average of the array elements. We take the average over the flattened array by default, otherwise over the specified axis.
- a: array_like
Array- We have to find mean of an array containing integers. If a is not an array, a conversion is attempted.
- axis: None or int or tuple of ints, optional
Axis or axes along which we compute the means. The default is to compute the mean of the flattened array
If this is a tuple of ints, a mean is performed over multiple axes, instead of a single axis or all the axes as before.
- dype: data-type, optional
Type to use in computing the mean. For integer inputs, the default is;
float64for floating point inputs, it is the same as the input dtype.
- out: ndarray, optional
Alternate output array in which to place the result. The default is
None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See
- keepdims: bool, optional
If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
If the default value is passed, then keepdims will not be passed through to the
meanmethod of sub-classes of ndarray, however, any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.
- m: ndarray, see dtype parameter above
If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned.
Here are some examples:
>>> a = np.array([[1, 2], [3, 4]])
- >>> np.mean(a)
- >>> np.mean(a, axis=0)
- array([ 2., 3.]) # array([(1+3)/2 , (4+2)/2])
- >>> np.mean(a, axis=1)
- array([ 1.5, 3.5]) # array([(1+2)/2 , (3+4)/2])
average(a, axis=None, weights=None, returned=False)
- Computes the weighted average along the specified axis.
- a : array_like
Array- We have to average the integers contained in the array. If a is not an array, a conversion is attempted.
- axis : None or int or tuple of ints, optional
Axis or axes along which to average a. The default, axis=None, will average over all of the elements of the input array. If the axis is negative it counts from the last to the first axis.
If the axis is a tuple of ints, averaging is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
- weights: array_like, optional
An array of weights associated with the values in a. Each value in a contributes to the average according to its associated weight. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a. If weights=None, then all data in a are assumed to have a weight equal to one.
- returned: bool, optional
Default is False. If True, the tuple (average, sum_of_weights) is returned, otherwise, only the average is returned. If weights=None, sum_of_weights is equivalent to the number of elements over which the average is taken.
- average, [sum_of_weights] : array_type or double
Return the average along the specified axis. When returned is True, return a tuple with the average as the first element and the sum of the weights as the second element. The return type is Float if a is of integer type, otherwise, it is of the same type as a. sum_of_weights is of the same type as average.
When all weights along the axis are zero. See —–>numpy.ma.average<—— for a version robust to this type of error.
When the length of 1D weights is not the same as the shape of a along the axis.
However, the main difference between np.mean() and np.average() lies in the fact that numpy.average can compute a weighted average as shown below.
So, this was a brief yet concise introduction-cum-tutorial of two of the numpy functions- numpy.mean() and numpy.average() . This brings us to the end of this tutorial and now we can clearly understand the difference between this two functions.