What is percentile in python?

NumPy percentile() function in Python is used to compute the nth percentile of the array elements along the specified axis. We basically use percentile in statistics which gives you a number that describes the value that a given percent of the values are lower than.

In this article, I will explain the syntax of NumPy percentile() and using this function to compute the percentile

If you are in a hurry, below are some quick examples of how to use NumPy percentile() function.


# Below are the quick examples

# Example 1: # Create an 1D array
arr = np.array([2, 3, 5, 8, 9,4])
# Get the 50th percentile of 1-D array
arr2 = np.percentile(arr, 50)

# Example 2: Get the 75th percentile of 1-D array  
arr2 = np.percentile(arr, 75)

# Example 3: Create 2-D array
arr = np.array([[6, 8, 4],[ 9, 5, 7]])
# Get the 50th percentile of 2-D array               
arr2 = np.percentile(arr, 50)

# Example 4: Get the percentile along the axis = 0               
arr2 = np.percentile(arr, 75, axis=0)

# Example 5: Get the percentile along the axis = 1                 
arr2 = np.percentile(arr, 75, axis=1)

# Example 6: Get the percentile of an array axis=1 and keepdims = true                
arr2 = np.percentile(arr, 75, axis=1, keepdims=True)

2. Syntax of NumPy percentile()

Following is the syntax of the numpy.percentile() function.


# Syntax of numpy.percentile() 
numpy.percentile(arr, percentile, axis=None, out=None, overwrite_input=False, keepdims=False)

2.1 Parameters of percentile()

The percentile() function allows the following parameters.

  • arr - array_like, this is the input array or object that can be converted to an array.
  • percentile – array_like of float Percentile or sequence of percentiles to compute, which must be between 0 and 100 inclusive.
  • axis – Axis or axes along which the percentile is computed. By default, a flattened array is used. axis = 0 means along the column and axis = 1 means working along the row.
  • out – An alternate output array where you can place the result.
  • overwrite_input – If the boolean value is True, you can modify the input array through intermediate calculations, to save memory.
  • keepdims – The value is set to be True, the creates reduced axes with dimensions of one size.

2.2 Return Value of percentile()

It returns a scalar or array with percentile values along with the specified axis.

3. Usage of NumPy percentile() Function

In statistics, a percentile is a term that describes how a score compares to other scores from the same set. While there is no universal definition of percentile, it is commonly expressed as the percentage of values in a set of data scores that fall below a given value. Percentiles show how a given value compares to others. The general rule is that if a value is in the nth percentile, it is greater than nth percent of the total values.

For a better understanding, a student who scores 90 percentiles out of 100, and then it means 90% of students got less than 90 and 10% of students got more than 90.

Let’s compute the percentile value of a single dimension array using the numpy.percentile() function.


import numpy as np
# Create an 1D array
arr = np.array([2, 3, 5, 8, 9,4])

# Get the 50th percentile of 1-D array
arr2 = np.percentile(arr, 50)
print(arr2)

# Output
# 4.5

# Get the 75th percentile of 1-D array  
arr2 = np.percentile(arr, 75)
print(arr2)

# Output
# 7.25

4. Get the Percentile Value of 2-D Array

Let’s take 2-Dimensional array and compute the percentile value using numpy.percentile() function. For example,


# Create 2-D array
arr = np.array([[6, 8, 4],[ 9, 5, 7]])
# Get the 50th percentile of 2-D array               
arr2 = np.percentile(arr, 50)
print(arr2)

# Output
# 6.5

5. Get the Percentile along the Axis

We can compute the percentile along the axis, For example, if we set axis=0, then percentile is calculated along the column, and if axis= 1, then percentile is computed along the row.


# Get the percentile along the axis = 0               
arr2 = np.percentile(arr, 75, axis=0)
print(arr2)

# Output
# [8.25 7.25 6.25]

# Get the percentile along the axis = 1                 
arr2 = np.percentile(arr, 75, axis=1)
print(arr2)

# Output
# [7. 8.]

6. Use axis=1 and keepdims = true

We can also compute the percentile value of an array along with specified axis and keepdims, keepdims argument keeps the dimensions in the result.


# Get the percentile of an array axis=1 and keepdims = true                
arr2 = np.percentile(arr, 75, axis=1, keepdims=True)
print(arr2)

# Output
# [[7.]
#  [8.]]

7. Conclusion

In this article, I have explained how to use NumPy percentile() function and using this function how to get percentile values for 1 dimension and 2 dimension arrays along with specified parameters.

Happy Learning!!

You May Also Like

References

  • https://np.org/doc/1.20/reference/generated/np.percentile.html

What a percentile means?

Definition of percentile : a value on a scale of 100 that indicates the percent of a distribution that is equal to or below it a score in the 95th percentile.

What is percentile of a data?

What Is a Percentile in Statistics? In statistics, a percentile is a term that describes how a score compares to other scores from the same set. While there is no universal definition of percentile, it is commonly expressed as the percentage of values in a set of data scores that fall below a given value.

What is percentile in programming?

So basically, the percentile is a number where a certain percentage of scores fall below that number. For example: If in an examination a student's percentile is 75 then it means that the student has scored more than 75% of students who took the test.

What is percentile used for?

A percentile is a term used in statistics to express how a score compares to other scores in the same set. While there is technically no standard definition of percentile, it's typically communicated as the percentage of values that fall below a particular value in a set of data scores.