All examples can be viewed in this sample Jupyter notebook Table of Contents
You need to have the matplotlib module installed for this!
Versions used: Pandas 1.x, matplotlib 3.0.x
Sample data for examples
import pandas as pd df = pd.DataFrame({ 'name':['john','mary','peter','jeff','bill','lisa','jose'], 'age':[23,78,22,19,45,33,20], 'gender':['M','F','M','M','M','F','M'], 'state':['california','dc','california','dc','california','texas','texas'], 'num_children':[2,0,0,3,2,1,4], 'num_pets':[5,1,0,5,2,2,3] }) This is what our sample dataset looks likePandas has tight integration with matplotlib.
You can plot data directly from your DataFrame using the plot() method:
Scatter plot of two columns
import matplotlib.pyplot as plt import pandas as pd # a scatter plot comparing num_children and num_pets df.plot(kind='scatter',x='num_children',y='num_pets',color='red') plt.show() Source dataframe Looks like we have a trendBar plot of column values
import matplotlib.pyplot as plt import pandas as pd # a simple line plot df.plot(kind='bar',x='name',y='age') Source dataframe 'kind' takes arguments such as 'bar', 'barh' (horizontal bars), etcLine plot, multiple columns
Just reuse the Axes object.
import matplotlib.pyplot as plt import pandas as pd # gca stands for 'get current axis' ax = plt.gca() df.plot(kind='line',x='name',y='num_children',ax=ax) df.plot(kind='line',x='name',y='num_pets', color='red', ax=ax) plt.show()reuse an Axis to plot multiple lines
Save plot to file
Instead of calling plt.show(), call plt.savefig('outputfile.png'):
import matplotlib.pyplot as plt import pandas as pd df.plot(kind='bar',x='name',y='age') # the plot gets saved to 'output.png' plt.savefig('output.png')Bar plot with group by
import matplotlib.pyplot as plt import pandas as pd df.groupby('state')['name'].nunique().plot(kind='bar') plt.show() Source dataframe Number of unique names per stateStacked bar plot with group by
Example: plot count by category as a stacked column:
create a dummy variable and do a two-level group-by based on it:
fix the x axis label and the legend
2 for DC and texas Note how the legend follows the
same order as the actual column.
This makes your plot easier to read.
Stacked bar plot with group by, normalized to 100%
A plot where the columns sum up to 100%.
Similar to the example above but:
normalize the values by dividing by the total amounts
use percentage tick labels for the y axis
Example: Plot percentage count of records by state
Stacked bar plot, two-level group by
Just do a normal groupby() and call unstack():
import matplotlib.pyplot as plt import pandas as pd df.groupby(['state','gender']).size().unstack().plot(kind='bar',stacked=True) plt.show() Source dataframe Stacked bar chart showing the number of peopleper state, split into males and females
Another example: count the people by gender, spliting by state:
import matplotlib.pyplot as plt import pandas as pd df.groupby(['gender','state']).size().unstack().plot(kind='bar',stacked=True) plt.show() Source dataframe Now grouped by 'state' and 'gender'Stacked bar plot with two-level group by, normalized to 100%
Sometimes you are only ever interested in the distributions, not raw amounts:
import matplotlib.ticker as mtick import matplotlib.pyplot as plt df.groupby(['gender','state']).size().groupby(level=0).apply( lambda x: 100 * x / x.sum() ).unstack().plot(kind='bar',stacked=True) plt.gca().yaxis.set_major_formatter(mtick.PercentFormatter()) plt.show() Source dataframe Record count grouped by state and gender, with normalized columnsso that each sums up to 100%
How do you plot a line graph from a DataFrame in Python?
To effectively draw a line plot with datetime as axes, you need to follow these steps:.
Step 1: Check if datetime values are in correct format. The datetime values should be of the form of pandas datetime objects. ... .
Step 2: Make datetime values index of the dataframe. ... .
Step 3: Create the Line plot..
How do you create a line graph from a DataFrame?
Drawing a Line chart using pandas DataFrame in Python: A line chart or line graph is one among them. Calling the line() method on the plot instance draws a line chart. If the column name for X-axis is not specified, the method takes the index of the column as the X-axis, which is of the pattern 0, 1, 2, 3 and so on.
How do you do a line plot in Pandas DataFrame?
To generate a line plot with pandas, we typically create a DataFrame* with the dataset to be plotted. Then, the plot. line() method is called on the DataFrame. Set the values to be represented in the x-axis.
How do you plot a Line in Python?
Simple Line Plots.
%matplotlib inline import matplotlib.pyplot as plt plt. style. use('seaborn-whitegrid') import numpy as np. ... .
fig = plt. figure() ax = plt. axes() ... .
In [3]: fig = plt. figure() ax = plt. ... .
In [4]: plt. plot(x, np. ... .
In [5]: plt. plot(x, np. ... .
plt. plot(x, x + 0, '-g') # solid green plt. ... .
In [9]: plt. ... .
In [10]: plt..