Show
9 tricks to master Pandas drop() and speed up your data analysisPhoto by Bernard Hermant on Unsplash
Delete one or many rows/columns from a Pandas DataFrame can be achieved in multiple ways. Among them, the most common one is the In this article, you’ll learn Pandas
1. Delete a single rowBy default, Pandas df.drop(1)# It's equivalent todelete a single row using Pandas drop() (Image by author) Note that the argument Alternatively, a more intuitive way to delete a row from DataFrame is to use the # A more intuitive way delete a single row using Pandas drop() (Image by author)
2. Delete multiple rowsPandas df.drop([1,2])# It's equivalent todelete multiple rows using Pandas drop() (Image by author) Similarly,
a more intuitive way to delete multiple rows is to pass a list to the # A more intuitive way delete multiple rows using Pandas drop() (Image by author)3. Delete rows based on row position and custom rangeThe DataFrame index values may not be in ascending order, sometimes they can be any other values, for example, datetime or string labels. For these cases, we can delete rows based on their row position, for instance, delete the 2nd row, we can call df.drop(index=df.index[1]) delete rows based on row position (Image by author)To delete the last row, we can use shortcuts such as df.drop(index=df.index[-1]) delete rows based on row position (Image by author)We can also use the slice technique to select a range of rows, for instance
If you want to learn more about the slice technique and how to use row index for selecting data, you can check out this article: 4. Delete a single columnSimilar to delete rows, Pandas df.drop('math', axis=1)# It's equivalent todelete a single column using Pandas drop() (Image by author) A more intuitive way to delete a column from DataFrame is to use the # A more intuitive way delete a single column using Pandas drop() (Image by author)
5. Delete multiple columnsSimilarly, we can pass a list to delete multiple columns: df.drop(['math', 'physics'], axis=1)# It's equivalent to A more intuitive way to delete multiple columns is to pass a list to the # A more intuitive way delete multiple columns using Pandas drop() (Image by author)6. Delete columns based on column position and custom rangeWe can delete a column based on its column position, for instance, delete the 2nd column, we can call df.drop(columns=df.columns[1])
delete columns based on column position (Image by author)To delete the last column, we can use shortcuts such as df.drop(columns=df.columns[-1]) delete
columns based on column position (Image by author)Similarly, we can also use the slice technique to select a range of columns, for instance
7. Working with MultiIndexA MultiIndex (also known as a hierarchical index) DataFrame allows us to have multiple columns acting as a row identifier and multiple rows acting as a header identifier: (image by author)When calling Pandas # Delete all Oxford rowsPandas drop() in MultiIndex (image by author) To specify a level to be removed, we can set the # remove all 2019-07-04 row at level 1Pandas drop() in MultiIndex (image by author) In some cases, we would like to delete a specific index or column combination. To do that, we can pass a tuple to the # drop the index combination 'Oxford' and '2019-07-04'Pandas drop() in MultiIndex (image by author) If you want to learn more about accessing data in a MultiIndex DataFrame, please check out this article: 8. Do operation in place with inplace=TrueBy default, the Pandas 9. Suppress error with error='ignore'You may notice that the Pandas ConclusionIn
this article, we have covered 9 use cases about deleting rows and columns using the Pandas Thanks for reading. Please check out the Notebook for the source code and stay tuned if you are interested in the practical aspect of machine learning. More tutorials are available from the Github Repo. References[1] 5 Tips for Data manipulation How do I remove rows from a DataFrame in Python?To drop a row or column in a dataframe, you need to use the drop() method available in the dataframe. You can read more about the drop() method in the docs here. Rows are labelled using the index number starting with 0, by default. Columns are labelled using names.
How do you remove unwanted rows in Python?To delete a row from a DataFrame, use the drop() method and set the index label as the parameter.
How do you delete columns and rows in Python?The drop function allows the removal of rows and columns from your DataFrame, and once you've used it a few times, you'll have no issues. The Pandas “drop” function is used to delete columns or rows from a Pandas DataFrame.
How do I delete multiple rows in a DataFrame in Python?To delete rows and columns from DataFrames, Pandas uses the “drop” function. To delete a column, or multiple columns, use the name of the column(s), and specify the “axis” as 1. Alternatively, as in the example below, the 'columns' parameter has been added in Pandas which cuts out the need for 'axis'.
|