Show
9 tricks to master Pandas drop() and speed up your data analysisPhoto by Bernard Hermant on Unsplash
Delete one or many rows/columns from a Pandas DataFrame can be achieved in multiple ways. Among them, the most common one is the In this article, you’ll learn Pandas
1. Delete a single rowBy default, Pandas df.drop(1)# It's equivalent todelete a single row using Pandas drop() (Image by author) Note that the argument Alternatively, a more intuitive way to delete a row from DataFrame is to use the # A more intuitive way delete a single row using Pandas drop() (Image by author)
2. Delete multiple rowsPandas df.drop([1,2])# It's equivalent todelete multiple rows using Pandas drop() (Image by author) Similarly, a more intuitive way to delete multiple rows is to pass a list to the # A more intuitive way delete multiple rows using Pandas drop() (Image by author)3. Delete rows based on row position and custom rangeThe DataFrame index values may not be in ascending order, sometimes they can be any other values, for example, datetime or string labels. For these cases, we can delete rows based on their row position, for instance, delete the 2nd row, we can call df.drop(index=df.index[1]) delete rows based on row position (Image by author)To delete the last row, we can use shortcuts
such as df.drop(index=df.index[-1]) delete rows based on row position (Image by author)We can also use the slice technique to select a range of rows, for instance
If you want to learn more about the slice technique and how to use row index for selecting data, you can check out this article: 4. Delete a single columnSimilar to delete rows, Pandas df.drop('math', axis=1)# It's equivalent todelete a single column using Pandas drop() (Image by author) A more intuitive way to delete a column from DataFrame is to use the # A more intuitive way
delete a single column using Pandas drop() (Image by author)5. Delete multiple columnsSimilarly, we can pass a list to delete multiple columns: df.drop(['math', 'physics'], axis=1)# It's equivalent todelete multiple columns using Pandas drop() (Image by author) A more intuitive way to delete multiple columns is to pass a list to the # A more intuitive way delete multiple columns using
Pandas drop() (Image by author)6. Delete columns based on column position and custom rangeWe can delete a column based on its column position, for instance, delete the 2nd column, we can call df.drop(columns=df.columns[1]) delete columns based on column position (Image by author)To delete the last column, we can use shortcuts such as df.drop(columns=df.columns[-1]) delete columns based on column position (Image by author)Similarly, we can also use the slice technique to select a range of columns, for instance
7. Working with MultiIndexA MultiIndex (also known as a hierarchical index) DataFrame allows us to have multiple columns acting as a row identifier and multiple rows acting as a header identifier: (image by author)When calling Pandas # Delete all Oxford rowsPandas drop() in MultiIndex (image by author) To specify a
level to be removed, we can set the # remove all 2019-07-04 row at level 1Pandas drop() in MultiIndex (image by author) In some cases, we would like to delete a specific index or column combination. To do that, we can pass a tuple to the # drop the index combination 'Oxford' and '2019-07-04'Pandas drop() in MultiIndex (image by author) If you want to learn more about accessing data in a MultiIndex DataFrame, please check out this article: 8. Do operation in place with inplace=TrueBy default, the Pandas 9. Suppress error with error='ignore'You may notice that the Pandas ConclusionIn this article, we have covered 9 use cases about deleting rows and columns using the Pandas Thanks for reading. Please check out the Notebook for the source code and stay tuned if you are interested in the practical aspect of machine learning. More tutorials are available from the Github Repo. References[1] 5 Tips for Data manipulation How do I remove rows from a DataFrame?To drop a row or column in a dataframe, you need to use the drop() method available in the dataframe. You can read more about the drop() method in the docs here. Rows are labelled using the index number starting with 0, by default. Columns are labelled using names.
How do I delete multiple rows in a DataFrame in Python?To delete rows and columns from DataFrames, Pandas uses the “drop” function. To delete a column, or multiple columns, use the name of the column(s), and specify the “axis” as 1. Alternatively, as in the example below, the 'columns' parameter has been added in Pandas which cuts out the need for 'axis'.
How do I delete 10 rows in pandas?Delete Top N Rows of DataFrame Using drop()
drop() method is also used to delete rows from DataFrame based on column values (condition). Use axis param to specify what axis you would like to delete. By default axis = 0 meaning to delete rows. Use axis=1 or columns param to delete columns.
How do I drop multiple rows in a DataFrame?Delete a Multiple Rows by Index Position in DataFrame
As df. drop() function accepts only list of index label names only, so to delete the rows by position we need to create a list of index names from positions and then pass it to drop(). As default value of inPlace is false, so contents of dfObj will not be modified.
|