How do you filter duplicates from a list in Python?

Python – Ways to remove duplicates from list

This article focuses on one of the operations of getting the unique list from a list that contains a possible duplicated. Remove duplicates from list operation has large number of applications and hence, it’s knowledge is good to have.

Method 1 : Naive method
In naive method, we simply traverse the list and append the first occurrence of the element in new list and ignore all the other occurrences of that particular element.




# Python 3 code to demonstrate

# removing duplicated from list

# using naive methods

# initializing list

test_list = [1, 3, 5, 6, 3, 5, 6, 1]

print ("The original list is : " + str(test_list))

# using naive method

# to remove duplicated

# from list

res = []

for i in test_list:

if i not in res:

res.append(i)

# printing list after removal

print ("The list after removing duplicates : " + str(res))

Output :

The original list is : [1, 3, 5, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 3, 5, 6]

Method 2 : Using list comprehension
This method has working similar to the above method, but this is just a one-liner shorthand of longer method done with the help of list comprehension.




# Python 3 code to demonstrate

# removing duplicated from list

# using list comprehension

# initializing list

test_list = [1, 3, 5, 6, 3, 5, 6, 1]

print ("The original list is : " + str(test_list))

# using list comprehension

# to remove duplicated

# from list

res = []

[res.append(x) for x in test_list if x not in res]

# printing list after removal

print ("The list after removing duplicates : " + str(res))

Output :



The original list is : [1, 3, 5, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 3, 5, 6]

Method 3 : Using set()
This is the most popular way by which the duplicated are removed from the list. But the main and notable drawback of this approach is that the ordering of the element is lost in this particular method.




# Python 3 code to demonstrate

# removing duplicated from list

# using set()

# initializing list

test_list = [1, 5, 3, 6, 3, 5, 6, 1]

print ("The original list is : " + str(test_list))

# using set()

# to remove duplicated

# from list

test_list = list(set(test_list))

# printing list after removal

# distorted ordering

print ("The list after removing duplicates : " + str(test_list))

Output :

The original list is : [1, 5, 3, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 3, 5, 6]

Method 4 : Using list comprehension + enumerate()
list comprehension coupled with enumerate function can also achieve this task. It basically looks for already occurred elements and skips adding them. It preserves the list ordering.




# Python 3 code to demonstrate

# removing duplicated from list

# using list comprehension + enumerate()

# initializing list

test_list = [1, 5, 3, 6, 3, 5, 6, 1]

print ("The original list is : " + str(test_list))

# using list comprehension + enumerate()

# to remove duplicated

# from list

res = [i for n, i in enumerate(test_list) if i not in test_list[:n]]

# printing list after removal

print ("The list after removing duplicates : " + str(res))

Output :

The original list is : [1, 5, 3, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 5, 3, 6]

Method 5 : Using collections.OrderedDict.fromkeys()
This is fastest method to achieve the particular task. It first removes the duplicates and returns a dictionary which has to be converted to list. This works well in case of strings also.




# Python 3 code to demonstrate

# removing duplicated from list

# using collections.OrderedDict.fromkeys()

from collections import OrderedDict

# initializing list

test_list = [1, 5, 3, 6, 3, 5, 6, 1]

print ("The original list is : " + str(test_list))

# using collections.OrderedDict.fromkeys()

# to remove duplicated

# from list

res = list(OrderedDict.fromkeys(test_list))

# printing list after removal

print ("The list after removing duplicates : " + str(res))

Output :

The original list is : [1, 5, 3, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 5, 3, 6]

How do you filter duplicates from a list in Python?




Article Tags :

Python

Python list-programs

python-list

Practice Tags :

python-list

How to Remove Duplicates From a Python List

❮ Previous Next ❯


Learn how to remove duplicates from a List in Python.


Example

Remove any duplicates from a List:

mylist = ["a", "b", "a", "c", "c"]
mylist = list(dict.fromkeys(mylist))
print(mylist)

Try it Yourself »

Remove duplicates from list using Set

To remove the duplicates from a list, you can make use of the built-in function set(). The specialty of set() method is that it returns distinct elements.

We have a list : [1,1,2,3,2,2,4,5,6,2,1]. The list has many duplicates which we need to remove and get back only the distinct elements. The list is given to the set() built-in function. Later the final list is displayed using the list() built-in function, as shown in the example below.


The output that we get is distinct elements where all the duplicates elements are eliminated.my_list = [1,1,2,3,2,2,4,5,6,2,1] my_final_list = set(my_list) print(list(my_final_list))

Output:

[1, 2, 3, 4, 5, 6]

Remove Duplicates from a list using the Temporary List

To remove duplicates from a given list, you can make use of an empty temporary list. For that first, you will have to loop through the list having duplicates and add the unique items to the temporary list. Later the temporary list is assigned to the main list.

Here is a working example using temporary list.

my_list = [1, 2, 3, 1, 2, 4, 5, 4 ,6, 2] print("List Before ", my_list) temp_list = [] for i in my_list: if i not in temp_list: temp_list.append(i) my_list = temp_list print("List After removing duplicates ", my_list)

Output:

List Before [1, 2, 3, 1, 2, 4, 5, 4, 6, 2] List After removing duplicates [1, 2, 3, 4, 5, 6]

Python: 5 Ways to Remove Duplicates from List

How do you filter duplicates from a list in Python?

In this article, we will learn what is a list in python. As a python list is a collection of multiple elements even containing duplicates, sometimes it is necessary to make the list unique. Here, we are going to study the multiple ways to remove duplicates from the list in python. So, let's get started!

Removing Duplicate Items

Most of these answers only remove duplicate items which are hashable, but this question doesn't imply it doesn't just need hashable items, meaning I'll offer some solutions which don't require hashable items.

collections.Counter is a powerful tool in the standard library which could be perfect for this. There's only one other solution which even has Counter in it. However, that solution is also limited to hashable keys.

To allow unhashable keys in Counter, I made a Container class, which will try to get the object's default hash function, but if it fails, it will try its identity function. It also defines an eq and a hash method. This should be enough to allow unhashable items in our solution. Unhashable objects will be treated as if they are hashable. However, this hash function uses identity for unhashable objects, meaning two equal objects that are both unhashable won't work. I suggest you override this, and changing it to use the hash of an equivalent mutable type (like using hash(tuple(my_list)) if my_list is a list).

I also made two solutions. Another solution which keeps the order of the items, using a subclass of both OrderedDict and Counter which is named 'OrderedCounter'. Now, here are the functions:

from collections import OrderedDict, Counter class Container: def __init__(self, obj): self.obj = obj def __eq__(self, obj): return self.obj == obj def __hash__(self): try: return hash(self.obj) except: return id(self.obj) class OrderedCounter(Counter, OrderedDict): 'Counter that remembers the order elements are first encountered' def __repr__(self): return '%s(%r)' % (self.__class__.__name__, OrderedDict(self)) def __reduce__(self): return self.__class__, (OrderedDict(self),) def remd(sequence): cnt = Counter() for x in sequence: cnt[Container(x)] += 1 return [item.obj for item in cnt] def oremd(sequence): cnt = OrderedCounter() for x in sequence: cnt[Container(x)] += 1 return [item.obj for item in cnt]

remd is non-ordered sorting, while oremd is ordered sorting. You can clearly tell which one is faster, but I'll explain anyways. The non-ordered sorting is slightly faster, since it doesn't store the order of the items.

Now, I also wanted to show the speed comparisons of each answer. So, I'll do that now.

Remove/extract duplicate elements from list in Python

Posted: 2020-12-08 / Tags: Python, List

Tweet

This article describes how to generate a new list in Python by removing and extracting duplicate elements from a list. Note that removing duplicate elements is equivalent to extracting only unique elements.

  • Remove duplicate elements (Extract unique elements) from a list
    • Do not keep the order of the original list: set()
    • Keep the order of the original list: dict.fromkeys(), sorted()
    • For a two-dimensional list (list of lists)
  • Extract duplicate elements from a list
    • Do not keep the order of the original list
    • Keep the order of the original list
    • For a two-dimensional list (list of lists)

The same idea can be applied to tuples instead of lists.

See the following article for how to check if lists or tuples have duplicate elements.

  • Check if the list contains duplicate elements in Python

Sponsored Link

Remove Duplicates From List in Python

Python Python List

Created: February-06, 2021 | Updated: February-21, 2021

A List in Python is a data structure that is used to store data in a particular order. The list can store data of multiple types i.e. int, float, string, another list, etc. Lists are mutable, which means values once created can be changed later. It is represented by square brackets [].

myList = [2, 1, 2, 3, 0, 6, 7, 6, 4, 8] print(myList)

Output:

[2, 1, 2, 3, 0, 6, 7, 6, 4, 8]

You can remove duplicate elements from the above list using a for loop as shown below.

myList = [2, 1, 2, 3, 0, 6, 7, 6, 4, 8] resultantList = [] for element in myList: if element not in resultantList: resultantList.append(element) print(resultantList)

Output:

[2, 1, 3, 0, 6, 7, 4, 8]

If you don’t want to write this much code, then there are two most popular ways to remove duplicate elements from a List in Python.

  1. If you don’t want to maintain the order of the elements inside a list after removing the duplicate elements, then you can use a Set data structure.
  2. If you want to maintain the order of the elements inside a list after removing duplicate elements, then you can use something called OrderedDict.

Methods to Remove Duplicate Elements from List – Python

1. Using iteration

To remove duplicate elements from List in Python, we can manually iterate through the list and add an element to the new list if it is not present. Otherwise, we skip that element.

The code is shown below:

a = [2, 3, 3, 2, 5, 4, 4, 6] b = [] for i in a: # Add to the new list # only if not present if i not in b: b.append(i) print(b)

Output

[2, 3, 5, 4, 6]

The same code can be written using List Comprehension to reduce the number of lines of code, although it is essentially the same as before.

a = [2 3, 4, 2, 5, 4, 4, 6] b = [] [b.append(i) for i in a if i not in b] print(b)

The problem with this approach is that it is a bit slow since a comparison is done for every element in the new list, while already iterating through our original list.

This is computationally expensive, and we have other methods to deal with this issue. You should use this only if the list size is not very large. Otherwise, refer to the other methods.