Python-Remove element of list that are repeated less than k times


Introduction

In today’s immensely emerging world, one thing which is very precious is data. Data is very powerful, due to this many of the techniques were introduced so, that this data can be secured. In this field of data, data processing also plays one of the important roles, where filtering out of the unnecessary data and gathering the required data, for further use and to perform operations on it. One common thing is to remove elements from a list that are repeated less than certain times denoted as ‘k’. This article aims to provide you a smooth and context rich meaning of the above-mentioned topic.

Remove element of list that are repeated less than k times

Definition

Here, the problem to solve is to remove the elements of the given list which has the frequency less than ‘k’ times. ‘k’ can be any number. So, let us assume that we have ‘n’ number elements in the list and we have to remove all the elements which has its occurrence less than ‘k’ times.

Algorithm

  • Step 1: First, create an empty dictionary or counter object, and then store the frequency of each element.

  • Step 2: Each time you iterate through the element, update the element frequency in the dictionary or counter object

  • Step 3: Create a new list using list comprehension to store, only those elements whose occurrence is more than specified k number of times.

  • Step 4: Return the new list to the output.

  • Step 5: Done.

Approach

  • Approach 1− Using Dictionary and List comprehension.

  • Approach 2− Using Counter from collections module

Approach 1− Using Dictionary and List comprehension.

Example

def remove (original_list, k):
   element_occurence= {}
   for element in original_list:
      if element in element_occurence:
         element_occurence[element]+=1
      else:
         element_occurence[element]= 1
   new_list= [element for element in original_list if 
         element_occurence[element]>=k]
   return new_list
i_list= [1,0,1,1,2,3,2,2,3,3,4,5,4,4,4,5]
k_value= 3
result= remove (i_list, k_value)
print ("Output list:",result)

Output

Output list: [1, 1, 1, 2, 3, 2, 2, 3, 3, 4, 4, 4, 4]

Explanation

  • We create a dynamic list, with n number of elements, here we have named that list as i_list and assigned the values in it

  • Then we initialize the k value, here in this example we have taken it has k_value and initialized it to three.

  • We define a function named as ‘remove’ which has two arguments, the first is the original_list which has been created and the second one is the k value .

  • In this function, first we will create an empty dictionary in order to count the elements occurrence.

  • Using ‘for’ loop iteration of each element in the input list is done

  • In this loop, it is checked If the element was already in the dictionary, we increment it by one; otherwise, we add the element to the dictionary and assign its value to one.

  • Once elements are passed through the loop, we now have the occurrence of each element in the dictionary named as ‘element_occurence’.

  • Now, we create a new_list using list comprehensionand include only those elements whose occurrence is greater than or equal to the value of ‘k’.

  • This new_list is then returned, and at last we print the output list.

  • In the output list we have only those elements whose frequency is at least three times in the original_list. All the other element lesser or fewer than three times are removed from the list.

Approach 2: Using Counter from collections module

Example

from collections import Counter
def remove(original_list,k):
   element_counter= Counter(original_list)
   new_list= [element for element in original_list if 
         element_counter[element] >=k]
   return new_list
i_list= [8,8,6,6,8,8,4,6,4,4,33,33,1,2,1,2,2,0]
k_value= 3
result= remove(i_list,k_value)
print ("Output list:",result)

Output

Output list: [8, 8, 6, 6, 8, 8, 4, 6, 4, 4, 2, 2, 2]

Explanation

  • First, we create a dynamic list and insert a few repeated elements in it, so that we can clearly see the difference between the input list and the output list.

  • We initialize the value of ‘k’ as three.

  • To count the frequency of each element in the list we import Counter class which is from the collection module.

  • We define a function named as ‘remove’ which has two arguments in it. The first argument is for the original_list and the second argument is for the k value.

  • In order to count the elements occurrence, we create a counter object named ‘element_counter’. Here we pass the original_list as argument, so that it can automatically count the frequency of each element

  • Now, we create new list named as ‘new_list’ using a list comprehension and we will take only those elements whose occurrence is equal or more than value of ‘k’, by iterating and by checking the frequency in the counter object named ‘element_counter’ of each element.

  • Finally, this new_list is then returned to the result and we print the output list.

  • The output list contains elements that occur at least three times in the input list. And the elements that occurs fewer the specified ‘k’ value are removed from the list.

  • The use of the Counter class from collections module simplifies the counting process, making the code concise and readable.

Conclusion

A common preprocessing activity in data analysis is removing elements from a list that occur less than k times. We looked at the problem statement, the syntax and explanation of the code, a simple five-step method, and two techniques with executable code examples and outputs in this post.

We can effectively count the occurrences of elements in a list by using dictionaries or the Counter class from the collections module. We generate a new list using the acquired frequency information that only contains elements with frequencies greater than or equal to the set minimum value, k.

Whether you pick the first way, which uses a dictionary, or the second method, which uses the Counter class, both will effectively remove elements that occur less than k times from a given list, resulting in cleaner and more accurate data for future analysis.

Updated on: 09-Oct-2023

70 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements