How to process iterators in parallel using ZIP

PythonServer Side ProgrammingProgramming


List comprehensions make it easy to take a source list and get a derived list by applying an expression. For example, say that I want to multiply each element in a list with 5. Here, I do this by using a simple for loop.

a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
multiply_by_5 = []
for x in a:
print(f"Output \n *** {multiply_by_5}")


*** [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]

With a list comprehension, I can achieve the same outcome by specifying the expression and the input sequence to loop over.

# List comprehension
multiply_by_5 = [x*2 for x in a]
print(f"Output \n *** {multiply_by_5}")


*** [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

Now, let us say you have couple of lists for addition.

# 1 . Create a List of Numbers
list1 = [100, 200, 300, 400]
list2 = [500, 600, 700, 800]

# 2. Add the two lists to create a new list
list3 = []

# Using a Loop.
for i in range(len(list1)):
added_value = list1[i] + list2[i]
print(f"Output \n*** {list3}")


*** [600, 800, 1000, 1200]

Now what's important here is that the items in the derived list (list3 in our case) of added values are directly related to the items in the source list by their indexes.

Now as far as zipping goes, here is a zip solution for the same list integers. In this case, two lists of integers, one containing 100, 200, 300 and 400, and one containing 500, 600, 700 and 800. And of course, we can define those and assign them to variables. And they don't have to be lists.

They could be other sequences, such as tuples, so on.

So what we're going to do is zip together the pairs of elements from those, so 100 from list1 and 500 from list2 will be zipped together, and so on. For each tuple, as we iterate through them, we will unpack the tuple into the variables a and b.

list4 = []
list4 = [(a + b) for a, b in zip(list1, list2)]
print(f"Output \n*** {list4}")


*** [600, 800, 1000, 1200]

Now above solution looks really cool, however there are is a serious issue you need to know before applying them in your code.

zip built-in function behaves strangely if the input iterators are of different lengths. Let us try them.

# add a new number to the list
print(f"Output \n*** Length of List1 is {len(list1)} , Length of List2 is {len(list2)}")

# run the zip against the list for addition.
list5 = [(a + b) for a, b in zip(list1, list2)]
print(f"*** {list5}")


*** Length of List1 is 9 , Length of List2 is 4
*** [600, 800, 1000, 1200]

Now when we print out each added number from list3 you'll notice that the number added to list1 is missing, even though we appended it there and it's in the list1 it doesn't show up in the output of zip.

And this is just how zip works. It keeps you in tuples until any, either of the iterators are exhausted. So even though list1 has more to go compared to list2, it gets exhausted first and so then the loop exits.

Surprisingly, you are not notified with any exception. So you have to be very careful with zip in production.

You have an option for this problem in python function from itertools called zip longest.

What this zip longest is, it keep going even when one of the iterators has been exhausted.

from itertools import zip_longest

list6 = []
for a, b in zip_longest(list1, list2):
if b is None:
print(f" << do your logic here >> ")
elif a is None:
print(f" << do your logic here >> ")
list6.append(a + b)
print(f"Output \n*** {list6}")

<< do your logic here >>
<< do your logic here >>
<< do your logic here >>
<< do your logic here >>
<< do your logic here >>


*** [600, 800, 1000, 1200]

Conclusion :

  • The zip function is very handy if you want to iterate over multiple iterators in parallel.

  • zip function works differently when you pass iterators of different lengths.

  • In case you want to use iterators of different lengths then go with zip_longest.

Published on 09-Nov-2020 10:22:20