How to use the sub process module with pipes on Linux?

LinuxOperating SystemOpen Source

In Python, we have the subprocess module that allows us to work with additional processes and makes things easier for us as a developer. While there are other modules available to us that also provide similar functionalities as the subprocess module like the os.spawn(), os.system(), os.popen() and much more, but the reason the subprocess is recommended over all these modules is because of its offering of a high level interface than all the other similar modules mentioned above.

In order to be able to use pipes along with the subprocess module we need to understand what the subprocess module does first.

Example

Let’s consider a simple example of the subprocess module where I will be printing an external command without even interacting with it.

Consider the example shown below −

Create a python file named sample.py and then put the following code shown below in that file −

import subprocess
subprocess.call(['ls -ltr', '-1'], shell=True)

Run the sample.py file using the command shown below −

python sample.py

Output

$ python sample.py

__init__.py
index.rst
interaction.py
repeater.py
signal_child.py
signal_parent.py
subprocess_check_call.py
subprocess_check_output.py
subprocess_check_output_error.py
subprocess_check_output_error_trap_output.py
subprocess_os_system.py
subprocess_pipes.py
subprocess_popen2.py
subprocess_popen3.py
subprocess_popen4.py
subprocess_popen_read.py
subprocess_popen_write.py
...

Now when we talk about making use of the subprocess module we end up making use of the Shell=True flag which in many cases must be avoided and is also not recommended.

For example −

def count_number_of_lines(website):
return subprocess.check_output('curl %s | wc -l' % website, shell=True)

If I pass any website's URL, in the above example, it will return me the number of lines that are available on that URL.

For reference, I passes ‘www.google.com’, then

Output

'7\n'

But this is definitely not recommended, as it allows shell injection which is a nightmare if you are concerned about the security of your website.

A better approach is to make use of pipes, and for that we can change the code in the above example to something like this −

def count_number_of_lines(website):
   args1 = ['curl', website]
   args2 = ['wc', '-l']
   process_curl = subprocess.Popen(args1, stdout=subprocess.PIPE,
                  shell=False)
   process_wc = subprocess.Popen(args2, stdin=process_curl.stdout,
                  stdout=subprocess.PIPE, shell=False)
   process_curl.stdout.close()
   return process_wc.communicate()[0]

Output

'7\n'
raja
Published on 31-Jul-2021 12:02:59
Advertisements