Generate Captcha using Python


CAPTCHA, short for Completely Automated Public Turing test to tell Computers and Humans Apart is an important test used in many websites to check whether the person accessing the website is a human or a bot. This is done mainly for security reasons so that bots cannot intrude on the website and throw false statistics and skew internet traffic.

Its main purpose is to generate tests that are easy for humans, but difficult for bots or other computers to come up with a solution. They present tests that make use of a user’s cognitive and thinking abilities.

There are various types of CAPTCHA available. Some of the most common CAPTCHA types are:

  • Image-based CAPTCHA: Images are provided to humans and features are distorted in order to make image recognition for computers difficult.

  • Text-based CAPTCHA: A sequence of obscure characters is provided with features distorted and random noise to make character recognition difficult.

  • Audio-based CAPTCHA: An audio recording of spoken characters or clips is provided to humans which they need to input into the system correctly.

  • Behavioural CAPTCHA: Perform specific actions which may be difficult for the bots to replicate/automate.

In order to generate CAPTCHA in Python, we need the library PIL (Python Imaging Library) installed in our system. To do that, we install a fork of the PIL library called Pillow into our system. To do so, we type in the Command prompt:

pip install Pillow

After which we can import PIL in Python to generate our captcha code.

Example 1

In this example, we will learn to generate a Text-Based CAPTCHA using Python.

Algorithm

  • Import the necessary libraries.

  • Create a blank image with any colour as background

  • Define the length of the CAPTCHA and join a string of random words.

  • Customise the font and font size of the resulting text.

  • Calculate the text size and position it in the centre.

  • Draw the calculated text image with the values needed for it to print in the centre.

  • Add a random assortment of noise onto the image and fill it with black.

  • Define the width, height of the CAPTCHA image and the string length of the captcha.

  • Save the captcha image.

import random
from PIL import Image, ImageDraw, ImageFont

def generate_captcha(width, height, length):
   # Create a blank image with a yellow background
   image = Image.new('RGB', (width, height), 'yellow')
   draw = ImageDraw.Draw(image)
   
   # Define the characters to be used in the CAPTCHA
   characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890'
   
   # With a given length a random string is generated
   captcha_text = ''.join(random.sample(characters, length))
   
   # Define the font and its size
   font = ImageFont.truetype('arial.ttf', 72)
   
   # Calculate the text size and position it in the center
   t_width, t_height = draw.textsize(captcha_text, font)
   text_x = (width - t_width) / 2
   text_y = (height - t_height) / 2
   
   #Text is drawn in the image
   draw.text((text_x, text_y), captcha_text, font=font, fill='black')
   
   # Noise is added to create image distortion
   for x in range(width):
    for y in range(height):
      if random.random() < 0.1:
       draw.point((x, y), fill='black')
   
   # Return the generated CAPTCHA image and the corresponding text
   return image, captcha_text

width = 700  # Width of the CAPTCHA image
height = 350  # Height of the CAPTCHA image
length = 6   # Length of the CAPTCHA text

captcha_image, captcha_text = generate_captcha(width, height, length)
captcha_image.save('captcha.png')

We define all the characters we need for generating the CAPTCHA into a string variable. After defining the font, which is Arial, and font size of the CAPTCHA image, it is important to position it in the center of the image. We then draw this text on the blank image using the draw() function and fill the colour as ‘black’ (can be changed into different colours).

We then need to add noise to the image. This is done by initiating a for loop across the image, and checking whether the random value is less than 0.1 (if it’s greater than 0.1, it will be very obvious). These places are then filled in black to create image distortion as it is common knowledge that computer software is not currently equipped to handle very noisy data.

Output

Conclusion

Generating CAPTCHA is an effective way to prevent malicious and suspicious bots from gainin access to the website. In this present tech-blooming world, it becomes a necessity to have CAPTCHAs for every website since it is more vulnerable to attacks and hacks than ever before.

Consequently, due to the advancement of technology, and better OCR (Optical Character Recognition) techniques with high accuracy implemented, some bots may be able to bypass CAPTCHAs with relative ease. Another problem is that malicious users can build Machine Learning algorithms and train the model based on CAPTCHAs and bypass the test that way too.

Updated on: 10-Aug-2023

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements