Get Indian Railways Station code Using Python


Web scraping is only one of the many uses for the flexible programming language Python. We'll discover how to use Python to extract station codes for Indian Railways in this blog post. Each Indian railway station has a specific identification number, or station code. They are used to make reservations for tickets, look at train timetables, and find other relevant data.

Installation

To start with, we need to install the requests and Beautiful Soup libraries. Requests is a Python library used for sending HTTP requests, while Beautiful Soup is a library used for web scraping purposes.

To install requests, open your terminal and type −

pip install requests
pip install beautifulsoup4

Algorithm

  • Define a function called get_html that takes a URL as input.

  • Inside the function, create a dictionary of headers that contains the user agent, accept and accept language values.

  • Use the requests.get method to make a GET request to the URL using the headers dictionary and store the response in a variable called response.

  • Return the response text from the function.

  • Define a function called get_station_code that takes a station name as input.

  • Construct a URL for the station page by concatenating the station name to the base URL.

  • Call the get_html function with the constructed URL to retrieve the HTML data for the page and store it in a variable called html_data.

  • Parse the HTML data using the BeautifulSoup library and store the result in a variable called soup.

  • Use the find method of the soup object to locate the table element with the class extrtable, which contains the station code.

  • Use the find_all method of the soup object to locate all the b elements within the table element.

  • Retrieve the last element in the list of b elements using the -1 index and get its text value using the get_text method.

  • Return the station code from the function.

  • Call the get_station_code function with a station name as input to retrieve the station code.

  • Print the station code to the console.

Example

import requests
from bs4 import BeautifulSoup

# function to get html data from a url
def get_html(url):
   headers = {
      'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) ',
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
      'Accept-Language': 'en-US,en;q=0.5',
   }
   response = requests.get(url, headers=headers)
   return response.text

# main function to get station code from mapsofindia.com
def get_station_code(station_name):
   # construct url for the station page
   url = f"https://www.mapsofindia.com/railways/station-code/{station_name}.html"
    
   # get html data for the station page
   html_data = get_html(url)
    
   # parse html data using BeautifulSoup
   soup = BeautifulSoup(html_data, 'html.parser')
    
   # extract station code from html data
   station_code = soup.find("table", class_="extrtable").find_all('b')[-1].get_text()
    
   # return station code
   return station_code

# example usage
station_name = "pune-junction"
station_code = get_station_code(station_name)
print(f"Station Code for {station_name.title()} is {station_code}")

station_name = "new-delhi"
station_code = get_station_code(station_name)
print(f"Station Code for {station_name.title()} is {station_code}")

Output

Station Code for Pune-Junction is PUNE
Station Code for New-Delhi is NDLS

Explanation

This Python script is used to obtain the Indian Railways Station code for a given station name using web scraping.

The script starts by importing the necessary modules - requests and BeautifulSoup from bs4. The requests module is used to send HTTP requests while BeautifulSoup is used to parse HTML data.

The method get_html() is then defined, taking a URL as an input and returning the page's HTML information which sends an HTTP GET request using the requests module to the supplied URL and then returns the page's HTML information as a text string. The primary method get_station_code() returns the station code after receiving the station name as an input. The station name is first used to create the station's page's URL. To acquire the page's HTML information, the get_html() method is then used with this URL as an argument.

The HTML data is then parsed using BeautifulSoup. The soup object is created by passing the HTML data and html.parser to the BeautifulSoup class constructor. Then, the station code is extracted from the parsed HTML data by searching for the <table> element with the class "extrtable". This table contains all the station information including the station code, which is identified by the last <b> tag in the table. The .get_text() method is used to obtain the text content of this tag, which is the station code.

Finally, the get_station_code() function returns the station code, which is then printed along with the station name using an f-string.

Applications

The illustrative example has the potential to be extended to be utilized in a wide range of applications, including the development of software that makes it simpler to purchase tickets, provides information about trains, or permits users to view schedules. Think about developing an app for booking train tickets where users can enter a code to get the station code for a specific station before getting the right ticket.

Conclusion

Python is a powerful language that can be used for various purposes, including web scraping. In this blog, we have learned how to extract Indian Railways station codes using Python. We used the requests and BeautifulSoup libraries to send HTTP requests and parse HTML data, respectively. We also learned how to construct a URL for a particular station and extract its station code from the HTML data. This code can be used in various applications, including building applications that provide information about trains, booking tickets, or checking train schedules.

Updated on: 18-Jul-2023

115 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements