Article Categories

Selected Reading

urllib.parse — Parse URLs into components in Python

Python Server Side Programming Programming

The urllib.parse module provides a standard interface to break Uniform Resource Locator (URL) strings into components or to combine the components back into a URL string. It also has functions to convert a "relative URL" to an absolute URL given a "base URL."

This module supports the following URL schemes:

file
ftp
gopher
hdl
http
https
imap
mailto
mms
news
nntp
prospero
rsync
rtsp
rtspu
sftp
shttp
sip
sips
snews
svn
svn+ssh
telnet
wais
ws
wss

urlparse() Function

The urlparse() function parses a URL into six components, returning a 6-tuple. Each tuple item is a string, and % escapes are not expanded. The return value is an instance of a subclass of tuple with named attributes:

Attribute	Index	Value	Value if not present
scheme	0	URL scheme specifier	scheme parameter
netloc	1	Network location part	empty string
path	2	Hierarchical path	empty string
params	3	Parameters for last path element	empty string
query	4	Query component	empty string
fragment	5	Fragment identifier	empty string
username	?	User name	None
password	?	Password	None
hostname	?	Host name (lower case)	None
port	?	Port number as integer, if present	None

Example

from urllib.parse import urlparse

url = 'https://mail.google.com/mail/u/0/?tab=rm#inbox'
result = urlparse(url)
print(result)
print(f"Scheme: {result.scheme}")
print(f"Network Location: {result.netloc}")
print(f"Path: {result.path}")
print(f"Query: {result.query}")
print(f"Fragment: {result.fragment}")

ParseResult(scheme='https', netloc='mail.google.com', path='/mail/u/0/', params='', query='tab=rm', fragment='inbox')
Scheme: https
Network Location: mail.google.com
Path: /mail/u/0/
Query: tab=rm
Fragment: inbox

urlunparse() Function

The urlunparse() function constructs a URL from a tuple as returned by urlparse(). The parts argument can be any six-item iterable ?

from urllib.parse import urlparse, urlunparse

url = 'https://mail.google.com/mail/u/0/?tab=rm#inbox'
parsed = urlparse(url)
reconstructed = urlunparse(parsed)
print(reconstructed)

https://mail.google.com/mail/u/0/?tab=rm#inbox

urlsplit() Function

The urlsplit() function is similar to urlparse(), but does not split the params from the URL. This function returns a 5-tuple: (scheme, netloc, path, query, fragment) ?

from urllib.parse import urlsplit

url = 'https://mail.google.com/mail/u/0/?tab=rm#inbox'
result = urlsplit(url)
print(result)

SplitResult(scheme='https', netloc='mail.google.com', path='/mail/u/0/', query='tab=rm', fragment='inbox')

URL Quoting Functions

The URL quoting functions focus on taking program data and making it safe for use as URL components by quoting special characters and appropriately encoding non-ASCII text.

quote() Function

The quote() function replaces special characters in string using the %xx escape. Letters, digits, and the characters '_.-~' are never quoted ?

from urllib.parse import quote

url = 'https://mail.google.com/mail/u/0/?tab=rm#inbox'
quoted = quote(url)
print(quoted)

https%3A//mail.google.com/mail/u/0/%3Ftab%3Drm%23inbox

unquote() Function

The unquote() function replaces %xx escapes by their single-character equivalent ?

from urllib.parse import quote, unquote

url = 'https://mail.google.com/mail/u/0/?tab=rm#inbox'
quoted = quote(url)
unquoted = unquote(quoted)
print(f"Original: {url}")
print(f"Quoted: {quoted}")
print(f"Unquoted: {unquoted}")

Original: https://mail.google.com/mail/u/0/?tab=rm#inbox
Quoted: https%3A//mail.google.com/mail/u/0/%3Ftab%3Drm%23inbox
Unquoted: https://mail.google.com/mail/u/0/?tab=rm#inbox

urlencode() Function

The urlencode() function converts a mapping object or a sequence of two-element tuples to a percent-encoded ASCII text string. The resulting string is a series of key=value pairs separated by '&' characters ?

from urllib.parse import urlencode

query_params = {"name": "Rajeev", "salary": 20000, "dept": "IT"}
encoded = urlencode(query_params)
print(encoded)

name=Rajeev&salary=20000&dept=IT

Conclusion

The urllib.parse module provides essential tools for URL manipulation in Python. Use urlparse() for detailed URL analysis and quote()/unquote() for safe URL encoding.

Nitya Raut

Updated on: 2026-03-25T05:46:09+05:30

9K+ Views

Previous Next