Python Data Persistence - PyMongo module

MongoDB is a document oriented NoSQL database. It is a cross platform database distributed under server side public license. It uses JSON like documents as schema.

In order to provide capability to store huge data, more than one physical servers (called shards) are interconnected, so that a horizontal scalability is achieved. MongoDB database consists of documents.


A document is analogous to a row in a table of relational database. However, it doesn't have a particular schema. Document is a collection of key-value pairs - similar to dictionary. However, number of k-v pairs in each document may vary. Just as a table in relational database has a primary key, document in MongoDB database has a special key called "_id".

Before we see how MongoDB database is used with Python, let us briefly understand how to install and start MongoDB. Community and commercial version of MongoDB is available. Community version can be downloaded from

Assuming that MongoDB is installed in c:\mongodb, the server can be invoked using following command.


The MongoDB server is active at port number 22017 by default. Databases are stored in data/bin folder by default, although the location can be changed by –dbpath option.

MongoDB has its own set of commands to be used in a MongoDB shell. To invoke shell, use Mongo command.


A shell prompt similar to MySQL or SQLite shell prompt, appears before which native NoSQL commands can be executed. However, we are interested in connecting MongoDB database to Python.

PyMongo module has been developed by MongoDB Inc itself to provide Python programming interface. Use well known pip utility to install PyMongo.

pip3 install pymongo

Assuming that MongoDB server is up and running (with mongod command) and is listening at port 22017, we first need to declare a MongoClient object. It controls all transactions between Python session and the database.

from pymongo import MongoClient

Use this client object to establish connection with MongoDB server.

client = MongoClient('localhost', 27017)

A new database is created with following command.


MongoDB database can have many collections, similar to tables in a relational database. A Collection object is created by Create_collection() function.


Now, we can add one or more documents in the collection as follows −

from pymongo import MongoClient
studentlist=[{'studentID':1,'Name':'Juhi','age':20, 'marks'=100},
{'studentID':2,'Name':'dilip','age':20, 'marks'=110},
{'studentID':3,'Name':'jeevan','age':24, 'marks'=145}]

To retrieve the documents (similar to SELECT query), we should use find() method. It returns a cursor with the help of which all documents can be obtained.

for doc in docs:
   print (doc['Name'], doc['age'], doc['marks'] )

To find a particular document instead of all of them in a collection, we need to apply filter to find() method. The filter uses logical operators. MongoDB has its own set of logical operators as below −

Sr.No MongoDB operator & Traditional logical operator


equal to (==)



greater than (>)



greater than or equal to (>=)



if equal to any value in array



less than (<)



less than or equal to (<=)



not equal to (!=)



if not equal to any value in array

For example, we are interested in obtaining list of students older than 21 years. Using $gt operator in the filter for find() method as follows −

for doc in docs:
   print (doc.get('Name'), doc.get('age'), doc.get('marks'))

PyMongo module provides update_one() and update_many() methods for modifying one document or more than one documents satisfying a specific filter expression.

Let us update marks attribute of a document in which name is Juhi.

from pymongo import MongoClient
doc=db.students.find_one({'Name': 'Juhi'})
db['students'].update_one({'Name': 'Juhi'},{"$set":{'marks':150}})