How to remove duplicate record in MongoDB 3.x?

To remove duplicate records in MongoDB, use the aggregation pipeline with $group and $addToSet operators to identify duplicates, then remove them using deleteMany().

Syntax

db.collection.aggregate([
    {
        $group: {
            _id: { field: "$field" },
            duplicateIds: { $addToSet: "$_id" },
            count: { $sum: 1 }
        }
    },
    { $match: { count: { $gt: 1 } } }
]);

Sample Data

db.demo438.insertMany([
    { "FirstName": "Chris" },
    { "FirstName": "David" },
    { "FirstName": "Chris" },
    { "FirstName": "Bob" },
    { "FirstName": "David" }
]);
{
    "acknowledged": true,
    "insertedIds": [
        ObjectId("5e775c37bbc41e36cc3caea1"),
        ObjectId("5e775c3dbbc41e36cc3caea2"),
        ObjectId("5e775c40bbc41e36cc3caea3"),
        ObjectId("5e775c44bbc41e36cc3caea4"),
        ObjectId("5e775c47bbc41e36cc3caea5")
    ]
}

Display all documents from the collection ?

db.demo438.find();
{ "_id": ObjectId("5e775c37bbc41e36cc3caea1"), "FirstName": "Chris" }
{ "_id": ObjectId("5e775c3dbbc41e36cc3caea2"), "FirstName": "David" }
{ "_id": ObjectId("5e775c40bbc41e36cc3caea3"), "FirstName": "Chris" }
{ "_id": ObjectId("5e775c44bbc41e36cc3caea4"), "FirstName": "Bob" }
{ "_id": ObjectId("5e775c47bbc41e36cc3caea5"), "FirstName": "David" }

Method 1: Identify Duplicate Groups

First, group documents by the field to identify duplicates ?

db.demo438.aggregate([
    {
        $group: {
            _id: { FirstName: "$FirstName" },
            duplicateIds: { $addToSet: "$_id" },
            count: { $sum: 1 }
        }
    }
]);
{ "_id": { "FirstName": "David" }, "duplicateIds": [ ObjectId("5e775c47bbc41e36cc3caea5"), ObjectId("5e775c3dbbc41e36cc3caea2") ], "count": 2 }
{ "_id": { "FirstName": "Bob" }, "duplicateIds": [ ObjectId("5e775c44bbc41e36cc3caea4") ], "count": 1 }
{ "_id": { "FirstName": "Chris" }, "duplicateIds": [ ObjectId("5e775c40bbc41e36cc3caea3"), ObjectId("5e775c37bbc41e36cc3caea1") ], "count": 2 }

Method 2: Remove Duplicates (Keep First Occurrence)

Remove duplicate documents while keeping the first occurrence of each unique value ?

db.demo438.aggregate([
    {
        $group: {
            _id: { FirstName: "$FirstName" },
            duplicateIds: { $addToSet: "$_id" },
            count: { $sum: 1 }
        }
    },
    { $match: { count: { $gt: 1 } } }
]).forEach(function(doc) {
    doc.duplicateIds.shift();
    db.demo438.deleteMany({ _id: { $in: doc.duplicateIds } });
});

Verify Result

db.demo438.find();
{ "_id": ObjectId("5e775c37bbc41e36cc3caea1"), "FirstName": "Chris" }
{ "_id": ObjectId("5e775c3dbbc41e36cc3caea2"), "FirstName": "David" }
{ "_id": ObjectId("5e775c44bbc41e36cc3caea4"), "FirstName": "Bob" }

Key Points

  • $addToSet collects all _id values for each duplicate group
  • shift() removes the first element, keeping the original record
  • $match filters groups with count greater than 1 (duplicates only)

Conclusion

Use aggregation to group duplicates by field values, then remove extras with deleteMany(). This approach preserves the first occurrence of each unique value while removing all duplicates.

Updated on: 2026-03-15T02:59:31+05:30

386 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements