Find duplicate records in MongoDB?

To find duplicate records in MongoDB, use the aggregate framework with $group, $match, and $project stages. The aggregation pipeline groups documents by field values, counts occurrences, and filters results where count is greater than 1.

Syntax

db.collection.aggregate([
    { $group: { "_id": "$fieldName", "count": { $sum: 1 } } },
    { $match: { "_id": { $ne: null }, "count": { $gt: 1 } } },
    { $project: { "fieldName": "$_id", "_id": 0 } }
]);

Sample Data

db.findDuplicateRecordsDemo.insertMany([
    { "StudentFirstName": "John" },
    { "StudentFirstName": "John" },
    { "StudentFirstName": "Carol" },
    { "StudentFirstName": "Sam" },
    { "StudentFirstName": "Carol" },
    { "StudentFirstName": "Mike" }
]);
{
    "acknowledged": true,
    "insertedIds": [
        ObjectId("5c8a330293b406bd3df60e01"),
        ObjectId("5c8a330493b406bd3df60e02"),
        ObjectId("5c8a330c93b406bd3df60e03"),
        ObjectId("5c8a331093b406bd3df60e04"),
        ObjectId("5c8a331593b406bd3df60e05"),
        ObjectId("5c8a331e93b406bd3df60e06")
    ]
}

Display All Documents

db.findDuplicateRecordsDemo.find();
{ "_id": ObjectId("5c8a330293b406bd3df60e01"), "StudentFirstName": "John" }
{ "_id": ObjectId("5c8a330493b406bd3df60e02"), "StudentFirstName": "John" }
{ "_id": ObjectId("5c8a330c93b406bd3df60e03"), "StudentFirstName": "Carol" }
{ "_id": ObjectId("5c8a331093b406bd3df60e04"), "StudentFirstName": "Sam" }
{ "_id": ObjectId("5c8a331593b406bd3df60e05"), "StudentFirstName": "Carol" }
{ "_id": ObjectId("5c8a331e93b406bd3df60e06"), "StudentFirstName": "Mike" }

Find Duplicate Records

db.findDuplicateRecordsDemo.aggregate([
    { "$group": { "_id": "$StudentFirstName", "count": { "$sum": 1 } } },
    { "$match": { "_id": { "$ne": null }, "count": { "$gt": 1 } } },
    { "$project": { "StudentFirstName": "$_id", "_id": 0 } }
]);
{ "StudentFirstName": "Carol" }
{ "StudentFirstName": "John" }

How It Works

  • $group: Groups documents by StudentFirstName and counts occurrences
  • $match: Filters groups where count > 1 (duplicates) and excludes null values
  • $project: Reshapes output to show only the duplicate field names

Conclusion

Use MongoDB's aggregation pipeline to identify duplicate records by grouping, counting, and filtering. This approach efficiently finds any field values that appear more than once in your collection.

Updated on: 2026-03-15T00:07:25+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements