Find all duplicate documents in a MongoDB collection by a key field?

To find all duplicate documents in a MongoDB collection by a key field, use the aggregation framework with $group and $match stages to group by the field and filter groups with count greater than 1.

Syntax

db.collection.aggregate([
    { $group: {
        _id: { fieldName: "$fieldName" },
        documents: { $addToSet: "$_id" },
        count: { $sum: 1 }
    }},
    { $match: { count: { $gte: 2 } }},
    { $sort: { count: -1 }}
]);

Sample Data

db.findDuplicateByKeyDemo.insertMany([
    {"StudentId": 1, "StudentName": "John"},
    {"StudentId": 2, "StudentName": "Carol"},
    {"StudentId": 3, "StudentName": "Carol"},
    {"StudentId": 4, "StudentName": "John"},
    {"StudentId": 5, "StudentName": "Sam"},
    {"StudentId": 6, "StudentName": "Carol"}
]);

Display all documents to see the data ?

db.findDuplicateByKeyDemo.find().pretty();
{
    "_id": ObjectId("..."),
    "StudentId": 1,
    "StudentName": "John"
}
{
    "_id": ObjectId("..."),
    "StudentId": 2,
    "StudentName": "Carol"
}
{
    "_id": ObjectId("..."),
    "StudentId": 3,
    "StudentName": "Carol"
}
{
    "_id": ObjectId("..."),
    "StudentId": 4,
    "StudentName": "John"
}
{
    "_id": ObjectId("..."),
    "StudentId": 5,
    "StudentName": "Sam"
}
{
    "_id": ObjectId("..."),
    "StudentId": 6,
    "StudentName": "Carol"
}

Find Duplicate Documents

db.findDuplicateByKeyDemo.aggregate([
    { $group: {
        _id: { StudentName: "$StudentName" },
        UIDS: { $addToSet: "$_id" },
        COUNTER: { $sum: 1 }
    }},
    { $match: {
        COUNTER: { $gte: 2 }
    }},
    { $sort: { COUNTER: -1 }},
    { $limit: 10 }
]).pretty();

The output displays duplicate records with Carol appearing 3 times and John 2 times ?

{
    "_id": {
        "StudentName": "Carol"
    },
    "UIDS": [
        ObjectId("5c7f5b248d10a061296a3c3c"),
        ObjectId("5c7f5b438d10a061296a3c3f"),
        ObjectId("5c7f5b1f8d10a061296a3c3b")
    ],
    "COUNTER": 3
}
{
    "_id": {
        "StudentName": "John"
    },
    "UIDS": [
        ObjectId("5c7f5b2d8d10a061296a3c3d"),
        ObjectId("5c7f5b168d10a061296a3c3a")
    ],
    "COUNTER": 2
}

How It Works

  • $group groups documents by the key field (StudentName)
  • $addToSet collects all document IDs for each group
  • $sum: 1 counts documents in each group
  • $match filters groups with count ? 2
  • $sort orders results by duplicate count (descending)

Conclusion

Use MongoDB aggregation with $group and $match stages to efficiently identify duplicate documents by any key field. This approach provides both the duplicate values and their document IDs for further processing.

Updated on: 2026-03-15T00:05:06+05:30

473 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements