Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
MongoDB query to group duplicate documents
To group duplicate documents in MongoDB, use the $group stage in the aggregation pipeline. This operation groups documents by a specified field and eliminates duplicates by creating a single document per unique value.
Syntax
db.collection.aggregate([
{ $group: { _id: "$fieldName" } }
]);
Sample Data
db.demo501.insertMany([
{ "Name": "Chris" },
{ "Name": "Bob" },
{ "Name": "Chris" },
{ "Name": "John" },
{ "Name": "Chris" },
{ "Name": "John" },
{ "Name": "David" }
]);
{
"acknowledged": true,
"insertedIds": [
ObjectId("5e8752f0987b6e0e9d18f566"),
ObjectId("5e8752f4987b6e0e9d18f567"),
ObjectId("5e8752f8987b6e0e9d18f568"),
ObjectId("5e8752fb987b6e0e9d18f569"),
ObjectId("5e8752fd987b6e0e9d18f56a"),
ObjectId("5e875301987b6e0e9d18f56b"),
ObjectId("5e875307987b6e0e9d18f56c")
]
}
Display all documents to see the duplicates ?
db.demo501.find();
{ "_id": ObjectId("5e8752f0987b6e0e9d18f566"), "Name": "Chris" }
{ "_id": ObjectId("5e8752f4987b6e0e9d18f567"), "Name": "Bob" }
{ "_id": ObjectId("5e8752f8987b6e0e9d18f568"), "Name": "Chris" }
{ "_id": ObjectId("5e8752fb987b6e0e9d18f569"), "Name": "John" }
{ "_id": ObjectId("5e8752fd987b6e0e9d18f56a"), "Name": "Chris" }
{ "_id": ObjectId("5e875301987b6e0e9d18f56b"), "Name": "John" }
{ "_id": ObjectId("5e875307987b6e0e9d18f56c"), "Name": "David" }
Group Duplicate Documents
Use $group to get unique values from the Name field ?
db.demo501.aggregate([
{ $group: { _id: "$Name" } }
]);
{ "_id": "David" }
{ "_id": "John" }
{ "_id": "Bob" }
{ "_id": "Chris" }
Count Duplicates
Add a count field to see how many duplicates exist for each name ?
db.demo501.aggregate([
{
$group: {
_id: "$Name",
count: { $sum: 1 }
}
}
]);
{ "_id": "David", "count": 1 }
{ "_id": "John", "count": 2 }
{ "_id": "Bob", "count": 1 }
{ "_id": "Chris", "count": 3 }
Conclusion
The $group stage effectively eliminates duplicates by grouping documents with the same field value. Use $sum: 1 to count occurrences of each unique value for better duplicate analysis.
Advertisements
