Search for documents with similar arrays in MongoDB and order by similarity value

To search for documents with similar arrays in MongoDB and order by similarity value, use the aggregation pipeline with $unwind, $match, and $group stages to calculate similarity percentages based on matching array elements.

Syntax

db.collection.aggregate([
    { $unwind: "$arrayField" },
    { $match: { arrayField: { $in: targetArray } } },
    { $group: { _id: "$_id", matches: { $sum: 1 } } },
    { $project: { _id: 1, matches: 1, similarity: { $divide: ["$matches", targetArray.length] } } },
    { $sort: { similarity: -1 } }
]);

Sample Data

db.demo123.insertMany([
    { "ListOfSubject": ["MySQL", "MongoDB", "Java"] },
    { "ListOfSubject": ["Python", "MongoDB", "C"] },
    { "ListOfSubject": ["MySQL", "MongoDB", "C++"] }
]);
{
    "acknowledged": true,
    "insertedId": ObjectId("5e2f24ac140daf4c2a3544b8")
}
{
    "acknowledged": true,
    "insertedId": ObjectId("5e2f24cd140daf4c2a3544b9")
}
{
    "acknowledged": true,
    "insertedId": ObjectId("5e2f24ce140daf4c2a3544ba")
}

Display all documents from the collection ?

db.demo123.find();
{ "_id": ObjectId("5e2f24ac140daf4c2a3544b8"), "ListOfSubject": ["MySQL", "MongoDB", "Java"] }
{ "_id": ObjectId("5e2f24cd140daf4c2a3544b9"), "ListOfSubject": ["Python", "MongoDB", "C"] }
{ "_id": ObjectId("5e2f24ce140daf4c2a3544ba"), "ListOfSubject": ["MySQL", "MongoDB", "C++"] }

Example: Find Similar Arrays

Search for documents with arrays similar to ["MySQL", "MongoDB", "Java"] and order by similarity percentage ?

var subjects = ["MySQL", "MongoDB", "Java"];
db.demo123.aggregate([
    { $unwind: "$ListOfSubject" },
    { $match: { ListOfSubject: { $in: subjects } } },
    { $group: { _id: "$_id", number: { $sum: 1 } } },
    { $project: { _id: 1, number: 1, percentage: { $divide: ["$number", subjects.length] } } },
    { $sort: { percentage: -1 } }
]);
{ "_id": ObjectId("5e2f24ac140daf4c2a3544b8"), "number": 3, "percentage": 1 }
{ "_id": ObjectId("5e2f24ce140daf4c2a3544ba"), "number": 2, "percentage": 0.6666666666666666 }
{ "_id": ObjectId("5e2f24cd140daf4c2a3544b9"), "number": 1, "percentage": 0.3333333333333333 }

How It Works

  • $unwind breaks down each array into individual elements
  • $match filters elements that exist in the target array
  • $group counts matching elements per document
  • $project calculates similarity percentage by dividing matches by target array length
  • $sort orders results by similarity (highest first)

Conclusion

Use MongoDB's aggregation pipeline to find documents with similar arrays by calculating match percentages. The first document shows 100% similarity (3/3 matches), while others show proportional similarity based on common elements.

Updated on: 2026-03-15T02:11:46+05:30

279 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements