Search for documents with similar arrays in MongoDB and order by similarity value

MongoDBDatabaseBig Data Analytics

Let us create a collection with documents −

> db.demo123.insertOne({"ListOfSubject":['MySQL', 'MongoDB', 'Java']});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5e2f24ac140daf4c2a3544b8")
}
> db.demo123.insertOne({"ListOfSubject":['Python', 'MongoDB', 'C']});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5e2f24cd140daf4c2a3544b9")
}
> db.demo123.insertOne({"ListOfSubject":['MySQL', 'MongoDB', 'C++']});
{
   "acknowledged" : true,
   "insertedId" : ObjectId("5e2f24ce140daf4c2a3544ba")
}

Display all documents from a collection with the help of find() method −

> db.demo123.find();

This will produce the following output −

{ "_id" : ObjectId("5e2f24ac140daf4c2a3544b8"), "ListOfSubject" : [ "MySQL", "MongoDB", "Java" ] }
{ "_id" : ObjectId("5e2f24cd140daf4c2a3544b9"), "ListOfSubject" : [ "Python", "MongoDB", "C" ] }
{ "_id" : ObjectId("5e2f24ce140daf4c2a3544ba"), "ListOfSubject" : [ "MySQL", "MongoDB", "C++" ] }

Following is the query to search for document with similar arrays and order them −

> var subjects = ['MySQL', 'MongoDB', 'Java'];
> db.demo123.aggregate([
...    {$unwind: "$ListOfSubject"},
...    {$match: {ListOfSubject:{ $in:subjects}}},
...    {$group: {_id: "$_id", number: {$sum: 1}}},
...    {$project: {_id: 1, number: 1, percentage: {$divide: ["$number",subjects.length]}}},
...    {$sort: {percentage: -1}}
... ]);

This will produce the following output −

{ "_id" : ObjectId("5e2f24ac140daf4c2a3544b8"), "number" : 3, "percentage" : 1 }
{ "_id" : ObjectId("5e2f24ce140daf4c2a3544ba"), "number" : 2, "percentage" : 0.6666666666666666 }
{ "_id" : ObjectId("5e2f24cd140daf4c2a3544b9"), "number" : 1, "percentage" : 0.3333333333333333 }
raja
Published on 31-Mar-2020 11:34:42
Advertisements