最新消息:Welcome to the puzzle paradise for programmers! Here, a well-designed puzzle awaits you. From code logic puzzles to algorithmic challenges, each level is closely centered on the programmer's expertise and skills. Whether you're a novice programmer or an experienced tech guru, you'll find your own challenges on this site. In the process of solving puzzles, you can not only exercise your thinking skills, but also deepen your understanding and application of programming knowledge. Come to start this puzzle journey full of wisdom and challenges, with many programmers to compete with each other and show your programming wisdom! Translated with DeepL.com (free version)

mongodb - $lookup accross 2 collections for ID exists takes too much time on Mongo - Stack Overflow

matteradmin5PV0评论

I am using a Quartz + Spring batch Cluster to read documents from Mongo for bulk processing. Since i cannot tag(add a read flag) the original document as read , i add the ID of the read document into a migration collection and compare the ID's across the collection using a $lookup with below code in aggregate pipeline

$lookup:{
from:'migration_coll',
localField: '_id',
foreignField: '_id,
pipeline:[
{
   $project: {
      "_id":1
   }
 }
],
as:'migrtedDocuments'
]}

I typed in the above part to get an idea on how i am using the pipeline to do a look up using _id's across the collection and then projecting on id only as well to increase the speed. However with large collection size, the Query is really slow. With 2 million plus records it is taking more than 10 to 15s to return a count.

Questions:

  1. is there any better way to do this ?.
  2. What else can i use to keep a tab if i cant modify the existing document?

I am kind of stuck on this issue. Any help is appreciated

Post a comment

comment list (0)

  1. No comments so far