@홍재희 Hello, I am a teammate of Yutong and will assist you further with this matter.
There are a few items that come to mind after reading your post that could be causing issues for you. Splitting your data into smaller JSON files was a good move to avoid parsing issues. However, there might still be challenges with how the data is indexed and retrieved. If the vector store isn't indexing all of the data correctly, it might not be able to retrieve the most recent data accurately. I also have concerns about the instruction to match 100% of the criteria might be too strict, leading to incomplete or incorrect results if the criteria are not perfectly met. You are correct in that the LLM might have limitations with understanding complex queries, especially if the queries get into sorting or aggregating data. Lastly, data retrieval could be only getting partial data or encountering inconsistent data values.
With those risks called out, I can also think of a few workarounds for you to try. Can you try to reindex your data stored? This would ensure that all data is correctly indexed. It would also give you another opportunity to observe the indexing logs and ensure there are no errors with the process.
Next, can you relax your query criteria? See if allowing partial matches that still meet the essential criteria.
Lastly, if the above does not help, you will need to implement debugging and logging to track how the queries are processed and where they might be failing. This will help you and us understand if the issue is with the query execution or the data retrieval..
We look forward to your reply.