Pre-processing: The system prepares the data by dividing it into smaller, manageable chunks. These chunks are then encoded into numerical representations for efficient retrieval.
Prompt Engineering: Here, the prompt you give the LLM is analyzed. Query expansion techniques might be applied to broaden the search scope.
Retrieval: The refined query is sent to the retrieval system, which uses a combination of dense retrieval, sparse retrieval (in the case of hybrid search), and self-query retrieval (if applicable) to find the most relevant information chunks.
Re-ranking: The retrieved information is then re-ordered based on its relevance to the prompt, ensuring the LLM focuses on the most valuable content.
Generation: Finally, the LLM takes the top-ranked information chunks and uses them in conjunction with its internal knowledge to generate the final text output.
2 Replies
Mark it as conclusion
Allen Tester1 done