RAG Notes
:
. . Vector-Databases Search-Engine Disambiguation . . Answer-Enrichment Retrieval-System (Pinecone, Weaviate, Elasticsearch) . . .
RAG does not help with training or fine tuning the model itself.
RAG just uses the retrieved information to disambiguate the context and enrich the answer for one-time purpose of answering the question.
The model itself may be fine-tuned on how to summarise the result of Retrieval systems to improve the quality of the answer.
However RAG does not teach any new facts to the trained model.
Improved Accuracy: By providing latest domain-specific knowledge
Smaller Model Sizes: Because knowledge is offloaded to external systems, rathern than memorizing vast knowledge.
Dynamic Adaptability: RAG allows the system to adjust to new information without retraining the core model. (RAG does not retrain the model just uses the retrieved knowledge to enrich answer)
Chatbots for customer support. Technical Q&A systems relying on up-to-date documentation.
Personalized Applications: AI assistant based on use user-specific data, like emails, documents, or settings.
Generating reports or summaries based on current, retrieved information.
Note: The Model is already trained on the task of summarization.
Depends on the quality of the retrieval system.
Latency: Retrieval system introduces Latency.
Hallucination Risk: The LLM might "hallucinate" or incorrectly synthesize information.
Note: Model Deployments can not be standalone, requires Retrieval system as well.