ML Tools
High level summary of various tools/Technologies used in ML.
:
. . . .
Note: Code assistants could be:
- Purely based on LLM.
- Use RAG for better contextual awareness.
Further (external or internal) search can enhance Context.
RAG is a game changer for AI Coding Assistents.
Context Creation is a fine balance of following components:
- Repo level functions and comments. (e.g. Sourcegraph Cody)
- Local Surrounding code structure
- Imported Modules, Dependencies
- Dynamic Fetching of API Documentation (Cody, Codeium)
- Analyse Commit history ??? (for hot areas of changes??)
Question: How embeddings used for RAG ?
Embeddings applied at chunks e.g. Functions, Classes, Files.
Embeddings are stored and managed in Vector Databases.
Embedding Models: OpenAPI text-embedding-ada-002, code-bert,
FAISS (Facebook), etc.
Vector Databases: FAISS (Self Host), PineCone, Weaviate
Vector Databases support different algorithms to search
based on given embedding of user query.
e.g. KNN, Hybrid Vector-Based+Keyword
Cody - Open
Codeium - BlackBox Generous Windsurf
Copilot - Microsoft
Cursor -
Continue - Open
OpenDevin -
AnythingLLM
See https://medium.com/@justinmilner/the-top-coding-assistant-platforms-of-july-2024-a862e84c1b34
Cody (From Sourcegraph): Rating: 5
Codeium: Rating 5
Copilot: Popular; ✅ Mature; Use Cases: Fix Import, Explain Code, Commit Msg, Pull Req Msg generation, github integration, OpenAI Model. Generous Quota. Intellicode is another complementary Microsoft product for Autocompletion (working offline).
Continue:
AnythingLLM:
Following Entries Worth mentioning :
Tabnine: Good for enterprises, security focused. No BYOM. Allows Enterprise to fine tune Model and RAG.
Amazon Q: Not RAG/Context sensitive but still great for generating AWS solutions based code generation. Use Cases: Code Amazon Solutions. e.g. #Implement DynamoDB Function ===> Generates code for you. Java version upgrade, Semantic Security Scan, Flag Copyright violations, Paid Tier supports RAG.
OpenDevin: