ML Tools

Overview

High level summary of various tools/Technologies used in ML.

:

. . . .

Tools Like ChatGPT

  • ChatGPT
  • Gemini (Google), Bard ?

Misc Tools

  • DALL-E3
  • Stable Diffusion

Libraries

  • PyTorch
  • TensorFlow

ML Apps

  • Lumen 5 - Text to Videos for social media or learning
  • Pictory AI - Summarize Vidoes & create short highlights
  • Remove.bg - Background image remover
  • QuillBot - Paraphrase, summarize and refine your writing
  • Missive - Streamline your email and chat.
  • Notion AI - Taking notes, managing tasks

Code IDEs Comparison

Key Concepts: RAG, Embeddings, LLM

   Note: Code assistants could be:

     - Purely based on LLM.
     - Use RAG for better contextual awareness.
       Further (external or internal) search can enhance Context.

   RAG is a game changer for AI Coding Assistents.

   Context Creation is a fine balance of following components:
     - Repo level functions and comments. (e.g. Sourcegraph Cody)
     - Local Surrounding code structure
     - Imported Modules, Dependencies
     - Dynamic Fetching of API Documentation (Cody, Codeium)
     - Analyse Commit history ??? (for hot areas of changes??)

   Question: How embeddings used for RAG ?

   Embeddings applied at chunks e.g. Functions, Classes, Files.
   Embeddings are stored and managed in Vector Databases.

   Embedding Models: OpenAPI text-embedding-ada-002, code-bert,
                     FAISS (Facebook), etc.

   Vector Databases: FAISS (Self Host), PineCone, Weaviate

   Vector Databases support different algorithms to search
   based on given embedding of user query.
   e.g. KNN, Hybrid Vector-Based+Keyword


Synopsis


     Cody              - Open
     Codeium           - BlackBox Generous Windsurf
     Copilot           - Microsoft 
     Cursor            - 

     Continue          - Open  
     OpenDevin         - 
     AnythingLLM



  • See https://medium.com/@justinmilner/the-top-coding-assistant-platforms-of-july-2024-a862e84c1b34

  • Cody (From Sourcegraph): Rating: 5

    • Initially source code search company. Well Funded.
    • ✅ Advanced Search, Advanced Context.
    • Great RAG implemention (Hybrid embeddings and keywords) ❌ No Custom RAG Stack.
    • ✅ Open Source 3.4K stars and ✅ BYOM.
  • Codeium: Rating 5

    • From Windsurf Makers; ✅ Generous Free Tier
    • Great RAG capability. ❌ Not BYOM, Not BYOE
  • Copilot: Popular; ✅ Mature; Use Cases: Fix Import, Explain Code, Commit Msg, Pull Req Msg generation, github integration, OpenAI Model. Generous Quota. Intellicode is another complementary Microsoft product for Autocompletion (working offline).

  • Continue:

    • Most Configurable (Using text config file)
    • ✅ BYOM and Custom RAG Stack
    • Chat, Autocomplete, Edit, Custom Actions with in IDE
    • ✅ Free ; Also Paid with Proxy and Enterprise Support.
  • AnythingLLM:

Following Entries Worth mentioning :

  • Tabnine: Good for enterprises, security focused. No BYOM. Allows Enterprise to fine tune Model and RAG.

  • Amazon Q: Not RAG/Context sensitive but still great for generating AWS solutions based code generation. Use Cases: Code Amazon Solutions. e.g. #Implement DynamoDB Function ===> Generates code for you. Java version upgrade, Semantic Security Scan, Flag Copyright violations, Paid Tier supports RAG.

  • OpenDevin:

    • ✅ BYOM and Custom RAG Stack
    • Non-Profit. Research. In Alpha Stage Only.
    • Designed to replicate Devin, an autonomous AI software engineer

Todo: Build Semantic Search Index for Notes

  • Use FAISS embeddings VectorDB of your own notes and do search.