60 Generative AI Projects for Your Resume

Boost your resume with these amazing Generative AI project ideas, each designed to provide practical experience and highlight your skills with the latest technologies.

Here's a breakdown of each project, relevant tutorials, and code to help you get started and the skills you'll develop.

Multimodal LLM Applications LLM Fine-Tuning Projects RAG (Retrieval Augmented Generation) Projects Agentic AI Projects Music and Audio Generation Projects
Medical Diagnostics App with GPT-4 Vision Fine Tune Phi-2 Model on Your Dataset End To End Advanced RAG Project using Open Source LLM Models And Groq Inferencing engine AI Agents from Scratch using Open Source AI Text to Song Generation (With Vocals + Music) App using Generative AI
Visual Question Answering with IDEFICS 9B Fine Tune a Multimodal LLM "IDEFICS 9B" for Visual Question Answering RAG Pipeline from Scratch Using OLlama Python & Llama2 AgentOps Library: Build Your Own AI Agents Monitoring Framework Text to Music Generation App using Generative AI
AI Voice Assistant App using Multimodal LLM "Llava" and Whisper Fine Tune Multimodal LLM "Idefics 2" using QLoRA RAG Application using Langchain, OpenAI and FAISS Build a Multi-Agent AI App from Scratch – no frameworks needed Generate Music using Text2Music AI Model MusicGen by Meta AI
OCR & VQA with Qwen2-VL Fine Tune Qwen2 VL Model using Llama Factory RAG Application using Langchain Mistral AI and Weviate db Autogen AI Agents: AI Debates – Pizza vs. Sushi Clone Any Voice to Generate Music and Speech
Chat with Video File using Qwen2 VL Fine-Tuning with ReFT: Create an Emoji LLM for Medical Diagnosis RAG Application Using OpenSource Framework LlamaIndex and Mistral-AI Production Grade AI Agents using LangGraph (Map Reduce Implementation)
Multimodal RAG with Qwen-2 and ColPali Fine Tune DeepSeek Model on your Custom Dataset RAG Pipeline Using Haystack and OpenAI Build an Agentic RAG using Crew AI
Janus 1.3B for Image Generation and RAG GRPO Crash Course: Fine-Tuning DeepSeek for MATH! RAG Application Using Haystack MistralAI Pinecone & FastAPI Build Multi-agent AI system for Investment Risk Analysis
Chat, Search & Summarize any Video using Vision AI Model Fine Tune Llama 3 using ORPO End To End Document Q&A RAG App With Gemma And Groq API Build a Research Assistant AI Agent using Crew AI
Multimodal AI Model for Radiology Reporting Train a Small Language Model for Disease Symptoms Building Real-Time RAG Pipeline With Mongodb and Pinecone ADVANCED Python AI Multi-Agent Project
MultiModal RAG Application Using LanceDB and LlamaIndex for Video Processing Make LLM Fine Tuning 5x Faster with Unsloth Chat With Multiple Documents using AstraDB and Langchain Academic Task and Learning Agent System
Multimodal RAG: Chat with PDFs (Images & Tables) Multi GPU Fine Tuning of LLM using DeepSpeed and Accelerate Built Powerful Multimodal RAG using Vertex AI(GCP), AstraDb and Langchain ClauseAI
MultiModal Summarizer RAG Based Chatbot With Memory(Chat History) Content Intelligence
Realtime Multimodal RAG Usecase with Google Gemini-Pro-Vision and Langchain EU Green Compliance FAQ Bot
End To End Resume Application Tracking System(ATS) Using Google Gemini Pro Vision LIM Model ShopGenie
Weather Disaster Management AI Agent
Career Assistant for Hackathons
AInsight LangGraph
Blog Writer Swarm
Business Meme Generator

Multimodal LLM Applications

1. Medical Diagnostics App with GPT-4 Vision

Difficulty Level: 3/5

Description:

This project uses a multimodal LLM for medical image analysis to aid in diagnostics.

Skills Gained:

Multimodal AI, medical image analysis, diagnostic applications.

Resources:

2. Visual Question Answering with IDEFICS 9B

Difficulty Level: 3/5

Description:

Develop a system that answers questions based on visual input using the IDEFICS 9B model. It involves managing visual data and answering questions based on the content of an image.

Skills Gained:

Visual Question Answering (VQA), multimodal models, image understanding.

Resources:

3. AI Voice Assistant App using Multimodal LLM "Llava" and Whisper

Difficulty Level: 4/5

Description:

Create a voice assistant that understands voice and visual inputs using Llava and Whisper. Combines voice recognition, natural language processing, and visual understanding.

Skills Gained:

Multimodal AI, voice recognition, natural language processing, assistant applications.

Resources:

4. OCR & VQA with Qwen2-VL

Difficulty Level: 3/5

Description:

Build a model specialized for optical character recognition and visual question answering using the Qwen2-VL model. Requires understanding of text extraction from images and answering questions about visual content.

Skills Gained:

OCR, VQA, multimodal models, image and text processing.

Resources:

5. Chat with Video File using Qwen2 VL

Difficulty Level: 3/5

Description:

Create an application that allows users to interact with video content by asking questions, leveraging Qwen2-VL. Involves processing video data to understand its content.

Skills Gained:

Video understanding, multimodal AI, question answering.

Resources:

6. Multimodal RAG with Qwen-2 and ColPali

Difficulty Level: 4/5

Description:

Combines multimodal models with Retrieval Augmented Generation to answer questions based on images, utilizing Qwen-2 and ColPali. It involves not only processing images but also integrating them with a retrieval system.

Skills Gained:

Multimodal RAG, image and text processing, retrieval systems.

Resources:

7. Janus 1.3B for Image Generation and RAG

Difficulty Level: 3/5

Description:

This project uses Janus 1.3B for image generation and retrieval-augmented generation tasks. Requires understanding image generation and RAG systems with a smaller language model.

Skills Gained:

Image generation, RAG, smaller LLM implementation.

Resources:

8. Chat, Search & Summarize any Video using Vision AI Model

Difficulty Level: 2/5

Description:

Focused on video understanding, allowing users to chat, search, and summarize video content. Involves complex processing tasks for video understanding, summarization, and searching.

Skills Gained:

Video processing, summarization, search, multimodal models.

Resources:

9. Multimodal AI Model for Radiology Reporting

Difficulty Level: 3/5

Description:

Develop a model to automate radiology reporting, integrating image and text data. Requires in depth knowledge of medical imaging and report generation.

Skills Gained:

Multimodal AI, medical imaging, report generation.

Resources:

10. MultiModal RAG Application Using LanceDB and LlamaIndex for Video Processing

Difficulty Level: 3/5

Description:

Builds a system that allows for querying of video content using LanceDB and LlamaIndex. Involves using these tools for video content processing and retrieval.

Skills Gained:

Video processing, RAG, vector databases, LlamaIndex.

Resources:

11. Multimodal RAG: Chat with PDFs (Images & Tables)

Difficulty Level: 2/5

Description:

Build a multimodal Retrieval-Augmented Generation (RAG) pipeline using LangChain and the Unstructured library to query complex PDFs containing various data types, leveraging LLMs like GPT-4 with vision.

Skills Gained:

Multimodal AI, data extraction.

Resources:

12. MultiModal Summarizer

Difficulty Level: 2/5

Description:

Create a summarization application that processes different types of media. Requires knowledge of summarization techniques and processing different media types.

Skills Gained:

Multimodal AI, summarization, media processing.

Resources:

13. Realtime Multimodal RAG Usecase with Google Gemini-Pro-Vision and Langchain

Difficulty Level: 2/5

Description:

Uses Google's Gemini Pro Vision with Langchain for multimodal RAG applications. Requires a deep understanding of both technologies.

Skills Gained:

Multimodal RAG, Google Gemini, Langchain.

Resources:

14. End To End Resume Application Tracking System(ATS) Using Google Gemini Pro Vision LIM Model

Difficulty Level: 3/5

Description:

Creates an ATS that leverages multimodal models to process resume content, including images and text. This is a full application for understanding and processing resume content for tracking.

Skills Gained:

Multimodal AI, resume processing, ATS development.

Resources:

LLM Fine-Tuning Projects


15. Fine Tune Phi-2 Model on Your Dataset

Difficulty Level: 3/5

Description:

Tailor a smaller language model, Phi-2, for specific tasks using fine-tuning. Requires understanding of model architectures and training procedures.

Skills Gained:

LLM fine-tuning, model adaptation, smaller model optimization.

Resources:

16. Fine Tune a Multimodal LLM "IDEFICS 9B" for Visual Question Answering

Difficulty Level: 4/5

Description:

Adapts the IDEFICS 9B model for visual question answering through fine-tuning. Requires knowledge of fine tuning and Visual Question Answering.

Skills Gained:

Multimodal fine-tuning, visual question answering, LLM customization.

Resources:

17. Fine Tune Multimodal LLM "Idefics 2" using QLoRA

Difficulty Level: 4/5

Description:

Uses QLoRA to fine-tune the multimodal Idefics 2 model. Requires deep understanding of both multimodal models and QLoRA technique.

Skills Gained:

Multimodal fine-tuning, QLoRA, model optimization.

Resources:

18. Fine Tune Qwen2 VL Model using Llama Factory

Difficulty Level: 3/5

Description:

Fine-tunes the Qwen2 VL model for specific applications using Llama Factory. Requires experience with both the model and the factory tool.

Skills Gained:

Multimodal fine-tuning, Llama Factory, model adaptation.

Resources:

19. Fine-Tuning with ReFT: Create an Emoji LLM for Medical Diagnosis

Difficulty Level: 4/5

Description:

Uses fine-tuning techniques to create a medical diagnosis model that generates emojis. Involves creatively applying fine-tuning to a medical and creative task.

Skills Gained:

Fine-tuning, medical diagnosis, creative LLM applications.

Resources:

20. Fine Tune DeepSeek Model on your Custom Dataset

Difficulty Level: 3/5

Description:

Trains the DeepSeek model on a custom dataset to tailor it for specific tasks. Requires dataset management skills as well as experience with fine-tuning.

Skills Gained:

LLM fine-tuning, model customization, dataset management.

Resources:

21. GRPO Crash Course: Fine-Tuning DeepSeek for MATH!

Difficulty Level: 1/5

Description:

Optimizes the DeepSeek model for math-related tasks using GRPO. Requires an understanding of math with LLMs and group optimization techniques.

Skills Gained:

LLM fine-tuning, GRPO, mathematical reasoning with LLMs.

Resources:

22. Fine Tune Llama 3 using ORPO

Difficulty Level: 3/5

Description:

Optimizes the Llama 3 model using the ORPO technique. Requires a strong understanding of fine tuning.

Skills Gained:

LLM fine-tuning, ORPO, model optimization.

Resources:

23. Train a Small Language Model for Disease Symptoms

Difficulty Level: 3/5

Description:

Creates a model specifically for identifying disease symptoms. Involves medical knowledge with a fine-tuned language model.

Skills Gained:

LLM fine-tuning, medical applications, symptom identification.

Resources:

24. Make LLM Fine Tuning 5x Faster with Unsloth

Difficulty Level: 2/5

Description:

Improves LLM fine-tuning speeds by using Unsloth. Requires an understanding of optimization techniques for fine tuning.

Skills Gained:

LLM fine-tuning, performance optimization, Unsloth.

Resources:

25. Multi GPU Fine Tuning of LLM using DeepSpeed and Accelerate

Difficulty Level: 3/5

Description:

Fine-tunes large language models using multiple GPUs with DeepSpeed and Accelerate. Requires advanced knowledge of distributed training techniques.

Skills Gained:

LLM fine-tuning, distributed training, DeepSpeed, Accelerate.

Resources:

RAG (Retrieval Augmented Generation) Projects


26. End To End Advanced RAG Project using Open Source LLM Models And Groq Inferencing engine

Difficulty Level: 3/5

Description:

Creates a full RAG pipeline with ingestion, retrieval, and generation stages. Requires experience with the full RAG pipeline.

Skills Gained:

RAG, end-to-end pipeline development, data ingestion, retrieval, generation.

Resources:

27. RAG Pipeline from Scratch Using OLlama Python & Llama2

Difficulty Level: 2/5

Description:

Builds a RAG pipeline using OLlama, Python, and Llama 2. Uses core libraries to create a RAG system.

Skills Gained:

RAG, OLlama, Llama 2, Python.

Resources:

28. RAG Application using Langchain, OpenAI and FAISS

Difficulty Level: 2/5

Description:

Implements RAG using Langchain, OpenAI, and the FAISS vector database. Uses popular libraries to create a RAG system.

Skills Gained:

RAG, Langchain, OpenAI, FAISS.

Resources:

29. RAG Application using Langchain Mistral AI and Weviate db

Difficulty Level: 3/5

Description:

Uses Mistral AI and Weviate as part of the RAG architecture. Requires a good understanding of both libraries.

Skills Gained:

RAG, Langchain, Mistral AI, Weaviate.

Resources:

30. RAG Application Using OpenSource Framework LlamaIndex and Mistral-AI

Difficulty Level: 3/5

Description:

Utilizes LlamaIndex and Mistral-AI for a RAG system. Uses LlamaIndex as an open source framework and Mistral-AI for its models.

Skills Gained:

RAG, LlamaIndex, Mistral AI, open-source framework.

Resources:

31. RAG Pipeline Using Haystack and OpenAI

Difficulty Level: 3/5

Description:

Builds a RAG pipeline with the Haystack framework and OpenAI. Requires understanding the haystack framework as well as OpenAI.

Skills Gained:

RAG, Haystack, OpenAI.

Resources:

32. RAG Application Using Haystack MistralAI Pinecone & FastAPI

Difficulty Level: 3/5

Description:

Uses Haystack, MistralAI, Pinecone, and FastAPI to create an end-to-end RAG application. Uses multiple technologies to make a more full app.

Skills Gained:

RAG, Haystack, MistralAI, Pinecone, FastAPI.

Resources:

33. End To End Document Q&A RAG App With Gemma And Groq API

Difficulty Level: 3/5

Description:

Implements a an end to end Document Q&A RAG App with Google Gemma And GRoq API

Skills Gained:

RAG, Google Gemma, Groq API.

Resources:

34. Building Real-Time RAG Pipeline With Mongodb and Pinecone

Difficulty Level: 3/5

Description:

A RAG pipeline for real-time applications using MongoDB and Pinecone. Includes building a real time pipeline.

Skills Gained:

RAG, real-time processing, MongoDB, Pinecone.

35. Chat With Multiple Documents using AstraDB and Langchain

Difficulty Level: 3/5

Description:

Creates a chatbot that can process multiple document types using AstraDB and Langchain. Uses these two tools to make an application.

Skills Gained:

RAG, chatbots, AstraDB, Langchain.

Resources:

36. Built Powerful Multimodal RAG using Vertex AI(GCP), AstraDb and Langchain

Difficulty Level: 3/5

Description:

Uses Vertex AI, AstraDB, and Langchain to build a multimodal RAG. Incorporates multiple cloud services in a RAG system.

Skills Gained:

Multimodal RAG, Vertex AI, AstraDB, Langchain.

Resources:

37. RAG Based Chatbot With Memory(Chat History)

Difficulty Level: 2/5

Description:

Creates a RAG-based chatbot with chat history. Includes the extra level of complexity of chat history.

Skills Gained:

RAG, chatbots, chat history management, memory implementation.

Resources:

Agentic AI Projects


38. AI Agents from Scratch using Open Source AI

Difficulty Level: 3/5

Description:

Build AI agents from scratch using open-source AI tools to summarize, write, and sanitize sensitive information without pre-built frameworks.

Skills Gained:

AI agent development, open-source AI, summarization, information sanitization.

Resources:

39. AgentOps Library: Build Your Own AI Agents Monitoring Framework

Difficulty Level: 3/5

Description:

Build an AgentOps library in Python to monitor AI agents effectively, including installation, key metrics monitoring, and data visualization.

Skills Gained:

AI agent monitoring, AgentOps library development, Python, data visualization.

Resources:

40. Build a Multi-Agent AI App from Scratch – no frameworks needed

Difficulty Level: 3/5

Description:

Build a Multi-Agent AI System from Scratch using Python and OpenAI's GPT-4o model with a Streamlit web interface for specialized tasks.

Skills Gained:

Multi-agent systems, Python, OpenAI GPT-4o, Streamlit web interface.

Resources:

41. Autogen AI Agents: AI Debates – Pizza vs. Sushi

Difficulty Level: 3/5

Description:

Use Autogen to enable LLMs like Claude and GPT-4o to debate "Which is tastier, Pizza or Sushi?".

Skills Gained:

Multi-agent systems, Autogen framework, LLM collaboration, agent role management.

Resources:

42. Production Grade AI Agents using LangGraph (Map Reduce Implementation)

Difficulty Level: 3/5

Description:

Leverage the LangGraph framework to build robust, stateful, multi-actor applications using map-reduce patterns.

Skills Gained:

LangGraph framework, stateful applications, multi-actor applications, map-reduce patterns.

Resources:

43. Build an Agentic RAG using Crew AI

Difficulty Level: 3/5

Description:

Enhance AI capabilities by combining retrieval mechanisms with generative models to create intelligent, autonomous agents using Crew AI.

Skills Gained:

Agentic RAG, Crew AI, retrieval mechanisms, generative models, autonomous agents.

Resources:

44. Build Multi-agent AI system for Investment Risk Analysis

Difficulty Level: 3/5

Description:

Create a multi-agent AI system for investment risk analysis. Configure agents to monitor market data, develop trading strategies.

Skills Gained:

Agentic RAG, Crew AI, generative models.

Resources:

45. Build a Research Assistant AI Agent using Crew AI

Difficulty Level: 3/5

Description:

Build a simple AI agent for healthcare research using the Crew.ai framework.

Skills Gained:

Agentic RAG, Crew AI, retrieval mechanisms, generative models, autonomous agents.

Resources:

46. ADVANCED Python AI Multi-Agent Project

Difficulty Level: 3/5

Description:

Build an advanced multi-agent AI app through Python, Langflow, Astra DB, Streamlit, and more.

Skills Gained:

Agentic RAG, Streamlit, Langflow.

Resources:

47. Academic Task and Learning Agent System

Difficulty Level: 4/5

Description:

Build an intelligent multi-agent system that transforms the way students manage their academic life using LangGraph's workflow framework.

Skills Gained:

Multi-agent systems, workflow orchestration, personalized academic support.

Resources:

48. ClauseAI

Difficulty Level: 3/5

Description:

Develop an AI agent to assist with legal clause analysis and management.

Skills Gained:

Legal AI, text analysis, clause management.

Resources:

49. Content Intelligence

Difficulty Level: 3/5

Description:

Create an AI agent for content analysis and intelligence gathering.

Skills Gained:

Content analysis, intelligence gathering, NLP techniques.

Resources:

50. EU Green Compliance FAQ Bot

Difficulty Level: 3/5

Description:

Build a FAQ bot to assist with EU Green Compliance queries.

Skills Gained:

Compliance assistance, FAQ bots, question answering systems.

Resources:

51. ShopGenie

Difficulty Level: 3/5

Description:

Develop an AI shopping assistant to help users find and compare products.

Skills Gained:

Shopping assistance, product comparison, AI recommendations.

Resources:

52. Weather Disaster Management AI Agent

Difficulty Level: 4/5

Description:

Create an AI agent for managing and responding to weather disasters.

Skills Gained:

Disaster management, weather forecasting, emergency response.

Resources:

53. Career Assistant for Hackathons

Difficulty Level: 3/5

Description:

Develop an AI-powered mentor designed to simplify and support your journey in Generative AI learning, Resume preparation, Interview assistant and job hunting.

Skills Gained:

Hackathon preparation, career assistance, AI guidance.

Resources:

54. AInsight LangGraph

Difficulty Level: 3/5

Description:

Create an AI agent to provide insights and analysis using LangGraph. AInsight automatically collects, processes, and summarizes AI/ML news for general audiences.

Skills Gained:

Data analysis, insights generation, LangGraph framework.

Resources:

55. Blog Writer Swarm

Difficulty Level: 3/5

Description:

Develop a swarm of AI agents to assist with blog writing and content creation.

Skills Gained:

Content creation, blog writing, swarm intelligence.

Resources:

56. Business Meme Generator

Difficulty Level: 3/5

Description:

Build an AI agent to generate business-themed memes for marketing and social media.

Skills Gained:

Meme generation, marketing assistance, AI creativity.

Resources:

Music and Audio Generation Projects


57. Text to Song Generation (With Vocals + Music) App using Generative AI

Difficulty Level: 3/5

Description:

Creates an application that generates songs, including vocals, from text using generative AI.

Skills Gained:

Audio generation, music generation, generative AI.

Resources:

58. Text to Music Generation App using Generative AI

Difficulty Level: 3/5

Description:

Builds an application that generates music from text using generative AI.

Skills Gained:

Music generation, generative AI.

Resources:

59. Generate Music using Text2Music AI Model MusicGen by Meta AI

Difficulty Level: 2/5

Description:

Generates music from text using the Text2Music AI Model MusicGen by Meta AI.

Skills Gained:

Music generation, generative AI.

Resources:

60. Clone Any Voice to Generate Music and Speech

Difficulty Level: 2/5

Description:

Build an application that generates music and speech by cloning voices.

Skills Gained:

Music generation, generative AI.

Resources: