60 Generative AI Projects for Your Resume

Boost your resume with these amazing Generative AI project ideas, each designed to provide practical experience and highlight your skills with the latest technologies.

Here's a breakdown of each project, relevant tutorials, and code to help you get started and the skills you'll develop.

Multimodal LLM Applications	LLM Fine-Tuning Projects	RAG (Retrieval Augmented Generation) Projects	Agentic AI Projects	Music and Audio Generation Projects
Medical Diagnostics App with GPT-4 Vision	Fine Tune Phi-2 Model on Your Dataset	End To End Advanced RAG Project using Open Source LLM Models And Groq Inferencing engine	AI Agents from Scratch using Open Source AI	Text to Song Generation (With Vocals + Music) App using Generative AI
Visual Question Answering with IDEFICS 9B	Fine Tune a Multimodal LLM "IDEFICS 9B" for Visual Question Answering	RAG Pipeline from Scratch Using OLlama Python & Llama2	AgentOps Library: Build Your Own AI Agents Monitoring Framework	Text to Music Generation App using Generative AI
AI Voice Assistant App using Multimodal LLM "Llava" and Whisper	Fine Tune Multimodal LLM "Idefics 2" using QLoRA	RAG Application using Langchain, OpenAI and FAISS	Build a Multi-Agent AI App from Scratch – no frameworks needed	Generate Music using Text2Music AI Model MusicGen by Meta AI
OCR & VQA with Qwen2-VL	Fine Tune Qwen2 VL Model using Llama Factory	RAG Application using Langchain Mistral AI and Weviate db	Autogen AI Agents: AI Debates – Pizza vs. Sushi	Clone Any Voice to Generate Music and Speech
Chat with Video File using Qwen2 VL	Fine-Tuning with ReFT: Create an Emoji LLM for Medical Diagnosis	RAG Application Using OpenSource Framework LlamaIndex and Mistral-AI	Production Grade AI Agents using LangGraph (Map Reduce Implementation)
Multimodal RAG with Qwen-2 and ColPali	Fine Tune DeepSeek Model on your Custom Dataset	RAG Pipeline Using Haystack and OpenAI	Build an Agentic RAG using Crew AI
Janus 1.3B for Image Generation and RAG	GRPO Crash Course: Fine-Tuning DeepSeek for MATH!	RAG Application Using Haystack MistralAI Pinecone & FastAPI	Build Multi-agent AI system for Investment Risk Analysis
Chat, Search & Summarize any Video using Vision AI Model	Fine Tune Llama 3 using ORPO	End To End Document Q&A RAG App With Gemma And Groq API	Build a Research Assistant AI Agent using Crew AI
Multimodal AI Model for Radiology Reporting	Train a Small Language Model for Disease Symptoms	Building Real-Time RAG Pipeline With Mongodb and Pinecone	ADVANCED Python AI Multi-Agent Project
MultiModal RAG Application Using LanceDB and LlamaIndex for Video Processing	Make LLM Fine Tuning 5x Faster with Unsloth	Chat With Multiple Documents using AstraDB and Langchain	Academic Task and Learning Agent System
Multimodal RAG: Chat with PDFs (Images & Tables)	Multi GPU Fine Tuning of LLM using DeepSpeed and Accelerate	Built Powerful Multimodal RAG using Vertex AI(GCP), AstraDb and Langchain	ClauseAI
MultiModal Summarizer		RAG Based Chatbot With Memory(Chat History)	Content Intelligence
Realtime Multimodal RAG Usecase with Google Gemini-Pro-Vision and Langchain			EU Green Compliance FAQ Bot
End To End Resume Application Tracking System(ATS) Using Google Gemini Pro Vision LIM Model			ShopGenie
			Weather Disaster Management AI Agent
			Career Assistant for Hackathons
			AInsight LangGraph
			Blog Writer Swarm
			Business Meme Generator

Multimodal LLM Applications

1. Medical Diagnostics App with GPT-4 Vision

Difficulty Level: 3/5

Description:

This project uses a multimodal LLM for medical image analysis to aid in diagnostics.

Skills Gained:

Multimodal AI, medical image analysis, diagnostic applications.

Resources:

2. Visual Question Answering with IDEFICS 9B

Difficulty Level: 3/5

Description:

Develop a system that answers questions based on visual input using the IDEFICS 9B model. It involves managing visual data and answering questions based on the content of an image.

Skills Gained:

Visual Question Answering (VQA), multimodal models, image understanding.

Resources:

3. AI Voice Assistant App using Multimodal LLM "Llava" and Whisper

Difficulty Level: 4/5

Description:

Create a voice assistant that understands voice and visual inputs using Llava and Whisper. Combines voice recognition, natural language processing, and visual understanding.

Skills Gained:

Multimodal AI, voice recognition, natural language processing, assistant applications.

Resources:

4. OCR & VQA with Qwen2-VL

Difficulty Level: 3/5

Description:

Build a model specialized for optical character recognition and visual question answering using the Qwen2-VL model. Requires understanding of text extraction from images and answering questions about visual content.

Skills Gained:

OCR, VQA, multimodal models, image and text processing.

Resources:

5. Chat with Video File using Qwen2 VL

Difficulty Level: 3/5

Description:

Create an application that allows users to interact with video content by asking questions, leveraging Qwen2-VL. Involves processing video data to understand its content.

Skills Gained:

Video understanding, multimodal AI, question answering.

Resources:

6. Multimodal RAG with Qwen-2 and ColPali

Difficulty Level: 4/5

Description:

Combines multimodal models with Retrieval Augmented Generation to answer questions based on images, utilizing Qwen-2 and ColPali. It involves not only processing images but also integrating them with a retrieval system.

Skills Gained:

Multimodal RAG, image and text processing, retrieval systems.

Resources:

7. Janus 1.3B for Image Generation and RAG

Difficulty Level: 3/5

Description:

This project uses Janus 1.3B for image generation and retrieval-augmented generation tasks. Requires understanding image generation and RAG systems with a smaller language model.

Skills Gained:

Image generation, RAG, smaller LLM implementation.

Resources:

8. Chat, Search & Summarize any Video using Vision AI Model

Difficulty Level: 2/5

Description:

Focused on video understanding, allowing users to chat, search, and summarize video content. Involves complex processing tasks for video understanding, summarization, and searching.

Skills Gained:

Video processing, summarization, search, multimodal models.

Resources:

9. Multimodal AI Model for Radiology Reporting

Difficulty Level: 3/5

Description:

Develop a model to automate radiology reporting, integrating image and text data. Requires in depth knowledge of medical imaging and report generation.

Skills Gained:

Multimodal AI, medical imaging, report generation.

Resources:

10. MultiModal RAG Application Using LanceDB and LlamaIndex for Video Processing

Difficulty Level: 3/5

Description:

Builds a system that allows for querying of video content using LanceDB and LlamaIndex. Involves using these tools for video content processing and retrieval.

Skills Gained:

Video processing, RAG, vector databases, LlamaIndex.

Resources:

11. Multimodal RAG: Chat with PDFs (Images & Tables)

Difficulty Level: 2/5

Description:

Build a multimodal Retrieval-Augmented Generation (RAG) pipeline using LangChain and the Unstructured library to query complex PDFs containing various data types, leveraging LLMs like GPT-4 with vision.

Skills Gained:

Multimodal AI, data extraction.

Resources:

12. MultiModal Summarizer

Difficulty Level: 2/5

Description:

Create a summarization application that processes different types of media. Requires knowledge of summarization techniques and processing different media types.

Skills Gained:

Multimodal AI, summarization, media processing.

Resources:

13. Realtime Multimodal RAG Usecase with Google Gemini-Pro-Vision and Langchain

Difficulty Level: 2/5

Description:

Uses Google's Gemini Pro Vision with Langchain for multimodal RAG applications. Requires a deep understanding of both technologies.

Skills Gained:

Multimodal RAG, Google Gemini, Langchain.

Resources:

14. End To End Resume Application Tracking System(ATS) Using Google Gemini Pro Vision LIM Model

Difficulty Level: 3/5

Description:

Creates an ATS that leverages multimodal models to process resume content, including images and text. This is a full application for understanding and processing resume content for tracking.

Skills Gained:

Multimodal AI, resume processing, ATS development.

Resources:

LLM Fine-Tuning Projects

15. Fine Tune Phi-2 Model on Your Dataset

Difficulty Level: 3/5

Description:

Tailor a smaller language model, Phi-2, for specific tasks using fine-tuning. Requires understanding of model architectures and training procedures.

Skills Gained:

LLM fine-tuning, model adaptation, smaller model optimization.

Resources:

16. Fine Tune a Multimodal LLM "IDEFICS 9B" for Visual Question Answering

Difficulty Level: 4/5

Description:

Adapts the IDEFICS 9B model for visual question answering through fine-tuning. Requires knowledge of fine tuning and Visual Question Answering.

Skills Gained:

Multimodal fine-tuning, visual question answering, LLM customization.

Resources:

17. Fine Tune Multimodal LLM "Idefics 2" using QLoRA

Difficulty Level: 4/5

Description:

Uses QLoRA to fine-tune the multimodal Idefics 2 model. Requires deep understanding of both multimodal models and QLoRA technique.

Skills Gained:

Multimodal fine-tuning, QLoRA, model optimization.

Resources:

18. Fine Tune Qwen2 VL Model using Llama Factory

Difficulty Level: 3/5

Description:

Fine-tunes the Qwen2 VL model for specific applications using Llama Factory. Requires experience with both the model and the factory tool.

Skills Gained:

Multimodal fine-tuning, Llama Factory, model adaptation.

Resources:

19. Fine-Tuning with ReFT: Create an Emoji LLM for Medical Diagnosis

Difficulty Level: 4/5

Description:

Uses fine-tuning techniques to create a medical diagnosis model that generates emojis. Involves creatively applying fine-tuning to a medical and creative task.

Skills Gained:

Fine-tuning, medical diagnosis, creative LLM applications.

Resources:

20. Fine Tune DeepSeek Model on your Custom Dataset

Difficulty Level: 3/5

Description:

Trains the DeepSeek model on a custom dataset to tailor it for specific tasks. Requires dataset management skills as well as experience with fine-tuning.

Skills Gained:

LLM fine-tuning, model customization, dataset management.

Resources:

21. GRPO Crash Course: Fine-Tuning DeepSeek for MATH!

Difficulty Level: 1/5

Description:

Optimizes the DeepSeek model for math-related tasks using GRPO. Requires an understanding of math with LLMs and group optimization techniques.

Skills Gained:

LLM fine-tuning, GRPO, mathematical reasoning with LLMs.

Resources:

22. Fine Tune Llama 3 using ORPO

Difficulty Level: 3/5

Description:

Optimizes the Llama 3 model using the ORPO technique. Requires a strong understanding of fine tuning.

Skills Gained:

LLM fine-tuning, ORPO, model optimization.

Resources:

23. Train a Small Language Model for Disease Symptoms

Difficulty Level: 3/5

Description:

Creates a model specifically for identifying disease symptoms. Involves medical knowledge with a fine-tuned language model.

Skills Gained:

LLM fine-tuning, medical applications, symptom identification.

Resources:

24. Make LLM Fine Tuning 5x Faster with Unsloth

Difficulty Level: 2/5

Description:

Improves LLM fine-tuning speeds by using Unsloth. Requires an understanding of optimization techniques for fine tuning.

Skills Gained:

LLM fine-tuning, performance optimization, Unsloth.

Resources:

25. Multi GPU Fine Tuning of LLM using DeepSpeed and Accelerate

Difficulty Level: 3/5

Description:

Fine-tunes large language models using multiple GPUs with DeepSpeed and Accelerate. Requires advanced knowledge of distributed training techniques.

Skills Gained:

LLM fine-tuning, distributed training, DeepSpeed, Accelerate.

Resources:

RAG (Retrieval Augmented Generation) Projects

26. End To End Advanced RAG Project using Open Source LLM Models And Groq Inferencing engine

Difficulty Level: 3/5

Description:

Creates a full RAG pipeline with ingestion, retrieval, and generation stages. Requires experience with the full RAG pipeline.

Skills Gained:

RAG, end-to-end pipeline development, data ingestion, retrieval, generation.

Resources:

27. RAG Pipeline from Scratch Using OLlama Python & Llama2

Difficulty Level: 2/5

Description:

Builds a RAG pipeline using OLlama, Python, and Llama 2. Uses core libraries to create a RAG system.

Skills Gained:

RAG, OLlama, Llama 2, Python.

Resources:

28. RAG Application using Langchain, OpenAI and FAISS

Difficulty Level: 2/5

Description:

Implements RAG using Langchain, OpenAI, and the FAISS vector database. Uses popular libraries to create a RAG system.

Skills Gained:

RAG, Langchain, OpenAI, FAISS.

Resources:

29. RAG Application using Langchain Mistral AI and Weviate db

Difficulty Level: 3/5

Description:

Uses Mistral AI and Weviate as part of the RAG architecture. Requires a good understanding of both libraries.

Skills Gained:

RAG, Langchain, Mistral AI, Weaviate.

Resources:

30. RAG Application Using OpenSource Framework LlamaIndex and Mistral-AI

Difficulty Level: 3/5

Description:

Utilizes LlamaIndex and Mistral-AI for a RAG system. Uses LlamaIndex as an open source framework and Mistral-AI for its models.

Skills Gained:

RAG, LlamaIndex, Mistral AI, open-source framework.

Resources:

31. RAG Pipeline Using Haystack and OpenAI

Difficulty Level: 3/5

Description:

Builds a RAG pipeline with the Haystack framework and OpenAI. Requires understanding the haystack framework as well as OpenAI.

Skills Gained:

RAG, Haystack, OpenAI.

Resources:

32. RAG Application Using Haystack MistralAI Pinecone & FastAPI

Difficulty Level: 3/5

Description:

Uses Haystack, MistralAI, Pinecone, and FastAPI to create an end-to-end RAG application. Uses multiple technologies to make a more full app.

Skills Gained:

RAG, Haystack, MistralAI, Pinecone, FastAPI.

Resources:

33. End To End Document Q&A RAG App With Gemma And Groq API

Difficulty Level: 3/5

Description:

Implements a an end to end Document Q&A RAG App with Google Gemma And GRoq API

Skills Gained:

RAG, Google Gemma, Groq API.

Resources:

34. Building Real-Time RAG Pipeline With Mongodb and Pinecone

Difficulty Level: 3/5

Description:

A RAG pipeline for real-time applications using MongoDB and Pinecone. Includes building a real time pipeline.

Skills Gained:

RAG, real-time processing, MongoDB, Pinecone.

Resources:

35. Chat With Multiple Documents using AstraDB and Langchain

Difficulty Level: 3/5

Description:

Creates a chatbot that can process multiple document types using AstraDB and Langchain. Uses these two tools to make an application.

Skills Gained:

RAG, chatbots, AstraDB, Langchain.

Resources:

36. Built Powerful Multimodal RAG using Vertex AI(GCP), AstraDb and Langchain

Difficulty Level: 3/5

Description:

Uses Vertex AI, AstraDB, and Langchain to build a multimodal RAG. Incorporates multiple cloud services in a RAG system.

Skills Gained:

Multimodal RAG, Vertex AI, AstraDB, Langchain.

Resources:

37. RAG Based Chatbot With Memory(Chat History)

Difficulty Level: 2/5

Description:

Creates a RAG-based chatbot with chat history. Includes the extra level of complexity of chat history.

Skills Gained:

RAG, chatbots, chat history management, memory implementation.

Resources:

Agentic AI Projects

38. AI Agents from Scratch using Open Source AI

Difficulty Level: 3/5

Description:

Build AI agents from scratch using open-source AI tools to summarize, write, and sanitize sensitive information without pre-built frameworks.

Skills Gained:

AI agent development, open-source AI, summarization, information sanitization.

Resources:

39. AgentOps Library: Build Your Own AI Agents Monitoring Framework

Difficulty Level: 3/5

Description:

Build an AgentOps library in Python to monitor AI agents effectively, including installation, key metrics monitoring, and data visualization.

Skills Gained:

AI agent monitoring, AgentOps library development, Python, data visualization.

Resources:

40. Build a Multi-Agent AI App from Scratch – no frameworks needed

Difficulty Level: 3/5

Description:

Build a Multi-Agent AI System from Scratch using Python and OpenAI's GPT-4o model with a Streamlit web interface for specialized tasks.

Skills Gained:

Multi-agent systems, Python, OpenAI GPT-4o, Streamlit web interface.

Resources:

41. Autogen AI Agents: AI Debates – Pizza vs. Sushi

Difficulty Level: 3/5

Description:

Use Autogen to enable LLMs like Claude and GPT-4o to debate "Which is tastier, Pizza or Sushi?".

Skills Gained:

Multi-agent systems, Autogen framework, LLM collaboration, agent role management.

Resources:

42. Production Grade AI Agents using LangGraph (Map Reduce Implementation)

Difficulty Level: 3/5

Description:

Leverage the LangGraph framework to build robust, stateful, multi-actor applications using map-reduce patterns.

Skills Gained:

LangGraph framework, stateful applications, multi-actor applications, map-reduce patterns.

Resources:

43. Build an Agentic RAG using Crew AI

Difficulty Level: 3/5

Description:

Enhance AI capabilities by combining retrieval mechanisms with generative models to create intelligent, autonomous agents using Crew AI.

Skills Gained:

Agentic RAG, Crew AI, retrieval mechanisms, generative models, autonomous agents.

Resources:

44. Build Multi-agent AI system for Investment Risk Analysis

Difficulty Level: 3/5

Description:

Create a multi-agent AI system for investment risk analysis. Configure agents to monitor market data, develop trading strategies.

Skills Gained:

Agentic RAG, Crew AI, generative models.

Resources:

45. Build a Research Assistant AI Agent using Crew AI

Difficulty Level: 3/5

Description:

Build a simple AI agent for healthcare research using the Crew.ai framework.

Skills Gained:

Agentic RAG, Crew AI, retrieval mechanisms, generative models, autonomous agents.

Resources:

46. ADVANCED Python AI Multi-Agent Project

Difficulty Level: 3/5

Description:

Build an advanced multi-agent AI app through Python, Langflow, Astra DB, Streamlit, and more.

Skills Gained:

Agentic RAG, Streamlit, Langflow.

Resources:

47. Academic Task and Learning Agent System

Difficulty Level: 4/5

Description:

Build an intelligent multi-agent system that transforms the way students manage their academic life using LangGraph's workflow framework.

Skills Gained:

Multi-agent systems, workflow orchestration, personalized academic support.

Resources:

Code

48. ClauseAI

Difficulty Level: 3/5

Description:

Develop an AI agent to assist with legal clause analysis and management.

Skills Gained:

Legal AI, text analysis, clause management.

Resources:

Code

49. Content Intelligence

Difficulty Level: 3/5

Description:

Create an AI agent for content analysis and intelligence gathering.

Skills Gained:

Content analysis, intelligence gathering, NLP techniques.

Resources:

Code

50. EU Green Compliance FAQ Bot

Difficulty Level: 3/5

Description:

Build a FAQ bot to assist with EU Green Compliance queries.

Skills Gained:

Compliance assistance, FAQ bots, question answering systems.

Resources:

Code

51. ShopGenie

Difficulty Level: 3/5

Description:

Develop an AI shopping assistant to help users find and compare products.

Skills Gained:

Shopping assistance, product comparison, AI recommendations.

Resources:

Code

52. Weather Disaster Management AI Agent

Difficulty Level: 4/5

Description:

Create an AI agent for managing and responding to weather disasters.

Skills Gained:

Disaster management, weather forecasting, emergency response.

Resources:

Code

53. Career Assistant for Hackathons

Difficulty Level: 3/5

Description:

Develop an AI-powered mentor designed to simplify and support your journey in Generative AI learning, Resume preparation, Interview assistant and job hunting.

Skills Gained:

Hackathon preparation, career assistance, AI guidance.

Resources:

Code

54. AInsight LangGraph

Difficulty Level: 3/5

Description:

Create an AI agent to provide insights and analysis using LangGraph. AInsight automatically collects, processes, and summarizes AI/ML news for general audiences.

Skills Gained:

Data analysis, insights generation, LangGraph framework.

Resources:

Code

55. Blog Writer Swarm

Difficulty Level: 3/5

Description:

Develop a swarm of AI agents to assist with blog writing and content creation.

Skills Gained:

Content creation, blog writing, swarm intelligence.

Resources:

Code

56. Business Meme Generator

Difficulty Level: 3/5

Description:

Build an AI agent to generate business-themed memes for marketing and social media.

Skills Gained:

Meme generation, marketing assistance, AI creativity.

Resources:

Code

Music and Audio Generation Projects

57. Text to Song Generation (With Vocals + Music) App using Generative AI

Difficulty Level: 3/5

Description:

Creates an application that generates songs, including vocals, from text using generative AI.