Llama 3 RAG on Google Colab

This repository contains an implementation of Retrieval-Augmented Generation (RAG) using the Llama 3 model on Google Colab. This project integrates LangChain and Chroma for document retrieval and embedding, demonstrating how to combine a retrieval system with a powerful language model for answering questions based on a custom dataset.

Overview

Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of a language model by incorporating external knowledge sources, such as documents or web pages. This project leverages RAG by:

Using Llama 3 as the language model to generate answers.
Employing Chroma as the vector store for efficient document retrieval.
Utilizing LangChain for chaining language model and retrieval steps.

Features

Document Retrieval: Retrieve relevant documents based on a user’s query.
LLM Response Generation: Generate answers using Llama 3, augmented by the retrieved documents.
Google Colab Integration: The entire setup is designed to run on Google Colab, enabling free and accessible experimentation.

Setup Instructions

1. Clone the repository

git clone https://github.com/SamuelJayasingh/Llama3_RAG_for_Web.git
cd Llama3_RAG_for_Web

2. Install Required Packages

Install the necessary Python libraries using pip. Ensure your environment includes the following dependencies:

pip install langchain chromadb flask pandas requests gradio ollama

3. Open in Google Colab

To run this project in Google Colab:

Go to Google Colab.
Upload the .ipynb notebook included in this repository.
Run the cells to set up the environment and run the Llama 3 RAG model.

4. Prepare Your Data

To test the model on your own dataset:

Upload your data (URLs or text) in a CSV format.
The URLs will be loaded and split into chunks using LangChain’s text splitter.
The model will generate embeddings using OllamaEmbeddings and store them in Chroma for retrieval.

5. Run the Retrieval and Generation Pipeline

After setting up your dataset, you can ask questions to the Llama 3 model. The system will:

Retrieve relevant documents from the Chroma vector store.
Use Llama 3 to generate an answer based on the retrieved context.

6. Using the Gradio Interface

This project includes a Gradio-based interface for interacting with the RAG pipeline. Launch the Gradio UI by running the code, then enter your question in the text box to get a response.

Example Usage

Load your documents (from URLs or CSV files) into the vector store.

Ask a question through the interface:

Question: What is the role of LangChain in this project?

The system retrieves relevant documents and Llama 3 generates a response:

LangChain helps to manage and chain the different components of the retrieval and generation process. It connects the document retrieval system with the language model to provide context-aware answers.

Key Technologies

Llama 3: The language model used to generate context-aware answers.
LangChain: A framework for integrating LLMs with external sources of data, like databases or APIs.
Chroma: A vector database used to store document embeddings and enable fast retrieval.
Google Colab: A free, cloud-based platform for running Python code, including machine learning projects.
Gradio: A web interface that allows easy interaction with machine learning models.

Resources

LangChain Documentation: LangChain
Chroma Documentation: Chroma
Llama 3 Model: Meta AI LLaMA
Google Colab: Google Colab

Contribution

If you have any suggestions to this README or about the Script, feel free to inform me. And if you liked, you are free to use it for yourself.(P.S. Star it too!! 😬 )
Your Contributions are much welcomed here!

Fork the project

Compile your work

Call in for a Pull Request

Credits: Samuel Jayasingh

Last Edited on: 15/10/2024

License

This project is licensed under the MIT License. See the LICENSE file for details.