Vespa.ai: an open source platform for building efficient AI search and recommendation systems
Vespa.ai is an open source AI search and recommendation platform that focuses on processing large-scale data to provide efficient search, recommendation and personalized services. It supports vector search, text search and structured data processing, combined with machine learning models to achieve real-time inference.Vespa can handle hundreds of millions of data, response speed...
NodeRAG: A Heterogeneous Graph-Based Tool for Accurate Information Retrieval and Generation
NodeRAG is an open source Retrieval Augmented Generation (RAG) system hosted on GitHub and developed by Terry-Xu-666. It optimizes information retrieval and generation through heterogeneous graph structures, significantly improving retrieval accuracy and contextual relevance.NodeRAG supports local deployment, provides a user-friendly interface and can be .....
Morphik Core: an open source RAG platform for processing multimodal data
Morphik Core is an open source project developed by the morphik-org team and hosted on GitHub. It used to be called DataBridge Core, but is now renamed Morphik Core.This tool is a database designed for AI applications that can process text, images...
Rankify: a Python toolkit supporting information retrieval and reordering
Rankify is an open source Python toolkit developed by the Data Science Group at the University of Innsbruck, Austria. It focuses on information retrieval, reordering and retrieval augmentation generation (RAG), providing a unified framework. The toolkit comes with 40 built-in pre-retrieved benchmark datasets, support for 7 retrieval techniques and 24 ...
HippoRAG: A multi-hop knowledge retrieval framework based on long term memory
HippoRAG is an open source framework developed by the OSU-NLP group at The Ohio State University, inspired by human long term memory mechanisms. It combines Retrieval Augmented Generation (RAG), Knowledge Graph, and Personalized PageRank techniques to help Large Language Models (LLMs) continuously integrate knowledge from external documents.Hippo.....
LettuceDetect: an efficient tool for detecting hallucinations in the RAG system
LettuceDetect is a lightweight open-source tool developed by KRLabsOrg that specializes in detecting hallucinatory content generated in Retrieval Augmented Generation (RAG) systems. It helps developers improve the accuracy of RAG systems by comparing context, questions and answers, and identifying parts of the answer that are not supported by the context...
dsRAG: A Retrieval Engine for Unstructured Data and Complex Queries
dsRAG is a high-performance retrieval engine designed to handle complex queries on unstructured data. It performs particularly well in handling challenging queries in dense text such as financial reports, legal documents, and academic papers. dsRAG employs three key approaches to improve performance: semantic segmentation, contextual self...
VideoRAG: A RAG framework for understanding ultra-long videos with support for multimodal retrieval and knowledge graph construction
VideoRAG is a retrieval-enhanced generative framework designed for processing and understanding very long contextual videos. The tool combines a graph-driven textual knowledge base with hierarchical multimodal context encoding to efficiently process hundreds of hours of video content on a single NVIDIA RTX 3090 GPU.VideoRAG works by moving...
PRAG: Parameterized Retrieval Augmentation Generation Tool for Improving the Performance of Q&A Systems
PRAG (Parametric Retrieval-Augmented Generation) is an innovative retrieval-augmented generation tool that aims to enhance generation by embedding external knowledge directly into the parameter space of a Large Language Model (LLM). The tool overcomes the limitations of traditional contextual retrieval-augmented generation methods...
ColiVara: Visual Embedding Based Document Storage and Retrieval Service
ColiVara is a document storage and retrieval service based on visual embedding technology. It eliminates the need for Optical Character Recognition (OCR) or text extraction and avoids the problem of broken forms or lost images.ColiVara supports over 100 file formats including PDF, DOCX, PPTX, etc. and is able to automatically intercept web pages...
Deeptrain: converting video content into large model retrievable information
Deeptrain is a platform focusing on AI video processing, which can effectively integrate video content into various AI applications through its advanced technology that supports over 200 language models. Users can train models directly by providing video URLs without having to download the videos.Deeptrain offers the ability to create video from...
UltraRAG: A One-Stop RAG System Solution to Simplify Data Construction and Model Fine-Tuning
UltraRAG is a RAG (Retrieval Augmented Generation) system solution jointly proposed by the THUNLP group at Tsinghua University, the NEUIR group at Northeastern University, Modelbest.Inc and the 9#AISoft team. The framework is based on agile deployment and modular construction, providing automated data construction, model fine-tuning, and inference evaluation technology body...
Fast GraphRAG: A Highly Accurate and Low-Cost Graphical Search Enhancement Generation Tool
Fast GraphRAG is an open source tool developed by Circlemind AI to enable efficient and accurate retrieval augmentation generation (RAG) through knowledge graph and PageRank algorithms. The tool intelligently adapts to the user's usage scenarios, data and query requirements, providing interpretable, low-cost and efficient ....
Ragas: assessing RAG recall QA accuracy and answer correlation
Ragas is a tool specifically designed to evaluate and optimize Retrieval Augmented Generation (RAG) systems. It provides a comprehensive set of evaluation metrics by analyzing the relationships between queries, retrieval contexts, and generated answers. These metrics include fidelity, answer relevance, context relevance, context recall, and on...
Orama: a high-performance full-text book and vector search engine
Orama is an open source high-performance search engine , written entirely in TypeScript , supporting full-text search , vector search and hybrid search.Orama is designed to work in any JavaScript runtime environment , providing fast and reliable search functionality . It is designed to be lightweight (less than 2KB) .....
XRAG: A Visual Evaluation Tool for Optimizing Retrieval Enhancement Generation Systems
XRAG (eXamining the Core) is a benchmarking framework designed for evaluating the underlying components of advanced retrieval augmentation generation (RAG) systems. By profiling and analyzing each core module, XRAG provides insights into how different configurations and components affect the overall performance of a RAG system. The framework supports a wide range of retrieval...
MiniRAG: Simplified Retrieval Enhanced Generation Framework, Entity Graph Index Recall Relevant Text Blocks
MiniRAG is an extremely simple Retrieval Augmented Generation (RAG) framework that aims to enable good RAG performance even for small models through heterogeneous graph indexing and lightweight topology-enhanced retrieval. Developed by the Hong Kong University Data Science Laboratory (HKUDS), the project focuses on addressing small language models (SLMs) in the context of existing RA...
Cognita: An Open Source Framework for Building Modular RAG Applications and Rapidly Testing Diverse RAG Strategies
Cognita is an open source framework developed by TrueFoundry to simplify the development of RAG (Retrieval-Augmented Generation) based applications. The framework provides a structured, modular solution that makes it easy to take RAG technology from the prototype stage...
Vanna: Generating Accurate SQL Queries from Text Using RAG Techniques
Vanna is an MIT-licensed open source Python framework focused on generating SQL queries through RAG (Retrieval Augmented Generation) techniques. Users can train RAG models, apply them to their own data, and then ask questions, and Vanna will return appropriate SQL queries. These queries can be automatically run in a database...