BAGEL is an open source multimodal base model developed by the ByteDance Seed team and hosted on GitHub.It integrates text comprehension, image generation, and editing capabilities to support cross-modal tasks. The model has 7B active parameters (14B parameters in total) and uses Mixture-of-Tra...
DeepResearchAgent is an open source AI tool developed by SkyworkAI that focuses on automating deep research. It helps users quickly generate detailed research reports by combining search engines, web crawling and large-scale language modeling (LLM). Users simply enter a research topic or question, and the tool automatically searches...
Muscle-Mem is an open source Python tool hosted on GitHub and developed by pig-dot-dev. It is designed to provide behavioral caching capabilities for AI agents to help reduce large language model (LLM) calls in repetitive tasks, thereby increasing runtime speed, reducing variability, and saving costs....
Simple Subtitling is an open source audio subtitle generation tool that focuses on automatically generating subtitles and labeling speakers for video or audio files. Project developed by Jaesung Huh , hosted on GitHub , aims to provide a simple and efficient subtitle generation solution . Tools through the audio processing technology .....
arXiv Summarizer is an open source Python scripting tool, hosted on GitHub, designed to help users quickly access and generate summaries of academic papers from the arXiv platform. It utilizes the free Gemini API for efficient text summarization and is suitable for researchers, students and academic...
Sim Studio is an open source AI agent workflow building platform focused on helping users quickly design, test, and deploy large-scale language model (LLM) workflows through a lightweight, intuitive visual interface. Users can create complex multi-agent applications with drag-and-drop without deep programming. It supports this ...
Mad Professor (暴躁的教授读论文) is an open source AI academic tool designed for researchers and students to simplify the reading and analysis of academic papers. It integrates PDF processing, AI translation, RAG search, AI Q&A and voice interaction. Users can import PDF papers...
AIstudioProxyAPI is an open source project that uses Node.js and Playwright technology to convert the Gemini model dialog functionality of the Google AI Studio web version into a standard API connection by emulating the OpenAI API ...
Step1X-Edit is an open source image editing framework developed by the Stepfun AI team and hosted on GitHub It combines a multimodal large language model (Qwen-VL) and a diffusion transformer (DiT) to allow users to edit an image with simple natural language commands, such as changing the background, removing an object, or transforming the wind ....
Klavis AI is an open source platform focused on simplifying the use and integration of the Model Context Protocol (MCP), an open standard that allows AI applications to dynamically connect with external tools and data sources.Klavis AI offers Slack, Discord clients, hosted MCP servers, and simplicity...
RealtimeVoiceChat is an open source project focused on real-time, natural conversations with artificial intelligence via voice. Users use the microphone to input voice, the system captures the audio through the browser, quickly converts it to text, generates a reply from a large language model (LLM), and then converts the text to speech output, the whole...
MiMo is an open source large language modeling project developed by Xiaomi, focusing on mathematical reasoning and code generation. The core product is the MiMo-7B family of models, which consists of a base model (Base), a supervised fine-tuning model (SFT), a reinforcement learning model trained from the base model (RL-Zero), and a SFT model trained from...
Muyan-TTS is an open source text-to-speech (TTS) model designed for podcasting scenarios. It is pre-trained with over 100,000 hours of podcast audio data and supports zero-sample speech synthesis to generate high-quality natural speech. The model is built on Llama-3.2-3B, and combined with the SoVITS decoder, it provides high...
CAD-MCP is an open source project that allows users to control CAD software drawing operations through natural language commands. It combines natural language processing and CAD automation technology , so that users do not need to manually operate the CAD interface , just enter simple text commands to create and modify the drawing . The project supports a variety of ...
GraphGen is an open-source framework developed by OpenScienceLab, an AI lab in Shanghai, hosted on GitHub, focused on optimizing supervised fine-tuning of Large Language Models (LLMs) by guiding synthetic data generation through knowledge graphs. It constructs fine-grained knowledge graphs from source text, utilizing the expected calibration error...
ACI.dev is an open source infrastructure platform designed to provide AI intelligences with rapid integration to over 600 tools. It ensures that intelligences have secure access to tools such as Google Calendar, Slack, and Brave Search through multi-tenant authentication and fine-grained permissions management. developers can...
llm.pdf is an open source project that allows users to run Large Language Models (LLMs) directly in PDF files. This project, developed by EvanZhouDev and hosted on GitHub, demonstrates an innovative approach: compiling llama.cpp via Emscripten as ...
Abogen is an open source tool designed to quickly convert ePub, PDF or plain text files to high quality audio. It uses the Kokoro-82M model to generate natural and smooth speech, and also supports synchronized subtitle generation, making it suitable for audiobooks, video dubbing or learning aids. Users can choose...
Local Deep Research is an open source AI research assistant designed to help users conduct deep research and generate detailed reports for complex problems. It supports local running, allowing users to accomplish research tasks without relying on cloud services. The tool combines Local Large Language Modeling (LLM)...