Overseas access: www.kdjingpai.com

Bookmark Us

Local Deployment of Open Source Large Modeling Tools

 Submit Website

LMCache: A Key-Value Cache Optimization Tool for Accelerating Reasoning on Large Language Models
LMCache is an open source key-value (KV) cache optimization tool designed to improve the efficiency of Large Language Model (LLM) reasoning. It significantly reduces inference time and GPU resource consumption by caching and reusing the intermediate computation results (key-value caching) of the model, which is especially suitable for long context scenarios.LMCache works with vL...
08-04 1.7 K0kudos
FastDeploy: an open source tool for rapid deployment of AI models
FastDeploy is an open source tool developed by the PaddlePaddle team that focuses on rapid deployment of deep learning models. It supports a wide range of hardware and frameworks, covering more than 20 scenarios such as image, video, text and speech, and contains more than 150 mainstream models.FastDeploy provides production environment out-of-the-box ....
07-29 1.0 K0kudos
Web - macOS AI Browser: a native AI-powered browser for macOS
Web is an open source macOS browser project developed by nuance-dev and hosted on GitHub. It is based on Apple's WebKit engine, using the SwiftUI and Combine frameworks, and follows the MVVM architecture.The core feature of Web is the set of ...
07-29 1.1 K0kudos
Transformers: open source machine learning modeling framework with support for text, image and multimodal tasks
Transformers is an open source machine learning framework developed by Hugging Face focused on providing advanced model definitions to support inference and training for text, image, audio, and multimodal tasks. It simplifies the process of using models and is compatible with many mainstream deep learning frameworks such as PyTorch, Tens .....
07-06 1.2 K0kudos
Local LLM Notepad: A Portable Tool for Running Local Large Language Models Offline
Local LLM Notepad is an open source offline application that allows users to run Local Large Language Models on any Windows computer via a USB device without an Internet connection and without installation. Users simply copy a single executable file (EXE) and a model file (e.g. GGUF format) to a USB drive...
07-03 1.1 K0kudos
llm.pdf: experimental project to run a large-scale language model in a PDF file
llm.pdf is an open source project that allows users to run Large Language Models (LLMs) directly in PDF files. This project, developed by EvanZhouDev and hosted on GitHub, demonstrates an innovative approach: compiling llama.cpp via Emscripten as ...
05-05 1.8 K0kudos
Aana SDK: An Open Source Tool for Easy Deployment of Multimodal AI Models
Aana SDK is an open source framework developed by Mobius Labs, named after the Malayalam word ആന (elephant). It helps developers quickly deploy and manage multimodal AI models, supporting processing of text, images, audio and video, and other data.Aana SDK is based on the Ray Distributed Computing Framework ...
03-25 1.9 K0kudos
BrowserAI: Running AI Models Locally in the Browser with WebGPUs
BrowserAI is an open source tool that lets users run native AI models directly in the browser. It was developed by the Cloud-Code-AI team and supports language models like Llama, DeepSeek, and Kokoro. Users can complete text generation through the browser without a server or complex setup...
03-16 2.3 K0kudos
LitServe: Rapidly Deploying Enterprise-Grade General AI Model Reasoning Services
LitServe is an open source AI model service engine from Lightning AI, built on FastAPI and focused on rapidly deploying inference services for general-purpose AI models. It supports a wide range of scenarios from large language models (LLMs), visual models, audio models to classical machine learning models, providing batch...
03-10 1.8 K0kudos
Nexa: a small multimodal AI solution for local operation
Nexa AI is a platform focused on multimodal AI solutions that run locally. It offers a wide range of AI models, including Natural Language Processing (NLP), Computer Vision, Speech Recognition and Generation (ASR and TTS), all of which can be run locally on devices without relying on cloud-based services. This ...
02-01 2.2 K0kudos
vLLM: LLM reasoning and service engine for efficient memory utilization
vLLM is a high-throughput and memory-efficient reasoning and service engine designed for Large Language Modeling (LLM). Originally developed by the Sky Computing Lab at UC Berkeley, it has become a community project driven by academia and industry. vLLM aims to provide fast, easy...
01-17 2.3 K0kudos
Llama 3.2 Reasoning WebGPU: Running Llama-3.2 in a Browser
Transformers.js is a JavaScript library provided by Hugging Face designed to run state-of-the-art machine learning models directly in the browser without server support. The library is compatible with Hugging Face's Python version of transformer...
01-15 2.2 K0kudos
Harbor: a containerized toolset for easily managing and running AI services with one-click deployment of local LLM development environments
Harbor is a revolutionary containerized LLM toolset focused on simplifying the deployment and management of local AI development environments. It enables developers to launch and manage all AI service components including LLM backend, API interfaces, and front-end interfaces with a single click through a clean command line interface (CLI) and companion application....
01-02 2.8 K0kudos
Xinference: Easy Distributed AI Model Deployment and Serving
Xorbits Inference (Xinference for short) is a powerful and versatile library focused on providing distributed deployment and serving of language models, speech recognition models, and multimodal models. With Xorbits Inference, users can easily deploy and serve their own models or built-in advanced models,...
01-02 2.1 K0kudos
AI Dev Gallery: Windows Native AI Model Development Toolset, End-Side Model Integration into Windows Applications
AI Dev Gallery is an AI development tools application from Microsoft (currently in public preview) designed for Windows developers. It provides a comprehensive platform to help developers easily integrate AI features into their Windows applications. The most notable feature of the tool is that it provides...
12-30 2.5 K0kudos
LightLLM: An Efficient Lightweight Framework for Reasoning and Serving Large Language Models
LightLLM is a Python-based Large Language Model (LLM) inference and service framework known for its lightweight design, ease of extension, and efficient performance. The framework leverages a variety of well-known open source implementations, including FasterTransformer, TGI, vLLM, and FlashAtten...
12-17 2.3 K0kudos
Transformers.js: running nearly 700 AI macromodels in the local web
Transformers.js is a JavaScript library developed by Hugging Face to enable users to run state-of-the-art machine learning models directly in the browser without server support. The library is compatible with Hugging Face's Python trans...
12-02 2.5 K0kudos
GLM Edge: Smart Spectrum Releases End-Side Large Language Model and Multi-Modal Understanding Model for Mobile, Car and PC Platforms
GLM-Edge is a series of large language models and multimodal understanding models designed for end-side devices from Tsinghua University (Smart Spectrum Light Language). These models include GLM-Edge-1.5B-Chat, GLM-Edge-4B-Chat, GLM-Edge-V-2B and GLM-Edge-V-5...
12-01 2.4 K0kudos
EXO: Running distributed AI clusters using idle home devices with support for multiple inference engines and automated device discovery.
Exo is an open source project that aims to run its own AI cluster using everyday devices (e.g. iPhone, iPad, Android, Mac, Linux, etc.). Through dynamic model partitioning and automated device discovery, Exo is able to unify multiple devices into a single powerful GPU, supporting multiple models such as LLaMA, Mistral...
11-28 3.4 K0kudos