Overseas access: www.kdjingpai.com

Bookmark Us

Large model fine-tuning

 Submit Website

Qwen3-FineTuning-Playground: a ready-to-use code base for fine-tuning Qwen3's big models.
Qwen3-FineTuning-Playground is an open source project that provides a complete code base dedicated to fine-tuning the Qwen3 family of large language models. The basis of this project is to provide clear, professional and easily extensible fine-tuning code examples so that developers and researchers can easily practice...
08-28 7040kudos
Verifiers: a library of reinforcement learning environment tools for training large language models
Verifiers is a library of modular components for creating Reinforcement Learning (RL) environments and training Large Language Model (LLM) agents. The goal of this project is to provide a set of reliable tools that allow developers to easily build, train, and evaluate LLM agents. Verifiers contains a transfor.....-based
08-28 7200kudos
Radal: a low-code platform for rapid fine-tuning and optimization of AI models
Radal is a low-code platform focused on helping organizations quickly build and optimize AI models. It enables users to train large language models (LLMs) without deep programming through an intuitive interface and AI-assisted features. Developed by a team of industry experts and startups, the platform emphasizes efficient, customized AI solutions...
08-04 5630kudos
WhiteLightning: an open source tool for generating lightweight offline text classification models in one click
WhiteLightning is an open source command line tool designed to help developers quickly generate lightweight text classification models with a single line of command. The tool generates synthetic data using a large language model, and trains ONNX models of less than 1MB through faculty-student distillation techniques, which supports fully offline operation and is suitable for edge device...
08-04 6600kudos
FineTuningLLMs: A Practical Guide to Efficiently Fine-Tuning Large Language Models on a Single GPU
FineTuningLLMs is a GitHub repository created by author dvgodoy, based on his book A Hands-On Guide to Fine-Tuning LLMs with PyTorch and Hugging Face. This repository...
07-09 6810kudos
ReCall: training large models for tool-call inference via reinforcement learning
ReCall is an open source framework designed to train Large Language Models (LLMs) for tool invocation and inference through reinforcement learning, without relying on supervised data. It allows models to autonomously use and combine external tools, such as search, calculators, etc., to solve complex tasks.ReCall supports user-defined tools suitable for...
07-01 1.0 K0kudos
GraphGen: Fine-tuning Language Models Using Knowledge Graphs to Generate Synthetic Data
GraphGen is an open-source framework developed by OpenScienceLab, an AI lab in Shanghai, hosted on GitHub, focused on optimizing supervised fine-tuning of Large Language Models (LLMs) by guiding synthetic data generation through knowledge graphs. It constructs fine-grained knowledge graphs from source text, utilizing the expected calibration error...
05-05 1.7 K0kudos
MiniMind-V: 1 hour training of a 26M parameter visual language model
MiniMind-V is an open source project, hosted on GitHub, designed to help users train a lightweight visual language model (VLM) with only 26 million parameters in less than an hour. It is based on the MiniMind language model , the new visual coder and feature projection module , support for image and text joint processing. .....
04-14 1.5 K0kudos
DeepCoder-14B-Preview: an open source model that specializes in code generation
DeepCoder-14B-Preview is an open source code generation model developed by Agentica team and released on Hugging Face platform. It is based on DeepSeek-R1-Distilled-Qwen-14B, optimized by distributed reinforcement learning (RL) techniques...
04-10 1.5 K0kudos
WeClone: training digital doppelgangers with WeChat chats and voices
WeClone is an open-source project that lets users create personalized digital doppelgängers by using WeChat chat logs and voice messages, combined with large language models and speech synthesis technology. The project can analyze a user's chatting habits to train the model, and can also generate realistic voice clones with a small number of voice samples. Ultimately, the digital ...
04-08 1.6 K0kudos
Search-R1: A Tool for Reinforcement Learning to Train Large Models for Search and Reasoning
Search-R1 is an open source project developed by PeterGriffinJin on GitHub and built on the veRL framework. It uses reinforcement learning (RL) techniques to train a large language model (LLM), so that the model autonomously learns to reason and invoke the search engine to solve problems. Project Support Qwen2....
04-01 1.5 K0kudos
Optexity: an open-source project to train AI to perform web actions with human demonstrations
Optexity is an open source project on GitHub, developed by the Optexity team. Its core is to use human demonstration data to train AI to complete computer tasks, especially web page operations. The project consists of three code libraries : ComputerGYM, AgentAI and Playwright...
04-01 1.5 K0kudos
Bonsai: A three-valued weighted language model suitable for operation on edge devices
Bonsai is an open source language model developed by deepgrove-ai with a parameter size of 500 million, using ternary weights. It is based on the Llama architecture and the Mistral classifier design, with linear layers adapted to support ternary weights. The model mainly uses ...
03-26 1.6 K0kudos
Second Me: Locally trained AI doppelgangers with personal memories and habits
Second Me is an open source project developed by the Mindverse team that lets you create an AI on your computer that acts like a "digital doppelganger", learning your speech and habits through your words and memories, and becoming an intelligent assistant that understands you. Its best feature is that all the data stays...
03-24 2.2 K0kudos
Easy Dataset: an easy tool for creating fine-tuned datasets for large models
Easy Dataset is an open source tool designed specifically for fine-tuning large models (LLMs), hosted on GitHub. It provides an easy-to-use interface that allows users to upload files, automatically segment content, generate questions and answers, and ultimately output structured datasets suitable for fine-tuning. The developer, Cona...
03-21 1.7 K0kudos
MM-EUREKA: A Multimodal Reinforcement Learning Tool for Exploring Visual Reasoning
MM-EUREKA is an open source project developed by Shanghai Artificial Intelligence Laboratory, Shanghai Jiao Tong University and other parties. It extends textual reasoning capabilities to multimodal scenarios through rule-based reinforcement learning techniques to help models process image and textual information. The core goal of this tool is to enhance the model in...
03-18 1.5 K0kudos
AI Toolkit by Ostris: Stable Diffusion with FLUX.1 Model Training Toolkit
AI Toolkit by Ostris is an open source AI toolkit focused on supporting Stable Diffusion and FLUX.1 models for training and image generation tasks. Created and maintained by developer Ostris and hosted on GitHub, the toolkit aims to provide researchers and developers with flexible model...
03-12 3.3 K0kudos
X-R1: Low-cost training of 0.5B models in common devices
X-R1 is a reinforcement learning framework open-sourced on GitHub by the dhcode-cpp team, aiming to provide developers with a low-cost, efficient tool for training models based on end-to-end reinforcement learning. Inspired by DeepSeek-R1 and open-r1, the project focuses on building an easy...
03-11 1.3 K0kudos
OpenManus-RL: Fine-tuning Large Models to Enhance Intelligent Body Reasoning and Decision Making
OpenManus-RL is an open source project jointly developed by UIUC-Ulab and the OpenManus team of the MetaGPT community, hosted on GitHub.The project enhances the reasoning and decision-making capabilities of large language model (LLM) intelligences through reinforcement learning (RL) techniques, based on Deepseek-R1...
03-10 2.0 K0kudos

English