Current Position:fig. beginning " AI Professional Tools

Collection of scripts and tutorials for fine-tuning OpenAI GPT OSS models

2025-08-06

AI Professional Tools/AI Tool/Model Tuning

2.1 K 5

gpt-oss-recipes is a GitHub repository maintained by Hugging Face that focuses on providing scripts and Jupyter Notebook tutorials for using OpenAI GPT OSS models. The repository contains the latest open source models for OpenAI gpt-oss-120b cap (a poem) gpt-oss-20b configuration and usage examples. Known for their powerful reasoning capabilities and efficient resource footprint, these models are suitable for developers to run in production environments or on personal devices. The code and documentation in the repository help users quickly get started with model inference, fine-tuning, and deployment, covering everything from environment setup to implementation of complex tasks. All content is based on the Apache 2.0 license, which allows free use and modification.

Function List

furnish gpt-oss-120b cap (a poem) gpt-oss-20b Configuration scripts for models that support fast switching of model sizes.
Contains environment setup code to support Python virtual environments and dependency installation.
Provides reasoning examples that show how to use the model to generate text or perform tool calls.
Supports model fine-tuning and contains examples of processing multilingual inference datasets.
Provides the ability to work with Transformers, vLLM and Ollama Integration tutorials for frameworks such as.
Supports optimized configurations for running models on different hardware (H100 GPUs, consumer-grade devices).

Using Help

Installation process

To use gpt-oss-recipes scripts in the repository, you first need to clone the repository and set up the Python environment. Here are the detailed steps:

clone warehouse
Open a terminal and run the following command to clone the repository locally:
```
git clone https://github.com/huggingface/gpt-oss-recipes.git
cd gpt-oss-recipes
```
Creating a Virtual Environment
It is recommended that you create a virtual environment using Python 3.11 to ensure compatibility. It is recommended to use uv Tools:
```
uv venv gpt-oss --python 3.11
source gpt-oss/bin/activate
```

Installation of dependencies
Install the necessary Python packages, including PyTorch and Transformers. run the following command:

uv pip install --upgrade pip
uv pip install torch==2.8.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/test/cu128
uv pip install -U transformers accelerate

Installing the Triton kernel (optional)
If the hardware supports MXFP4 quantization (e.g. H100 or RTX 50xx), the Triton kernel can be installed to optimize performance:
```
uv pip install git+https://github.com/triton-lang/triton.git@main#subdirectory=python/triton_kernels
```

configuration model

The repository offers two models:gpt-oss-120b(117B parameters for high performance GPUs) and gpt-oss-20b(21B parameters for consumer grade hardware). In the script, modify the model_path Variable selection models. Example:

model_path = "openai/gpt-oss-20b"  # 选择 20B 模型
# model_path = "openai/gpt-oss-120b"  # 选择 120B 模型

The script automatically configures device mapping and optimization settings based on model size.

running inference

The repository contains simple reasoning examples for generating text or performing specific tasks. The following is an example of an application that uses the gpt-oss-20b Example of model-generated text:

show (a ticket) inference.py file (or similar script).

Ensure that the model and splitter are loaded:

from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "openai/gpt-oss-20b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype="auto")

Enter prompts and generate results:

messages = [{"role": "user", "content": "如何用 Python 写一个排序算法？"}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
generated = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(generated[0]))

Run the script and the model returns sample Python code for the sorting algorithm.

Adjustment of inference parameters

The level of detail in reasoning can be adjusted by system prompts. For example, set a high reasoning level:

messages = [
{"role": "system", "content": "Reasoning: high"},
{"role": "user", "content": "解释量子计算的基本原理"}
]

High inference levels generate more detailed reasoning processes, suitable for complex problems.

fine-tuned model

The repository provides fine-tuning examples, based on Hugging Face's TRL library and LoRA technology. Here is the fine-tuning gpt-oss-20b The Steps:

Download the multilingual inference dataset:

from datasets import load_dataset
dataset = load_dataset("HuggingFaceH4/Multilingual-Thinking", split="train")

Configure LoRA parameters and load the model:

from transformers import AutoModelForCausalLM
from peft import PeftModel, LoraConfig
model_name = "openai/gpt-oss-20b"
lora_config = LoraConfig(r=8, lora_alpha=32, target_modules=["q_proj", "v_proj"])
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
model = PeftModel(model, lora_config)

Use the TRL library for fine-tuning (refer to the repository for the finetune.ipynb).
Save the fine-tuned model for specific tasks such as multilingual reasoning.

Using vLLM or Ollama

If rapid deployment is required, warehouse support vLLM and Ollama:

vLLM: Start an OpenAI-compatible server:

uv pip install --pre vllm==0.10.1+gptoss --extra-index-url https://wheels.vllm.ai/gpt-oss/
vllm serve openai/gpt-oss-20b

Ollama: Runs on consumer-grade hardware:

ollama pull gpt-oss:20b
ollama run gpt-oss:20b

Featured Function Operation

Tool Call: The model supports function calls and Web searches. For example, calling the weather function:

tools = [{"type": "function", "function": {"name": "get_current_weather", "description": "获取指定地点的天气", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}}]
messages = [{"role": "user", "content": "巴黎的天气如何？"}]
response = client.chat.completions.create(model="openai/gpt-oss-120b:cerebras", messages=messages, tools=tools)

multilingual reasoning: Through fine-tuning, the model can generate reasoning processes in English, Spanish, French and other languages. The user can specify the reasoning language, for example:
```
messages = [{"role": "system", "content": "Reasoning language: Spanish"}, {"role": "user", "content": "¿Cuál es la capital de Australia?"}]
```

application scenario

AI development experiments
Developers can use the scripts in the repository to test the performance of GPT OSS models in different tasks, such as text generation, code generation, or Q&A systems. Ideal for rapid prototyping.
Local Model Deployment
Can be deployed on local devices by businesses or individuals gpt-oss-20b, for privacy-sensitive scenarios such as internal document processing or customer support.
Education and Research
Researchers can use the fine-tuning tutorials to optimize models based on specific datasets (e.g., multilingual reasoning) and explore the application of large models in academic fields.
Production Environment Integration
The repository supports the deployment of API servers via vLLM and is suitable for integrating models into production environments such as chatbots or automated workflows.

QA

What models does the warehouse support?
Warehouse Support gpt-oss-120b(117B parameters) and gpt-oss-20b(21B parameters) for high-performance GPUs and consumer hardware, respectively.
How to choose the right model?
Recommended if you have an H100 GPU gpt-oss-120bIf you are using a regular device (16GB of memory), select the gpt-oss-20bThe
What hardware is required?
gpt-oss-20b Requires 16GB of RAM.gpt-oss-120b Requires 80GB GPUs (e.g., H100). mxFP4 quantization reduces resource requirements.
How to deal with errors in model reasoning?
Make sure to use harmony format for input and output. Check for hardware compatibility and update dependencies such as PyTorch and Triton kernels.

AI open source project

AI productivity tools " Collection of scripts and tutorials for fine-tuning OpenAI GPT OSS models Posted on 2025-08-06, please contact us if you find the URL is out of date, or inaccessible.

0Bookmarked

0kudos

Collection of scripts and tutorials for fine-tuning OpenAI GPT OSS models

Function List

Using Help

Installation process

configuration model

running inference

Adjustment of inference parameters

fine-tuned model

Using vLLM or Ollama

Featured Function Operation

application scenario

QA

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Collection of scripts and tutorials for fine-tuning OpenAI GPT OSS models

Function List

Using Help

Installation process

configuration model

running inference

Adjustment of inference parameters

fine-tuned model

Using vLLM or Ollama

Featured Function Operation

application scenario

QA

Recommended

Can't find AI tools? Try here!

Popular AI tools

New Releases

Latest AI tools

Quick query station AI tool