The following steps are required to install and run the GPT-OSS model:
- Download model weights: from the Hugging Face platform via the huggingface-cli, for example:
huggingface-cli download openai/gpt-oss-120b --include 'original/*' --local-dir gpt-oss-120b/
- Configure your Python environment: It is recommended that you create a virtual environment using Python 3.12 and install the transformers, accelerate, and torch dependencies.
- Running models: can be run in a variety of ways, including Transformers implementations, vLLM implementations, Ollama implementations, and so on. For example, use Transformers to load the model:
pipe = pipeline('text-generation', model='openai/gpt-oss-20b', torch_dtype='auto', device_map='auto')
Note that the Harmony format must be used or the model will not work properly. For Apple Silicon devices, the weights will also need to be converted to Metal format.
This answer comes from the articleGPT-OSS: OpenAI's Open Source Big Model for Efficient ReasoningThe