Transformers is an open source machine learning framework developed by Hugging Face focused on providing advanced model definitions that support inference and training for text, image, audio, and multimodal tasks. It simplifies the process of using models and is compatible with a variety of mainstream deep learning frameworks such as PyTorch, TensorFlow and Flax.Transformers provides more than 1 million pre-trained models covering the fields of natural language processing, computer vision, and speech recognition, which are widely used in academic research and commercial development. Users can quickly load models and perform tasks such as text generation, image segmentation, or speech-to-text with simple code. The framework is updated frequently, and the latest version supports new models such as Kyutai-STT and ColQwen2 to stay on the cutting edge of technology.
Function List
- Pipeline API that supports multiple tasks, simplifying operations such as text generation, speech recognition, image classification, and more.
- Provides over 1 million pre-trained models covering natural language processing, computer vision, and multimodal tasks.
- Compatible with PyTorch, TensorFlow, and Flax, it supports multiple training and inference frameworks.
- Supports downloading and caching models from Hugging Face Hub for offline use.
- Provide command line tools
transformers serve
In addition, it supports OpenAI-compliant HTTP servers. - Support for model fine-tuning and training, adapted to multiple training frameworks such as DeepSpeed and PyTorch-Lightning.
- Provides support for the latest models, such as Kyutai-STT (speech-to-text) and ColQwen2 (document retrieval).
Using Help
Installation process
Transformers is easy to install, supports Python 3.9+ environments, and is recommended to use a virtual environment to avoid dependency conflicts. Here are the detailed installation steps:
- Creating a Virtual Environment
Using Python'svenv
module to create a virtual environment:python -m venv transformers_env source transformers_env/bin/activate # Linux/Mac transformers_env\Scripts\activate # Windows
- Installing Transformers
utilizationpip
Install the latest stable version:pip install transformers
If GPU support is required, ensure that the appropriate CUDA driver is installed and run the following command to check GPU availability:
python -c "import torch; print(torch.cuda.is_available())"
To experience the latest features, you can install the development version from GitHub:
pip install git+https://github.com/huggingface/transformers
- Verify Installation
After the installation is complete, run the following command to test it:from transformers import pipeline print(pipeline('sentiment-analysis')('hugging face is awesome'))
The output should be something like
{'label': 'POSITIVE', 'score': 0.9998}
The results of the
Using the Pipeline API
The core feature of Transformers is the Pipeline API, which allows users to perform complex tasks without delving into the details of the model.The Pipeline API supports a wide variety of tasks, such as text generation, speech recognition, and image segmentation. Here's how it works:
- Text Generation
Use the Pipeline API for text generation:from transformers import pipeline generator = pipeline(task="text-generation", model="Qwen/Qwen2.5-1.5B") result = generator("The secret to baking a really good cake is") print(result[0]["generated_text"])
Models are automatically downloaded and cached to the default directory
~/.cache/huggingface/hub
. Users can set the environment variableTRANSFORMERS_CACHE
Change the cache path. - speech recognition
For speech-to-text tasks, the Pipeline API is equally simple:from transformers import pipeline asr = pipeline(task="automatic-speech-recognition", model="openai/whisper-large-v3") result = asr("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac") print(result["text"])
The output is the text content of an audio file, e.g. "I have a dream ......".
- Command Line Interaction
Transformers provides command-line toolstransformers serve
If you want to start an HTTP server that is compatible with the OpenAI API, you can do so by clicking on the following link:transformers serve
Users can interact with the model via HTTP requests, making it suitable for integration into other applications.
Offline use
Transformers supports offline mode, which is suitable for network-less environments. Users can set it up by following steps:
- Download the model locally:
from huggingface_hub import snapshot_download snapshot_download(repo_id="meta-llama/Llama-2-7b-hf", repo_type="model")
- Set environment variables to enable offline mode:
export HF_HUB_OFFLINE=1
- Load the local model:
from transformers import LlamaForCausalLM model = LlamaForCausalLM.from_pretrained("./path/to/local/directory", local_files_only=True)
Offline mode ensures that model loading is not network dependent and is suitable for production environments.
Feature: New Model Support
Transformers is constantly updated to support the latest models. For example:
- Kyutai-STT: Speech-to-text model based on the Mimi codec with support for streaming audio processing. Install the preview version:
pip install git+https://github.com/huggingface/transformers@v4.52.4-Kyutai-STT-preview
- ColQwen2: A model for document retrieval, dealing with visual features of page images. The installation method is similar:
pip install git+https://github.com/huggingface/transformers@v4.52.4-ColQwen2-preview
These models will be officially released in subsequent versions (e.g. v4.53.0) and users can experience them in advance.
Fine-tuning and training
Transformers supports model fine-tuning and is compatible with a wide range of training frameworks. Users can use run_clm.py
Scripts for language model training:
HF_HUB_OFFLINE=1 python examples/pytorch/language-modeling/run_clm.py --model_name_or_path meta-llama/Llama-2-7b-hf --dataset_name wikitext
This feature is suitable for developers who need to customize their models.
application scenario
- academic research
Researchers use Transformers to load pre-trained models and quickly experiment with natural language processing or computer vision. For example, when testing new algorithms, Hugging Face Hub models can be called directly, saving training time. - business development
Enterprise developers use Transformers to build chatbots, voice assistants, or image analysis tools. For example, use the Pipeline API to quickly deploy text generation or speech recognition functionality for integration into products. - Education and learning
Students and beginners learn deep learning with Transformers, practicing tasks such as text categorization and translation with the help of pre-trained models and simple code to lower the learning barrier.
QA
- What programming frameworks does Transformers support?
Transformers is compatible with PyTorch, TensorFlow, and Flax, and supports a variety of training and inference frameworks such as DeepSpeed, PyTorch-Lightning, and vLLM. - How do you handle caching of model downloads?
Models are cached by default to the~/.cache/huggingface/hub
This can be done through the environment variableTRANSFORMERS_CACHE
Change the path. SettingHF_HUB_OFFLINE=1
Offline mode can be enabled to load only local models. - Do I need a GPU to use Transformers?
Not required, Transformers supports CPU operation, but GPU accelerates inference and training. A CPU-only version is available for installation to avoid installing CUDA dependencies.