The following hardware requirements and optimization recommendations need to be considered for deploying Seed-OSS:
hardware requirement
- Basic Configuration: At least 1 NVIDIA H100-80G GPU is recommended.
- High Performance Configuration: 4 GPUs to support higher-load tasks.
Optimization Recommendations
- Multi-GPU reasoning: Allocate GPU resources through the tensor-parallel-size parameter, e.g. setting tensor-parallel-size=8 is suitable for 8 GPUs.
- data type: Use bfloat16 to reduce video memory footprint for large scale deployments.
- Generating Configurations: Temperature=1.1 and top_p=0.95 are recommended for diverse output. For specific tasks (e.g. Taubench), this can be adjusted to temperature=1 and top_p=0.7.
- logical framework: It is recommended to use the vLLM reasoning framework to improve reasoning efficiency.
These optimizations can significantly improve the performance and efficiency of Seed-OSS in real-world applications.
This answer comes from the articleSeed-OSS: Open Source Large Language Model for Long Context Reasoning and Versatile ApplicationsThe































