SkyServe is SkyPilot's subsystem designed for production-grade AI services, with key features including:
- elasticity scaling: By
replicaThe parameter sets the number of replicas (e.g., 2 A100 instances) and automatically load balances the traffic. - HTTPS Support: Built-in automatic certificate management (similar to Let's Encrypt) to enable secure access without additional configuration.
- Blue-Green Deployment: Supports seamless switching of model versions to minimize service downtime.
- Monitor Dashboard: Provides graphical presentation of key metrics such as QPS, latency, etc.
Configuration example:service:
replica: 2
ports: 8080
run: |
python serve.py --model llama
priming commandsky serve up serve.yaml -n llama-servicewill be generated ashttps://llama-service.skypilot.coof the access endpoints.
This answer comes from the articleSkyPilot: an open-source framework for efficiently running AI and batch tasks in any cloudThe































