Hibiki achieves superior end-side deployment capabilities through model compression techniques and a dedicated runtime. The system provides a streamlined version of the 1B parameter, which works with the MLX framework to run smoothly on mobile devices such as the iPhone 16 Pro. Deployment options included:
- MLX-Swift Mobile Optimization Framework
- Metal/CUDA hardware acceleration support
- 8-bit quantization technology reduces computational requirements
Empirical tests show that the 1B model consumes only 1.2W of power on the A17 Pro chip, realizing continuous real-time translation. This edge computing capability enables the system to be applied to scenarios that cannot be covered by traditional cloud-based translation, such as field operations in network-less environments and confidential meetings, elevating the usability of professional-grade voice translation to new heights.
This answer comes from the articleHibiki: a real-time speech translation model, streaming translation that preserves the characteristics of the original voiceThe































