The iPhone 16 Pro's Neural Engine arithmetic has reached 35 TOPS, which, in conjunction with OLMoE's Expert Hybrid Architecture, makes running 7B parametric models on mobile devices a reality. Performance tests show that the iPad Pro with the M2 chip performs close to 2023 cloud-based Llama2-13B model levels when handling complex logical reasoning tasks. This hardware evolution directly lowers the barrier to entry for device-side AI, expanding OLMoE's target user base from the geek community to the average developer.
Market data confirms this trend: in Q3 2024, the download of <7B parameter models that can be deployed on the device side increased by 4,70% year-on-year. the OLMoE project accurately grasps the technological inflection point, and its recommended Core ML conversion toolchain converts the PyTorch model to the Apple chip-specific format, which reduces the inference energy consumption by 60%. the first batch of integration cases show that The first integration cases show that after a securities APP realizes local financial report analysis by embedding OLMoE, the user's stay time is increased by 22%, which verifies the feasibility of the business scenario.
This answer comes from the articleAi2 OLMoE: An Open Source iOS AI App Based on OLMoE Models Running OfflineThe































