Overseas access: www.kdjingpai.com
Bookmark Us
Current Position:fig. beginning " AI Answers

HumanOmni is the industry's first multimodal open source large model focused on human video analytics

2025-08-28 1.6 K

HumanOmni's Industry Leadership

Developed by the HumanMLLM team and open-sourced on GitHub, HumanOmni is currently the industry's first multimodal macromodel with human video analysis as its core task. The model innovatively integrates 2.4 million human-centric video clips and 14 million instruction data for pre-training, and uses 50,000 finely labeled video clips for fine-tuning.

Its core values are reflected in three areas:

  • Complete analytical dimensions: Simultaneous coverage of facial expression, body movement and interaction scene recognition
  • Dynamic integration mechanisms: the weights of the three branches of analysis can be automatically adjusted according to the inputs
  • Open Source Properties: Full availability of code, pre-trained models and partial datasets

Compared to traditional unimodal models, HumanOmni achieves a UAR of 74.861 TP3T on the DFEW emotion recognition dataset, significantly ahead of GPT4-O's 50.571 TP3T.This breakthrough performance confirms its technological superiority as a domain-first model.

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top