Advancements in Multimodal AI: Developing Human-Like Object Perceptions

June 10, 2025

A recent study by a team of Chinese researchers has shown that multimodal large language models (LLMs) possess the remarkable ability to create human-like representations of object concepts. This finding may pave the way for enhanced cognitive science within artificial intelligence (AI) and provide a foundation for developing AI systems that mirror human cognitive processes.

With the rise of advanced LLMs, scientists have begun to explore the potential for these models to develop conceptual representations akin to human understanding, particularly through linguistic and multimodal datasets. He Huiguang, a researcher at the Institute of Automation, highlighted that understanding objects involves recognizing not only their physical attributes but also their emotional significance and cultural meanings.

The collaborative study between the Institute of Automation and the CAS Center for Excellence in Brain Science employed a unique approach that fuses behavioral analysis with neuroimaging to investigate the convergence between LLMs and human cognitive function. By constructing a conceptual model for LLMs, they discovered that the 66 dimensions derived from LLMs’ behavioral data closely align with neural activity in the human brain’s category-selective regions.

Moreover, the research indicated that while human decision-making tends to blend visual attributes with semantic context, LLMs are more dependent on semantic labels and abstract concepts for their judgments.

Source: news.cgtn.com

Warning: Attempt to read property "roles" on bool in /var/www/wordpress/wp-content/themes/morenews/inc/template-functions.php on line 920