Document detail
ID

oai:arXiv.org:2408.15077

Topic
Computer Science - Computer Vision... Computer Science - Artificial Inte... Computer Science - Machine Learnin...
Author
Ravva, Pavan Uttej Kiafar, Behdokht Kullu, Pinar Li, Jicheng Bhat, Anjana Barmaki, Roghayeh Leila
Category

Computer Science

Year

2024

listing date

9/4/2024

Keywords
multimodal framework autism science computer
Metrics

Abstract

Autism spectrum disorder (ASD) is characterized by significant challenges in social interaction and comprehending communication signals.

Recently, therapeutic interventions for ASD have increasingly utilized Deep learning powered-computer vision techniques to monitor individual progress over time.

These models are trained on private, non-public datasets from the autism community, creating challenges in comparing results across different models due to privacy-preserving data-sharing issues.

This work introduces MMASD+, an enhanced version of the novel open-source dataset called Multimodal ASD (MMASD).

MMASD+ consists of diverse data modalities, including 3D-Skeleton, 3D Body Mesh, and Optical Flow data.

It integrates the capabilities of Yolov8 and Deep SORT algorithms to distinguish between the therapist and children, addressing a significant barrier in the original dataset.

Additionally, a Multimodal Transformer framework is proposed to predict 11 action types and the presence of ASD.

This framework achieves an accuracy of 95.03% for predicting action types and 96.42% for predicting ASD presence, demonstrating over a 10% improvement compared to models trained on single data modalities.

These findings highlight the advantages of integrating multiple data modalities within the Multimodal Transformer framework.

Ravva, Pavan Uttej,Kiafar, Behdokht,Kullu, Pinar,Li, Jicheng,Bhat, Anjana,Barmaki, Roghayeh Leila, 2024, MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of Children with Autism Spectrum Disorder

Document

Open

Share

Source

Articles recommended by ES/IODE AI