Skip to Main Content (Press Enter)

Logo UNIME
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Competenze

Competenze e Professionalità
Logo UNIME

|

UNIFIND - Competenze e Professionalità

unime.it
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Competenze
  1. Pubblicazioni

Self-Supervised Hypergraph Learning for Enhanced Multimodal Representation

Articolo
Data di Pubblicazione:
2024
Abstract:
Hypergraph neural networks have gained substantial popularity in capturing complex correlations between data items in multimodal datasets. In this study, we propose a novel approach called the self-supervised hypergraph learning (SHL) framework that focuses on extracting hypergraph features to improve multimodal representation. Our method utilizes a dual embedding strategy and leverages SHL to improve the accuracy and robustness of the model. To achieve this, we employ a hypergraph learning framework to extract global context effectively by capturing rich inter-modal dependencies. Additionally, we introduce a novel self-supervised learning (SSL) component that utilizes the interaction graph data, thereby strengthening the robustness of the model. By jointly optimizing hypergraph feature extraction and SSL, SHL significantly improves the performance of multimodal representation tasks. To validate the effectiveness of our approach, we construct two comprehensive multimodal micro-video recommendation datasets using publicly available data (TikTok and MovieLens-10M). Prior to dataset creation, we meticulously handle invalid entries and outliers and complete missing mode information using external auxiliary sources, such as YouTube. These datasets are made publicly available to the research community for evaluation purposes. Experimental results on the above recommendation datasets demonstrate that the proposed SHL approach outperforms state-of-the-art baselines, highlighting its superior performance in multimodal representation tasks.
Tipologia CRIS:
14.a.1 Articolo su rivista
Keywords:
hypergraph neural networks; micro-video; Multimodal; self-supervised learning
Elenco autori:
Shu, H.; Meng, C.; De Meo, P.; Wang, Q.; Zhu, J.
Autori di Ateneo:
DE MEO Pasquale
Link alla scheda completa:
https://iris.unime.it/handle/11570/3291491
Pubblicato in:
IEEE ACCESS
Journal
  • Informazioni
  • Assistenza
  • Accessibilità
  • Privacy
  • Utilizzo dei cookie
  • Note legali

Realizzato con VIVO | Designed by Cineca | 26.5.0.0