Logo del repository
  1. Home
 
Opzioni

Video question answering supported by a multi-task learning objective

Falcon A.
•
Serra G.
•
Lanz O.
2023
  • journal article

Periodico
MULTIMEDIA TOOLS AND APPLICATIONS
Abstract
Video Question Answering (VideoQA) concerns the realization of models able to analyze a video, and produce a meaningful answer to visual content-related questions. To encode the given question, word embedding techniques are used to compute a representation of the tokens suitable for neural networks. Yet almost all the works in the literature use the same technique, although recent advancements in NLP brought better solutions. This lack of analysis is a major shortcoming. To address it, in this paper we present a twofold contribution about this inquiry and its relation with question encoding. First of all, we integrate four of the most popular word embedding techniques in three recent VideoQA architectures, and investigate how they influence the performance on two public datasets: EgoVQA and PororoQA. Thanks to the learning process, we show that embeddings carry question type-dependent characteristics. Secondly, to leverage this result, we propose a simple yet effective multi-task learning protocol which uses an auxiliary task defined on the question types. By using the proposed learning strategy, significant improvements are observed in most of the combinations of network architecture and embedding under analysis.
DOI
10.1007/s11042-023-14333-0
Archivio
https://hdl.handle.net/11390/1245364
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-85150592452
https://ricerca.unityfvg.it/handle/11390/1245364
Diritti
open access
Soggetti
  • Multi-task learning

  • Video question answer...

  • Vision and language

  • Word embedding techni...

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback