Multitask Representation Learning for Multimodal Estimation of Depression Level

Authors: A. Qureshi, S. Saha , M. Hasanuzzaman, G. Dias

Abstract: Depression is a serious medical condition that is suffered by a large number of people around the world. It significantly affects the way one feels, causing a persistent lowering of mood. In this work, we propose a novel multitask learning attention-based deep neural network model, which facilitates the fusion of various modalities. In particular, we use this network to both regress and classify the level of depression. Acoustic, textual and visual modalities have been used to train our proposed network. Various experiments have been carried out on the benchmark dataset, namely, Distress Analysis Interview Corpus — a Wizard of Oz. From the results, we empirically justify that a) multitask learning networks co-trained over regression and classification have better performance compared to single-task networks, and b) the fusion of all the modalities helps in giving the most accurate estimation of depression with respect to regression. Our proposed approach outperforms the state of the art by 4.93% on root mean squared error and 1.50% on mean absolute error for regression, while we settle new baseline values for depression classification, namely 66.66% accuracy and 0.53 F-score.

Publishing Date: June, 2019 (accepted)

Published in: IEEE Intelligent Systems