IIT Home Page CNR Home Page

Profiling Twitter Users Using Autogenerated Features Invariant to Data Distribution

With the diffusion of Web and Social Media, automatic user profiling classifiers applied to digital contents have become extremely important in application contexts related to social and forensic studies. In many research papers on this topic, an important part of the work is devoted to a costly manual "feature engineering" phase, where the semantic, syntactic, and often language-dependent features need to be accurately chosen to be relevant for profilation task. Differently from this approach, in this work we propose a Twitter user profiling classifier which exploits deep learning techniques to automatically generate user features being a) optimal for user profilation task, and b) able to fight covariance shift problem due to data distribution differences in training and test sets. In the best configuration found, the built system is able to achieve very interesting accuracy results on both English and Spanish languages, with an average final accuracy of more than 0.83.

CLEF (Working Notes) 2019, Lugano, Switzerland, 2019

IIT authors:

Type: Contributo in atti di convegno
Field of reference: Computer Science & Engineering

File: Clef2019.pdf

Activity: Algoritmica per tecnologie web