IIT Home Page CNR Home Page

Semantically-aware Statistical Metrics via Weighting Kernels

Distance metrics between statistical distributions are widely used as an efficient mean to aggregate/simplify the underlying probabilities, thus enabling high-level analyses. In this paper we investigate the collisions that can arise with such metrics, and a mitigation technique rooted on kernels. In detail, we first show that the existence of colliding functions (so-called iso-curves) is widespread across metrics and families of functions (e.g., gaussians, heavy-tailed). Later, we propose a solution based on kernels for augmenting distance metrics and summary statistics, thus avoiding collisions and highlighting semantically-relevant phenomena. This study is supported by a thorough theoretical evaluation of our solution against a large number of functions and metrics, complemented by a real-world evaluation carried out by applying our solution to an existing problem. Some further research venues are also discussed. The theoretical construction and the achieved results show the soundness, viability, and quality of our proposal that, other being interesting on its own, also paves the way for further research in the highlighted directions.

The 6th IEEE International Conference on Data Science and Advanced Analytics (DSAA'19), Washington, USA, 2019

Autori esterni: Roberto Di Pietro (Hamad Bin Khalifa University)
Autori IIT:

Tipo: Contributo in atti di convegno
Area di disciplina: Computer Science & Engineering

File: Cresci, 2019, Semantically-aware Statistical Metrics via Weighting Kernels.pdf

Attività: Social Media Analysis