IIT Home Page CNR Home Page

A matter of words: NLP for quality evaluation of Wikipedia medical articles

Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains, like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specic domain features to improve the results of the evaluation of Wikipedia medical articles. We rely on Natural Language Processing (NLP) and dictionaries-based techniques in order to extract the biomedical concepts in a text. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified.


16th International Conference on Web Engineering (ICWE2016) , Lugano, 2016

Autori esterni: Angelo Spognardi (DTU Compute, Kgs. Lyngby, Denmark)
Autori IIT:

Vittoria Cozza

Foto di Vittoria Cozza

Tipo: Contributo in atti di convegno
Area di disciplina: Information Technology and Communication Systems

File: icwe16short.pdf

Attività: Bolle dell'informazione e rilevamento di falsi in rete