IIT Home Page CNR Home Page

Web Crawling and Processing with Limited Resources for Business Intelligence and Analytics Applications

this activity for enterprises span from the reduction of the operative costs due to a more sensible internal organization to a more productive and aware decision process. To be effective, BI relies heavily on the availability of a huge amount of (possibly high-quality) data. The steady decrease of costs for acquiring, storing and analyzing large knowledge bases has motivated big companies to invest in BI technologies. Until now, instead, SMEs (Small and Medium-sized Companies) are excluded from the benefits of BI because of their limited budget and resources. In this paper we show that a satisfactory BI activity is possible even in presence of a small budget. Our ultimate goal is not necessarily that of proposing novel solutions but providing the practitioners with a sort of hitchhiker’s guide to a cost-effective web-based BI. In particular, we discuss how the Web can be used as a cheap yet reliable source of information where crawling, data cleaning and classification can be achieved using a limited amount of CPU, storage space and bandwidth..

Journal of software, 2018

Autori IIT:

Loredana Marialuisa Genovese

Foto di Loredana Marialuisa Genovese

Tipo: Contributo in rivista non ISI
Area di disciplina: Information Technology and Communication Systems

File: jsw.pdf
Da pagina 300 a pagina 316

Attività: Algoritmica per tecnologie web