Paper Search Console

Home Search Page About Contact

Journal Title

Title of Journal: World Wide Web

Search In Journal Title:

Abbravation: World Wide Web

Search In Journal Abbravation:

Publisher

Springer US

Search In Publisher:

ISSN

1573-1413

Search In ISSN:
Search In Title Of Papers:

Online refresh strategies for content based feed a

Authors: Roxana Horincar Bernd Amann Thierry Artières
Publish Date: 2014/05/02
Volume: 18, Issue: 4, Pages: 913-947
PDF Link

Abstract

With the rapid growth of data sources services and devices connected to the Internet online available web content is getting more and more diverse and dynamic In order to facilitate the efficient dissemination of evolving and temporary information many web applications publish their new information as RSS and Atom documents which are then collected and transformed by RSS aggregators like Feedly or Yahoo News This article addresses the particular issue of large scale aggregation of highly dynamic information sources by focusing on the design of optimal refresh strategies for large collections of RSS feed documents First we introduce two quality measures specific to RSS aggregation which reflect the information completeness and average freshness of the result feeds Then we propose a best effort feed refresh strategy that achieves maximum aggregation quality compared with all other existing policies with the same average number of refreshes This strategy is based on specific online change estimation models developed after a deep analysis of the temporal publication characteristics of a representative collection of realworld RSS feeds The presented methods have been implemented and tested against synthetic and realworld RSS feed data sets


Keywords:

References


.
Search In Abstract Of Papers:
Other Papers In This Journal:


Search Result: