Authors: KeYan Cao GuoRen Wang DongHong Han GuoHui Ding AiXia Wang LingXu Shi
Publish Date: 2014/05/17
Volume: 29, Issue: 3, Pages: 436-448
Abstract
Outlier detection on data streams is an important task in data mining The challenges become even larger when considering uncertain data This paper studies the problem of outlier detection on uncertain data streams We propose Continuous Uncertain Outlier Detection CUOD which can quickly determine the nature of the uncertain elements by pruning to improve the efficiency Furthermore we propose a pruning approach — Probability Pruning for Continuous Uncertain Outlier Detection PCUOD to reduce the detection cost It is an estimated outlier probability method which can effectively reduce the amount of calculations The cost of PCUOD incremental algorithm can satisfy the demand of uncertain data streams Finally a new method for parameter variable queries to CUOD is proposed enabling the concurrent execution of different queries To the best of our knowledge this paper is the first work to perform outlier detection on uncertain data streams which can handle parameter variable queries simultaneously Our methods are verified using both real data and synthetic data The results show that they are able to reduce the required storage and running timeThe work is supported by the National Natural Science Foundation of China under Grant Nos 61025007 61328202 61173029 61100024 61332006 and 61073063 the National High Technology Research and Development 863 Program of China under Grant No 2012AA011004 and the National Basic Research 973 Program of China under Grant No 2011CB302200G
Keywords: