Journal Title
Title of Journal: Data Min Knowl Disc
|
Abbravation: Data Mining and Knowledge Discovery
|
|
|
|
|
Authors: Parvathi Chundi Daniel J Rosenkrantz
Publish Date: 2008/04/18
Volume: 17, Issue: 3, Pages: 377-401
Abstract
We propose a special type of time series which we call an itemset time series to facilitate the temporal analysis of software version histories email logs stock market data etc In an itemset time series each observed data value is a set of discrete items We formalize the concept of an itemset time series and present efficient algorithms for segmenting a given itemset time series Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series Each segment is associated with an item set that is computed from the item sets of the time points in that segment using a function which we call a measure function We then define a concept called the segment difference which measures the difference between the item set of a segment and the item sets of the time points in that segment The segment difference values are required to construct an optimal segmentation of the time series We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper We outline a dynamic programming based scheme to construct an optimal segmentation of the given itemset time series We use the itemset time series segmentation techniques to analyze the temporal content of three different data sets–Enron email stock market data and a synthetic data set The experimental results show that an optimal segmentation of itemset time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment without examining the item set data at the time points and can be used to analyze different types of temporal data
Keywords:
.
|
Other Papers In This Journal:
|