Journal Title
Title of Journal: Soc Netw Anal Min
|
Abbravation: Social Network Analysis and Mining
|
Publisher
Springer Vienna
|
|
|
|
Authors: Evangelos E Papalexakis Tudor Dumitras Duen Horng Chau B Aditya Prakash Christos Faloutsos
Publish Date: 2014/11/27
Volume: 4, Issue: 1, Pages: 240-
Abstract
How does malware propagate Does it form spikes over time Does it resemble the propagation pattern of benign files such as software patches Does it spread uniformly over countries How long does it take for a URL that distributes malware to be detected and shut down In this work we answer these questions by analyzing patterns from 22 million malicious and benign files found on 16 million hosts worldwide during the month of June 2011 We conduct this study using the WINE database available at Symantec Research Labs Additionally we explore the research questions raised by sampling on such large databases of executables the importance of studying the implications of sampling is twofold First sampling is a means of reducing the size of the database hence making it more accessible to researchers second because every such data collection can be perceived as a sample of the real world We discover the SharkFin temporal propagation pattern of executable files the GeoSplit pattern in the geographical spread of machines that report executables to Symantec’s servers the Periodic Power Law Ppl distribution of the lifetime of URLs and we show how to efficiently extrapolate crucial properties of the data from a small sample We further investigate the propagation pattern of benign and malicious executables unveiling latent structures in the way these files spread To the best of our knowledge our work represents the largest study of propagation patterns of executablesWe thank Vern Paxson and Marc Dacier for their early feedback on the the design and effects of the WINE sampling strategy The data analyzed in this paper are available for followon research as the reference data set WINE2012006 Research was also sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF0920053The model assumes a total number of N machines that can be infected Let Un be the number of machines that are not infected at time n In be the count of machines that got infected up to time n1 and Delta In be count of machines infected exactly at time n Then Un+1=Un Delta In+1 with initial conditions Delta I0 = 0 and U0 = NAdditionally we let beta as the strength of that executable file We assume that the infectiveness of a file on a machine drops as a specific power law based on the elapsed time since the file infected that machine say tau ie ftau = beta tau 15 Finally we also have to consider one more parameter for our model the external shock” or in other words the first appearance of a file let n b the time that this initial burst appeared and let Sn b be the size of the shock count of infected machines
Keywords:
.
|
Other Papers In This Journal:
|