Authors: Pete Beckman Kamil Iskra Kazutomo Yoshii Susan Coghlan Aroon Nataraj
Publish Date: 2008/01/24
Volume: 11, Issue: 1, Pages: 3-16
Abstract
We investigate operating system noise which we identify as one of the main reasons for a lack of synchronicity in parallel applications Using a microbenchmark we measure the noise on several contemporary platforms and find that even with a generalpurpose operating system noise can be limited if certain precautions are taken We then inject artificially generated noise into a massively parallel system and measure its influence on the performance of collective operations Our experiments indicate that on extremescale platforms the performance is correlated with the largest interruption to the application even if the probability of such an interruption on a single process is extremely small We demonstrate that synchronizing the noise can significantly reduce its negative influence
Keywords: