We tackle the problem of predicting the performance of MapReduce applications designing accurate progress indicators, which keep programmers informed on the percentage of completed computation time during the execution of a job. This is especially important in pay-as-you-go cloud environments, where slow jobs can be aborted in order to avoid excessive costs. Performance predictions can also serve as a building block for several profile-guided optimizations. By assuming that the running time depends linearly on the input size, state-of-the-art techniques can be seriously harmed by data skewness, load unbalancing, and straggling tasks. We thus design a novel profile-guided progress indicator, called NearestFit, that operates without the linear hypothesis assumption in a fully online way (i.e., without resorting to profile data collected from previous executions). NearestFit exploits a careful combination of nearest neighbor regression and statistical curve fitting techniques. Fine-grained profiles required by our theoretical progress model are approximated through space- and time-efficient data streaming algorithms. We implemented NearestFit on top of Hadoop 2.6.0. An extensive empirical assessment over the Amazon EC2 platform on a variety of benchmarks shows that its accuracy is very good, even when competitors incur non-negligible errors and wide prediction fluctuations.
On data skewness, stragglers, and MapReduce progress indicators / Coppa, Emilio; Finocchi, Irene. - (2015), pp. 139-152. ((Intervento presentato al convegno ACM Symposium on Cloud Computing (SoCC) tenutosi a Hawaii, USA nel 27 August through 30 August 2015 [10.1145/2806777.2806843].
Titolo: | On data skewness, stragglers, and MapReduce progress indicators | |
Autori: | ||
Data di pubblicazione: | 2015 | |
Citazione: | On data skewness, stragglers, and MapReduce progress indicators / Coppa, Emilio; Finocchi, Irene. - (2015), pp. 139-152. ((Intervento presentato al convegno ACM Symposium on Cloud Computing (SoCC) tenutosi a Hawaii, USA nel 27 August through 30 August 2015 [10.1145/2806777.2806843]. | |
Handle: | http://hdl.handle.net/11385/192551 | |
ISBN: | 978-1-4503-3651-2 | |
Appare nelle tipologie: | 04.1 - Contributo in Atti di convegno (Paper in Proceedings) |
File in questo prodotto:
File | Descrizione | Tipologia | Licenza | |
---|---|---|---|---|
ACM-SoCC15.pdf | Versione dell'editore | DRM non definito | Administrator |