Comprehensive Elastic Resource Management to Ensure Predictable Performance for Scientific Applications on Public IaaS Clouds (UCC 2014)


Scientists have become increasingly reliant on large-scale compute resources on public IaaS clouds to efficiently process their applications. Unfortunately, the reactive nature of auto-scaling techniques made available by the public cloud provider can cause insufficient response time and poor job deadline satisfaction rates. To solve these problems, we designed an end-to-end elastic resource management system for scientific applications on public IaaS clouds. This system employs the following strategies: 1) an accurate and dynamic job execution time predictor, 2) a resource evaluation scheme that balances cost and performance, and 3) an 'availability-aware' job scheduling algorithm. This comprehensive system is deployed on Amazon Web Services and is compared with other state-ofthe-art resource management schemes. Experimental results show that our system achieves a 9% – 32% improvement with respect to the deadline satisfaction rate over other schemes. We achieve this deadline satisfaction rate improvement while still providing improved cost-efficiency over other state-of-the-art approaches.

Proceedings of the 7th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2014)