Cost-Driven Scheduling Strategy for Scientific Workflow under Multi-cloud Environment
LIN Bing1,2, GUO Wen-Zhong1,2, CHEN Guo-Long1,2, CHEN Huang-Ning1
1.College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116 2.Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing, Fuzhou University, Fuzhou 350108
Abstract:Aiming at the deadline-constrained scientific workflow scheduling problem under multi-cloud environment, the concept of partial critical paths algorithm is introduced. A cost-driven scheduling strategy for scientific workflow is proposed to reduce the execution cost of workflow as much as possible and satisfy its deadline constraint. The characteristics of multi-cloud environment and scientific workflows are taken into account in this strategy. Firstly, the adjacent two tasks with a common directed cut-edge are merged into a single task based on the workflow structure. Then, the partial critical paths with subdeadline constraints are searched based on the critical parent iterative mechanism. Finally, the most suitable instances are allocated to the partial critical path and all the tasks in the path are scheduled to their corresponding instance. Various workflows are used for evaluating the proposed strategy and experimental results show that the proposed strategy has a better execution efficiency and a lower workflow execution cost.
[1] Bittencourt L F, Madeira E R M, da Fonseca N L S. Scheduling in Hybrid Clouds. IEEE Communications Magazine, 2012, 50(9): 42-47 [2] Tindell K W, Burns A, Wellings A J. Allocating Hard Real-Time Tasks: An NP-Hard Problem Made Easy. Real-Time Systems, 1992, 4(2): 145-165 [3] Armbrust M, Fox A, Griffith R, et al. A View of Cloud Computing. Communications of the ACM, 2010, 53(4): 50-58 [4] Wang Q, Li X F, Wang J. A Data Placement and Task Scheduling Algorithm in Cloud Computing. Journal of Computer Research and Development, 2014, 51(11): 2416-2426 (in Chinese) (王 强,李雄飞,王 婧.云计算中的数据放置与任务调度算法.计算机研究与发展, 2014, 51(11): 2416-2426) [5] Kwok Y K, Ahmad I. Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors. ACM Computing Surveys, 1999, 31(4): 406-471 [6] Song Y J, Yang X Z, Li D Y, et al. The Cloud Scheduler Politics of Multiprocessor Multitask Real Time Systems. Chinese Journal of Computers, 2000, 23(10): 1107-1113 (in Chinese) (宋远骏,杨孝宗,李德毅,等.多机多任务实时系统云调度策略.计算机学报, 2000, 23(10): 1107-1113) [7] Cao H J, Jin H, Wu X X, et al. DAGMap: Efficient and Depend-able Scheduling of DAG Workflow Job in Grid. The Journal of Supercomputing, 2010, 51(2): 201-223 [8] Chen W N, Zhang J. An Ant Colony Optimization Approach to a Grid Workflow Scheduling Problem with Various QoS Requirements. IEEE Trans on Systems, Man, and Cybernetics: Applications and Reviews, 2008, 39(1): 29-43 [9] Abrishami S, Naghibzadeh M, Epema D H J. Deadline-Constrained Workflow Scheduling Algorithms for Infrastructure as a Service Clouds. Future Generation Computer Systems, 2013, 29(1): 158-169 [10] Pandey S, Wu L L, Guru S M, et al. A Particle Swarm Optimization-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments // Proc of the 24th IEEE International Conference on Advanced Information Networking and Applications. Perth, USA, 2010: 400-407 [11] Wu Z J, Ni Z W, Gu L C, et al. A Revised Discrete Particle Swarm Optimization for Cloud Workflow Scheduling // Proc of the International Conference on Computational Intelligence and Security. Nanning, China, 2010: 184-188 [12] Li J Y, Qiu M K, Ming Z, et al. Online Optimization for Scheduling Preemptable Tasks on IaaS Cloud Systems. Journal of Parallel and Distributed Computing, 2012, 72(5): 666-677 [13] Lin X Y, Wu C Q. On Scientific Workflow Scheduling in Clouds under Budget Constraint // Proc of the 42nd International Confe-rence on Parallel Processing. Lyon, France, 2013: 90-99 [14] Liu S W, Kong L M, Ren K J, et al. A Two-Step Data Placement and Task Scheduling Strategy for Optimizing Scientific Workflow Performance on Cloud Computing Platform. Chinese Journal of Computers, 2011, 34(11): 2121-2130 (in Chinese) (刘少伟,孔令梅,任开军,等.云环境下优化科学工作流执行性能的两阶段数据放置与任务调度策略.计算机学报, 2011, 34(11): 2121-2130) [15] Abrishami S, Naghibzadeh M, Epema D H J. Cost-Driven Scheduling of Grid Workflows Using Partial Critical Paths. IEEE Trans on Parallel and Distributed Systems, 2012, 23(8): 1400-1414 [16] Bharathi S, Chervenak A, Deelman E, et al. Characterization of Scientific Workflows // Proc of the 3rd Workshop on Workflows in Support of Large-Scale Science. Austin, USA, 2008. DOI: 10.1109/WORKS.2008.4723958 [17] Van den Bossche R, Vanmechelen K, Broeckhove J. Online Cost-Efficient Scheduling of Deadline-Constrained Workloads on Hybrid Clouds. Future Generation Computer Systems, 2013, 29(4): 973-985 [18] Wang X D. Data Structures and Algorithm Design. Beijing, China: China Machine Press, 2012 (in Chinese) (王晓东.数据结构与算法设计.北京:机械工业出版社, 2012) [19] Zhao F, Wang B S, Lu Z X, et al. Impact of Non-cut Link Fai-lures on Network Traffic. Journal on Communications, 2006, 27(11A): 184-188 (in Chinese) (赵 锋,王宝生,卢泽新,等.非割边链路故障对网络流量的影响分析.通信学报, 2006, 27(11A): 184-188) [20] Yi S, Andrzejak A, Kondo D. Monetary Cost-Aware Checkpointing and Migration on Amazon Cloud Spot Instances. IEEE Trans on Services Computing , 2012, 5(4): 512-524 [21] Topcuoglu H, Hariri S, Wu M Y. Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing. IEEE Trans on Parallel and Distributed Systems, 2002, 13(3): 260-274