Scheduling for high performance computing with reinforcement learning

Authors

Scott Hutchison, Kansas State University; Daniel Andresen, Kansas State University; William Hsu, Kansas State University; Benjamin Parsons, Engineering and Research and Development Center, Vicksburg, MS; Mitchell Neilsen, Kansas State University

Keywords:

High Performance Computing, Scheduling, Artificial Intelligence, Reinforcement Learning, Deep Learning

Synopsis

This is a Chapter in:

Book:
Competitive Tools, Techniques, and Methods

Print ISBN 978-1-6692-0008-6
Online ISBN 978-1-6692-0007-9

Series:
Chronicle of Computing

Chapter Abstract:

Job scheduling for high performance computing systems involves building a policy to optimize for a particular metric, such as minimizing job wait time or maximizing system utilization. Different administrators may value one metric over another, and the desired policy may change over time. Tuning a scheduling application to optimize for a particular metric is challenging, time consuming, and error prone. However, reinforcement learning can quickly learn different scheduling policies dynamically from log data and effectively apply those policies to other workloads. This research demonstrates that a reinforcement learning agent trained using the proximal policy optimization algorithm performs 18.44% better than algorithmic scheduling baselines for one metric and has comparable performance for another. Reinforcement learning can learn scheduling policies which optimize for multiple different metrics and can select not only which job in the queue to schedule next, but also the machine on which to run it. The agent considers jobs with three resource constraints (CPU, GPU, and memory) while respecting individual machine resource constraints.

Cite this paper as:
Hutchison S., Andresen D., Hsu W., Parsons B., Neilsen M. (2024) Scheduling for high performance computing with reinforcement learning. In: Tiako P.F. (ed) Competitive Tools, Techniques, and Methods. Chronicle of Computing. OkIP. APDC24#6. https://doi.org/10.55432/978-1-6692-0007-9_1

Presented at:
The 2024 OkIP International Conference on Advances in Parallel and Distributed Computing (APDC) in Oklahoma City, Oklahoma, USA, and Online, on April 3, 2024

Contact:
Scott Hutchison
scotthutch@ksu.edu

References

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). OpenAI Gym. ArXiv Preprint ArXiv:1606.01540.

Chapin, S., Cirne, W., Feitelson, D., Jones, J., Leutenegger, S., Schwiegelshohn, U., . . . Talby, D. (1999). Benchmarks and standards for the evaluation of parallel job schedulers. Job Scheduling Strategies for Parallel Processing: IPPS/SPDP’99Workshop, JSSPP’99 (pp. 67-90). San Juan: Springer Berlin Heidelberg.

Corbalan, J., & D’Amico, M. (2021). Modular workload format: Extending SWF for modular systems. Workshop on Job Scheduling Strategies for Parallel Processing (pp. 43-55). Cham: Springer International Publishing.

Dósa, G., & Sgall, J. (2014). Optimal analysis of best fit bin packing. International Colloquium on Automata, Languages, and Programming (pp. 429-441). Berlin: Heidelberg: Springer Berlin Heidelberg.

Girden, E. R. (1992). ANOVA: Repeated measures (No. 84). Sage.

Huang, S., & Ontañón, S. (2020). A closer look at invalid action masking in policy gradient algorithms. arXiv preprint arXiv:2006.14171.

Jette, M., & Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. Proceedings of ClusterWorld Conference and Expo. . San Jose, California.

Liang, S., Yang, Z., Jin, F., & Chen, Y. (2020). Data centers job scheduling with deep reinforcement learning. Advances in Knowledge Discovery and Data Mining: 24th Pacific-Asia Conference, PAKDD 2020 (pp. 906-917). Singapore: Springer International Publishing.

Mao, H., Alizadeh, M., Menache, I., & Kandula, S. (2016). Resource management with deep reinforcement learning. Proceedings of the 15th ACM workshop on hot topics in networks, (pp. 50-56).

Raffin, A., Hill, A., Gleave, A., A., K., Ernestus, M., & Dormann, N. (2021). Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research, 22, pp. 1-8. Retrieved from http://jmlr.org/papers/v22/20-1364.html

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.0634.

Student. (1908). The probable error of a mean. Biometrika, (pp. 1-25).

Ullman, J. D. (1975). NP-complete scheduling problems. Journal of Computer and System sciences, 384-393.

Wang, Q., Zhang, H., Qu, C., Shen, Y., Liu, X., & Li, J. (2021). RLSchert: an hpc job scheduler using deep reinforcement learning and remaining time prediction. Applied Sciences, (p. 9448).

Zhang, D., Dai, D., He, Y., Bao, F. S., & Xie, B. (2020). RLScheduler: an automated HPC batch job scheduler using reinforcement learning. SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-15). IEEE.

Scheduling for high performance computing with reinforcement learning

Published

August 17, 2024

Online ISSN

2831-350X

Print ISSN

2831-3496

Details about this monograph

Date of first publication (11)

2024-08-17
Hijri Calendar