oai:arXiv.org:2311.12308
Computer Science
2023
11/29/2023
Scientific workflows facilitate computational, data manipulation, and sometimes visualization steps for scientific data analysis.
They are vital for reproducing and validating experiments, usually involving computational steps in scientific simulations and data analysis.
These workflows are often developed by domain scientists using Jupyter notebooks, which are convenient yet face limitations: they struggle to scale with larger data sets, lack failure tolerance, and depend heavily on the stability of underlying tools and packages.
To address these issues, Jup2Kup has been developed.
This software system translates workflows from Jupyter notebooks into a distributed, high-performance Kubernetes environment, enhancing fault tolerance.
It also manages software dependencies to maintain operational stability amidst changes in tools and packages.
;Comment: for associated software, see https://github.com/shirou10086/Scientificworkflow
Duan, Jinli,Dennis, Shasha, 2023, Jup2Kub: algorithms and a system to translate a Jupyter Notebook pipeline to a fault tolerant distributed Kubernetes deployment