Document detail
ID

oai:arXiv.org:2311.12308

Topic
Computer Science - Distributed, Pa... Computer Science - Software Engine...
Author
Duan, Jinli Dennis, Shasha
Category

Computer Science

Year

2023

listing date

11/29/2023

Keywords
scientific workflows distributed data software
Metrics

Abstract

Scientific workflows facilitate computational, data manipulation, and sometimes visualization steps for scientific data analysis.

They are vital for reproducing and validating experiments, usually involving computational steps in scientific simulations and data analysis.

These workflows are often developed by domain scientists using Jupyter notebooks, which are convenient yet face limitations: they struggle to scale with larger data sets, lack failure tolerance, and depend heavily on the stability of underlying tools and packages.

To address these issues, Jup2Kup has been developed.

This software system translates workflows from Jupyter notebooks into a distributed, high-performance Kubernetes environment, enhancing fault tolerance.

It also manages software dependencies to maintain operational stability amidst changes in tools and packages.

;Comment: for associated software, see https://github.com/shirou10086/Scientificworkflow

Duan, Jinli,Dennis, Shasha, 2023, Jup2Kub: algorithms and a system to translate a Jupyter Notebook pipeline to a fault tolerant distributed Kubernetes deployment

Document

Open

Share

Source

Articles recommended by ES/IODE AI