ADASS XXXI

Stimela 2.0: containerization and workflow management for data pipelines
2021-10-27, 10:30–10:45, Grand Ballroom

The stimela software package (Makhathini 2018) was designed to address a glaring problem in radio astronomy software, namely, the rigid nature of data reduction pipelines, and the heterogeneous and often conflicting dependencies of the software packages involved. Stimela is a containerization and workflow management framework. More specifically, it addresses the cumbersome process of having to craft the perfect computing environment one needed to interface more than one software package for a particular data reduction. Stimela leverages container technology, as well as a modular Python-based approach, to provide a unified and simple interface from which to access a plethora of radio astronomy software packages. This has allowed users to focus on optimising their pipelines instead of fine-tuning computing environments. Examples of stimela deployment “in the field” include the CARACal and VerMeerKAT pipelines, which are two of the pipelines capable of producing quality end-to-end MeerKAT reductions.

Three years later, stimela is ready for a second major release (stimela2) which provides a YaML-based syntax for composable processing recipes, much greater flexibility in software installations (supporting a mix of direct binary installations, virtual environments, and containers), as well as distributed pipelines that can be seamlessly deployed on most cluster infrastructures.


Theme

Solutions for workflow management and reproducibility, Building accessible and friendly user interfaces