ADASS XXXI

Jonathan Kenyon

The speaker's profile picture

Biography

I am a relatively recent graduate from Rhodes University, having obtained a Ph.D. in the Department of Physics and Electronics for my thesis entitled "CubiCal: A fast radio interferometric calibration suite exploiting complex optimisation". I am a technically minded person who enjoys writing and optimizing code. My focus is on the calibration of radio interferometer data, but I have dabbled in imaging and deconvolution in the past. I am always keen to hear about new techniques and technologies and how they may be applied to my work.

Profile Picture adass-xxxi-2021/question_uploads/Avatar_pJBNYOp.jpg Affiliation

Rhodes University/SARAO

Position

Postdoctoral Research Fellow

GitHub ID

JSKenyon


Sessions

10-25
10:00
15min
QuartiCal - embarassingly parallel calibration using Numba and Dask
Jonathan Kenyon

We live in the era of Big Data. Where once it was a looming threat, a problem for our future selves, it is now very much upon us. Existing radio interferometers, such as the MeerKAT and LOFAR, already produce unprecedented quantities of visibility data, and even they will soon be dwarfed by the planned SKA and ngVLA. Despite the modernity of these instruments, they will still require extensive calibration in order to correct various science-limiting effects. Thus calibration is and will remain an integral part of radio interferometric data reduction. This has motivated the development of QuartiCal, a Python application that leverages a couple of contemporary packages to make calibration scalable, distributable and fast. The first such package is Dask, a library for parallel and distributed computing with Python. It provides parallel, Big Data collections that extend familiar interfaces (e.g. NumPy, Pandas). Using these collections, it is possible to construct task graphs that can be understood by Dask’s dynamic task schedulers. This allows appropriately written code to scale from executing locally on a laptop to remotely on a compute cluster. The second package of interest is a Numba. Numba is a just-in-time compiler for a subset of Python/NumPy that can provide C-like speed without forfeiting the expressiveness and dynamism of Python. It has been used to extensively optimize the computationally demanding components of the calibration algorithms. QuartiCal convincingly outperforms its predecessor, CubiCal, in terms of both wall time and memory footprint. Finally, in testing QuartiCal, we have found that the Measurement Set (backed by the Casacore Table Data System) can limit parallel performance.

Grand Ballroom