Bernie Boscoe is an information scientist in Los Angeles, California. Prior to this, she was a postdoctoral researcher in the Physics and Astronomy Department at UCLA, where she worked on the Machine Learning in Astronomy project, funded by the Alfred P. Sloan Foundation. Her work looks at the adoption of machine learning techniques into traditional computational approaches in astronomy, with an interest in using tools from industry being used in research settings.Profile Picture – adass-xxxi-2021/question_uploads/IMG_7423_vHXxVJH.jpg Affiliation –
Occidental CollegePosition –
Visiting Assistant Professor, Computer ScienceTwitter handle –
As machine learning increasingly becomes adopted into astronomy research practices, additional workflows must be added to preserve reproducible and replicable results. However, as machine learning tools are often drawn from industry or computer science domains, how these imports can segue into traditional astronomy computational practices and enhance results is not well understood. In this talk, I start from the position that machine learning models rarely can be rerun after a year’s time, let alone after a longer time span. I ask the question: How can scientific results be verified, compared to, or built upon if the artefacts necessary to examine the process to obtain results have not been saved? With this in mind, I suggest three straightforward additions to typical machine learning workflows to improve reproducibility: (1) Fixing the training data used to create the model, (2) Covering best practices in saving the model, including parameters and hyperparameters used in the final tuning, and (3) Describing additional artefacts in model evaluations that enable a further understanding of outputs that the model produced. I will conclude with steps enacted in our team’s research project using machine learning to predict galactic redshifts, and present lessons learned in our experience reproducing models using machine learning.