Cataloguing Deep Space: Data Sciences Institute research software development support office seeds Zoobot project

October 16, 2024 by Cormac Rea - Data Sciences Institute

Astronomers and aerospace engineers are continuously driven to design and build better tools with which to monitor and explore outer space. Recent breakthroughs have resulted in new billion-dollar telescopes (ie. Euclid and Rubin) that can provide reams of detailed photographs from distant reaches of the universe.

But with each breakthrough arrive new problems; for instance, how can astronomers accurately organize, label, measure, catalogue and eventually make use of this seemingly infinite cache of images?

Enter Zoobot3D, a cutting-edge new Data Sciences Institute-funded software development project that connects AI industry with human ingenuity, efficiently measuring, labelling, annotating and cataloguing images of deep space. Zoobot3D will be the first and only software tool for galaxy feature segmentation, underpinning a new field of research that will help researchers answer questions that would otherwise be impossible.

Essentially, Zoobot3D will help researchers develop maps to millions of previously unknown galaxies … and who knows what we might find there?

A cluster of galaxies.
Euclid’s view of the Perseus cluster of galaxies.

Co-led by Jo Bovy, a professor and Canada Research Chair in Galactic Astrophysic at the David A. Dunlap Department of Astronomy & Astrophysics and Joshua Speagle, an assistant professor of astrostatistics in the Department of Statistical Sciences, the Zoobot project was awarded funding under the DSI Research Software Development Program.

“From the dawn of humanity, people have looked at the sky and classified the phenomena that can be observed on the celestial tapestry,” says Bovy. “This has led to fundamental insights, such as that the Earth is not at the centre of the Universe and that the Milky Way galaxy is but one of an enormous number of galaxies.”

“Understanding this ‘zoo’ of galaxies across time allows us to piece together how galaxies form and evolve and how our own Milky Way fits into this picture. By partnering with the DSI, we are able to bring the power of modern software development and data science to bear on this problem.”

A large galaxy.
Euclid’s view of spiral galaxy IC 342.

“Historically, astronomers have looked through every image of galaxies — and they have looked through many thousands and tens of thousands — and then they divided them into different buckets,” explains Zoobot Team Lead and Department of Astronomy & Astrophysics postdoctoral fellow, Mike Walmsley.

“But as telescopes have become much more powerful, it’s impossible to do that for the millions of images each telescope now collects.”

“We’ve been running a citizen science project named Galaxy Zoo, showing galaxies to hundreds of thousands of people and asking them to annotate those images — partly to get those same measurements that astronomers are used to, and partly to see what might be there that we didn’t expect,” adds Walmsley.

“Zoobot adds to the picture by helping to really focus on the first of those goals — the making of measurements at scale.”

Three graphics showing the zooniverse experiment tasks.

Certain technical challenges with the Zoobot3D project required a research software engineer that could package the custom annotation tools so that other researchers could create their own labelling and as well seamlessly retrain the model on their own data.

“This has been a very interesting project,” says DSI senior software developer, Conor Klamann. “Its purpose — the creation of maps of outer space — is undeniably fascinating, and developing the software itself has given us the opportunity to evaluate, select, and integrate several cutting-edge open-source tools.”

“It’s always amazing to see what the open-source community has created, and it’s gratifying to think that citizen-scientists will be using our software to advance our knowledge of the world (and beyond!).”

An image of stars and text that says "Zooniverse Experiment, crowd-sourced labeling, get started."