Anomaly Detection in Scientific Domains AAAI Workshop

Tuesday, March 4, 2025 from 9am to 6pm

Philadeliphia, PA, USA

Description

Scientific discovery often involves the observation of an inconsistency among “normal” patterns within our data. The observation of something different, incongruous with the data, is what we call anomaly detection. Looking for anomalies is quite different from other tasks since we do not know what exactly to look for—we just need to look for something different. The goal of this workshop is to nurture the community of researchers working at the intersection of machine learning and various scientific domains toward scientific discovery.

This workshop will also serve as the award ceremony with special recognition of the winners for the 1st HDR Interdisciplinary Machine Learning Challenge focusing on anomaly detection. The challenge is a series of 4 challenges one for each of three distinct scientific domains (biology, physics, and climate science), and a combined challenge across domains. A critical element of this challenge was the integration of FAIR and Reproducible science.

Format

Our 1-day workshop will include keynote/invited talks, contributed paper presentations, a poster session, a panel discussion, and winning team presentations of their solutions to the challenge.

Attendance

The intended audience for this workshop includes (a) AI/ML/data science researchers working on topics such as anomaly and novelty detection, out-of-distribution detection, open world recognition, scientific discovery, and FAIR dataset and reproducible workflows, who are looking for novel interdisciplinary research problems; (b) domain scientists working on problems with data-driven scientific discoveries.

Workshop Program

9:00 - 9:10am: Opening Remarks

9:10 - 9:50am: Eric Nalisnick—Anomalous Anomalies: Monitoring and Adapting Anomaly Detectors

Details

image of Eric Nalisnick
Bio: Eric Nalisnick is an assistant professor at Johns Hopkins University. His research interests span statistical machine learning and probabilistic modeling, with an emphasis on quantifying uncertainty in deep learning, human-AI collaboration, specifying prior knowledge, and detecting distribution shift. He previously was an assistant professor at the University of Amsterdam, a postdoctoral researcher at the University of Cambridge and a PhD student at the University of California, Irvine. Eric has also held research positions at DeepMind, Microsoft, Twitter, and Amazon. His papers have been recognized with selective oral presentations (ECCV 2024) and awards (AIStats 2023, AIStats 2024).

Abstract: Anomaly detectors perform essential tasks ranging from protecting an autonomous system from abnormal inputs to isolating interesting scientific signals that may lead to discovery. However, how do we know that the anomaly detector will be robust? The detector itself could be susceptible to failure if there is a change in the distribution of anomalies expected. And to make matters worse, we usually don't have an abundance of test-time anomalies that are labeled as such. In this talk, I will discuss my group's recent work on monitoring if our anomaly detector is still valid and adapting it to data shift, without supervision.

9:50 - 10:30am: Jennifer Ngadiuba—Boosting sensitivity to new physics at the LHC with anomaly detection

Details

image of Jennifer Ngadiuba
Bio: Jennifer Ngadiuba, a Wilson Fellow at Fermilab since 2021, specializes in searching for new physics in collider data and advancing AI applications in high-energy physics. After earning her Ph.D. from the University of Zurich, she contributed to CMS experiment studies, focusing on diboson resonances and jet substructure techniques. Starting when she was research fellow at CERN and Caltech, she pioneered deep learning for anomaly detection and fast machine learning on FPGAs for real-time systems in particle physics. Her work earned her the DOE AI4HEP award and AI2050 fellowship in 2023, recognizing her transformative contributions to experimental physics and machine learning.

Abstract: Anomaly detection techniques have been proposed as a way to mitigate the impact of model-specific assumptions when searching for new physics at the LHC. In this talk I will discuss how these techniques, when based on modern AI developments, could be utilized at different stages of data processing workflow, from real-time systems to offline analysis, and the impact they could have to revolutionize the current paradigms in the search for new physics.

10:30 - 11:30am: Coffee Break and Poster Session (Papers and Associated Posters)

11:30 - 12:10pm: Adji Bousso Dieng—Vendi Scoring For Discovery

Details

image of Adji Bousso Dieng
Bio: Adji Bousso Dieng is an Assistant Professor of Computer Science at Princeton University where she leads the lab Vertaix on research at the intersection of artificial intelligence and the natural sciences. She is affiliated with the Chemical and Biological Engineering Department, the Princeton Materials Institute, the Princeton Quantum Initiative, the Andlinger Center for Energy and the Environment, and the High Meadows Environmental Institute (HMEI) at Princeton. She is also a Research Scientist at Google AI and the founder and President of the nonprofit The Africa I Know. She has been recently named an Early-Career Distinguished Presenter at the MRS Spring Meeting, one of 10 African Scholars to watch in 2025 by The Africa Report, an Outstanding Recent Alumni by Columbia University's Grad School of Arts and Sciences, an AI2050 Early Career Fellow by Schmidt Sciences, and as the Annie T. Randall Innovator of 2022 for her research and advocacy by the American Statistical Association. She received her Ph.D. from Columbia University. Her doctoral work received many recognitions, including a Google Ph.D. Fellowship in Machine Learning, a rising star in Machine Learning nomination by the University of Maryland, and a Savage Award from the International Society for Bayesian Analysis, for her doctoral thesis. Dieng's research has been covered in media such as the New Scientist and TechXplore. She hails from Kaolack, Senegal.

Abstract: This talk will cover the concepts, tools, and methods that make up Vendi Scoring, a new research direction focused on the concept of diversity. I’ll begin by introducing the Vendi Scores, a family of diversity metrics rooted in ecology and quantum mechanics, along with their extensions. Next, I’ll discuss algorithms for efficiently searching large materials databases and exploring complex energy landscapes, such as those found in molecular simulations, using the Vendi Scores. Finally, I’ll introduce the new concept of 'algorithmic microscopy,' which stems from Vendi Scoring, and describe the Vendiscope, the first algorithmic microscope designed to help scientists zoom in on large data collections for data-driven discovery.

12:10 - 12:30pm: Suhee Yoon—Diffusion based Semantic Outlier Generation via Nuisance Awareness for Out-of-Distribution Detection

Details

image of Suhee Yoon
Bio: Suhee Yoon is an AI Research Scientist at LG AI Research, specializing in AI Safety & Reliability with a focus on robustness under distribution shifts and efficient adaptation of large-scale foundation models. She holds a Master of Science in Industrial Engineering from Sungkyunkwan University. Her research spans various data domains, including computer vision, chemistry, and tabular data, aiming to develop AI models that are both reliable and adaptable to real-world challenges.

Abstract: Out-of-distribution (OOD) detection, which determines whether a given sample is part of the in-distribution (ID), has recently shown promising results through training with synthetic OOD datasets. Nonetheless, existing methods often produce outliers that are considerably distant from the ID, showing limited efficacy for capturing subtle distinctions between ID and OOD. To address these issues, we propose a novel framework, Semantic Outlier generation via Nuisance Awareness (SONA), which notably produces challenging outliers by directly leveraging pixel-space ID samples through diffusion models. Our approach incorporates SONA guidance, providing separate control over semantic and nuisance regions of ID samples. Thereby, the generated outliers achieve two crucial properties: (i) they present explicit semantic-discrepant information, while (ii) maintaining various levels of nuisance resemblance with ID. Furthermore, the improved OOD detector training with SONA outliers facilitates learning with a focus on semantic distinctions. Extensive experiments demonstrate the effectiveness of our framework, achieving an impressive AUROC of 88% on near-OOD datasets, surpassing the performance of baseline methods by a significant margin of approximately 6%.

12:30 - 2:00pm: Lunch (not provided)

2:00 - 2:10pm: Challenge Overview

2:10 - 2:50pm: Butterfly Challenge Talk

2:50 - 3:30pm: Gravitational Waves Talk

3:30 - 4:00pm: Coffee Break

4:00 - 4:40pm: Sea Level Rise Challenge Talk

4:40 - 5:00pm: Overall Challenge Talk

5:00 - 5:30pm: Closing Remarks and Discussion of Next Challenge

Accepted Papers

Using Diffusion Inpainting Model to Recognize Out-of-distribution Objects [PDF] [Poster]
Authors: Quang-Huy Nguyen, Jin Peng Zhou, Zhenzhen Liu, Kilian Q. Weinberger, Wei-Lun Chao, Dung D. Le

Time Series Foundational Models: Their Role in Anomaly Detection and Prediction [PDF] [Poster]
Authors: Chathurangi Shyalika, Harleen Kaur Bagga, Ahan Bhatt, Renjith Prasad, Alaa Al Ghazo, Amit Sheth

Navigating the High-Dimensional Frontier: Anomaly Detection with Limited Data [PDF] [Poster]
Authors: Ayush Shah, Apurva Narayan, Oleg Iegorov

Contrastive Pretraining for Efficient Anomaly Detection in High-Energy Physics [PDF] [Poster]
Authors: Samuel Bright-Thonney, Christina Reissel, Gaia Grosso, Philip Harris

Anomaly Detection in the CMS L1 Trigger [PDF] [Poster]
Author: Written by Melissa Quinnan on behalf of the CMS Collaboration

Anomaly Detection for Hybrid Butterfly Subspecies via Probability Filtering [PDF] [Poster]
Authors: Bo-Kai Ruan, Yi-Zeng Fang, Hong-Han Shuai, Juinn-Dar Huang

Adaptive Sparsified Graph Learning Framework for Vessel Behavior Anomalies [PDF] [Poster]
Authors: Jeehong Kim, Minchan Kim, Jaeseong Ju, Youngseok Hwang, Wonhee Lee, Hyunwoo Park

Call for Challenge Participation

The ML Challenge is extended to run through January 31st, 2025!

Participate in the Anomaly Detection challenge

Discovering hybrid butterfly species through pattern recognition in image datasets

Finding unmodeled gravitational wave events, such as potential supernovae, in detector data

Identifying unusual fluctuations in water levels that do not correspond to known environmental factors or historical data patterns

Call for Paper Submission

Topics

We encourage participation from researchers in a broad range of topics that explore AI/ML techniques to detect novel patterns and anomalies within data and promote scientific discovery. Examples of research questions include (but are not limited to):

How can we automate the discovery of new scientific phenomena?
How can we embed a notion of uncertainty within AI/ML to quantify the significance of a discovery?
How do we ensure FAIR reproducibility of these discoveries?
What interpretable AI/ML approaches can lead to direct explanations of these discoveries?

Key Dates

Paper Submission Deadline: December 6, 2024. 11:59 PM AOE
Acceptance/Rejection Decision: December 16, 2024
Early Registration Deadline: December 19, 2024. 11:59 PM AOE
Camera-Ready Deadline: February 15, 2025, 11:59 PM AOE
Poster Deadline: February 20, 2025, 11:59 PM AOE
Workshop Date: March 4, 2025

We offer an extended deadline for submissions with the same Camera-Ready and Poster Deadlines, but keep in mind that these are after the AAAI Early Registration Deadline has passed.

Paper Submission Deadline: January 31, 2025. 11:59 PM AOE
Acceptance/Rejection Decision: February 7, 2025

Submission Requirements

We are accepting extended abstract submissions for position, review, or research results (up to 2 pages, excluding references). Shorter versions (up to 6 pages, excluding references) of articles in submission or accepted at other venues (or presented after Oct. 1, 2024) are acceptable as long as they do not violate the dual-submission policy of the other venue. We allow the addition of an appendix (no page limit) following references. All submissions will undergo peer review (double-blind).

Submissions should follow the AAAI template format (two-column, camera-ready style; AAAI Author Kit) and submitted via CMT.

The accepted paper will NOT be archived in AAAI proceedings. This allows the authors to extend their work afterward and submit it to a conference or journal.

Submit Your Work Today!

Organizing Committee

Wei-Lun Chao (The Ohio State University, chao.209@osu.edu, primary contact)
Philip Harris (Massachusetts Institute of Technology, pcharris@mit.edu)
Elizabeth G. Campolongo (The Ohio State University, campolongo.4@osu.edu)
Yuan-Tang Chou (University of Washington, ytchou@uw.edu)
Katya Govorkova (Massachusetts Institute of Technology, katyag@mit.edu)
Hilmar Lapp (Duke University, hilmar.lapp@duke.edu)
Josephine Namayanja (University of Maryland, Baltimore County, jona1@umbc.edu)
Aneesh Subramanian (University of Colorado Boulder, aneeshcs@colorado.edu)