Leman Akoglu
Professor at Carnegie Mellon University
Andreas Krause
Professor at ETH Zürich
Aaron Klein
Senior Scientist
at Amazon Web Services
Stephen Roberts
Professor at University of Oxford
Mitra Baratchi
Professor
at Leiden University
Rich Caruana
Senior Principal Researcher at Microsoft Research
Irina Rish
Professor
at Université de Montréal
Title: Automating Unsupervised Learning
Learning models are equipped with hyperparameters (HPs) that control their bias-variance trade-off and consequently generalization performance. Thus, carefully tuning these HPs is of utmost importance to learn “good” models. The supervised ML community has focused on Auto-ML toward effective algorithm selection and hyper-parameter optimization (HPO) especially in high dimensions. Yet, automating unsupervised learning remains significantly under-studied.
In this talk, I will present vignettes of our recent research toward unsupervised model selection, specifically in the context of anomaly detection. Especially with the advent of end-to-end trainable deep learning based models that exhibit a long list of HPs, and the attractiveness of self-supervised learning objectives for unsupervised anomaly detection, I will demonstrate that effective model selection becomes ever so critical, opening up challenges as well as opportunities. At the end, I will touch on the potential of pre-trained models toward zero-shot anomaly detection.
Leman Akoglu is the Heinz College Dean’s Associate Professor of Information Systems at Carnegie Mellon University. She has also received her Ph.D. from CSD/SCS of Carnegie Mellon University in 2012. Dr. Akoglu’s research
interests are graph mining, pattern discovery and anomaly detection, with applications to fraud and event detection in diverse real-world domains. She is a recipient of the SDM/IBM Early Career Data Mining Research award
(2020), National Science Foundation CAREER award (2015) and US Army Research Office Young Investigator award (2013). Her early work on graph anomalies has been recognized as the The Most Influential Paper (PAKDD 2020),
which was previously awarded the Best Paper (PAKDD 2010), along with several “best paper” awards at top-tier conferences. Her research has been supported by the NSF, US ARO, DARPA, Adobe, Capital One Bank, Facebook, Northrop
Grumman, PNC Bank, PwC, and Snap Inc.
Aaron Klein
Title: Compressing Large Language Models via Neural Architecture Search
Large Language Models mark a new era in Artificial Intelligence. However, their large size poses challenges for inference in real-world applications due to significant GPU memory requirements and high inference latency. Neural architecture search (NAS) finds more resource-efficient neural network architectures in a data-driven way that jointly optimize performance and efficiency. While NAS is usually used to discover entirely new architectures, in this talk, we show how we can use it to prune large language models, finding sub-networks that optimally trade off efficiency, for example in terms of model size or latency, and generalization performance. We also discuss how we can apply weight-sharing NAS approaches from the literature to accelerate the search process. With NAS, we can prune smaller encoder networks by up to 50% with less than a 5% loss in downstream performance. To foster further research in this direction, we also present a hardware-aware benchmark for GPT-2 type models that utilizes surrogate predictions to approximate various hardware metrics across different devices. In the future, we hope that this work leads to a practical application of NAS that advances the efficiency and scalability of model deployment, enabling more practical and widespread application of large language models across various industries and devices.
I am the head of the AutoML research group at ScaDS.AI (Center for Scalable Data Analytics and Artificial Intelligence) in Leipzig. I am also a co-host of the virtual AutoML Seminar as part of the ELLIS units in Berlin and Freiburg. Alongside my collaborators, I lead the development of the open-source library SyneTune for large-scale hyperparameter optimization and neural architecture search. Until 2024, I worked as an senior scientist at AWS, where I was part of the long-term science team of SageMaker, AWS’s machine learning cloud platform, and the science team of Amazon Q, the GenAI assistant of AWS. Prior to that, I completed my PhD at the University of Freiburg under the supervision of Frank Hutter in 2019. My collaborators from the University of Freiburg and I won the ChaLearn AutoML Challenge in 2015. I co-organized the workshop on neural architecture search at ICLR 2020 and ICLR 2021, respectively, and served as the local chair for the AutoML Conference in 2023.
Mitra Baratchi
Title: AutoML From Raw Spatio-Temporal Observations to Decisions
Modern sensing technologies have provided the possibility of sensing the world in a way that has not been possible before, generating massive spatio-temporal data sources. How can we use such data to understand and even change the complex world around us for the better? In this talk, I will discuss unique machine learning challenges in transforming such data into actionable decisions. These challenges call for automated solutions to address various problems, from filling the gaps in the data to filling the gaps in the knowledge acquired from data alone. I will present a few examples of such problems and automated solutions to address them.
Mitra Baratchi is assistant professor of artificial intelligence at Leiden University, where she leads the Spatio-temporal data Analysis and Reasoning (STAR) and co-leads of the Automated Design of Algorithms research group. Her research interests lie in spatio-temporal, time-series, and mobility data modelling. She strongly focuses on developing algorithms for wearable sensors data, Earth observations and other open spatio-temporal data sources. Specifically, she explores the design of algorithms that can automatically handle all necessary data processing tasks from the point of data collection to high-level modelling, extraction of information, and effective decision-making from such data. Her research targets applications in a broad range of urban, environmental, and industrial domains, for which she has collaborated, notably with the European Space Agency, Netherlands Institute for Space Research, Honda Research Institute, various municipalities, and researchers in other scientific disciplines.
Irina Rish
Irina Rish is a Full Professor in the Computer Science and Operations Research Department at the Université de Montréal (UdeM) and a core faculty member of MILA – Quebec AI Institute, where she leads the Autonomous AI Lab. Dr. Rish holds a Canada Excellence Research Chair (CERC) in Autonomous AI and a Canadian Institute for Advanced Research (CIFAR) Canada AI Chair. Her extensive research career spans multiple AI domains, from automated reasoning and probabilistic inference in graphical models, to machine learning, sparse modeling, and neuroscience-inspired AI. Her current research endeavors concentrate on continual learning, out-of-distribution generalization, and robustness of AI systems, as well as understanding neural scaling laws and emergent behaviors, w.r.t. both capabilities and alignment, in large-scale foundation models – a vital stride towards achieving maximally beneficial Artificial General Intelligence (AGI). She teaches courses on AI scaling and alignment, and runs Neural Scaling & Alignment workshop series. Dr. Rish is a recipient of the INCITE 2023 and other compute grants by the US Department of Energy, and leads several projects on Scalable Foundation Models on Summit & Frontier supercomputers at the Oak Ridge Leadership Computing Facility, focusing on developing open-source large-scale AI models. She is also a co-founder and the Chief Science Officer of nolano.ai, a company focused on training, compression and fast inference in large-scale AI models.
Stephen Roberts
Title: Learning the Dynamics & the Dynamics of Learning
We start by looking at how structure imposition in hierarchical models can transform dynamical system learning. Not all problems are so readily solved by imposing structure priors. We will look at some of the concepts behind understanding the loss-surfaces and dynamics of the learning process. Understanding the complexities of deep-model loss-surfaces leads to better learning procedures & improved generalisation performance, with some interesting insights along the way.
Steve Roberts is Professor of Machine Learning at the University of Oxford. He is a Fellow of the Royal Academy of Engineering, co-leads Oxford’s Eric & Wendy Schmidt AI in Science Programme, is director of the Oxford
ELLIS unit and is co-founder of the Oxford AI company Mind Foundry. Steve’s interests lie in the theory, methodology and application of machine learning to real-world problem domains. His current research includes applications
in astrophysics, climate & environment, ecology, finance and engineering as well as a range of theoretical and methodological problems.
rich Caruana
Title: New Frontiers and Opportunities for AutoML
AutoML began by optimizing hyperparameters such as learning rate and regularization. Quickly, however, AutoML embraced larger pieces of the ML pipeline and hyperparameters eventually included things like the choice of learning algorithm and pre- and post-processing steps such as feature coding, feature selection, missing value imputation, model calibration and neural net architecture. Now LLMs and transformers have created new opportunities for AutoML. And methods such as causal learning and high-accuracy glass-box learning suggest that future AutoML may need to focus less on accuracy and more on correctness.
Rich Caruana is a senior principal researcher at Microsoft Research. Before joining Microsoft, Rich was on the faculty in the Computer Science Department at Cornell University, at UCLA’s Medical School, and at CMU’s Center
for Learning and Discovery. Rich’s Ph.D. is from Carnegie Mellon University, where he worked with Tom Mitchell and Herb Simon. His thesis on Multi-Task Learning helped create interest in a new subfield of machine
learning called Transfer Learning. Rich received an NSF CAREER Award in 2004 (for Meta Clustering), best paper awards in 2005 (with Alex Niculescu-Mizil), 2007 (with Daria Sorokina), and 2014 (with Todd Kulesza,
Saleema Amershi, Danyel Fisher, and Denis Charles), co-chaired KDD in 2007 (with Xindong Wu), and serves as area chair for NIPS, ICML, and KDD. His current research focus is on learning for medical decision making,
transparent modeling, deep learning, and computational ecology.
andreas Krause
Title: Machine Learning in the Optimization and Discovery Loop
Many problems in science and engineering, from discovering novel molecules to tuning machine learning systems, can be viewed as estimating and optimizing an unknown function that is accessible only through noisy experiments. The field of Bayesian optimization seeks to address these challenges by quantifying uncertainty about the unknown objective and utilizing the uncertainty to navigate the exploration—exploitation dilemma. In this talk, I will present recent work motivated by key challenges in complex applications such as protein design, robotics and AutoML. In particular, I will discuss meta-learning of probabilistic models from related tasks and simulations, directing exploration in combinatorial search spaces via reinforcement learning and exploiting causal structure to improve both sample efficiency and interpretability.
Andreas Krause is a Professor of Computer Science at ETH Zurich, where he leads the Learning & Adaptive Systems Group. He also serves as Academic Co-Director of the Swiss Data Science Center and Chair of the ETH AI
Center, and co-founded the ETH spin-off LatticeFlow. Before that he was an Assistant Professor of Computer Science at Caltech. He received his Ph.D. in Computer Science from Carnegie Mellon University (2008) and his Diplom
in Computer Science and Mathematics from the Technical University of Munich, Germany (2004). He is a Max Planck Fellow at the Max Planck Institute for Intelligent Systems, ACM Fellow, ELLIS Fellow, a Microsoft Research
Faculty Fellow and a Kavli Frontiers Fellow of the US National Academy of Sciences. He received the Rössler Prize, ERC Starting Investigator and ERC Consolidator grants, the German Pattern Recognition Award, an NSF CAREER
award as well as the ETH Golden Owl teaching award. His research on machine learning and adaptive systems has received awards at several premier conferences and journals, including the ACM SIGKDD Test of Time award 2019
and the ICML Test of Time award 2020. Andreas Krause served as Program Co-Chair for ICML 2018 and General Chair for ICML 2023 and serves as Action Editor for the Journal of Machine Learning Research. In 2023, he was appointed
to the United Nations’ High-level Advisory Body on AI.