Tutorials – WSDM26

List of Accepted Tutorials

Explaining the “Unexplainable” Large Language Models

Organizers: Zhen Tan, Song Wang, Tianlong Chen, Jing Ma, Jundong Li and Huan Liu

Abstract: The integration of Large Language Models (LLMs) into critical societal functions has intensified the urgent demand for transparency and trust. While post-hoc attribution and Chain-of-Thought reasoning serve as primary explainability approaches, they often prove unreliable, yielding brittle or illusory insights into model behavior. This tutorial tries to unfold why so. We first establish the theoretical intractability of complete, mechanistic explanations, then clarify the intrinsic barriers to full transparency, and next pivot to a principled alternative, user-centric approaches such as concept-based interpretability and controlled data attribution. We review the foundations of these techniques and their modern extensions for comprehensive explanation, inference-time intervention, and editability. Finally, we demonstrate how these methods foster effective human-AI collaboration in high-stakes scientific applications. By synthesizing foundational theory, critical analysis, and cutting-edge techniques, this tutorial provides a unique perspective for developing the next generation of explainable and trustworthy AI.

A Comprehensive Guide to Time-Series Anomaly Detection

Organizers: John Paparrizos, Paul Boniol, Qinghua Liu and Themis Palpanas

Abstract: Anomaly detection is a fundamental data analytics task across scientific fields and industries. In recent years, an increasing interest has been shown in the application of anomaly detection techniques to time series. In this tutorial, we take a holistic view of anomaly detection in time series and comprehensively cover detection algorithms ranging from the 1980s to the most current state-of-the-art techniques. Importantly, the scope of this tutorial extends beyond algorithmic discussion, delving into the latest advancements in benchmarking and evaluation measures for this area. In particular, our interactive systems enable the exploration of methods and benchmarking results, thereby promoting user comprehension. Furthermore, this tutorial extensively explores automated solutions for unsupervised model selection, introduces a new taxonomy, and engages with the challenges and recent findings, particularly the difficulty for these solutions to outperform simple random choice. Driven by the limited generalizability of current detection algorithms, we review recent applications of Foundation Models for anomaly detection to motivate further research in the area.

GNN Explainers 2.0: User-centric and Data Driven Insights

Organizers: Arijit Khan, Xiangyu Ke, Yinghui Wu and Francesco Bonchi

Abstract: Graph neural networks (GNNs) are powerful deep learning models for graph-structured data–excelling in domains such as social networks, knowledge graphs, bioinformatics, transportation, World Wide Web, and finance on tasks like node/graph classification, link prediction, entity resolution, question answering, recommendation, and fraud detection. Despite their empirical success, GNNs remain largely opaque: Their multi-layer message-passing and complex feature interactions make it hard for practitioners and stakeholders to understand why a model produced a particular prediction. The first wave of explainability research (GNN Explainers 1.0; e.g., GNNExplainer, PGExplainer, SubgraphX, PGMExplainer, GraphLime, GCFExplainer, CF2, GNN-LRP) made important progress by identifying influential nodes, edges, subgraphs, and features–yet typically offer one-off, task-limited explanations. Practical debugging and accountability demand richer, layer-wise provenance and interactive, configurable explanations so that data scientists can trace transformations and non-technical stakeholders can query and understand GNN behavior via familiar interfaces, including structured queries, counterfactual evidence, or natural language. This tutorial surveys advances toward user-centered GNN explanations (GNN Explainers 2.0), shows how data science principles can improve comprehension, usability, and trust, and presents representative works, open challenges, and opportunities for the web and data mining community.

Democratizing RAGs with Structured Knowledge

Organizers: Yu Wang, Zhisheng Qi, Yongjia Lei, Haoyu Han, Harry Shomer, Kaize Ding, Yu Zhang, Ryan Rossi and Hui Liu

Abstract: Retrieving external knowledge to Augment Generations of downstream task solutions (RAGs) has become a standard practice in powering knowledge-intensive applications. However, real-world knowledge often manifests in heterogeneous yet distinctive structures (e.g., tabular schemas, social networked relations, and hierarchical document trees), the effective modeling of which demands specialized modeling, practical engineering skills, and domain expertise. Meanwhile, the growing adoption of RAG systems (RAGs) in high-stakes scenarios underscores the need for rigorous safety considerations. Despite the importance of this structural perspective, the current landscape remains fragmented: concepts, techniques, and datasets are often defined in isolation across different knowledge structures. Moreover, few approaches adequately consider how structured knowledge shapes the safety of RAGs. Against this backdrop, our tutorial offers a timely and distinctive structural perspective on RAGs. We begin with an architectural overview of structured RAGs across their full lifecycle, highlighting their canonical designs. We then examine how design principles can be specialized for different knowledge structures, such as documents, networks, and tables, showcasing their unique applications and introducing complementary perspectives that balance both utility and safety.

Uncertainty Quantification for Dynamical Networks

Organizers: Zhiqian Chen and Zonghan Zhang

Abstract: Dynamical networks are essential for understanding how network structures interact with dynamic processes over them. For example, in adaptive social networks, individuals’ opinions influence and are influenced by their connections, leading to co-evolutionary patterns. Similarly, in neuroscience, the plasticity of neural networks dynamically reshapes their structure in response to activity. The topology of these networks profoundly impacts behavior, making their analysis critical in understanding stability, synchronization, or cascading failures. Uncertainty quantification (UQ) in dynamical networks addresses the challenges posed by incomplete or noisy knowledge of network structure, parameters, and external influences. For instance, fluctuating edge weights, evolving node connections, or stochastic interactions introduce uncertainties that affect predictions. In epidemiology, unknown contact patterns or varying transmission rates can significantly impact outbreak modeling, while in power systems, uncertainties in demand and renewable energy integration challenge reliability assessments. UQ systematically evaluates these uncertainties, offering techniques to quantify their effects and develop robust predictions. This half-day tutorial bridges the study of dynamical networks with the systematic framework of UQ, providing a comprehensive understanding of their interplay and practical applications. The tutorial begins with an introduction to dynamical networks, exploring their structural and behavioral characteristics through real-world examples in epidemiology, neuroscience, and engineering. It then transitions to UQ, covering foundational methods such as probabilistic simulations, sensitivity analysis, and stochastic modeling. Advanced topics, including machine learning-based surrogate modeling for computationally efficient UQ, will be discussed. The session concludes with an exploration of open challenges, such as integrating data-driven and physics-based models, and strategies for scaling UQ techniques to high-dimensional systems.

Keynote Speakers

List of Accepted Tutorials

Explaining the “Unexplainable” Large Language Models

A Comprehensive Guide to Time-Series Anomaly Detection

GNN Explainers 2.0: User-centric and Data Driven Insights

Democratizing RAGs with Structured Knowledge

Uncertainty Quantification for Dynamical Networks