Publications | RespAI Lab

2025

OrgAccess: A Benchmark for Role Based Access Control in Organization Scale LLMs

Debdeep Sanyal , Umakanta Maharana , Yash Sinha , and 4 more authors

2025

source code PDF
Investigating Pedagogical Teacher and Student LLM Agents: Genetic Adaptation Meets Retrieval Augmented Generation Across Learning Style

Debdeep Sanyal , Agniva Maiti , Umakanta Maharana , and 4 more authors

In The 2025 Conference on Empirical Methods in Natural Language Processing , 2025

EMNLP Main CORE A*
Nine Ways to Break Copyright Law and Why Our LLM Won’t: A Fair Use Aligned Generation Framework

Aakash Sen Sharma , Debdeep Sanyal , Priyansh Srivastava , and 4 more authors

In The 2025 Conference on Empirical Methods in Natural Language Processing , 2025

The Findings of EMNLP CORE A* PDF
Guardians of Generation: Dynamic Inference-Time Copyright Shielding with Adaptive Guidance for AI Image Generation

Soham Roy , Abhishek Mishra , Shirish Karande , and 1 more author

2025

U&ME Workshop, ICCV-2025 source code Abs PDF

Modern text-to-image generative models can inadvertently reproduce copyrighted content memorized in their training data, raising serious concerns about potential copyright infringement. We introduce Guardians of Generation, a model agnostic inference time framework for dynamic copyright shielding in AI image generation. Our approach requires no retraining or modification of the generative model weights, instead integrating seamlessly with existing diffusion pipelines. It augments the generation process with an adaptive guidance mechanism comprising three components: a detection module, a prompt rewriting module, and a guidance adjustment module. The detection module monitors user prompts and intermediate generation steps to identify features indicative of copyrighted content before they manifest in the final output. If such content is detected, the prompt rewriting mechanism dynamically transforms the user’s prompt by sanitizing or replacing references that could trigger copyrighted material while preserving the prompt’s intended semantics. The adaptive guidance module adaptively steers the diffusion process away from flagged content by modulating the model’s sampling trajectory. Together, these components form a robust shield that enables a tunable balance between preserving creative fidelity and ensuring copyright compliance. We validate our method on a variety of generative models such as Stable Diffusion, SDXL, and Flux, demonstrating substantial reductions in copyrighted content generation with negligible impact on output fidelity or alignment with user intent. This work provides a practical, plug-and-play safeguard for generative image models, enabling more responsible deployment under real-world copyright constraints.
Right Prediction, Wrong Reasoning: Uncovering LLM Misalignment in RA Disease Diagnosis

Umakanta Maharana , Sarthak Verma , Avarna Agarwal , and 4 more authors

2025

This paper discusses the implications of LLM misalignment in medical diagnostics.

source code PDF
ALU: Agentic LLM Unlearning

Debdeep Sanyal , and Murari Mandal

2025

COLM-2025 Abs PDF

Information removal or suppression in large language models (LLMs) is a desired functionality, useful in AI regulation, legal compliance, safety, and privacy. LLM unlearning methods aim to remove information on demand from LLMs. Current LLM unlearning methods struggle to balance the unlearning efficacy and utility due to the competing nature of these objectives. Keeping the unlearning process computationally feasible without assuming access to the model weights is an overlooked area. We present the first agentic LLM unlearning (ALU) method, a multi-agent, retrain-free, model-agnostic approach to LLM unlearning that achieves effective unlearning while preserving the utility. Our ALU framework unlearns by involving multiple LLM agents, each designed for a specific step in the unlearning process, without the need to update model weights for any of the agents in the framework. Users can easily request any set of unlearning instances in any sequence, and ALU seamlessly adapts in real time. This is facilitated without requiring any changes in the underlying LLM model. Through extensive experiments on established benchmarks (TOFU, WMDP, WPU) and jailbreaking techniques (many shot, target masking, other languages), we demonstrate that ALU consistently stands out as the most robust LLM unlearning framework among current state-of-the-art methods while incurring a low constant-time cost. We further highlight ALU’s superior performance compared to existing methods when evaluated at scale. Specifically, ALU is assessed on up to 1000 unlearning targets, exceeding the evaluation scope of all previously proposed LLM unlearning methods.

2024

ConDa: Fast Federated Unlearning with Contribution Dampening

Vikram S Chundawat , Pushkar Niroula , Prasanna Dhungana , and 3 more authors

ArXiv, 2024

Abs PDF

Federated learning (FL) has enabled collaborative model training across decentralized data sources or clients. While adding new participants to a shared model does not pose great technical hurdles, the removal of a participant and their related information contained in the shared model remains a challenge. To address this problem, federated unlearning has emerged as a critical research direction, seeking to remove information from globally trained models without harming the model performance on the remaining data. Most modern federated unlearning methods use costly approaches such as the use of remaining clients data to retrain the global model or methods that would require heavy computation on client or server side. We introduce Contribution Dampening (ConDa), a framework that performs efficient unlearning by tracking down the parameters which affect the global model for each client and performs synaptic dampening on the parameters of the global model that have privacy infringing contributions from the forgetting client. Our technique does not require clients data or any kind of retraining and it does not put any computational overhead on either the client or server side. We perform experiments on multiple datasets and demonstrate that ConDa is effective to forget a client’s data. In experiments conducted on the MNIST, CIFAR10, and CIFAR100 datasets, ConDa proves to be the fastest federated unlearning method, outperforming the nearest state of the art approach by at least 100x. Our emphasis is on the non-IID Federated Learning setting, which presents the greatest challenge for unlearning. Additionally, we validate ConDa’s robustness through backdoor and membership inference attacks. We envision this work as a crucial component for FL in adhering to legal and ethical requirements.
UnStar: Unlearning with Self-Taught Anti-Sample Reasoning for LLMs

Yash Sinha , Murari Mandal , and Mohan S. Kankanhalli

In , 2024

Abs PDF

The key components of machine learning are data samples for training, model for learning patterns, and loss function for optimizing accuracy. Analogously, unlearning can potentially be achieved through anti-data samples (or anti-samples), unlearning method, and reversed loss function. While prior research has explored unlearning methods and reversed loss functions, the potential of anti-samples remains largely untapped. In this paper, we introduce UnSTAR: Unlearning with Self-Taught Anti-Sample Reasoning for large language models (LLMs). Our contributions are threefold; first, we propose a novel concept of anti-sample-induced unlearning; second, we generate anti-samples by leveraging misleading rationales, which help reverse learned associations and accelerate the unlearning process; and third, we enable fine-grained targeted unlearning, allowing for the selective removal of specific associations without impacting related knowledge - something not achievable by previous works. Results demonstrate that anti-samples offer an efficient, targeted unlearning strategy for LLMs, opening new avenues for privacy-preserving machine learning and model modification.
EcoVal: An Efficient Data Valuation Framework for Machine Learning

Ayush K Tarun , Vikram S Chundawat , Murari Mandal , and 3 more authors

2024

KDD-2024 Core A* source code Abs PDF

Quantifying the value of data within a machine learning workflow can play a pivotal role in making more strategic decisions in machine learning initiatives. The existing Shapley value based frameworks for data valuation in machine learning are computationally expensive as they require considerable amount of repeated training of the model to obtain the Shapley value. In this paper, we introduce an efficient data valuation framework EcoVal, to estimate the value of data for machine learning models in a fast and practical manner. Instead of directly working with individual data sample, we determine the value of a cluster of similar data points. This value is further propagated amongst all the member cluster points. We show that the overall data value can be determined by estimating the intrinsic and extrinsic value of each data. This is enabled by formulating the performance of a model as a \textitproduction function, a concept which is popularly used to estimate the amount of output based on factors like labor and capital in a traditional free economic market. We provide a formal proof of our valuation technique and elucidate the principles and mechanisms that enable its accelerated performance. We demonstrate the real-world applicability of our method by showcasing its effectiveness for both in-distribution and out-of-sample data. This work addresses one of the core challenges of efficient data valuation at scale in machine learning models.
Multi-Modal Recommendation Unlearning for Legal, Licensing, and Modality Constraints

Yash Sinha , Murari Mandal , and Mohan Kankanhalli

2024

Abs PDF

Unlearning methods for recommender systems (RS) have emerged to address privacy issues and concerns about legal compliance. However, evolving user preferences and content licensing issues still remain unaddressed. This is particularly true in case of multi-modal recommender systems (MMRS), which aim to accommodate the growing influence of multi-modal information on user preferences. Previous unlearning methods for RS are inapplicable to MMRS due to incompatibility of multi-modal user-item behavior data graph with the matrix based representation of RS. Partitioning based methods degrade recommendation performance and incur significant overhead costs during aggregation. This paper introduces MMRecUN, a new framework for multi-modal recommendation unlearning, which, to the best of our knowledge, is the first attempt in this direction. Given the trained recommendation model and marked forget data, we devise Reverse Bayesian Personalized Ranking (BPR) objective to force the model to forget it. MMRecUN employs both reverse and forward BPR loss mechanisms to selectively attenuate the impact of interactions within the forget set while concurrently reinforcing the significance of interactions within the retain set. Our experiments demonstrate that MMRecUN outperforms baseline methods across various unlearning requests when evaluated on benchmark multi-modal recommender datasets. MMRecUN achieves recall performance improvements of up to \mathbf49.85% compared to the baseline methods. It is up to 1.3× faster than the \textscGold model, which is trained on retain data from scratch. MMRecUN offers advantages such as superior performance in removing target elements, preservation of performance for retained elements, and zero overhead costs in comparison to previous methods.
Distill to Delete: Unlearning in Graph Networks with Knowledge Distillation

Yash Sinha , Murari Mandal , and Mohan Kankanhalli

2024

Abs PDF

Graph unlearning has emerged as a pivotal method to delete information from a pre-trained graph neural network (GNN). One may delete nodes, a class of nodes, edges, or a class of edges. An unlearning method enables the GNN model to comply with data protection regulations (i.e., the right to be forgotten), adapt to evolving data distributions, and reduce the GPU-hours carbon footprint by avoiding repetitive retraining. Existing partitioning and aggregation-based methods have limitations due to their poor handling of local graph dependencies and additional overhead costs. More recently, GNNDelete offered a model-agnostic approach that alleviates some of these issues. Our work takes a novel approach to address these challenges in graph unlearning through knowledge distillation, as it distills to delete in GNN (D2DGN). It is a model-agnostic distillation framework where the complete graph knowledge is divided and marked for retention and deletion. It performs distillation with response-based soft targets and feature-based node embedding while minimizing KL divergence. The unlearned model effectively removes the influence of deleted graph elements while preserving knowledge about the retained graph elements. D2DGN surpasses the performance of existing methods when evaluated on various real-world graph datasets by up to 43.1% (AUC) in edge and node unlearning tasks. Other notable advantages include better efficiency, better performance in removing target elements, preservation of performance for the retained elements, and zero overhead costs. Notably, our D2DGN surpasses the state-of-the-art GNNDelete in AUC by 2.4%, improves membership inference ratio by +1.3, requires 10.2×106 fewer FLOPs per forward pass and up to 3.2× faster.
Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models

Aakash Sen Sharma , Niladri Sarkar , Vikram Chundawat , and 2 more authors

2024

PDF
A Unified Framework for Continual Learning and Machine Unlearning

Romit Chatterjee , Vikram Chundawat , Ayush Tarun , and 2 more authors

2024

PDF

2023

Fast yet effective machine unlearning

Ayush K Tarun , Vikram S Chundawat , Murari Mandal , and 1 more author

IEEE Transactions on Neural Networks and Learning Systems, 2023

IEEE TNNLS source code PDF
Zero-Shot Machine Unlearning

Vikram S. Chundawat , Ayush K. Tarun , Murari Mandal , and 1 more author

IEEE Transactions on Information Forensics and Security, 2023

IEEE TIFS source code PDF
Deep Regression Unlearning

Ayush Kumar Tarun , Vikram Singh Chundawat , Murari Mandal , and 1 more author

In Proceedings of the 40th International Conference on Machine Learning , 23–29 jul 2023

ICML-2023 Core A* source code Abs PDF

With the introduction of data protection and privacy regulations, it has become crucial to remove the lineage of data on demand from a machine learning (ML) model. In the last few years, there have been notable developments in machine unlearning to remove the information of certain training data efficiently and effectively from ML models. In this work, we explore unlearning for the regression problem, particularly in deep learning models. Unlearning in classification and simple linear regression has been considerably investigated. However, unlearning in deep regression models largely remains an untouched problem till now. In this work, we introduce deep regression unlearning methods that generalize well and are robust to privacy attacks. We propose the Blindspot unlearning method which uses a novel weight optimization process. A randomly initialized model, partially exposed to the retain samples and a copy of the original model are used together to selectively imprint knowledge about the data that we wish to keep and scrub off the information of the data we wish to forget. We also propose a Gaussian fine tuning method for regression unlearning. The existing unlearning metrics for classification are not directly applicable to regression unlearning. Therefore, we adapt these metrics for the regression setting. We conduct regression unlearning experiments for computer vision, natural language processing and forecasting applications. Our methods show excellent performance for all these datasets across all the metrics. Source code: https://github.com/ayu987/deep-regression-unlearning
Can Bad Teaching Induce Forgetting? Unlearning in Deep Networks Using an Incompetent Teacher

Vikram S Chundawat , Ayush K Tarun , Murari Mandal , and 1 more author

Proceedings of the AAAI Conference on Artificial Intelligence, Jun 2023

AAAI-2023 Core A* source code
A universal metric for robust evaluation of synthetic tabular data

Vikram S Chundawat , Ayush K Tarun , Murari Mandal , and 2 more authors

IEEE Transactions on Artificial Intelligence, Jun 2023

IEEE-TAI source code