News
Feb 22, 2025 | 🎉 Two new papers are out! 🎯 "Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment" and "MemeSense: An Adaptive In-Context Framework for Social Commonsense Driven Meme Moderation" |
---|---|
Feb 13, 2025 | 🎉 Paper accepted at NAACL 2025 Industry! 🎯 "Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance" |
Jan 23, 2025 | 🎉 Paper accepted at NAACL 2025 Main! 🎯 "Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models" |
Dec 14, 2024 | 🎉 Paper accepted at AAAI 2025 AI Alignment Track! 🎯 "SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models" |
Nov 15, 2024 | 🎉 Paper accepted at ICWSM 2025! 🎯 "How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries" |
Oct 15, 2024 | 🎉 Paper accepted at EMNLP 2024 Industry Track! 🎯 "Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context" |
Sep 15, 2024 | 🎉 Paper accepted at EMNLP 2024 Main! 🎯 "Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations" |