Unbraiding the bounce: superluminality around the corner DA Dobre, AV Frolov, JTG Ghersi, S Ramazanov, A Vikman Journal of Cosmology and Astroparticle Physics 2018 (03), 020, 2018 | 78 | 2018 |
Soft prompt threats: Attacking safety alignment and unlearning in open-source llms through the embedding space L Schwinn, D Dobre, S Xhonneux, G Gidel, S Günnemann Advances in Neural Information Processing Systems 37, 9086-9116, 2024 | 34 | 2024 |
Adversarial attacks and defenses in large language models: Old and new threats L Schwinn, D Dobre, S Günnemann, G Gidel Proceedings on, 103-117, 2023 | 32 | 2023 |
Clipped stochastic methods for variational inequalities with heavy-tailed noise E Gorbunov, M Danilova, D Dobre, P Dvurechenskii, A Gasnikov, G Gidel Advances in Neural Information Processing Systems 35, 31319-31332, 2022 | 31 | 2022 |
Echoes from the scattering of wavepackets on wormholes JTG Ghersi, AV Frolov, DA Dobre Classical and Quantum Gravity 36 (13), 135006, 2019 | 23 | 2019 |
Learning diverse attacks on large language models for robust red-teaming and safety tuning S Lee, M Kim, L Cherif, D Dobre, J Lee, SJ Hwang, K Kawaguchi, G Gidel, ... arXiv preprint arXiv:2405.18540, 2024 | 12 | 2024 |
Raising the bar for certified adversarial robustness with diffusion models T Altstidl, D Dobre, B Eskofier, G Gidel, L Schwinn arXiv preprint arXiv:2305.10388, 2023 | 8 | 2023 |
Sarah frank-wolfe: Methods for constrained optimization with best rates and practical features A Beznosikov, D Dobre, G Gidel arXiv preprint arXiv:2304.11737, 2023 | 8 | 2023 |
In-context learning can re-learn forbidden tasks S Xhonneux, D Dobre, J Tang, G Gidel, D Sridhar arXiv preprint arXiv:2402.05723, 2024 | 7 | 2024 |
Dissecting adaptive methods in GANs S Jelassi, D Dobre, A Mensch, Y Li, G Gidel arXiv preprint arXiv:2210.04319, 2022 | 5 | 2022 |
A generative approach to LLM harmfulness detection with special red flag tokens S Xhonneux, D Dobre, M Mohfakhami, L Schwinn, G Gidel arXiv preprint arXiv:2502.16366, 2025 | | 2025 |
On the Scalability of Certified Adversarial Robustness with Generated Data T Altstidl, D Dobre, A Kosmala, B Eskofier, G Gidel, L Schwinn Advances in Neural Information Processing Systems 37, 102255-102278, 2024 | | 2024 |
In-Context Learning, Can It Break Safety? S Xhonneux, D Dobre, M Noukhovitch, J Tang, G Gidel, D Sridhar ICML 2024 Next Generation of AI Safety Workshop, 0 | | |
Navigating the Impending Arms Race between Attacks and Defenses in LLMs L Schwinn, D Dobre, S Günnemann, G Gidel | | |