关注
Yangsibo Huang
Yangsibo Huang
在 google.com 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Evaluating Gradient Inversion Attacks and Defenses in Federated Learning
Y Huang, S Gupta, Z Song, K Li, S Arora
NeurIPS 2021, 2021
2682021
Catastrophic Jailbreak of Open-Source LLMs via Exploiting Generation
Y Huang, S Gupta, M Xia, K Li, D Chen
ICLR 2024, 2024
1922024
Detecting pretraining data from large language models
W Shi, A Ajith, M Xia, Y Huang, D Liu, T Blevins, D Chen, L Zettlemoyer
ICLR 2024, 2024
1752024
Deep Q learning driven CT pancreas segmentation with geometry-aware U-Net
Y Man*, Y Huang*, J Feng, X Li, F Wu
IEEE Transactions on Medical Imaging, 2019
1682019
Instahide: Instance-hiding schemes for private distributed learning
Y Huang, Z Song, K Li, S Arora
ICML 2020, 2020
1632020
Recovering Private Text in Federated Learning of Language Models
S Gupta*, Y Huang*, Z Zhong, T Gao, K Li, D Chen
NeurIPS 2022, 2022
772022
TextHide: Tackling Data Privacy in Language Understanding Tasks
Y Huang, Z Song, D Chen, K Li, S Arora
EMNLP 2020, 2020
612020
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
B Wei*, K Huang*, Y Huang*, T Xie, X Qi, M Xia, P Mittal, M Wang, ...
ICML 2024, 2024
532024
Advancing differential privacy: Where we are now and future directions for real-world deployment
R Cummings, D Desfontaines, D Evans, R Geambasu, Y Huang, ...
Harvard Data Science Review, 2024
53*2024
DeepMC: a deep learning method for efficient Monte Carlo beamlet dose calculation by predictive denoising in magnetic resonance-guided radiotherapy
R Neph, Q Lyu, Y Huang, YM Yang, K Sheng
Physics in Medicine & Biology 66 (3), 035022, 2021
43*2021
Privacy Implications of Retrieval-Based Language Models
Y Huang, S Gupta, Z Zhong, K Li, D Chen
EMNLP 2023, 2023
282023
Privacy-Preserving Learning via Deep Net Pruning
Y Huang, Y Su, S Ravi, Z Song, S Arora, K Li
arXiv preprint arXiv:2003.01876, 2020
27*2020
MUSE: Machine Unlearning Six-way Evaluation for Language Models
W Shi, J Lee, Y Huang, S Malladi, J Zhao, A Holtzman, D Liu, ...
arXiv preprint arXiv:2407.06460, 2024
212024
A Safe Harbor for AI Evaluation and Red Teaming
S Longpre, S Kapoor, K Klyman, A Ramaswami, R Bommasani, ...
ICML 2024, 2024
212024
A Dataset Auditing Method for Collaboratively Trained Machine Learning Models
Y Huang, CY Huang, X Li, K Li
IEEE Transactions on Medical Imaging, 2022
152022
NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Y Huang, D Liu, Z Zhong, W Shi, YT Lee
arXiv preprint arXiv:2302.10879, 2023
142023
SORRY-bench: Systematically evaluating large language model safety refusal behaviors
T Xie, X Qi, Y Zeng, Y Huang, UM Sehwag, K Huang, L He, B Wei, D Li, ...
arXiv preprint arXiv:2406.14598, 2024
132024
: Auditing Data Removal from Trained Models
Y Huang, X Li, K Li
International Conference on Medical Image Computing and Computer-Assisted …, 2021
112021
Evaluating Copyright Takedown Methods for Language Models
B Wei, W Shi, Y Huang, NA Smith, C Zhang, L Zettlemoyer, K Li, ...
arXiv preprint arXiv:2406.18664, 2024
92024
AI Risk Management Should Incorporate Both Safety and Security
X Qi, Y Huang, Y Zeng, E Debenedetti, J Geiping, L He, K Huang, ...
arXiv preprint arXiv:2405.19524, 2024
92024
系统目前无法执行此操作,请稍后再试。
文章 1–20