Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Y Zhai, H Bai, Z Lin, J Pan, S Tong, Y Zhou, A Suhr, S Xie, Y LeCun, Y Ma, ... NeurIPS'24, 2024 | 51* | 2024 |
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning H Bai, Y Zhou, M Cemri, J Pan, A Suhr, S Levine, A Kumar NeurIPS'24, 2024 | 33 | 2024 |
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? Y Yu, S Buchanan, D Pai, T Chu, Z Wu, S Tong, H Bai, Y Zhai, ... JMLR (Long), 2023 | 13* | 2023 |
ICP algorithm: Theory, practice and its SLAM-oriented taxonomy H Bai CDS'22 (Oral), 2022 | 12 | 2022 |
CharmBana: Progressive Responses with Real-Time Internet Search for Knowledge-Powered Conversations RG Reddy, S Suresh, H Bai, W Yao, MS Sidhu, K Aggarwal, P Sonawane, ... WSDM'24, 2024 | 3* | 2024 |
Improving Neuron-level Interpretability with White-box Language Models H Bai, Y Ma CPAL 2025 (Oral), 2024 | 1 | 2024 |
Social Commonsense-Guided Search Query Generation for Open-Domain Knowledge-Powered Conversations RG Reddy, H Bai, W Yao, SCE Suresh, H Ji, CX Zhai EMNLP'23, 2023 | 1 | 2023 |
Digi-Q: Learning Q-Value Functions for Training Device-Control Agents H Bai, Y Zhou, LE Li, S Levine, A Kumar ICLR 2025, 2025 | | 2025 |