Spremljaj
Hanlin Zhu
Hanlin Zhu
Preverjeni e-poštni naslov na berkeley.edu - Domača stran
Naslov
Navedeno
Navedeno
Leto
Starling-7b: Improving helpfulness and harmlessness with rlaif
B Zhu, E Frick, T Wu, H Zhu, K Ganesan, WL Chiang, J Zhang, J Jiao
First Conference on Language Modeling, 2024
116*2024
Guided dialog policy learning: Reward estimation for multi-domain task-oriented dialog
R Takanobu, H Zhu, M Huang
Conference on Empirical Methods in Natural Language Processing, 100-110, 2019
1012019
Optimal conservative offline rl with general function approximation via augmented lagrangian
P Rashidinejad, H Zhu, K Yang, S Russell, J Jiao
arXiv preprint arXiv:2211.00716, 2022
452022
Vector-matrix-vector queries for solving linear algebra, statistics, and graph problems
C Rashtchian, DP Woodruff, H Zhu
Approximation, Randomization, and Combinatorial Optimization. Algorithms and …, 2020
382020
Learning Personalized Alignment for Evaluating Open-ended Text Generation
D Wang, K Yang, H Zhu, X Yang, A Cohen, L Li, Y Tian
arXiv preprint arXiv:2310.03304, 2023
21*2023
Importance weighted actor-critic for optimal conservative offline reinforcement learning
H Zhu, P Rashidinejad, J Jiao
Advances in Neural Information Processing Systems 36, 49579-49602, 2023
182023
Towards optimal statistical watermarking
B Huang, H Zhu, B Zhu, K Ramchandran, MI Jordan, JD Lee, J Jiao
arXiv preprint arXiv:2312.07930, 2023
162023
End-to-end story plot generator
H Zhu, A Cohen, D Wang, K Yang, X Yang, J Jiao, Y Tian
arXiv preprint arXiv:2310.08796, 2023
102023
Efficient prompt caching via embedding similarity
H Zhu, B Zhu, J Jiao
arXiv preprint arXiv:2402.01173, 2024
72024
On representation complexity of model-based and model-free reinforcement learning
H Zhu, B Huang, S Russell
arXiv preprint arXiv:2310.01706, 2023
72023
Towards a Theoretical Understanding of the'Reversal Curse'via Training Dynamics
H Zhu, B Huang, S Zhang, M Jordan, J Jiao, Y Tian, SJ Russell
Advances in Neural Information Processing Systems 37, 90473-90513, 2024
62024
Average-case communication complexity of statistical problems
C Rashtchian, D Woodruff, P Ye, H Zhu
Conference on Learning Theory, 3859-3886, 2021
62021
Provably efficient offline goal-conditioned reinforcement learning with general function approximation and single-policy concentrability
H Zhu, A Zhang
Advances in Neural Information Processing Systems 36, 4177-4198, 2023
52023
Provably efficient reinforcement learning via surprise bound
H Zhu, R Wang, J Lee
International Conference on Artificial Intelligence and Statistics, 4006-4032, 2023
52023
How Do LLMs Perform Two-Hop Reasoning in Context?
T Guo, H Zhu, R Zhang, J Jiao, S Mei, MI Jordan, S Russell
arXiv preprint arXiv:2502.13913, 2025
2025
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
DJ Su, H Zhu, Y Xu, J Jiao, Y Tian, Q Zheng
arXiv preprint arXiv:2502.03275, 2025
2025
Avoiding Catastrophe in Online Learning by Asking for Help
B Plaut, H Zhu, S Russell
arXiv preprint arXiv:2402.08062, 2024
2024
Sistem trenutno ne more izvesti postopka. Poskusite znova pozneje.
Članki 1–17