Obserwuj
Nathan Lambert
Nathan Lambert
Research Scientist, Allen AI
Zweryfikowany adres z allenai.org - Strona główna
Tytuł
Cytowane przez
Cytowane przez
Rok
Zephyr: Direct distillation of lm alignment
L Tunstall, E Beeching, N Lambert, N Rajani, K Rasul, Y Belkada, ...
arXiv preprint arXiv:2310.16944, 2023
5642023
[Github] Diffusers: State-of-the-art diffusion models
P von Platen, S Patil, A Lozhkov, P Cuenca, N Lambert, K Rasul, ...
https://github.com/huggingface/diffusers, 2022
490*2022
Open LLM Leaderboard
E Beeching, C Fourrier, N Habib, S Han, N Lambert, N Rajani, ...
URL https://huggingface. co/spaces/HuggingFaceH4/open_llm_leaderboard, 2023
3382023
Olmo: Accelerating the science of language models
D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord, AH Jha, ...
arXiv preprint arXiv:2402.00838, 2024
280*2024
[Github] Trl: Transformer reinforcement learning
L von Werra, Y Belkada, L Tunstall, E Beeching, T Thrush, N Lambert
https://github.com/lvwerra/trl, 2020
229*2020
Rewardbench: Evaluating reward models for language modeling
N Lambert, V Pyatkin, J Morrison, LJ Miranda, BY Lin, K Chandu, N Dziri, ...
arXiv preprint arXiv:2403.13787, 2024
2172024
Dolma: An open corpus of three trillion tokens for language model pretraining research
L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ...
arXiv preprint arXiv:2402.00159, 2024
210*2024
Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning
N Lambert, DS Drew, J Yaconelli, R Calandra, S Levine, KSJ Pister
IEEE Robotics and Automation Letters 4 (4), 4224-4230, 2019
1992019
Camels in a changing climate: Enhancing lm adaptation with tulu 2
H Ivison, Y Wang, V Pyatkin, N Lambert, M Peters, P Dasigi, J Jang, ...
arXiv preprint arXiv:2311.10702, 2023
1832023
On the importance of hyperparameter optimization for model-based reinforcement learning
B Zhang, R Rajan, L Pineda, N Lambert, A Biedenkapp, K Chua, F Hutter, ...
International Conference on Artificial Intelligence and Statistics, 4015-4023, 2021
1372021
[Blog] Illustrating reinforcement learning from human feedback (RLHF)
N Lambert, L Castricato, L von Werra, A Havrilla
https://hf.co/blog/rlhf, 2022
131*2022
Objective Mismatch in Model-based Reinforcement Learning
N Lambert, B Amos, O Yadan, R Calandra
Learning for Dynamics and Control (L4DC), 2020
1152020
A survey on data selection for language models
A Albalak, Y Elazar, SM Xie, S Longpre, N Lambert, X Wang, ...
arXiv preprint arXiv:2402.16827, 2024
982024
Toward controlled flight of the ionocraft: a flying microrobot using electrohydrodynamic thrust with onboard sensing and no moving parts
D Drew, N Lambert, C Schindler, K Pister
IEEE Robotics and Automation Letters 3 (4), 2807-2813, 2018
902018
Molmo and pixmo: Open weights and open data for state-of-the-art multimodal models
M Deitke, C Clark, S Lee, R Tripathi, Y Yang, JS Park, M Salehi, ...
arXiv preprint arXiv:2409.17146, 2024
88*2024
Tülu 3: Pushing Frontiers in Open Language Model Post-Training
N Lambert, J Morrison, V Pyatkin, S Huang, H Ivison, F Brahman, ...
arXiv preprint arXiv:2411.15124, 2024
57*2024
The Alignment Handbook
L Tunstall, E Beeching, N Lambert, N Rajani, S Huang, K Rasul, ...
URL https://github. com/huggingface/alignment-handbook, 2023
572023
Mbrl-lib: A modular library for model-based reinforcement learning
L Pineda, B Amos, A Zhang, NO Lambert, R Calandra
arXiv preprint arXiv:2104.10159, 2021
572021
Learning generalizable locomotion skills with hierarchical reinforcement learning
T Li, N Lambert, R Calandra, F Meier, A Rai
IEEE International Conference on Robotics and Automation (ICRA), 413-419, 2020
502020
Wildguard: Open one-stop moderation tools for safety risks, jailbreaks, and refusals of llms
S Han, K Rao, A Ettinger, L Jiang, BY Lin, N Lambert, Y Choi, N Dziri
arXiv preprint arXiv:2406.18495, 2024
462024
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–20