Theo dõi
Pierre Richemond
Pierre Richemond
Google DeepMind
Email được xác minh tại deepmind.com
Tiêu đề
Trích dẫn bởi
Trích dẫn bởi
Năm
Bootstrap your own latent-a new approach to self-supervised learning
JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ...
Advances in neural information processing systems 33, 21271-21284, 2020
75522020
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning
JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ...
Advances in neural information processing systems 33, 21271-21284, 2020
5302020
Data distributional properties drive emergent in-context learning in transformers
S Chan, A Santoro, A Lampinen, J Wang, A Singh, P Richemond, ...
Advances in neural information processing systems 35, 18878-18891, 2022
3182022
Byol works even without batch statistics
PH Richemond, JB Grill, F Altché, C Tallec, F Strub, A Brock, S Smith, ...
arXiv preprint arXiv:2010.10241, 2020
1122020
Continuous diffusion for categorical data
S Dieleman, L Sartran, A Roshannai, N Savinov, Y Ganin, PH Richemond, ...
arXiv preprint arXiv:2211.15089, 2022
972022
Generalized preference optimization: A unified approach to offline alignment
Y Tang, ZD Guo, Z Zheng, D Calandriello, R Munos, M Rowland, ...
arXiv preprint arXiv:2402.05749, 2024
722024
Human alignment of large language models through online preference optimisation
D Calandriello, D Guo, R Munos, M Rowland, Y Tang, BA Pires, ...
arXiv preprint arXiv:2403.08635, 2024
342024
Understanding self-predictive learning for reinforcement learning
Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ...
International Conference on Machine Learning, 33632-33656, 2023
342023
Categorical sdes with simplex diffusion
PH Richemond, S Dieleman, A Doucet
arXiv preprint arXiv:2210.14784, 2022
282022
On Wasserstein reinforcement learning and the Fokker-Planck equation
PH Richemond, B Maginnis
arXiv preprint arXiv:1712.07185, 2017
262017
Scaling instructable agents across many simulated worlds
MA Raad, A Ahuja, C Barros, F Besse, A Bolt, A Bolton, B Brownfield, ...
arXiv preprint arXiv:2404.10179, 2024
252024
Zipfian environments for reinforcement learning
SCY Chan, AK Lampinen, PH Richemond, F Hill
Conference on Lifelong Learning Agents, 406-429, 2022
172022
Offline regularised reinforcement learning for large language models alignment
PH Richemond, Y Tang, D Guo, D Calandriello, MG Azar, R Rafailov, ...
arXiv preprint arXiv:2405.19107, 2024
122024
Semppl: Predicting pseudo-labels for better contrastive representations
M Bošnjak, PH Richemond, N Tomasev, F Strub, JC Walker, F Hill, ...
arXiv preprint arXiv:2301.05158, 2023
92023
The edge of orthogonality: A simple view of what makes byol tick
PH Richemond, A Tam, Y Tang, F Strub, B Piot, F Hill
International Conference on Machine Learning, 29063-29081, 2023
82023
Memory-efficient episodic control reinforcement learning with dynamic online k-means
A Agostinelli, K Arulkumaran, M Sarrico, P Richemond, AA Bharath
arXiv preprint arXiv:1911.09560, 2019
62019
Sample-efficient reinforcement learning with maximum entropy mellowmax episodic control
M Sarrico, K Arulkumaran, A Agostinelli, P Richemond, AA Bharath
arXiv preprint arXiv:1911.09615, 2019
52019
Combining learning rate decay and weight decay with complexity gradient descent-Part I
PH Richemond, Y Guo
arXiv preprint arXiv:1902.02881, 2019
52019
Scaling instructable agents across many simulated worlds
M Abi Raad, A Ahuja, C Barros, F Besse, A Bolt, A Bolton, B Brownfield, ...
arXiv e-prints, arXiv: 2404.10179, 2024
42024
A short variational proof of equivalence between policy gradients and soft Q learning
PH Richemond, B Maginnis
arXiv preprint arXiv:1712.08650, 2017
42017
Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.
Bài viết 1–20