Michal Valko

Citado por

	Todos	Desde 2019
Citações	14218	13369
Índice h	46	41
Índice i10	107	100

4900

2450

1225

3675

20132014201520162017201820192020202120222023202462 61 107 141 167 200 323 608 1374 2676 3420 4861

Acesso público

Ver todos

53 artigos

0 artigo

disponível

não disponível

Com base nas autorizações de financiamento

Coautores

Rémi MunosGoogle DeepMindE-mail confirmado em inria.fr
Mohammad Gheshlaghi AzarCohereE-mail confirmado em cohere.com
Bilal PiotGoogle DeepmindE-mail confirmado em google.com
Daniele CalandrielloResearch Scientist, DeepMindE-mail confirmado em google.com
Corentin TallecDeepMindE-mail confirmado em google.com
Zhaohan Daniel GuoDeepMindE-mail confirmado em google.com
Jean-bastien GrillE-mail confirmado em google.com
Pierre MénardOvGU MagdeburgE-mail confirmado em inria.fr
Florent AltchéResearch Engineer, DeepMindE-mail confirmado em google.com
Alessandro LazaricResearch Scientist, Facebook Artificial Intelligence ResearchE-mail confirmado em inria.fr
Pierre RichemondGoogle DeepMindE-mail confirmado em deepmind.com
Yunhao TangResearch Scientist, DeepMindE-mail confirmado em columbia.edu
Emilie KaufmannCNRS & Univ. Lille (CRIStAL)E-mail confirmado em inria.fr
Mark RowlandResearch Scientist, Google DeepMindE-mail confirmado em google.com
Omar Darwiche DominguesCohereE-mail confirmado em cohere.com
Branislav KvetonAdobe ResearchE-mail confirmado em adobe.com
Milos HauskrechtProfessor of Computer Science, University of PittsburghE-mail confirmado em pitt.edu
Matteo PirottaResearch Scientist, Meta (FAIR)E-mail confirmado em fb.com
Shantanu ThakoorResearch Engineer at DeepMindE-mail confirmado em google.com
Carl DoerschGoogle DeepMindE-mail confirmado em google.com

Seguir

Michal Valko

Llama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMind

E-mail confirmado em meta.com - Página inicial

fine-tuning LLMs rl with human feedback deep reinforcement learning


Título Ordenar por citações Ordenar por ano Ordenar por título	Citado por Citado por	Ano
Bootstrap your own latent: A new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, PH Richemond, E Buchatskaya, ... Neural Information Processing Systems, 2020	6924	2020
The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024	1109	2024
Large-scale representation learning on graphs via bootstrapping S Thakoor, C Tallec, MG Azar, R Munos, P Veličković, M Valko International Conference on Learning Representations, 2022	469*	2022
A general theoretical paradigm to understand learning from human preferences MG Azar, M Rowland, B Piot, D Guo, D Calandriello, M Valko, R Munos International Conference on Artificial Intelligence and Statistics, 2024	298	2024
Finite-time analysis of kernelised contextual bandits M Valko, N Korda, R Munos, I Flaounas, N Cristianini Uncertainty in Artificial Intelligence, 2013	295	2013
Outlier detection for patient monitoring and alerting M Hauskrecht, I Batal, M Valko, S Visweswaran, GF Cooper, G Clermont Journal of Biomedical Informatics, 2013	179	2013
Online influence maximization under independent cascade model with semi-bandit feedback Z Wen, B Kveton, M Valko, S Vaswani Neural Information Processing Systems, 2017	155*	2017
Stochastic simultaneous optimistic optimization M Valko, A Carpentier, R Munos International Conference on Machine Learning, 2013	142	2013
Broaden your views for self-supervised video learning A Recasens, P Luc, JB Alayrac, L Wang, F Strub, C Tallec, M Malinowski, ... International Conference on Computer Vision, 2021	139	2021
Spectral bandits for smooth graph functions M Valko, R Munos, B Kveton, T Kocák International Conference on Machine Learning, 2014	137	2014
Efficient learning by implicit exploration in bandit problems with side observations T Kocák, G Neu, M Valko, R Munos Neural Information Processing Systems, 2014	134	2014
Episodic reinforcement learning in finite MDPs: Minimax lower bounds revisited O Darwiche Domingues, P Ménard, E Kaufmann, M Valko Algorithmic Learning Theory, 2021	127	2021
Black-box optimization of noisy functions with unknown smoothness JB Grill, M Valko, R Munos Neural Information Processing Systems, 2015	114	2015
Simple regret for infinitely many armed bandits A Carpentier, M Valko International Conference on Machine Learning, 2015	109	2015
Game Plan: What AI can do for Football, and What Football can do for AI K Tuyls, S Omidshafiei, P Muller, Z Wang, J Connor, D Hennes, I Graham, ... Journal of Artificial Intelligence Research 71, 41-88, 2021	108	2021
BYOL works even without batch statistics PH Richemond, JB Grill, F Altché, C Tallec, F Strub, A Brock, S Smith, ... NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020	107	2020
Gamification of pure exploration for linear bandits R Degenne, P Ménard, X Shang, M Valko International Conference on Machine Learning, 2020	98	2020
Adaptive reward-free exploration E Kaufmann, P Ménard, OD Domingues, A Jonsson, E Leurent, M Valko Algorithmic Learning Theory, 2021	95	2021
Gaussian process optimization with adaptive sketching: Scalable and no regret D Calandriello, L Carratino, A Lazaric, M Valko, L Rosasco Conference on Learning Theory, 2019	88	2019
Fast active learning for pure exploration in reinforcement learning P Ménard, OD Domingues, A Jonsson, E Kaufmann, E Leurent, M Valko International Conference on Machine Learning, 2021	87	2021

O sistema não pode executar a operação agora. Tente novamente mais tarde.

Artigos 1–20

Citações por ano

Citações duplicadas

Citações mescladas

Adicionar coautoresCoautores

Seguir

Citado por

Coautores