Prototypical networks for few-shot learning J Snell, K Swersky, R Zemel Advances in neural information processing systems 30, 2017 | 9799 | 2017 |
Taking the human out of the loop: A review of Bayesian optimization B Shahriari, K Swersky, Z Wang, RP Adams, N De Freitas Proceedings of the IEEE 104 (1), 148-175, 2015 | 5751 | 2015 |
Big self-supervised models are strong semi-supervised learners T Chen, S Kornblith, K Swersky, M Norouzi, GE Hinton Advances in neural information processing systems 33, 22243-22255, 2020 | 2523 | 2020 |
Learning fair representations R Zemel, Y Wu, K Swersky, T Pitassi, C Dwork International conference on machine learning, 325-333, 2013 | 2231 | 2013 |
Neural networks for machine learning lecture 6a overview of mini-batch gradient descent G Hinton, N Srivastava, K Swersky Cited on 14 (8), 2, 2012 | 1674 | 2012 |
Meta-learning for semi-supervised few-shot classification M Ren, E Triantafillou, S Ravi, J Snell, K Swersky, JB Tenenbaum, ... arXiv preprint arXiv:1803.00676, 2018 | 1666 | 2018 |
Scalable bayesian optimization using deep neural networks J Snoek, O Rippel, K Swersky, R Kiros, N Satish, N Sundaram, M Patwary, ... International conference on machine learning, 2171-2180, 2015 | 1336 | 2015 |
Generative moment matching networks Y Li, K Swersky, R Zemel International conference on machine learning, 1718-1727, 2015 | 1046 | 2015 |
Multi-task bayesian optimization K Swersky, J Snoek, RP Adams Advances in neural information processing systems 26, 2013 | 941 | 2013 |
The variational fair autoencoder C Louizos, K Swersky, Y Li, M Welling, R Zemel arXiv preprint arXiv:1511.00830, 2015 | 752 | 2015 |
Meta-dataset: A dataset of datasets for learning to learn from few examples E Triantafillou, T Zhu, V Dumoulin, P Lamblin, U Evci, K Xu, R Goroshin, ... arXiv preprint arXiv:1903.03096, 2019 | 728 | 2019 |
Neural networks for machine learning G Hinton, N Srivastava, K Swersky Coursera, video lectures 264 (1), 2146-2153, 2012 | 728 | 2012 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 617 | 2024 |
Your classifier is secretly an energy based model and you should treat it like one W Grathwohl, KC Wang, JH Jacobsen, D Duvenaud, M Norouzi, ... arXiv preprint arXiv:1912.03263, 2019 | 613 | 2019 |
Predicting deep zero-shot convolutional neural networks using textual descriptions J Lei Ba, K Swersky, S Fidler Proceedings of the IEEE international conference on computer vision, 4247-4255, 2015 | 527 | 2015 |
Flexibly fair representation learning by disentanglement E Creager, D Madras, JH Jacobsen, M Weis, K Swersky, T Pitassi, ... International conference on machine learning, 1436-1445, 2019 | 404 | 2019 |
Freeze-thaw Bayesian optimization K Swersky, J Snoek, RP Adams arXiv preprint arXiv:1406.3896, 2014 | 325 | 2014 |
Two sides of the same coin: Heterophily and oversmoothing in graph convolutional neural networks Y Yan, M Hashemi, K Swersky, Y Yang, D Koutra 2022 IEEE International Conference on Data Mining (ICDM), 1287-1292, 2022 | 306 | 2022 |
Input warping for Bayesian optimization of non-stationary functions J Snoek, K Swersky, R Zemel, R Adams International conference on machine learning, 1674-1682, 2014 | 292 | 2014 |
Lecture 6a overview of mini–batch gradient descent G Hinton, N Srivastava, K Swersky Coursera Lecture slides https://class. coursera. org/neuralnets-2012-001 …, 2012 | 292 | 2012 |