关注
Soham De
Soham De
DeepMind
在 google.com 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Gemma: Open models based on gemini research and technology
G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ...
arXiv preprint arXiv:2403.08295, 2024
6822024
High-Performance Large-Scale Image Recognition Without Normalization
A Brock, S De, SL Smith, K Simonyan
International Conference on Machine Learning, 2021
6342021
Adversarial robustness through local linearization
C Qin, J Martens, S Gowal, D Krishnan, K Dvijotham, A Fawzi, S De, ...
Advances in Neural Information Processing Systems, 13847-13856, 2019
3512019
Training quantized nets: A deeper understanding
H Li*, S De*, Z Xu, C Studer, H Samet, T Goldstein
Advances in Neural Information Processing Systems, 5813-5823, 2017
2502017
Resurrecting Recurrent Neural Networks for Long Sequences
A Orvieto, SL Smith, A Gu, A Fernando, C Gulcehre, R Pascanu, S De
arXiv preprint arXiv:2303.06349, 2023
2352023
On the Origin of Implicit Regularization in Stochastic Gradient Descent
SL Smith, B Dherin, DGT Barrett, S De
International Conference on Learning Representations, 2021
2162021
Unlocking High-Accuracy Differentially Private Image Classification through Scale
S De, L Berrada, J Hayes, SL Smith, B Balle
ICML Workshop on Theory and Practice of Differential Privacy, 2022
205*2022
Batch normalization biases residual blocks towards the identity function in deep networks
S De, S Smith
Advances in Neural Information Processing Systems 33, 2020
180*2020
The loosening of American culture over 200 years is associated with a creativity–order trade-off
JC Jackson, M Gelfand, S De, A Fox
Nature human behaviour 3 (3), 244-250, 2019
170*2019
Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration
S De, A Mukherjee, E Ullah
ICML Workshop on Modern Trends in Nonconvex Optimization for Machine Learning, 2018
163*2018
Automated inference with adaptive batches
S De, A Yadav, D Jacobs, T Goldstein
Artificial Intelligence and Statistics, 1504-1513, 2017
156*2017
Characterizing signal propagation to close the performance gap in unnormalized ResNets
A Brock, S De, SL Smith
International Conference on Learning Representations, 2021
1402021
BYOL works even without batch statistics
PH Richemond, JB Grill, F Altché, C Tallec, F Strub, A Brock, S Smith, ...
NeurIPS Workshop on Self-Supervised Learning: Theory and Practice, 2020
134*2020
On the Generalization Benefit of Noise in Stochastic Gradient Descent
S Smith, E Elsen, S De
International Conference on Machine Learning, 9058-9067, 2020
130*2020
The impact of neural network overparameterization on gradient confusion and stochastic gradient descent
KA Sankararaman*, S De*, Z Xu, WR Huang, T Goldstein
International Conference on Machine Learning, 8469-8479, 2020
1192020
Understanding norm change: An evolutionary game-theoretic approach
S De, DS Nau, MJ Gelfand
Proceedings of the 16th Conference on Autonomous Agents and MultiAgent …, 2017
77*2017
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
S De, SL Smith, A Fernando, A Botev, G Cristian-Muraru, A Gu, R Haroun, ...
arXiv preprint arXiv:2402.19427, 2024
73*2024
Layer-specific adaptive learning rates for deep networks
B Singh, S De, Y Zhang, T Goldstein, G Taylor
2015 IEEE 14th International Conference on Machine Learning and Applications …, 2015
702015
Differentially Private Diffusion Models Generate Useful Synthetic Images
S Ghalebikesabi, L Berrada, S Gowal, I Ktena, R Stanforth, J Hayes, S De, ...
arXiv preprint arXiv:2302.13861, 2023
592023
Efficient distributed SGD with variance reduction
S De, T Goldstein
2016 IEEE International Conference on Data Mining (ICDM), 2016
55*2016
系统目前无法执行此操作,请稍后再试。
文章 1–20