Implicit bias of sgd for diagonal linear networks: a provable benefit of stochasticity S Pesme, L Pillaud-Vivien, N Flammarion Neurips 2021, 2021 | 120 | 2021 |
(S) GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability M Even, S Pesme, S Gunasekar, N Flammarion Neurips 2023, 2023 | 47* | 2023 |
Saddle-to-Saddle Dynamics in Diagonal Linear Networks S Pesme, N Flammarion Neurips 2023, 2023 | 40 | 2023 |
Online robust regression via sgd on the l1 loss S Pesme, N Flammarion Neurips 2020, 2020 | 40 | 2020 |
On convergence-diagnostic based step sizes for stochastic gradient descent S Pesme, A Dieuleveut, N Flammarion ICML 2020, 2020 | 26 | 2020 |
Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks HG Papazov, S Pesme, N Flammarion AISTATS 2024, 2024 | 8 | 2024 |
Implicit Bias of Mirror Flow on Separable Data S Pesme, RA Dragomir, N Flammarion Neurips 2024, 2024 | 2 | 2024 |
Deep learning theory through the lens of diagonal linear networks SW Pesme EPFL, 2024 | 1 | 2024 |