Follow
Devansh Arpit
Devansh Arpit
Rashi.ai
Verified email at rashi.ai
Title
Cited by
Cited by
Year
A closer look at memorization in deep networks
D Arpit, S Jastrzębski, N Ballas, D Krueger, E Bengio, MS Kanwal, ...
ICML 2017 (arXiv preprint arXiv:1706.05394), 2017
17822017
On the spectral bias of deep neural networks
N Rahaman, D Arpit, A Baratin, F Draxler, M Lin, FA Hamprecht, Y Bengio, ...
ICML 2019 (arXiv preprint arXiv:1806.08734), 2018
1089*2018
Three factors influencing minima in SGD
S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey
ICANN 2018 (arXiv preprint arXiv:1711.04623), 2017
4842017
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
S Jastrzebski, M Szymczak, S Fort, D Arpit, J Tabor, K Cho, K Geras
ICLR 2020 (arXiv preprint arXiv:2002.09572), 2020
1422020
Normalization propagation: A parametric technique for removing internal covariate shift in deep networks
D Arpit, Y Zhou, BU Kota, V Govindaraju
ICML 2016 (arXiv preprint arXiv:1603.01431), 2016
1402016
Residual connections encourage iterative inference
S Jastrzebski, D Arpit, N Ballas, V Verma, T Che, Y Bengio
ICLR 2018 (arXiv preprint arXiv:1710.04773), 2017
1282017
A walk with sgd
C Xing, D Arpit, C Tsirigotis, Y Bengio
arXiv preprint arXiv:1802.08770, 2018
1112018
Why regularized auto-encoders learn sparse representation?
D Arpit, Y Zhou, H Ngo, V Govindaraju
ICML 2016 (arXiv preprint arXiv:1505.05561), 2015
902015
Ensemble of averages: Improving model selection and boosting performance in domain generalization
D Arpit, H Wang, Y Zhou, C Xiong
NeurIPS 2022, 2021
862021
Deep Nets Don't Learn via Memorization
D Krueger, N Ballas, S Jastrzebski, D Arpit, MS Kanwal, T Maharaj, ...
ICLR 2017 Workshop, 2017
652017
Fraternal Dropout
K Zolna, D Arpit, D Suhubdy, Y Bengio
ICLR 2018 (arXiv preprint arXiv:1711.00066), 2017
602017
How to Initialize your Network? Robust Initialization for WeightNorm & ResNets
D Arpit, V Campos, Y Bengio
NeurIPs 2019, 2019
512019
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
S Jastrzebski, D Arpit, O Astrand, G Kerg, H Wang, C Xiong, R Socher, ...
ICML 2021, 2020
472020
h-detach: Modifying the LSTM Gradient Towards Better Optimization
D Arpit, B Kanuparthi, G Kerg, NR Ke, I Mitliagkas, Y Bengio
ICLR 2019 (arXiv preprint arXiv:1810.03023), 2018
442018
Variational bi-lstms
S Shabanian, D Arpit, A Trischler, Y Bengio
arXiv preprint arXiv:1711.05717, 2017
412017
Is joint training better for deep auto-encoders?
Y Zhou, D Arpit, I Nwogu, V Govindaraju
arXiv preprint arXiv:1405.1380, 2014
402014
Bolaa: Benchmarking and orchestrating llm-augmented autonomous agents
Z Liu, W Yao, J Zhang, L Xue, S Heinecke, R Murthy, Y Feng, Z Chen, ...
arXiv preprint arXiv:2308.05960, 2023
332023
Finding Flatter Minima with SGD
S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey
ICLR 2018 Workshop, 2018
332018
The benefits of over-parameterization at initialization in deep ReLU networks
D Arpit, Y Bengio
arXiv preprint arXiv:1901.03611, 2019
322019
Merlion: A machine learning library for time series
A Bhatnagar, P Kassianik, C Liu, T Lan, W Yang, R Cassius, D Sahoo, ...
arXiv preprint arXiv:2109.09265, 2021
262021
The system can't perform the operation now. Try again later.
Articles 1–20