Risk and parameter convergence of logistic regression Z Ji, M Telgarsky
arXiv preprint arXiv:1803.07300, 2018
299 * 2018 Gradient descent aligns the layers of deep linear networks Z Ji, M Telgarsky
arXiv preprint arXiv:1810.02032, 2018
220 2018 Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow relu networks Z Ji, M Telgarsky
arXiv preprint arXiv:1909.12292, 2019
179 2019 Directional convergence and alignment in deep learning Z Ji, M Telgarsky
Advances in Neural Information Processing Systems 33, 17176-17186, 2020
144 2020 Characterizing the implicit bias via a primal-dual analysis Z Ji, M Telgarsky
Algorithmic Learning Theory, 772-804, 2021
64 * 2021 Gradient descent follows the regularization path for general losses Z Ji, M Dudík, RE Schapire, M Telgarsky
Conference on Learning Theory, 2109-2136, 2020
54 2020 Neural tangent kernels, transportation mappings, and universal approximation Z Ji, M Telgarsky, R Xian
arXiv preprint arXiv:1910.06956, 2019
47 2019 Early-stopped neural networks are consistent Z Ji, J Li, M Telgarsky
Advances in Neural Information Processing Systems 34, 1805-1817, 2021
32 2021 Generalization bounds via distillation D Hsu, Z Ji, M Telgarsky, L Wang
arXiv preprint arXiv:2104.05641, 2021
29 2021 Fast margin maximization via dual acceleration Z Ji, N Srebro, M Telgarsky
International Conference on Machine Learning, 4860-4869, 2021
28 2021 Reproducibility in optimization: Theoretical framework and limits K Ahn, P Jain, Z Ji, S Kale, P Netrapalli, GI Shamir
Advances in Neural Information Processing Systems 35, 18022-18033, 2022
13 2022 Actor-critic is implicitly biased towards high entropy optimal policies Y Hu, Z Ji, M Telgarsky
arXiv preprint arXiv:2110.11280, 2021
13 2021 Approximation power of random neural networks B Bailey, Z Ji, M Telgarsky, R Xian
arXiv preprint arXiv:1906.07709, 2019
7 2019 Think before you speak: Training language models with pause tokens S Goyal, Z Ji, AS Rawat, AK Menon, S Kumar, V Nagarajan
arXiv preprint arXiv:2310.02226, 2023
6 2023 Agnostic learnability of halfspaces via logistic loss Z Ji, K Ahn, P Awasthi, S Kale, S Karp
International Conference on Machine Learning, 10068-10103, 2022
6 2022 Social welfare and profit maximization from revealed preferences Z Ji, R Mehta, M Telgarsky
International Conference on Web and Internet Economics, 264-281, 2018
6 2018 Wikidata Vandalism Detection-The Loganberry Vandalism Detector at WSDM Cup 2017 Q Zhu, H Ng, L Liu, Z Ji, B Jiang, J Shen, H Gui
arXiv preprint arXiv:1712.06922, 2017
6 2017 Depth Dependence of P Learning Rates in ReLU MLPs S Jelassi, B Hanin, Z Ji, SJ Reddi, S Bhojanapalli, S Kumar
arXiv preprint arXiv:2305.07810, 2023
4 2023 Convex analysis at infinity: An introduction to astral space M Dudík, RE Schapire, M Telgarsky
arXiv preprint arXiv:2205.03260, 2022
2 2022 The implicit bias of gradient descent: from linear classifiers to deep networks Z Ji
University of Illinois at Urbana-Champaign, 2022
2022