Obserwuj
Zachary Kenton
Zachary Kenton
Google DeepMind
Zweryfikowany adres z google.com - Strona główna
Tytuł
Cytowane przez
Cytowane przez
Rok
Ethical and social risks of harm from language models
L Weidinger, J Mellor, M Rauh, C Griffin, J Uesato, PS Huang, M Cheng, ...
arXiv preprint arXiv:2112.04359, 2021
9152021
Taxonomy of risks posed by language models
L Weidinger, J Uesato, M Rauh, C Griffin, PS Huang, J Mellor, A Glaese, ...
Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022
5522022
Three factors influencing minima in sgd
S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey
arXiv preprint arXiv:1711.04623, 2017
5202017
Alignment of language agents
Z Kenton, T Everitt, L Weidinger, I Gabriel, V Mikulik, G Irving
arXiv preprint arXiv:2103.14659, 2021
1582021
On the relation between the sharpest directions of DNN loss and the SGD step length
S Jastrzębski, Z Kenton, N Ballas, A Fischer, Y Bengio, A Storkey
arXiv preprint arXiv:1807.05031, 2018
1272018
A systematic comparison of bayesian deep learning robustness in diabetic retinopathy tasks
A Filos, S Farquhar, AN Gomez, TGJ Rudner, Z Kenton, L Smith, ...
arXiv preprint arXiv:1912.10481, 2019
1242019
Specification gaming: the flip side of AI ingenuity
V Krakovna, J Uesato, V Mikulik, M Rahtz, T Everitt, R Kumar, Z Kenton, ...
DeepMind Blog 3, 2020
1102020
Imitating interactive intelligence
J Abramson, A Ahuja, I Barr, A Brussee, F Carnevale, M Cassin, ...
arXiv preprint arXiv:2012.05672, 2020
712020
Ethical and social risks of harm from language models. arXiv
L Weidinger, J Mellor, M Rauh, C Griffin, J Uesato, PS Huang, M Cheng, ...
arXiv preprint arXiv:2112.04359 10, 2021
652021
Goal misgeneralization: Why correct specifications aren't enough for correct goals
R Shah, V Varma, R Kumar, M Phuong, V Krakovna, J Uesato, Z Kenton
arXiv preprint arXiv:2210.01790, 2022
572022
Explaining grokking through circuit efficiency
V Varma, R Shah, Z Kenton, J Kramár, R Kumar
arXiv preprint arXiv:2309.02390, 2023
402023
The squeezed limit of the bispectrum in multi-field inflation
Z Kenton, DJ Mulryne
Journal of Cosmology and Astroparticle Physics 2015 (10), 018, 2015
392015
Finding flatter minima with sgd
S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey
382018
D-brane potentials in the warped resolved conifold and natural inflation
Z Kenton, S Thomas
Journal of High Energy Physics 2015 (2), 1-42, 2015
382015
Discovering agents
Z Kenton, R Kumar, S Farquhar, J Richens, M MacDermott, T Everitt
Artificial Intelligence 322, 103963, 2023
312023
The ethics of advanced ai assistants
I Gabriel, A Manzini, G Keeling, LA Hendricks, V Rieser, H Iqbal, ...
arXiv preprint arXiv:2404.16244, 2024
292024
Width of minima reached by stochastic gradient descent is influenced by learning rate to batch size ratio
S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey
Artificial Neural Networks and Machine Learning–ICANN 2018: 27th …, 2018
292018
Generalizing from a few environments in safety-critical reinforcement learning
Z Kenton, A Filos, Y Gal, O Evans
Safe Machine Learning workshop at ICLR, 2019
25*2019
Benchmarking Bayesian deep learning with diabetic retinopathy diagnosis
A Filos, S Farquhar, AN Gomez, TGJ Rudner, Z Kenton, L Smith, ...
Preprint at https://arxiv. org/abs/1912.10481, 2019
232019
The separate universe approach to soft limits
Z Kenton, DJ Mulryne
Journal of Cosmology and Astroparticle Physics 2016 (10), 035, 2016
222016
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–20