Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018 | 370 | 2018 |
Reducing sentiment bias in language models via counterfactual evaluation PS Huang, H Zhang, R Jiang, R Stanforth, J Welbl, J Rae, V Maini, ... arXiv preprint arXiv:1911.03064, 2019 | 217 | 2019 |
Machine learning for humans V Maini, S Sabri Retrieved on May 1, 2022, 2017 | 99 | 2017 |
Building safe artificial intelligence: specification, robustness, and assurance PA Ortega, V Maini, DMS Team DeepMind Safety Research Blog, 2018 | 44 | 2018 |
Scalable agent alignment via reward modeling: A research direction. arXiv 2018 J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 1811 | 22 | 1811 |
the DeepMind safety team PA Ortega, V Maini Building safe artificial intelligence: specification, robustness, and assurance, 2018 | 14 | 2018 |
Machine learning for humans. 2017 V Maini, S Sabri URL: https://medium. com/machinelearning-for-humans/why-machine-learning …, 2019 | 12 | 2019 |
Scalable agent alignment via reward modeling: a research direction. arXiv J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018 | 9 | 2018 |
Scalable agent alignment via reward modeling: a research direction, November 2018 J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg URL http://arxiv. org/abs/1811.07871, 0 | 6 | |
Machine Learning for Humans, Part 5: Reinforcement Learning V Maini URL: https://medium. com/machine-learningfor-humans/reinforcement-learning …, 2017 | 5 | 2017 |
Machine Learning For Humans (6 X 9): Introduction to Machine Learning with Python V Maini, S Sabri Alanna Maldonado, 2023 | 2 | 2023 |