Minqi Jiang

Research Scientist, Google DeepMind

The rapid rise of computational power allows ever more capable AI agents to be trained in simulation. A simulator, of course, does not fully reflect reality nor human preferences. How can AI agents learn useful, human-aligned behaviors in simulation that transfer to new settings and people?

I consider this question from the lens of generalization, human-AI coordination, and open-ended learning, as part of the Autonomous Assistants team at Google DeepMind.

News

Dec 2023: Co-organized the 2nd Workshop on Agent Learning in Open Endedness (ALOE) at NeurIPS 2023 🌱, seeking to bridge ideas of open-ended evolution in ALife with self-supervised machine learning. The event was a lot of fun and drew out a special community of researchers.

Dec 2023: Joined DeepMind as a Research Scientist.

Nov 2023: Released minimax, a library for rapid experimentation with autocurricula methods for RL in JAX, including new parallelized and multi-GPU/TPU versions of PLR and ACCEL.

Sep 2023: Became a Doctor (of computers). You can find my dissertation on arXiv.

Select works

Learning Curricula in Open-Ended Worlds
M Jiang
PhD Dissertation, 2023
[paper]
minimax: Efficient Baselines for Autocurricula in JAX
M Jiang, M Dennis, E Grefenstette, T Rocktäschel
ALOE, 2023
[paper, code, tl;dr]
General Intelligence Requires Rethinking Exploration
M Jiang, T Rocktäschel, E Grefenstette
Royal Society Open Science, 2023
[paper, tl;dr, blog]
MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning
M Samvelyan, A Khan, M Dennis, M Jiang, J Parker-Holder, R Raileanu, J Foerster, T Rocktäschel
ICLR, 2023
[paper, tl;dr]
Grounding Aleatoric Uncertainty in Unsupervised Environment Design
M Jiang, M Dennis, J Parker-Holder, A Lupu, H Küttler, E Grefenstette, T Rocktäschel, J Foerster
NeurIPS, 2022
[paper, tl;dr]
Evolving Curricula with Regret-Based Environment Design
J Parker-Holder*, M Jiang*, M Dennis, M Samvelyan, J Foerster, E Grefenstette, T Rocktäschel (*Equal contribution)
ICML, 2022
[paper, code, tl;dr, demo discussion, video explainer, interview]
Replay-Guided Adversarial Environment Design
M Jiang*, M Dennis*, J Parker-Holder, J Foerster, E Grefenstette, T Rocktäschel (*Equal contribution)
NeurIPS, 2021
[paper, code, tl;dr]
Prioritized Level Replay
M Jiang, E Grefenstette, T Rocktäschel
ICML, 2021
↳
Key component of current state-of-the-art on OpenAI Procgen
[paper, code, tl;dr]