Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning

Peter R. Wurman, Samuel Barrett, Kenta Kawamoto, James MacGlashan, Kaushik Subramanian, Thomas J. Walsh, Roberto Capobianco, Alisa Devlic, Franziska Eckert, Florian Fuchs, Leilani Gilpin, Varun Kompella, Piyush Khandelwal, HaoChih Lin, Patrick MacAlpine, Declan Oller, Craig Sherstan, Takuma Seno, Michael D. Thomure, Houmehr Aghabozorgi, Leon Barrett, Rory Douglas, Dion Whitehead, Peter Duerr, Peter Stone, Michael Spranger, Hiroaki Kitano: Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning. In: Nature, vol. 62, iss. 7896, pp. 223–28, 2022.

Abstract

Many potential applications of artificial intelligence
involve making real-time decisions in physical systems
while interacting with humans. Automobile racing
represents an extreme example of these conditions; drivers
must execute complex tactical manoeuvres to pass or block
opponents while operating their vehicles at their traction
limits1. Racing simulations, such as the PlayStation game
Gran Turismo, faithfully reproduce the non-linear control
challenges of real race cars while also encapsulating the
complex multi-agent interactions. Here we describe how we
trained agents for Gran Turismo that can compete with the
world's best e-sports drivers. We combine
state-of-the-art, model-free, deep reinforcement learning
algorithms with mixed-scenario training to learn an
integrated control policy that combines exceptional speed
with impressive tactics. In addition, we construct a
reward function that enables the agent to be competitive
while adhering to racing's important, but under-specified,
sportsmanship rules. We demonstrate the capabilities of
our agent, Gran Turismo Sophy, by winning a head-to-head
competition against four of the world's best Gran Turismo
drivers. By describing how we trained championship-level
racers, we demonstrate the possibilities and challenges of
using these techniques to control complex dynamical
systems in domains where agents must respect imprecisely
defined human norms.

BibTeX (Download)

@article{nature22,
title = {Outracing Champion Gran Turismo Drivers with Deep Reinforcement Learning},
author = {Peter R. Wurman and Samuel Barrett and Kenta Kawamoto and James MacGlashan and Kaushik Subramanian and Thomas J. Walsh and Roberto Capobianco and Alisa Devlic and Franziska Eckert and Florian Fuchs and Leilani Gilpin and Varun Kompella and Piyush Khandelwal and HaoChih Lin and Patrick MacAlpine and Declan Oller and Craig Sherstan and Takuma Seno and Michael D. Thomure and Houmehr Aghabozorgi and Leon Barrett and Rory Douglas and Dion Whitehead and Peter Duerr and Peter Stone and Michael Spranger and Hiroaki Kitano},
doi = {10.1038/s41586-021-04357-7},
year  = {2022},
date = {2022-02-10},
urldate = {2022-02-10},
journal = {Nature},
volume = {62},
issue = {7896},
pages = {223--28},
abstract = {Many potential applications of artificial intelligence 
 involve making real-time decisions in physical systems 
 while interacting with humans. Automobile racing 
 represents an extreme example of these conditions; drivers 
 must execute complex tactical manoeuvres to pass or block 
 opponents while operating their vehicles at their traction 
 limits1. Racing simulations, such as the PlayStation game 
 Gran Turismo, faithfully reproduce the non-linear control 
 challenges of real race cars while also encapsulating the 
 complex multi-agent interactions. Here we describe how we 
 trained agents for Gran Turismo that can compete with the 
 world's best e-sports drivers. We combine 
 state-of-the-art, model-free, deep reinforcement learning 
 algorithms with mixed-scenario training to learn an 
 integrated control policy that combines exceptional speed 
 with impressive tactics. In addition, we construct a 
 reward function that enables the agent to be competitive 
 while adhering to racing's important, but under-specified, 
 sportsmanship rules. We demonstrate the capabilities of 
 our agent, Gran Turismo Sophy, by winning a head-to-head 
 competition against four of the world's best Gran Turismo 
 drivers. By describing how we trained championship-level 
 racers, we demonstrate the possibilities and challenges of 
 using these techniques to control complex dynamical 
 systems in domains where agents must respect imprecisely 
 defined human norms.},
keywords = {journal},
pubstate = {published},
tppubtype = {article}
}