• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Tencent details how its MOBA-playing AI system beats 99.81% of human opponents

December 25, 2019   Big Data

In August, Tencent announced it had developed an AI system capable of defeating teams of pros in a five-on-five match in Honor of Kings (or Arena of Valor, depending on the region). This was a noteworthy achievement — Honor of Kings occupies the video game subgenre known as multiplayer online battle arena games (MOBAs), which are incomplete information games in the sense that players are unaware of the actions other players choose. The endgame, then, isn’t merely AI that achieves Honor of Kings superhero performance, but insights that might be used to develop systems capable of solving some of society’s toughest challenges.

A paper published this week peels back the layers of Tencent’s technique, which the coauthors describe as “highly scalable.” They claim its novel strategies enable it to explore the game map “efficiently,” with an actor-critic architecture that self-improves over time.

As the researchers point out, real-time strategy games like Honor of Kings require highly complex action control compared with traditional board games and Atari games. Their environments also tend to be more complicated (Honor of Kings has 10^600 possible states and and 10^18,000 possible actions) and the objectives more complex on the whole. Agents must not only learn to plan, attack, and defend but also to control skill combos, induce, and deceive opponents, all while contending with hazards like creeps and fully automated turrets.

Tencent’s architecture consists of four modules: Reinforcement Learning (RL) Learner, Artificial Intelligence (AI) Server, Dispatch Module, and Memory Pool.

The AI Server — which runs on a single processor core, thanks to some clever compression — dictates how the AI model interacts with objects in the game environment. It generates episodes via self-play, and, based on the features it extracts from the game state, it predicts players’ actions and forwards them to the game core for execution. The game core then returns the next state and the corresponding reward value, or the value that spurs the model toward certain Honor of Kings goals.

 Tencent details how its MOBA playing AI system beats 99.81% of human opponents

As for the Dispatch Module, it’s bundled with several AI Servers on the same machine, and it collects data samples consisting of rewards, features, action probabilities, and more before compressing and sending them to Memory Pools. The Memory Pool — which is also a server — supports samples of various lengths and data sampling based on the generated time, and it implements a circular queue structure that performs storage operations in a data-efficient fashion.

Lastly, the Reinforcement Learner, a distributed training environment, accelerates policy updates with the aforementioned actor-critic approach. Multiple Reinforcement Learners fetch data in parallel from Memory Pools, with which they communicate using shared memory. One mechanism (target attention) helps with enemy target selection, while another —  long short-term memory (LSTM), an algorithm capable of learning long-term dependencies — teaches hero players skill combos critical to inflicting “severe” damage.

The Tencent researchers’ system encodes image features and game state information such that each unit and enemy target is represented numerically. An action mask cleverly incorporates prior knowledge of experienced human players, preventing the AI from attempting to traverse physically “forbidden” areas of game maps (like challenging terrain).

In experiments, the paper’s coauthors ran the framework across a total of 600,000 cores and 1,064 graphics cards (a mixture of Nvidia Tesla P40s and Nvidia V100s), which crunched 16,000 features containing unconcealed unit attributes and game information. Training one hero required 48 graphics cards and 18,000 processor cores at a speed of about 80,000 samples per second per card. And collectively for every day of training, the system accumulated the equivalent of 500 years of human experience.

 Tencent details how its MOBA playing AI system beats 99.81% of human opponents

The AI’s Elo score, derived from a system for calculating the relative skill levels of players in zero-sum games, unsurprisingly increased steadily with training, the coauthors note. It became relatively stable within 80 hours, according to the researchers, and within just 30 hours it began to defeat the top 1% of human Honor of Kings players.

The system executes actions via the AI model every 133 milliseconds, or about the response time of a top amateur player. Five professional players — “QGhappy.Hurt,” “WE.762,” “TS.NuanYang,” “QGhappy.Fly,” and “eStarPro.Ca,” — were invited to play against it, as well as a “diversity” of players attending the ChinaJoy 2019 conference in Shanghai between August 2 and August 5.

The researchers note that despite eStarPro.Cat’s prowess with mage-type heroes, the AI achieved five kills per game and was killed only 1.33 times per game on average. In public matches, its win rate was 99.81% over 2,100 matches, and five of the eight AI-controlled heroes managed a 100% win rate.

They’re far from the only ones whose AI beat human players — DeepMind’s AlphaStar beat 99.8% of human StarCraft 2 players, while OpenAI Five’s OpenAI Five framework defeated a professional team twice in public matches.

The Tencent researchers say that they plan to make both their framework and algorithms open source in the near future, toward the goal of fostering research on complex games like Honor of Kings.

Let’s block ads! (Why?)

Big Data – VentureBeat

99.81%, beats, Details, human, MOBAplaying, opponents, System, Tencent
  • Recent Posts

    • Someone’s having surgery
    • C’mon hooman
    • Build and Release Pipelines for Azure Resources (Logic Apps and Azure Functions)
    • Database version control: Getting started with Flyway
    • Support CRM with New Dynamics 365 Field Service Mobile App
  • Categories

  • Archives

    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited