• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Baidu open-sources NLP model it claims achieves state-of-the-art results in Chinese language tasks

March 21, 2019   Big Data
 Baidu open sources NLP model it claims achieves state of the art results in Chinese language tasks

Baidu, the Beijing conglomerate behind the eponymous Chinese search engine, invests heavily in natural language processing (NLP) research. In October, it debuted an AI model capable of beginning a translation just a few seconds into a speaker’s speech and finishing seconds after the end of a sentence, and in 2016 and 2017, it launched SwiftScribe, a web app powered by its DeepSpeech platform, and TalkType, a dictation-centric Android keyboard.

Building on that and other previous work, Baidu this week detailed ERNIE (Enhanced Representation through kNowledge IntEgration), a natural language model based on its PaddlePaddle deep learning platform. The company claims it achieves “high accuracy” on a range of language processing tasks, including natural language inference, semantic similarity, named entity recognition, sentiment analysis, and question-answer matching, and that it’s state-of-the-art with respect to Chinese language understanding.

The source code and pretrained models are available on Github.

“In recent years, unsupervised pre-trained language models have made great progress on various NLP tasks,” Baidu explained in a blog post. “[But] early work in this field focused on context-independent word embedding. [T]hese models mainly focused on the original language signals, not on semantic units in the text … We considered that if the model can learn the implicit knowledge from texts, its performances on various tasks will be further improved.”

Toward that end, the character-based ERNIE was architected to learn the semantic representation of concepts by ingesting paragraphs containing partially masked words. It’s a versatile approach — Baidu says that unlike systems that rely on word-level modeling to suss out relationships among parts of speech, ERNIE is able to comprehend the “compositional meaning” of sequential characters like “红色,蓝色, 绿色,” which means red, blue and green, respectively.

Furthermore, ERNIE uses a dialogue language model to tackle question-answer scenarios, along with a technique called dialogue response loss. Essentially, it takes two adjacency pairs — two utterances by two speakers, one after the other — and encodes them mathematically to identify the speakers’ roles and learn implicit relationships in the exchange.

To validate ERNIE’s design, the researchers fed it with online encyclopedia articles, news clippings, and forum threads, and had it infer knowledge omitted from sample paragraphs. It managed to correctly fill in prompts like “Relativity is a theory about space-time and gravity, which was founded by _________” (ERNIE’s answer: “Einstein”) and “The surface area of the Earth is 510 million square kilometers, which of 71 percent are ________, 29 percent are land” (ERNIE: “ocean.” And far more impressively, when tested on a benchmark devised by Facebook and New York University researchers (XNLI), it outperformed Google’s BERT on Chinese data.

Baidu says it plans to integrate ERNIE with “a variety of products.” One likely beneficiary is DuerOS, a suite of software developer kits (SDKs), APIs, and turnkey solutions that enable original equipment manufacturers to build Baidu’s voice platform into smart speakers, refrigerators, washing machines, set-top boxes, and more. To date, more than 200 companies have launched 110 DuerOS-powered products, and Baidu announced in November that DuerOS is installed on over 150 million devices and has more than 35 million monthly active users.

Let’s block ads! (Why?)

Big Data – VentureBeat

achieves, Baidu, Chinese, Claims, language, model, opensources, Results, stateoftheart, Tasks
  • Recent Posts

    • Syncing Dynamics 365 User Permissions with SharePoint
    • solve for variable in iterator limit
    • THE UNIVERSE: A WONDROUS PLACE
    • 2020 ERP/CRM Software Blog Award Winners
    • Top 10 CRM Software Blog Posts in 2020
  • Categories

  • Archives

    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited