• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

MIT CSAIL’s AI can detect fake news and political bias

October 4, 2018   Big Data

Fake news continues to rear its ugly head. In March of this year, half of the U.S. population reported seeing deliberately misleading articles on news websites. A majority of respondents to a recent Edelman survey, meanwhile, said that they couldn’t judge the veracity of media reports. And given that fake news has been shown to spread faster than real news, it’s no surprise that almost seven in ten people are concerned it might be used as a “weapon.”

Researchers at the Massachusetts Institute of Technology’s Computer Science and Artificial Intelligence Lab (CSAIL) and the Qatar Computing Research Institute believe they’ve engineered a partial solution. In a study that’ll be presented later this month at the 2018 Empirical Methods in Natural Language Processing (EMNLP) conference in Brussels, Belgium, they describe an artificially intelligent (AI) system that can determine whether a source is accurate or politically prejudiced.

The researchers used it to create an open-source dataset of more than 1,000 news sources annotated with “factuality” and “bias” scores. They claim it’s the largest of its kind.

“A [promising] way to fight ‘fake news’ is to focus on their source,” the researchers wrote. “While ‘fake news’ are spreading primarily on social media, they still need a ‘home’, i.e., a website where they would be posted. Thus, if a website is known to have published non-factual information in the past, it is likely to do so in the future.”

 MIT CSAIL’s AI can detect fake news and political bias

The novelty of the AI system lies in its broad contextual understanding of the mediums it evaluates: rather than extract features (the variables on which the machine learning model trains) from news articles in isolation, it considers crowdsourced encyclopedias, social media, and even the structure of URLs and web traffic data in determining trustworthiness.

It’s built on a Support Vector Machine (SVM) — a supervised system commonly used for classification and regression analysis — that was trained to evaluate factuality and bias on a three-point (low, mixed, and high) and seven-point scale (extreme-left, left, center-left, center, center-right, right, extreme-right), respectively.

According to the team, the system only needs 150 articles to detect if a new source can be trusted reliably. It’s 65 percent accurate at detecting whether a news source has a high, low, or medium level of “factuality,” and is 70 percent accurate at detecting whether it’s left-leaning, right-leaning, or moderate.

On the articles front, it applies a six-prong test to the copy and headline, analyzing not just the structure, sentiment, engagement (in this case, the number of shares, reactions, and comments on Facebook), but also the topic, complexity, bias, and morality (based on the Moral Foundation theory, a social psychological theory intended to explain the origins of and variations in human moral reasoning). It calculates a score for each feature, and then averages that score over a set of articles.

 MIT CSAIL’s AI can detect fake news and political bias

Above: A chart showing where news sources in the researchers’ database fall regarding factuality and bias.

Wikipedia and Twitter also feed into the system’s predictive models. As the researchers note, the absence of a Wikipedia page may indicate that a website isn’t credible, or a page might mention that the source in question is satirical or expressly left-leaning. Moreover, they point out that publications without verified Twitter accounts, or those with recently created accounts which obfuscate their location, are less likely to be impartial.

The last two vectors the model takes into account are the URL structure and web traffic. It detects URLs that attempt to mimic those of credible news sources (e.g., “foxnews.co.cc” rather than “foxnews.com”) and considers the websites’ Alexa Rank, a metric calculated by the number of overall pageviews they receive.

The team trained the system on 1,066 news sources from Media Bias/Fact Check (MBFC), a website with human fact-checkers who manually annotate sites with accuracy and biased data. To produce the aforementioned database, they set it loose on 10-100 articles per website (a total of 94,814).

As the researchers painstakingly detail in their report, not every feature was a useful predictor of factuality and/or bias. For example, some websites without Wikipedia pages or established Twitter profiles were unbiased, and news sources ranked highly in Alexa weren’t consistently less biased or more factual than their less-trafficked competitors.

Interesting patterns emerged. Articles from fake news websites were more likely to use hyperbolic and emotional language, and left-leaning outlets were more likely to mention fairness and reciprocity. Publications with longer Wikipedia pages, meanwhile, were generally more credible, as were those with URLs containing a minimal number of special characters and complicated subdirectories.

In the future, the team intends to explore whether the system can be adapted to other languages (it was trained exclusively on English), and whether it can be trained to detect region-specific biases. And it plans to launch an app that’ll automatically respond to news items with articles “that span the political spectrum.”

“If a website has published fake news before, there’s a good chance they’ll do it again,” Ramy Baly, lead author on the paper and a postdoctoral associate, said. “By automatically scraping data about these sites, the hope is that our system can help figure out which ones are likely to do it in the first place.”

They’re fare from the only ones attempting to combat the spread of fake news with AI.

Dehli-based startup Metafact taps natural language processing algorithms to flag misinformation and bias in news stories and social media posts. And AdVerify.ai, a software-as-a-service platform that launched in beta last year, parses articles for misinformation, nudity, malware, and other problematic content, and cross-references a regularly updated database of thousands of fake and legitimate news items.

Facebook, for its part, has experimented with deploying AI tools that “identify accounts and false news,” and it recently acquired London-based startup Bloomsbury AI to aid in its fight against misleading stories.

Some experts aren’t convinced that AI’s up to the task. Dean Pomerleau, a Carnegie Mellon University Robotics Institute scientist who helped organize the Fake News Challenge, a competition to crowdsource bias detection algorithms, told The Verge in an interview that AI lacked the nuanced understanding of language necessary to suss out untruths and false statements.

“We actually started out with a more ambitious goal of creating a system that could answer the question ‘Is this fake news, yes or no?’” he said. “We quickly realized machine learning just wasn’t up to the task.”

Human fact-checkers aren’t necessarily better. This year, Google suspended Fact Check, a tag that appeared next to stories in Google News that “include information fact-checked by news publishers and fact-checking organizations,” after conservative outlets accused it of exhibiting bias against conservative outlets.

Whatever the ultimate solution — whether AI, human curation, or a mix of both — it can’t come fast enough. Gartner predicts that by 2022, if current trends hold, a majority of people in the developed world will see more false than true information.

Let’s block ads! (Why?)

Big Data – VentureBeat

bias, CSAIL’s, detect, Fake, News, Political
  • Recent Posts

    • NOT WHAT THEY MEANT BY “BUILDING ON THE BACKS OF….”
    • Why Healthcare Needs New Data and Analytics Solutions Before the Next Pandemic
    • Siemens and IBM extend alliance to IoT for manufacturing
    • Kevin Hart Joins John Hamburg For New Netflix Comedy Film Titled ‘Me Time’
    • Who is Monitoring your Microsoft Dynamics 365 Apps?
  • Categories

  • Archives

    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited