• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Facebook details the AI technology behind Instagram Explore

November 25, 2019   Big Data

According to Facebook, over half of Instagram’s roughly 1 billion users visit Instagram Explore to discover videos, photos, livestreams, and Stories each month. Predictably, building the underlying recommendation engine — which curates the billions of pieces of content uploaded to Instagram — posed an engineering challenge, not least because it works in real time.

In a blog post published this morning, Facebook for the first time peeled back the curtains on Explore’s inner workings. Its three-part ranking funnel, which the company says was architected with a custom query language and modeling techniques, extracts 65 billion features and makes 90 million model predictions every second. And that’s just the tip of the iceberg.

Tools

Before the team behind Explore embarked on building a content recommendation system, they developed tools to conduct large-scale experiments and obtain strong signals on the breadth of users’ interests. The first of these was IGQL, a meta language that provided the level of abstraction needed to assemble candidate algorithms in one place.

IGQL is optimized in C++, which helps minimize latency and compute resources without sacrificing extensibility, Facebook says. It’s both statistically validated and high-level, enabling engineers to write recommendation algorithms in a “Python-like” fashion. And it complements an account embeddings component that helps identify topically similar profiles as part of a retrieval pipeline that focuses on account-level information.

 Facebook details the AI technology behind Instagram Explore

Above: Demonstration of ig2vec predicting account similarity.

Image Credit: Facebook

A framework — Ig2vec — treats Instagram accounts a user interacts with as word sequences in a sentence, which informs the predictions of a model with respect to which accounts the user might interact with. (Facebook notes that a sequence of accounts interacted with in a session is more likely to be topically coherent compared with random accounts.) Concurrently, Facebook’s AI Similarity Search nearest neighbor retrieval library (FAISS) queries millions of accounts based on a metric used in embedding training.

A classifier system is trained to predict a topic for a set of accounts based solely on the embedding, which when compared with human-labeled topics makes evident how well the embeddings capture topical similarity. It’s an important step, because retrieving accounts similar to those a user has expressed interest in helps narrow down a per-profile ranking inventory.

Ranking accounts in Explore based on interests necessitated predicting the most relevant content for each person, according to Facebook, and gave rise to a lightweight ranking distillation model that preselects candidates before passing them to complex ranking models. Using knowledge in the form of input candidates with features and outputs from the more complicated models, the simpler model tries to approximate the main ranking models as much as possible via direct (and indirect) learning.

Building Explore

Explore consists of two main stages, according to the team that designed it: the candidate generation stage (also known as the sourcing stage) and the ranking stage.

During the candidate generation stage, Explore taps accounts that users have interacted with previously to identify “seed accounts” of interest. They’re only a fraction of the accounts about the same interest, but they help identify topically similar accounts when combined with the above-mentioned embeddings.

Knowing the accounts that might appeal to a user is the first step toward sussing out which content might float their boat. IGQL allows different candidate sources to be represented as distinct subqueries, and this enables Explore to find tens of thousands of eligible candidates for the average person across many types of sources.

 Facebook details the AI technology behind Instagram Explore

Above: This graphic shows a typical source for Instagram Explore recommendations.

Image Credit: Facebook

To ensure the recommended content remains safe and appropriate for users of all ages, signals are used to filter out anything that might not be eligible. Algorithms detect and filter spam and other content, typically before an inventory is built for each user.

Those filtering systems are quite effective, if Facebook’s latest Community Standards Enforcement Report is any indication. The network says that 845,000 pieces of content relating to self-injury and self-harm were removed in Q3 2019, of which 79.1% were detected proactively, and that over 99% of child nudity and exploitation posts were deleted over the past four quarters.

For every Explore ranking request, 500 candidates are selected from the thousands sampled and are passed along to the ranking stage. It’s there that they encounter a three-part infrastructure intended to balance relevance with computation efficiency.

In the first pass of the ranking stage, a distillation model mimics the combination of the other stages with a minimal number of features. It picks the 150 highest-quality and most relevant candidates out of the 500, after which a model with a full dense set of features (in the second phase) selects the top 50 candidates. Lastly, another model with a full set of features chooses the best 25 candidates, which populate the Explore grid.

 Facebook details the AI technology behind Instagram Explore

Above: An illustration of the current final-pass model architecture.

Image Credit: Facebook

Sometimes the first-pass distillation model mimics the other two stages in ranking order. The fix is a multi-task, multi-layer algorithm that captures signals to predict actions people might take on content, from positive actions such as tapping Like or Favorite to negative actions like tapping the See Fewer Posts Like This button. The predictions are combined using a value model formula to capture prominence, after which a weighted sum determines whether the importance of a person saving a post, say, is higher than their liking a post.

In the interest of maintaining a “rich balance” between new content and existing content, the Explore team incorporated a rule into the aforementioned value model that boosts content diversity. It downranks posts from the same author or seed account by adding a penalty factor so users don’t see multiple posts from the same person or the same seed account in Explore.

“We rank the most relevant content based on the final value model score of each ranking candidate in a descendant way,” wrote the blog authors. “One of the most exciting parts of building Explore is the ongoing challenge of finding new and interesting ways to help our community discover the most interesting and relevant content on Instagram. We’re continuously evolving Instagram Explore, whether by adding media formats like Stories [or] entry points to new types of content, such as shopping posts and IGTV videos.”

Let’s block ads! (Why?)

Big Data – VentureBeat

Behind, Details, Explore, Facebook, Instagram, technology
  • Recent Posts

    • Now make soup!
    • Attach2Dynamics Or SharePoint Security Sync – Choose your smart app for effective document management in Dynamics 365 CRM/Power Apps.
    • 5 jobs that you should apply for this week (before it’s too late)
    • SQL Server authentication methods, logins, and database users
    • DAE solver fails for system of coupled partial differential equations
  • Categories

  • Archives

    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited