• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Microsoft researchers are teaching AI to write stories about groups of photos

April 14, 2016   Big Data

Microsoft researchers have come up with a novel way of getting computers to tell stories about what’s happening in multiple photographs by using artificial intelligence (AI). Today the company is publishing an academic paper describing the technology, which could one day power services that are especially useful to the visually impaired. The paper will also detail the photos, captions, and “stories” developed in the research.

The new capability is significant because it goes well beyond just identifying objects in images, or even videos, in order to generate captions.

“It’s still hard to evaluate, but minimally you want to get the most important things in a dimension. With storytelling, a lot more that comes in is about what the background is and what sort of stuff might have been happening around the event,” Microsoft researcher Margaret Mitchell told VentureBeat in an interview.

To advance the state of the art in this area, Microsoft relied on people to write captions for individual images, as well as captions for images in a specific order. Engineers then used the information to teach machines how to come up with entire stories to tell about those sequences of images.

The method involves deep learning, a type of artificial intelligence that Microsoft has previously used for tasks like speech recognition and machine translation. Facebook, Google, and other companies are actively engaged in this research area, as well.

In this case, a recurrent neural network was employed to train on the images and words. Mitchell and her colleagues in the research borrowed an approach from the domain of machine translation called sequence-to-sequence learning. “Here, what we’re doing is we’re saying that every image is fed through a convolutional network to provide one part of the sequence, and you can go over the sequence to create a general encoding of a sequence of images, and then from that general encoding, we can decode out to the story,” said Mitchell, the principal investigator in the paper.

She and her collaborators — some of whom work at the Facebook Artificial Intelligence Research (FAIR) lab — sought to improve what was originally being produced with the system by putting certain rules in place. For instance, “the same content word cannot be produced more than once within a given story,” as they write in the paper.

Microsoft Visual Storytelling 2 Microsoft researchers are teaching AI to write stories about groups of photos

Above: An example of stories for images in sequence at bottom.

Image Credit: Screenshot

The final result is language that’s less literal but more abstract and fascinating. And over time, this sort of language could have great potential. People who aren’t able to see the photos can get an understanding of what they convey together as a set.

This would be a good next step to follow the recent wave of research into identifying objects and people in images and videos for the blind. In fact, that’s an area Mitchell has recently been exploring in association with blind Microsoft software developer Saqib Shaikh.

But sighted people who are learning a second language could also be helped a lot by visual storytelling, and it could inspire kids to think more creatively about what they’re seeing in the world, too, Mitchell said.

People are increasingly capturing multi-image files with the cameras on their phones, whether those be animated GIF-like Live Photos from iPhones or entire videos. So it will become more important for machines to understand what’s going on across those larger sets of frames. It’s no longer enough to just recognize what’s appearing in each individual frame. Mitchell sees the research going in that direction — though they’re not quite there yet.

“It’s just some simple heuristics, really, but it shows the wealth of information we’re able to pull out from these models,” Mitchell said. “It’s really positive and quite hopeful moving forward.”

See the academic paper for more detail. Microsoft also has an official blog post about the research.

Microsoft Corporation is a public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through … read more »

VB Profile Logo Microsoft researchers are teaching AI to write stories about groups of photosNew! Track Microsoft’s Landscape to stay on top of the industry in 3 minutes a day. Understand the entire ecosystem, monitor innovation, and track deal flows. Learn more.

Get more stories like this on Twitter & Facebook


Let’s block ads! (Why?)

Big Data – VentureBeat

About, Groups, Microsoft, photos, researchers, stories, Teaching, Write
  • Recent Posts

    • Trump’s Note to Biden
    • FSI Blog Series, Part IV: Staying Agile in Trying Times
    • Soci raises $80 million to power data-driven localized marketing for enterprises
    • Conversational Platform Trends for 2021
    • The Great Awakening?
  • Categories

  • Archives

    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited