• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Tag Archives: Could

Cashierless tech could detect shoplifting, but bias concerns abound

January 24, 2021   Big Data

How open banking is driving huge innovation

Learn how fintechs and forward-thinking FIs are accelerating personalized financial products through data-rich APIs.

Register Now


As the pandemic continues to rage around the world, it’s becoming clear that COVID-19 will endure longer than some health experts initially predicted. Owing in part to slow vaccine rollouts, rapidly spreading new strains, and politically charged rhetoric around social distancing, the novel coronavirus is likely to become endemic, necessitating changes in the ways we live our lives.

Some of those changes might occur in brick-and-mortar retail stores, where touch surfaces like countertops, cash, credit cards, and bags are potential viral spread vectors. The pandemic appears to have renewed interest in cashierless technology like Amazon Go, Amazon’s chain of stores that allow shoppers to pick up and purchase items without interacting with a store clerk. Indeed, Walmart, 7-Eleven, and cashierless startups including AiFi, Standard, and Grabango have expanded their presence over the past year.

But as cashierless technology becomes normalized, there’s a risk it could be used for purposes beyond payment, particularly shoplifting detection. While shoplifting detection isn’t problematic on its face, case studies illustrate that it’s susceptible to bias and other flaws that could, at worst, result in false positives.

Synthetic datasets

The bulk of cashierless platforms rely on cameras, among other sensors, to monitor the individual behaviors of customers in stores as they shop. Video footage from the cameras feed into machine learning classification algorithms, which identify when a shopper picks up and places an item in a shopping cart, for example. During a session at Amazon’s re:Mars conference in 2019, Dilip Kumar, VP of Amazon Go, explained that Amazon engineers use errors like missed item detections to train the machine learning models that power its Go stores’ cashierless experiences. Synthetic datasets boost the diversity of the training data and ostensibly the robustness of the models, which use both geometry and deep learning to ensure transactions are associated with the right customer.

The problem with this approach is that synthetic datasets, if poorly audited, might encode biases that machine learning models then learn to amplify. Back in 2015, a software engineer discovered that the image recognition algorithms deployed in Google Photos, Google’s photo storage service, were labeling Black people as “gorillas.” Google’s Cloud Vision API recently mislabeled thermometers held by people with darker skin as guns. And countless experiments have shown that image-classifying models trained on ImageNet, a popular (but problematic) dataset containing photos scraped from the internet, automatically learn humanlike biases about race, gender, weight, and more.

Jerome Williams, a professor and senior administrator at Rutgers University’s Newark campus, told NBC that a theft-detection algorithm might wind up unfairly targeting people of color, who are routinely stopped on suspicion of shoplifting more often than white shoppers. A 2006 study of toy stores found that not only were middle-class white women often given preferential treatment, but also that the police were never called on them, even when their behavior was aggressive. And in a recent survey of Black shoppers published in the Journal of Consumer Culture, 80% of respondents reported experiencing racial stigma and stereotypes when shopping.

 Cashierless tech could detect shoplifting, but bias concerns abound

“The people who get caught for shoplifting is not an indication of who’s shoplifting,” Williams told NBC. In other words, Black shoppers who feel they’ve been scrutinized in stores might be more likely to appear nervous while shopping, which might be perceived by a system as suspicious behavior. “It’s a function of who’s being watched and who’s being caught, and that’s based on discriminatory practices.”

Some solutions are explicitly designed to detect shoplifting track gait — patterns of limb movements — among other physical characteristics. It’s a potentially problematic measure considering that disabled shoppers, among others, might have gaits that appear suspicious to an algorithm trained on footage of able-bodied shoppers. As the U.S. Department of Justice’s Civil Rights Division, Disability Rights Section notes, some people with disabilities have a stagger or slurred speech related to neurological disabilities, mental or emotional disturbance, or hypoglycemia, and these characteristics may be misperceived as intoxication, among other states.

Tokyo startup Vaak’s anti-theft product, VaakEye, was reportedly trained on more than 100 hours of closed-circuit television footage to monitor the facial expressions, movements, hand movements, clothing choices, and over 100 other aspects of shoppers. AI Guardsman, a joint collaboration between Japanese telecom company NTT East and tech startup Earth Eyes, scans live video for “tells” like when a shopper looks for blind spots or nervously checks their surroundings.

NTT East, for one, makes no claims that its algorithm is perfect. It sometimes flags well-meaning customers who pick up and put back items and salesclerks restocking store shelves, a spokesperson for the company told The Verge. Despite this, NTT East claimed its system couldn’t be discriminatory because it “does not find pre-registered individuals.”

Walmart’s AI- and camera-based anti-shoplifting technology, which is provided by Everseen, came under scrutiny last May over its reportedly poor detection rates. In interviews with Ars Technica, Walmart workers said their top concern with Everseen was false positives at self-checkout. The employees believe that the tech frequently misinterprets innocent behavior as potential shoplifting.

Industry practices

Trigo, which emerged from stealth in July 2018, aims to bring checkout-less experiences to existing “medium to small” brick-and-mortar convenience stores. For a monthly subscription fee, the company supplies both high-resolution, ceiling-mounted cameras and an on-premises “processing unit” that runs machine learning-powered tracking software. Data is beamed from the unit to a cloud processing provider, where it’s analyzed and used to improve Trigo’s algorithms.

Trigo claims that it anonymizes the data it collects, that it can’t identify individual shoppers beyond the products they’ve purchased, and that its system is 99.5% accurate on average at identifying purchases. But when VentureBeat asked about what specific anti-shoplifting detection features the product offers and how Trigo trains algorithms that might detect theft, the company declined to comment.

Grabango, a cashierless tech startup founded by Pandora cofounder Will Glaser, also declined to comment for this article. Zippin says it requires shoppers to check in with a payment method and that staff is alerted only when malicious actors “sneak in somehow.” And Standard Cognition, which claims its technology can account for changes like when a customer puts back an item they initially considered purchasing, says it doesn’t and hasn’t ever offered shoplifting detection capabilities to its customers.

“Standard does not monitor for shoplifting behavior and we never have … We only track what people pick up or put down so we know what to charge them for when they leave the store. We do this anonymously, without biometrics,” CEO Jordan Fisher told VentureBeat via email. “An AI-driven system that’s trained responsibly with diverse sets of data should in theory be able to detect shoplifting without bias. But Standard won’t be the company doing it. We are solely focused on the checkout-free aspects of this technology.”

 Cashierless tech could detect shoplifting, but bias concerns abound

Above: OTG’s Cibo Express is the first confirmed brand to deploy Amazon’s “Just Walk Out” cashierless technology.

Separate interviews with The New York Times and Fast Company in 2018 tell a different story, however. Michael Suswal, Standard Cognition’s cofounder and chief operating officer, told The Times that Standard’s platform could look at a shopper’s trajectory, gaze, and speed to detect and alert a store attendant to theft via text message. (In the privacy policy on its website, Standard says it doesn’t collect biometric identifiers but does collect information about “certain body features.”) He also said that Standard hired 100 actors to shop for hours in its San Francisco demo store in order to train its algorithms to recognize shoplifting and other behaviors.

“We learn behaviors of what it looks like to leave,” Suswal told The Times. “If they’re going to steal, their gait is larger, and they’re looking at the door.”

A patent filed by Standard in 2019 would appear to support the notion that Standard developed a system to track gait. The application describes an algorithm trained on a collection of images that can recognize the physical features of customers moving in store aisles between shelves. This algorithm is designed to identify one of 19 different on-body points including necks, noses, eyes, ears, shoulders, elbows, wrists, hips, ankles, and knees.

Santa Clara-based AiFi also says its cashierless solution can recognize “suspicious behavior” inside of stores within a defined set of shopping behaviors. Like Amazon, the company uses synthetic datasets to generate a set of training and testing data without requiring customer data. “With simulation, we can randomize hairstyle, color, clothing, and body shape to ensure that we have a diverse and unbiased datasets,” a spokesperson told VentureBeat. “We respect user privacy and do not use facial recognition or personally identifiable information. It is our mission to change the future of shopping to make it automated, privacy-conscious, and inclusive.”

A patent filed in 2019 by Accel Robotics reveals the startup’s proposed anti-shoplifting solution, which optionally relies on anonymous tags that don’t reveal a person’s identity. By analyzing camera images over time, a server can attribute motion to a person and purportedly infer whether they took items from a shelf with malintent. Shopper behavior can be tracked over multiple visits if “distinguishing characteristics” are saved and retrieved for each visitor, which could be used to identify shoplifters who’ve previously stolen from the store.

“[The system can be] configured to detect shoplifting when the person leaves the store without paying for the item. Specifically, the person’s list of items on hand (e.g., in the shopping cart list) may be displayed or otherwise observed by a human cashier at the traditional cash register screen,” the patent description reads. “The human cashier may utilize this information to verify that the shopper has either not taken anything or is paying/showing for all items taken from the store. For example, if the customer has taken two items from the store, the customer should pay for two items from the store.”

Lack of transparency

For competitive reasons, cashierless tech startups are generally loath to reveal the technical details of their systems. But this does a disservice to the shoppers subjected to them. Without transparency regarding the applications of these platforms and the ways in which they’re developed, it will likely prove difficult to engender trust among shoppers, shoplifting detection capabilities or no.

Zippin was the only company VentureBeat spoke with that volunteered information about the data used to train its algorithms. It said that depending on the particular algorithm to be trained, the size of the dataset varies from a few thousand to a few million video clips, with training performed in the cloud and models deployed to the stores after training. But the company declined to say what steps it takes to ensure the datasets are sufficiently diverse and unbiased, whether it uses actors or synthetic data, and whether it continuously retrains algorithms to correct for errors.

Systems like AI Guardsman learn from their mistakes over time by letting store clerks and managers flag false positives as they occur. It’s a step in the right direction, but without more information about how these system work, it’s unlikely to allay shoppers’ concerns about bias and surveillance.

Experts like Christopher Eastham, a specialist in AI at the law firm Fieldfisher, call for frameworks to regulate the technology. And even Ryo Tanaka, the founder of Vaak, argues there should be notice before customers enter stores so that they can opt out. “Governments should operate rules that make stores disclose information — where and what they analyze, how they use it, how long they use it,” he told CNN.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

DeepMind’s improved protein-folding prediction AI could accelerate drug discovery

December 1, 2020   Big Data

When it comes to customer expectations, the pandemic has changed everything

Learn how to accelerate customer service, optimize costs, and improve self-service in a digital-first world.

Register here

The recipe for proteins — large molecules consisting of amino acids that are the fundamental building blocks of tissues, muscles, hair, enzymes, antibodies, and other essential parts of living organisms — are encoded in DNA. It’s these genetic definitions that circumscribe their three-dimensional structures, which in turn determines their capabilities. But protein “folding,” as it’s called, is notoriously difficult to figure out from a corresponding genetic sequence alone. DNA contains only information about chains of amino acid residues and not those chains’ final form.

In December 2018, DeepMind attempted to tackle the challenge of protein folding with a machine learning system called AlphaFold. The product of two years of work, the Alphabet subsidiary said at the time that AlphaFold could predict structures more precisely than prior solutions. Lending credence to this claim, the system beat 98 competitors in the Critical Assessment of Structure Prediction (CASP) protein-folding competition in Cancun, where it successfully predicted the structure of 25 out of 43 proteins.

DeepMind now asserts that AlphaFold has outgunned competing protein-folding-predicting methods for a second time. In the results from the 14th CASP assessment, a newer version of AlphaFold — AlphaFold 2 — has average error comparable to the width of an atom (or 0.1 of a nanometer), competitive with the results from experimental methods.

“We have been stuck on this one problem — how do proteins fold up — for nearly 50 years,” University of Maryland professor John Moult, cofounder and chair of CASP, told reporters during a briefing last week. “To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment.”

Protein folding

Solutions to many of the world’s challenges, like developing treatments for diseases, can ultimately be traced back to proteins. Antibody proteins are shaped like a “Y,” for example, enabling them to latch onto viruses and bacteria, and collagen proteins are shaped like cords, which transmit tension between cartilage, bones, skin, and ligaments. In SARS-CoV-2, the novel coronavirus, a spike-like protein changes shape to interact with another protein on the surface of human cells, allowing it to force entry.

It was biochemist Christian Anfinsen who hypothesized in 1972 that a protein’s amino acid sequence could determine its structure. This laid the groundwork for attempts to predict a protein’s structure based on its amino acid sequence as an alternative to expensive, time-consuming experimental methods like nuclear magnetic resonance, X-ray crystallography, and cryo-electron microscopy. Complicating matters, however, is the raw complexity of protein folding. Scientists estimate that because of the incalculable number of interactions between the amino acids, it would take longer than 13.8 billion years to figure out all the possible configurations of a typical protein before identifying the right structure.

 DeepMind’s improved protein folding prediction AI could accelerate drug discovery

Above: AlphaFold’s architecture in schematic form.

Image Credit: DeepMind

DeepMind says its approach with AlphaFold draws inspiration from the fields of biology, physics, machine leaning, and the work of scientists over the past half-century. Taking advantage of the fact that a folded protein can be thought of as a “spatial graph,” where amino acid residues (amino acids contained within a peptide or protein) are nodes and edges connect the residues in close proximity, AlphaFold leverages an AI algorithm that attempts to interpret the structure of this graph while reasoning over the implicit graph that it’s building using evolutionarily related sequences, multiple sequence alignment, and a representation of amino acid residue pairs.

By iterating through this process, AlphaFold can learn to predict the underlying structure of a protein and determine its shape within days, according to DeepMind. Moreover, the system can self-assess which parts of each protein structure are reliable using an internal confidence measure.

DeepMind says that the newest release of AlphaFold, which will be detailed in a forthcoming paper, was trained on roughly 170,000 protein structures from the Protein Data Bank, an open source database for structural data of large biological molecules. The company tapped 128 of Google’s third-generation tensor processing units (TPUs), special-purpose AI accelerator chips available through Google Cloud, for compute resources roughly equivalent to 100 to 200 graphics cards. Training took a few weeks. For the sake of comparison, it took DeepMind 44 days to train a single agent within its StarCraft 2-playing AlphaStar system using 32 third-gen TPUs.

DeepMind declined to reveal the cost of training AlphaFold. But Google charges Google Cloud customers $ 32 per hour per third-generation TPU, which works out to about $ 688,128 per week.

Measuring progress

In 1994, Moult and University of California, Davis professor Krzysztof Fidelis founded CASP as a biennial blind assessment to catalyze research, monitor progress, and establish the state of the art in protein structure prediction. It’s considered the gold standard for benchmarking predictive techniques, because CASP chooses structures that have only recently been experimentally selected as targets for teams to test their prediction methods against. Some were still awaiting validation at the time of AlphaFold’s assessment.

Because the target structures aren’t published in advance, CASP participants must blindly predict the structure of each of the proteins. These predictions are then compared to the ground-truth experimental data when this data become available.

The primary metric used by CASP to measure the accuracy of predictions is the global distance test, which ranges from 0 to 100. It’s essentially the percentage of amino acid residues within a certain threshold distance from the correct position. A score of around 90 is informally considered to be competitive with results obtained from experimental methods; AlphaFold achieved a median score of 92.4 overall and a median score of 87 for proteins in the free-modeling category (i.e., those without templates).

 DeepMind’s improved protein folding prediction AI could accelerate drug discovery

Above: The results of the CASP14 competition.

Image Credit: DeepMind

“What we saw in CASP14 was a group delivering atomic accuracy off the bat,” Moult said. “This [progress] gives you such excitement about the way science works — about how you can never see exactly, or even approximately, what’s going to happen next. There are always these surprises. And that really as a scientist is what keeps you going. What’s going to be the next surprise?”

Real-world applications

DeepMind makes the case that AlphaFold, if further refined, could be applied to previously intractable problems in the field of protein folding, including those related to epidemiological efforts. Earlier this year, the company predicted several protein structures of SARS-CoV-2, including ORF3a, whose makeup was formerly a mystery. At CASP14, DeepMind predicted the structure of another coronavirus protein, ORF8, which has since been confirmed by experimentalists.

Beyond pandemic response, DeepMind expects that AlphaFold will be used to explore the hundreds of millions of proteins for which science currently lacks models. Since DNA specifies the amino acid sequences that comprise protein structures, advances in genomics have made it possible to read protein sequences from the natural world, with 180 million protein sequences and counting in the publicly available Universal Protein database. In contrast, given the experimental work needed to translate from sequence to structure, only around 170,000 protein structures are in the Protein Data Bank.

DeepMind says it’s committed to making AlphaFold available “at scale” and collaborating with partners to explore new frontiers, like how multiple proteins form complexes and interact with DNA, RNA, and small molecules. Improving the scientific community’s understanding of protein folding could lead to more effective diagnoses and treatment of diseases such as Parkinson’s and Alzheimer’s, as these are believed to be caused by misfolded proteins. And it could aid in protein design, leading to protein-secreting bacteria that make wastewater biodegradable, for instance, and enzymes that can help manage pollutants such as plastic and oil.

 DeepMind’s improved protein folding prediction AI could accelerate drug discovery

Above: A ground-truth folded protein compared with AlphaFold 2’s prediction.

Image Credit: DeepMind

In any case, it’s a milestone for DeepMind, whose work has principally focused on the games domain. Its AlphaStar system bested professional players at StarCraft 2, following wins by AlphaZero at Go, chess, and shogi. While some of DeepMind’s work has found real-world application, chiefly in datacenters, Waymo’s self-driving cars, and the Google Play Store’s recommendation algorithms, DeepMind has yet to achieve a significant AI breakthrough in a scientific area such as protein folding or glass dynamics modeling. These new results might mark a shift in the company’s fortunes.

“AlphaFold represents a huge leap forward that I hope will really accelerate drug discovery and help us to better understand disease. It’s pretty mind blowing,” DeepMind CEO Demis Hassabis said during the briefing last week. “We advanced the state of the art in the field, so that’s fantastic, but there’s still a long way to go before we’ve solved it.”

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

AI that directs drones to film ‘exciting’ shots could lower video production costs

November 24, 2020   Big Data

When it comes to customer expectations, the pandemic has changed everything

Learn how to accelerate customer service, optimize costs, and improve self-service in a digital-first world.

Register here

Because of their ability to detect, track, and follow objects of interest while maintaining safe distances, drones have become an important tool for professional and amateur filmmakers alike. This being the case, quadcopters’ camera controls remain difficult to master. Drones might take different paths for the same scenes even if their positions, velocities, and angles are carefully tuned, potentially ruining the consistency of a shot.

In search of a solution, Carnegie Mellon, University of Sao Paulo, and Facebook researchers developed a framework that enables users to define drone camera shots working from labels like “exciting,” “enjoyable,” and “establishing.” Using a software simulator, they generated a database of video clips with a diverse set of shot types and then leveraged crowdsourcing and AI to learn the relationship between the labels and certain semantic descriptors.

Videography can be a costly endeavor. Filming a short commercial runs $ 1,500 to $ 3,500 on the low end, a hefty expense for small-to-medium-sized businesses. This leads some companies to pursue in-house solutions, but not all have the expertise required to execute on a vision. AI like Facebook’s, as well as Disney’s and Pixar’s, could lighten the load in a meaningful way.

 AI that directs drones to film ‘exciting’ shots could lower video production costs

The coauthors of this new framework began by conducting a series of experiments to determine the “minimal perceptually valid step sizes” — i.e., the minimum number of shots a drone had to take — for various shot parameters. Next, they built a dataset of 200 videos using these steps and tasked volunteers from Amazon Mechanical Turk with assigning scores to semantic descriptors. The scores informed a machine learning model that mapped the descriptors to parameters that could guide the drone through shots. Lastly, the team deployed the framework to a real-world Parrot Bepop 2 drone, which they claim managed to generalize well to different actors, activities, and settings.

 AI that directs drones to film ‘exciting’ shots could lower video production costs

The researchers assert that while the framework targets nontechnical users, experts could adapt it to gain more control over the model’s outcome. For example, they could learn separate generative models for individual shot types and expert more direction over the model’s inputs and outputs.

“Our … model is able to successfully generate shots that are rated by participants as having the expected degrees of expression for each descriptor,” the researchers wrote. “Furthermore, the model generalizes well to other simulated scenes and to real-world footages, which strongly suggests that our semantic control space is not overly attached to specific features of the training environment nor to a single set of actor motions.”

In the future, the researchers hope to explore a larger set of parameters to control each shot, including lens zoom and potentially even soundtracks. They’d also like to extend the framework to take into account features like terrain and scenery.

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

The Retirement of SharePoint 2010 Could Destroy Your CRM Business Processes

October 22, 2020   CRM News and Info

Find out how one organization thought they were good to go but after running the Modernization Scanner we found 1000’s of broken workflows.

If you’ve been using SharePoint 2010 to manage your CRM system, you might be in trouble. As you may have heard, Microsoft announced that SharePoint 2010 Workflows are being retired. This announcement stems from the company’s focus on adopting newer tools and technology and their push to have users quickly adopt Power Automate

Given the company’s push towards Power Automate, this announcement was only a matter of time, but it could leave businesses who are still relying on these workflows in a tough spot. But what exactly is being turned off? 

Here’s what we know so far: all SharePoint 2010 Workflows were switched off on August 1, 2020. If you were planning on migrating any workflows to a new tenant, it is no longer possible. To help make the transition away from SharePoint 2010 easier, here’s what you can expect.

Any workflows that are running on SharePoint 2010 in an existing tenant must be moved by November 1, 2020 and be recreated in either Power Automate or another system. If you don’t move them, they may stop running. 

Beginning on the first of November, Microsoft will start removing users’ ability to create or execute 2010 workflows. While you won’t be able to run existing 2010 workflows, you can still access them. Please note that as Microsoft continues to move towards newer technology, you may lose access to these workflows in the future. 

It’s expected that the following workflows will be affected by the retirement of SharePoint 2010:

  • Signature collection: This workflow routes a Microsoft document to gather electronic signatures. 
  • Gather feedback: The workflow that gathers feedback from specific people or parties. 
  • Classic pages publishing approval: When you need to send content or sample web pages to clients or customers for approval, this is the workflow you would use. 
  • Three-state: This workflow is used to manage business processes that require users to track a number of items or issues such as sales leads, project tasks, or support issues. 
  • Approvals: When you need approval from designated parties, this workflow routes the item or document to the right person. 

Moving SharePoint 2010 Workflows to Power Automate isn’t a simple task. While migrating simple workflows is relatively straightforward, migrating long, complicated processes can take much longer than the few months given by Microsoft. You don’t want to run the risk of losing access to the business processes you rely on after November 1 as this can create numerous problems for your organization. So what are you to do?

It’s recommended to get in touch with a Microsoft representative or Microsoft Partner that has experience in using Power Platform. JourneyTEAM is one of those partners. We’ve already assisted numerous clients in understanding what their options are as well as provided assistance in moving to Power Automate.

It won’t. Microsoft has specified that on-premise workflows for SharePoint 2010 and SharePoint 2013 will continue to be supported until 2026. 

As of today, Microsoft has made no announcement that SharePoint 2013 workflows will be retired. However, while they’ll still be supported, they will be deprecated.

By default, all SharePoint 2013 workflows will be turned off for new tenants after November 1, 2020. However, should you need to activate the 2013 on a new tenant, you can do so using the PowerShell script provided by Microsoft.

Don’t think that 2010 workflows are safe in SharePoint 2013. While Microsoft has not yet announced the retirement of SharePoint 2013, it’s only a matter of time before this announcement comes. Especially considering that Microsoft has repeatedly emphasized that their preferred workflow solution is Power Automate and Office 365. To be safe, any workflows you recreate in SharePoint 2013 should be recreated in Power Automate to ensure you don’t lose access to these workflows. 

Moving your workflows directly to Power Automate can help you save hours of work. But migrating to the platform isn’t as straightforward as it seems. SharePoint 2010 and 2013 have different features than those in Power Automate that allow some workflows to be recreated and others not. You’ll need to determine how complex your workflows are and if Power Automate features all the actions contained in the existing workflow. To help you figure this out, Microsoft has created the Modernization Scanner. 

While you only have a few months to migrate your workflows, there’s no need to start panicking. Your first step is analyzing your system and figuring out exactly which workflows will be turned off on November 1, 2020. 

This is exactly what the Modernization Scanner tool does. It scans your entire system and pinpoints exactly which workflows will be affected by the retirement. Plus, it can help for further modernization by providing a report that details: 

  • Which lists and libraries can be modernized
  • Where rebuilding of classic portals need to be done
  • Which classic workflows, InfoPaths, and blog pages are being used

The Modernization Scanner will show you exactly which workflows will be affected at the beginning of November. Additionally, the insight gained from the scanner will specify which SharePoint 2010 features can’t be recreated in Power Automate or another modern system. This enables you to work with your IT team to figure out how to recreate those actions and tasks within a modern server, so that your business processes won’t be disrupted. 

Even with a thorough knowledge of your system, you can never be 100% sure that there aren’t a few SharePoint 2010 workflows hiding in the shadows. Consider this: JourneyTEAM recently worked with a client that used the Modernization Scanner on one part of their system. When the scan was complete, they discovered 1,000 SharePoint 2010 workflows that several of their business processes relied heavily on. With the Modernization Scanner, they were able to avoid overloading their IT team with work at the beginning of November.

Now that you can see how the retirement of SharePoint will affect your business processes, it’s time to get to work. Here’s what you need to do: First, you’ll need to get in touch with someone from JourneyTEAM. After you’ve spoken with one of our team members, you’ll need to set up and run the Modernization Scanner in a specific tenant which will gather the data needed to determine which workflows will be affected by SharePoint’s retirement. Once you have that data, send it to your JourneyTEAM contact. Using that data, we’ll help you determine what, if any, steps need to be taken.

Working directly with our SharePoint Knowledge Management Team, you’ll get the exact support you need to modernize your SharePoint 2010 workflows. Whether you have questions about how to run the scanner or run into a few minor issues, our team can help. JourneyTEAM has numerous online resources available as well as tutorials and work sessions that can get you started. 

Don’t wait and see if your business processes will be affected after November 1. Contact JourneyTEAM today to get started. 

JourneyTEAM can help you integrate your solution with ease. Our team of professionals can help you navigate services and put them to work for you. Contact us today to discuss strategy or run a Modernization Scanner to see if you have any potential workflow threats.

SEE THE FULL ARTICLE HERE


228x318x2020 08 24 15 32 44.jpeg.pagespeed.ic.E6l5RWE4C2 The Retirement of SharePoint 2010 Could Destroy Your CRM Business ProcessesAbout JourneyTEAM

Article by: Dave Bollard -
Head of Marketing | 801-436-6636

JourneyTEAM has more than 25 years of experience in delivering IT solutions to businesses. As a Microsoft Gold Certified Partner, we’ve worked closely with Microsoft to carefully provide organizations with Microsoft products, including SharePoint, Azure, Office 365, Dynamics 364, SL, AX, GP, NAV, and CRM, and make them work for you. Whether you’re in need of a collaboration, marketing, sales, or productivity solution, JourneyTEAM can help you find the right technology for your business. To learn more about our team, visit our website at www.journeyteam.com.

 

Let’s block ads! (Why?)

CRM Software Blog | Dynamics 365

Read More

Researchers detail texture-swapping AI that could be used to create deepfakes

July 8, 2020   Big Data

In a preprint paper published on Arxiv.org, researchers at the University of California, Berkeley and Adobe Research describe the Swapping Autoencoder, a machine learning model designed specifically for image manipulation. They claim it can modify any image in a variety ways, including texture swapping, while remaining “substantially” more efficient compared with previous generative models.

The researchers acknowledge that their work could be used to create deepfakes, or synthetic media in which a person in an existing image or video is replaced with someone else’s likeness. In a human perceptual study, subjects were fooled 31% of the time by images created using the Swapping Autoencoder. But they also say that proposed detectors can successfully spot images manipulated by the tool at least 73.9% of the time, suggesting the Swapping Autoencoder is no more harmful than other AI-powered image manipulation tools.

“We show that our method based on an auto-encoder model has a number of advantages over prior work, in that it can accurately embed high-resolution images in real-time, into an embedding space that disentangles texture from structure, and generates realistic output images … Each code in the representation can be independently modified such that the resulting image both looks realistic and reflects the unmodified codes,” the coauthors of the study wrote.

The researchers’ approach isn’t novel in the sense that many AI models can edit portions of images to create new images. For example, the MIT-IBM Watson AI Lab released a tool that lets users upload photographs and customize the appearance of pictured buildings, flora, and fixtures, and Nvidia’s GauGAN can create lifelike landscape images that never existed. But these models tend to be challenging to design and computationally intensive to run.

VB Transform 2020 Online – July 15-17. Join leading AI executives: Register for the free livestream.

 Researchers detail texture swapping AI that could be used to create deepfakes

By contrast, the Swapping Autoencoder is lightweight, using image swapping as a “pretext” task for learning an embedding space useful for image manipulation. It encodes a given image into two separate latent codes — a “structure” code and a “texture” code — intended to represent structure and texture, and during training, the structure code learns to correspond to the layout or structure of a scene while the texture codes capture properties about the scene’s overall appearance.

In an experiment, the researchers trained Swapping Autoencoder on a data set containing images of churches, animal faces, bedrooms, people, mountain ranges, and waterfalls and built a web app that offers fine-grained control over uploaded photos. The app supports global style editing and region editing as well as cloning, with a brush tool that replaces the structure code from another part of the image.

“Tools for creative expression are an important part of human culture … Learning-based content creation tools such as our method can be used to democratize content creation, allowing novice users to synthesize compelling images,” the coauthors wrote.

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

AI tools could improve fake news detection by analyzing users’ interactions and comments

June 1, 2020   Big Data
 AI tools could improve fake news detection by analyzing users’ interactions and comments

In a paper published on the preprint server Arxiv.org, researchers affiliated with Microsoft and Arizona State University propose an approach to detecting fake news that leverages a technique called weak social supervision. They say that by enabling the training of fake news-detecting AI even in scenarios where labeled examples aren’t available, weak social supervision opens the door to exploring how aspects of user interactions indicate news might be misleading.

According to the Pew Research Center, approximately 68% of U.S. adults got their news from social media in 2018 — which is worrisome considering misinformation about the pandemic continues to go viral, for instance. Companies from Facebook and Twitter to Google are pursuing automated detection solutions, but fake news remains a moving target owing to its topical and stylistic diverseness.

Building on a study published in April, the coauthors of this latest work suggest that weak supervision — where noisy or imprecise sources provide data labeling signals — could improve fake news detection accuracy without requiring fine-tuning. To this end, they built a framework dubbed Tri-relationship for Fake News (TiFN) that models social media users and their connections as an “interaction network” to detect fake news.

Interaction networks describe the relationships among entities like publishers, news pieces, and users; given an interaction network, TiFN’s goal is to embed different types of entities, following from the observation that people tend to interact with like-minded friends. In making its predictions, the framework also accounts for the fact that connected users are more likely to share similar interests in news pieces; that publishers with a high degree of political bias are more likely to publish fake news; and that users with low credibility are more likely to spread fake news.

VB Transform 2020 Online – July 15-17. Join leading AI executives: Register for the free livestream.

To test whether TiFN’s weak social supervision could help to detect fake news effectively, the team validated it against a Politifact data set containing 120 true news and 120 verifiably fake pieces shared among 23,865 users. Versus baseline detectors that consider only news content and some social interactions, they report that TiFN achieved between 75% to 87% accuracy even with a limited amount of weak social supervision (within 12 hours after the news was published).

In another experiment involving a separate custom framework called Defend, the researchers sought to use as a weak supervision signal news sentences and user comments explaining why a piece of news is fake. Tested on a second Politifact data set consisting of 145 true news and 270 fake news pieces with 89,999 comments from 68,523 users on Twitter, they say that Defend achieved 90% accuracy.

[W]ith the help of weak social supervision from publisher-bias and user-credibility, the detection performance is better than those without utilizing weak social supervision. We [also] observe that when we eliminate news content component, user comment component, or the co-attention for news contents and user comments, the performances are reduced. [This] indicates capturing the semantic relations between the weak social supervision from user comments and news contents is important,” wrote the researchers. “[W]e can see within a certain range, more weak social supervision leads to a larger performance increase, which shows the benefit of using weak social supervision.”

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Google’s federated analytics method could analyze end user data without invading privacy

May 28, 2020   Big Data
 Google’s federated analytics method could analyze end user data without invading privacy

In a blog post today, Google laid out the concept of federated analytics, a practice of applying data science methods to the analysis of raw data that’s stored locally on edge devices. As the tech giant explains, it works by running local computations over a device’s data and making only the aggregated results — not the data from the particular device — available to authorized engineers.

While federated analytics is closely related to federated learning, an AI technique that trains an algorithm across multiple devices holding local samples, it only supports basic data science needs. It’s “federated learning lite” — federated analytics enables companies to analyze user behaviors in a privacy-preserving and secure way, which could lead to better products. Google for its part uses federated techniques to power Gboard’s word suggestions and Android Messages’ Smart Reply feature.

“The first exploration into federated analytics was in support of federated learning: how can engineers measure the quality of federated learning models against real-world data when that data is not available in a data center? The answer was to re-use the federated learning infrastructure but without the learning part,” Google research scientist Daniel Ramage and software engineer Stefano Mazzocchi said in a statement. “In federated learning, the model definition can include not only the loss function that is to be optimized, but also code to compute metrics that indicate the quality of the model’s predictions. We could use this code to directly evaluate model quality on phones’ data.”

As an example, in a user study, Gboard engineers measured the overall quality of word prediction models against raw typing data held on phones. Participating phones downloaded a candidate model, locally computed a metric of how well the model’s predictions matched words that were actually typed, and then uploaded the metric without any adjustment to the model itself or any change to the Gboard typing experience. By averaging the metrics uploaded by many phones, engineers learned a population-level summary of model performance.

VB Transform 2020 Online – July 15-17. Join leading AI executives: Register for the free livestream.

In a separate study, Gboard engineers wanted to discover words commonly typed by users and add them to dictionaries for spell-checking and typing suggestions. They trained a character-level recurrent neural network on phones, using only the words typed on these phones that weren’t already in the global dictionary. No typed words ever left the phones, but the resulting model could then be used in the datacenter to generate samples of frequently typed character sequences — i.e., the new words.

Beyond model evaluation, Google uses federated analytics to support the Now Playing feature on its Pixel phones, which shows what song might be playing nearby. Under the hood, Now Playing taps an on-device database of song fingerprints to identify music near a phone without the need for an active network connection.

When it recognizes a song, Now Playing records the track name into the on-device history, and when the phone is idle and charging while connected to Wi-Fi, Google’s federated learning and analytics server sometimes invites it to join a “round” of computation with hundreds of phones. Each phone in the round computes the recognition rate for the songs in its Now Playing history and uses a secure aggregation protocol to encrypt the results. The encrypted rates are sent to the federated analytics server, which doesn’t have the keys to decrypt them individually; when combined with the encrypted counts from the other phones in the round, the final tally of all song counts can be decrypted by the server.

The result enables Google’s engineers to improve the song database without any phone revealing which songs were heard, for example, by making sure the database contains truly popular songs. Google claims that in its first improvement iteration, federated analytics resulted in a 5% increase in overall song recognition across all Pixel phones globally.

“We are also developing techniques for answering even more ambiguous questions on decentralized datasets like ‘what patterns in the data are difficult for my model to recognize?’ by training federated generative models. And we’re exploring ways to apply user-level differentially private model training to further ensure that these models do not encode information unique to any one user,” wrote Ramage and Mazzocchi. “It’s still early days for the federated analytics approach and more progress is needed to answer many common data science questions with good accuracy … [B]ut federated analytics enables us to think about data science differently, with decentralized data and privacy-preserving aggregation in a central role.”

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

AI Weekly: Machine learning could lead cybersecurity into uncharted territory

February 15, 2020   Big Data
 AI Weekly: Machine learning could lead cybersecurity into uncharted territory

Once a quarter, VentureBeat publishes a special issue to take an in-depth look at trends of great importance. This week, we launched issue two, examining AI and security. Across a spectrum of stories, the VentureBeat editorial team took a close look at some of the most important ways AI and security are colliding today. It’s a shift with high costs for individuals, businesses, cities, and critical infrastructure targets — data breaches alone are expected to cost more than $ 5 trillion by 2024 — and high stakes.

Throughout the stories, you may find a theme that AI does not appear to be used much in cyberattacks today. However, cybersecurity companies increasingly rely on AI to identify threats and sift through data to defend targets.

Security threats are evolving to include adversarial attacks against AI systems; more expensive ransomware targeting cities, hospitals, and public-facing institutions; misinformation and spear phishing attacks that can be spread by bots in social media; and deepfakes and synthetic media have the potential to become security vulnerabilities.

In the cover story, European correspondent Chris O’Brien dove into how the spread of AI in security can lead to less human agency in the decision-making process, with malware evolving to adapt and adjust to security firm defense tactics in real time. Should costs and consequences of security vulnerabilities increase, ceding autonomy to intelligent machines could begin to seem like the only right choice.

We also heard from security experts like McAfee CTO Steve Grobman, F-Secure’s Mikko Hypponen, and Malwarebytes Lab director Adam Kujawa, who talked about the difference between phishing and spear phishing, addressed an anticipated rise in personalized spear phishing attacks ahead, and spoke generally to the fears — unfounded and not — around AI in cybersecurity.

VentureBeat staff writer Paul Sawers took a look at how AI could be used to reduce the massive job shortage in the cybersecurity sector, while Jeremy Horwitz explored how cameras in cars and home security systems equipped with AI will impact the future of surveillance and privacy.

AI editor Seth Colaner examines how security and AI can seem heartless and inhuman but still relies heavily on people, who are still a critical factor in security, both as defenders and targets. Human susceptibility is still a big part of why organizations become soft targets, and education around how to properly guard against attacks can lead to better protection.

We don’t know yet the extent to which those carrying out attacks will come to rely on AI systems. And we don’t know yet if open source AI opened Pandora’s box, or to what extent AI might increase threat levels. One thing we do know is that cybercriminals don’t appear to need AI to be successful today.

I’ll leave it to you to read the special issue and draw your own conclusions, but one quote worth remembering comes from Shuman Ghosemajumder, formerly known as the “click fraud czar” at Google and now CTO at Shape Security, in Sawers’ article. “[Good actors and bad actors] are both automating as much as they can, building up DevOps infrastructure and utilizing AI techniques to try to outsmart the other,” he said. “It’s an endless cat-and-mouse game, and it’s only going to incorporate more AI approaches on both sides over time.”

For AI coverage, send news tips to Khari Johnson and Kyle Wiggers and AI editor Seth Colaner — and be sure to subscribe to the AI Weekly newsletter and bookmark our AI Channel.

Thanks for reading,

Khari Johnson

Senior AI Staff Writer

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Jussie Smollett Could Return To Finale Season Of ‘Empire’

January 4, 2020   Humor
 Jussie Smollett Could Return To Finale Season Of ‘Empire’

The sixth and final season of the Fox musical drama Empire has continued on without the inclusion of Jussie Smollett, who played Jamal Lyon. However, in an interview with TV Line, showrunner Brett Mahoney revealed that discussions are underway for a possible Smollett return.

“It would be weird in my mind to end this family show and this family drama of which he was such a significant part of without seeing him,” Mahoney said. “It’s fair to say it’s being discussed, but there’s no plan as of yet to bring him back. There’s been no decision made.”

For those unfamiliar, Smollett was written out at the end of the fifth season following accusations of filing a false police report. The report claimed he was the victim of a racist and homophobic attack in Chicago in January. Charges were eventually dropped against Smollett in March.

Empire‘s midseason finale aired on Tuesday, December 17, 2019.

Share this:

Like this:

Like Loading…

Tone Bell Comedy From Cedric the Entertainer In Works At CBS

WATCH: Jennifer Hudson Is Aretha Franklin In ‘Respect’ Biopic

Let’s block ads! (Why?)

The Humor Mill

Read More

AI has a privacy problem, but these techniques could fix it

December 22, 2019   Big Data
 AI has a privacy problem, but these techniques could fix it

Artificial intelligence promises to transform — and indeed, has already transformed — entire industries, from civic planning and health care to cybersecurity. But privacy remains an unsolved challenge in the industry, particularly where compliance and regulation are concerned.

Recent controversies put the problem into sharp relief. The Royal Free London NHS Foundation Trust, a division of the U.K.’s National Health Service based in London, provided Alphabet’s DeepMind with data on 1.6 million patients without their consent. Google — whose health data-sharing partnership with Ascension became the subject of scrutiny in November — abandoned plans to publish scans of chest X-rays over concerns that they contained personally identifiable information. This past summer, Microsoft quietly removed a data set (MS Celeb) with more than 10 million images of people after it was revealed that some weren’t aware they had been included.

Separately, tech giants including Apple and Google have been the subject of reports uncovering the potential misuse of recordings collected to improve assistants like Siri and Google Assistant. In April, Bloomberg revealed that Amazon employs contract workers to annotate thousands of hours of audio from Alexa-powered devices, prompting the company to roll out user-facing tools that quickly delete cloud-stored data.

Increasingly, privacy isn’t merely a question of philosophy, but table stakes in the course of business. Laws at the state, local, and federal levels aim to make privacy a mandatory part of compliance management. Hundreds of bills that address privacy, cybersecurity, and data breaches are pending or have already been passed in 50 U.S. states, territories, and the District of Columbia. Arguably the most comprehensive of them all — the California Consumer Privacy Act — was signed into law roughly two years ago. That’s not to mention the Health Insurance Portability and Accountability Act (HIPAA), which requires companies to seek authorization before disclosing individual health information. And international frameworks like the EU’s General Privacy Data Protection Regulation (GDPR) aim to give consumers greater control over personal data collection and use.

AI technologies have not historically been developed with privacy in mind. But a subfield of machine learning — privacy-preserving machine learning — seeks to pioneer approaches that might prevent the compromise of personally identifiable data. Of the emerging techniques, federated learning, differential privacy, and homomorphic encryption are perhaps the most promising.

Neural networks and their vulnerabilities

The so-called neural networks at the heart of most AI systems consist of functions (neurons) arranged in layers that transmit signals to other neurons. Those signals — the product of data, or inputs, fed into the network — travel from layer to layer and slowly “tune” the network, in effect adjusting the synaptic strength (weights) of each connection. Over time, the network extracts features from the data set and identifies cross-sample trends, eventually learning to make predictions.

Neural networks don’t ingest raw images, videos, audio, or text. Rather, samples from training corpora are transformed algebraically into multidimensional arrays like scalars (single numbers), vectors (ordered arrays of scalars), and matrices (scalars arranged into one or more columns and one or more rows). A fourth entity type that encapsulates scalars, vectors, and matrices — tensors — adds in descriptions of valid linear transformations (or relations).

In spite of these transformations, it’s often possible to discern potentially sensitive information from the outputs of the neural network. The data sets themselves are vulnerable, too, because they’re not typically obfuscated, and because they’re usually stored in centralized repositories that are vulnerable to data breaches.

By far the most common form of machine learning reverse engineering is called a membership inference attack, where an attacker — using a single data point or several data points — determines whether it belonged to the corpus on which a target model was trained. As it turns out, removing sensitive information from a data set doesn’t mean it can’t be re-inferred, because AI is exceptionally good at recreating samples. Barring the use of privacy-preserving techniques, trained models incorporate compromising information about whatever set they’re fed.

In one study, researchers from the University of Wisconsin and the Marshfield Clinic Research Foundation were able to extract patients’ genomic information from a machine learning model that was trained to predict medical dosage. In another, Carnegie Mellon and University of Wisconsin-Madison research scientists managed to reconstruct specific head shot images from a model trained to perform facial recognition.

A more sophisticated data extraction attack employs generative adversarial networks, or GANs — two-part AI systems consisting of generators that produce samples and discriminators that attempt to distinguish between the generated samples and real-world samples. They’re trained to generate samples closely resembling those in the original corpus without having access to said samples and by interacting with the discriminative deep neural network in order to learn the data’s distribution.

In 2017, researchers demonstrated that GANs could be trained to produce prototypical samples of a private set, revealing sensitive information from this set. In another study, a team used GANs to infer the samples that were used to train an image-generating machine learning model, with up to a 100% success rate in a “white-box” setting where they had access to the target model’s parameters (e.g., the variables a chosen AI technique uses to adjust to data).

Fortunately, there’s hope in the form of approaches like federated learning and differential privacy.

Federated learning

Quite simply, federated learning is a technique that trains an AI algorithm across decentralized devices or servers (i.e., nodes) holding data samples without exchanging those samples, enabling multiple parties to build a common machine learning model without sharing data liberally. That’s in contrast to classical decentralized approaches, which assume local data samples are widely distributed.

A central server might be used to orchestrate the steps of the algorithm and act as a reference clock, or the arrangement might be peer-to-peer (in which case no such server exists). Regardless, local models are trained on local data samples, and the weights are exchanged among the models at some frequency to generate a global model.

It’s an iterative process broken up into sets of interactions known as federated learning rounds, where each round consists of transmitting the current global model state to participating nodes. Local models are trained on the nodes to produce a set of potential model updates at each node, and then the local updates are aggregated and processed into a single global update and applied to the global model.

Federated learning has been deployed in production by Google, a federated learning pioneer. Google uses it for personalization in its Gboard predictive keyboard across “tens of millions” of iOS and Android devices. Alongside the Pixel 4 launch, Google debuted an improved version of its Now Playing music-recognizing feature that aggregates the play counts of songs in a federated fashion, identifying the most popular songs by locality to improve recognition. And the company recently debuted a module for its TensorFlow machine learning framework dubbed TensorFlow Federated, which is intended to make it easier to experiment with deep learning and other computations on decentralized data.

Of course, no technique is without its flaws; federated learning requires frequent communication among nodes during the learning process. Tangibly, in order for the machine learning models to exchange parameters, they need significant amounts of processing power and memory. Other challenges include an inability to inspect training examples, as well as bias due in part to the fact that the AI models train only when power and a means of transmitting their parameters is available.

Differential privacy

Federated learning goes hand in hand with differential privacy, a system for publicly sharing information about a data set by describing patterns of groups within the corpus while withholding data about individuals. It usually entails injecting a small amount of noise into the raw data before it’s fed into a local machine learning model, such that it becomes difficult for malicious actors to extract the original files from the trained model.

Intuitively, an algorithm can be considered differentially private if an observer seeing its output cannot tell if a particular individual’s information was used in the computation. A differentially private federated learning process, then, enables nodes to jointly learn a model while hiding what data any node holds.

The open source TensorFlow library, TensorFlow Privacy, operates on the principle of differential privacy. Specifically, it fine-tunes models using a modified stochastic gradient descent that averages together multiple updates induced by training data examples, clips each of these updates, and adds noise to the final average. This prevents the memorization of rare details, and it offers some assurance that two machine learning models will be indistinguishable whether a person’s data is used in their training or not.

Apple has been using some form of differential privacy since 2017 to identify popular emojis, media playback preferences in Safari, and more, and the company combined it with federated learning in its latest mobile operating system release (iOS 13). Both techniques help to improve the results delivered by Siri, as well as apps like Apple’s QuickType keyboard and iOS’ Found In Apps feature. The latter scans both calendar and mail apps for the names of contacts and callers whose numbers aren’t stored locally.

For their part, researchers from Nvidia and King’s College London recently employed federated learning to train a neural network for brain tumor segmentation, a milestone Nvidia claims is a first for medical image analysis. Their model uses a data set from the BraTS (Multimodal Brain Tumor Segmentation) Challenge of 285 patients with brain tumors, and as with the approaches taken by Google and Apple, it leverages differential privacy to add noise to that corpus.

“This way, [each participating node] stores the updates and limits the granularity of the information that we actually share among the institutions,” Nicola Rieke, Nvidia senior researcher, told VentureBeat in a previous interview. “If you only see, let’s say, 50% or 60% of the model updates, can we still combine the contributions in the way that the global model converges? And we found out ‘Yes, we can.’ It’s actually quite impressive. So it’s even possible to aggregate the model in a way if you only share 10% of the model.”

Of course, differential privacy isn’t perfect, either. Any noise injected into the underlying data, input, output, or parameters impacts the overall model’s performance. In one study, after adding noise to a training data set, the authors noted a decline in predictive accuracy from 94.4% to 24.7%.

An alternative privacy-preserving machine learning technique — homomorphic encryption — suffers from none of those shortcomings, but it’s far from an ace in the hole.

Homomorphic encryption

Homomorphic encryption isn’t new — IBM researcher Craig Gentry developed the first scheme in 2009 — but it’s gained traction in recent years, coinciding with advances in compute power and efficiency. It’s basically a form of cryptography that enables computation on plaintext (file contents) encrypted using an algorithm (also known as ciphertexts), so that the generated encrypted result exactly matches the result of operations that would have been performed on unencrypted text. Using this technique, a “cryptonet” (e.g, any learned neural network that can be applied to encrypted data) can perform computation on data and return the encrypted result back to some client, which can then use the encryption key — which was never shared publicly — to decrypt the returned data and get the actual result.

“If I send my MRI images, I want my doctor to be able to see them immediately, but nobody else,” Jonathan Ballon, vice president of Intel’s IoT group, told VentureBeat in an interview earlier this year. “[Homomorphic] encryption delivers that, and in addition, the model itself is encrypted. So a company … can put that model [on a public cloud], and that [cloud provider] has no idea what their model looks like.”

In practice, homomorphic encryption libraries don’t yet fully leverage modern hardware, and they’re at least an order of magnitude slower than conventional models. But newer projects like cuHE, an accelerated encryption library, claim speedups of 12 to 50 times on various encrypted tasks over previous implementations. Moreover, libraries like PySyft and tf-encrypted — which are built on Facebook’s PyTorch machine learning framework and TensorFlow, respectively — have made great strides in recent months. So, too, have abstraction layers like HE-Transformer, a backend for nGraph (Intel’s neural network compiler) that delivers leading performance on some cryptonets.

In fact, just a few months ago, Intel researchers proposed nGraph-HE2, a successor to HE-Transformer that enables inference on standard, pretrained machine learning models using their native activation functions. They report in a paper that it was 3 times to 88 times faster at runtime in terms of scalar encoding (the encoding of a numeric value into an array of bits) with double the throughput, and that additional multiplication and addition optimizations yielded a further 2.6 times to 4.2 time runtime speedup.

IBM senior research scientist Flavio Bergamaschi has investigated the use of hardware at the edge to implement homomorphic encryption operations. In a recent study, he and colleagues deployed a local homomorphic database on a device equipped with an AI camera, enabling search to be performed directly on that camera. They report that performance was “homomorphically fast,” with lookup taking only 1.28 seconds per database entry, which amounted to a 200-entry query in five minutes.

“We are at what I call inflection points in performance,” he told VentureBeat in a recent phone interview. “Now, fully homomorphic encryption is fast enough in terms of performance that it’s perfectly adequate for certain use cases.”

On the production side, Bergamaschi and team worked with a U.S.-based banking client to encrypt a machine learning process using homomorphic techniques. That machine learning process — a linear regression model with well over a dozen variables — analyzed 24 months of transaction data from current account holders to predict the financial health of those accounts, partly to recommend products like loans. Motivated by the client’s privacy and compliance concerns, the IBM team encrypted the existing model and the transaction data in question, and they ran predictions using both the encrypted and unencrypted model to compare performance. While the former ran slower than the latter, the accuracy was the same.

“This is an important point. We showed that if we didn’t have any model for [our] prediction, we could take transaction data and perform the training of a new model in production,” Bergamaschi said.

Enthusiasm for homomorphic encryption has given rise to a cottage industry of startups aiming to bring it to production systems. Newark, New Jersey-based Duality Technologies, which recently attracted funding from one of Intel’s venture capital arms, pitches its homomorphic encryption platform as a privacy-preserving solution for “numerous” enterprises, particularly those in regulated industries. Banks can conduct privacy-enhanced financial crime investigations across institutions, so goes the company’s sales pitch, while scientists can tap it to collaborate on research involving patient records.

But like federated learning and differential privacy, homomorphic encryption offers no magic bullet. Even leading techniques can calculate only polynomial functions — a nonstarter for the many activation functions in machine learning that are non-polynomial. Plus, operations on encrypted data can involve only additions and multiplications of integers, which poses a challenge in cases where learning algorithms require floating point computations.

“In domains where you can take 10 seconds to turn around your inference, [homomorphic encryption] is fine, but If you need a three-millisecond turnaround time today, there’s just no way to do it,” Ballon said. “The amount of computation is too high, and this goes back to the domain of engineering.”

Since 2014, Bergamaschi and colleagues have experimented with hardware approaches to accelerating homomorphic operations. Historically, bandwidth has been the biggest stumbling block — while accelerators yield strong benchmark performance individually, they don’t yield strong systems performance overall. That’s because the data required to perform the operations requires a lot of bandwidth between processors and the accelerator.

The solution might lie in techniques that make more efficient use of processors’ on-chip memory. A paper published by researchers at the Korea Advanced Institute of Science and Technology advocates the use of a combined cache for all normal and security-supporting data, as well as memory scheduling and mapping schemes for secure processors and a type-aware cache insertion module. They say that together, the combined approaches could reduce encryption performance degradation from 25%-34% to less than 8%-14% in typical 8-core and 16-core secure processors, with minimal extra hardware costs.

A long way to go

New techniques might solve some of the privacy issues inherent in AI and machine learning, but they’re in their infancy and not without their shortcomings.

Federated learning trains algorithms across decentralized edge devices without exchanging their data samples, but it’s difficult to inspect and at the mercy of fluctuations in power, computation, and internet. Differential privacy, which exposes information about a data set while withholding information about the individuals, suffers dips in accuracy caused by injected noise. As for homomorphic encryption — a form of encryption that allows computation on encrypted data — it’s somewhat slow and computationally demanding.

Nevertheless, folks like Ballon believe all three approaches are steps in the right direction. “This is very similar to going from HTTP to HTTPS,” Ballon said. “We’ll have the tools and capabilities to make [privacy in machine learning] seamless someday, but we’re not quite there yet.”

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More
« Older posts
  • Recent Posts

    • Dapper Duo
    • AI Weekly: These researchers are improving AI’s ability to understand different accents
    • Why Choose RapidMiner for Your Data Science & Machine Learning Software?
    • How to Use CRM Integration to Your Advantage – Real World Examples
    • WATCH: ‘Coming 2 America’ Movie Review Available On Amazon Prime & Amazon
  • Categories

  • Archives

    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited