• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Category Archives: Big Data

Researchers propose Porcupine, a compiler for homomorphic encryption

January 23, 2021   Big Data
 Researchers propose Porcupine, a compiler for homomorphic encryption

How open banking is driving huge innovation

Learn how fintechs and forward-thinking FIs are accelerating personalized financial products through data-rich APIs.

Register Now


Homomorphic encryption (HE) is a privacy-preserving technology that enables computational workloads to be performed directly on encrypted data. HE enables secure remote computation, as cloud service providers can compute on data without viewing highly sensitive content. But despite its appeal, performance and programmability challenges remain a barrier to HE’s widespread adoption.

Realizing the potential of HE will likely require developing a compiler that can translate a plaintext, unencrypted codebase into encrypted code on the fly. In a step toward this, researchers at Facebook, New York University, and Stanford created Porcupine, a “synthesizing compiler” for HE. They say it results in speedups of up to 51% compared to heuristic-driven, entirely hand-optimized code.

Given a reference of a plaintext code, Porcupine synthesizes HE code that performs the same computation, the researchers explain. Internally, Porcupine models instruction noise, latency, behavior, and HE program semantics with a component called Quill. Quill enables Porcupine to reason about and search for HE kernels that are verifiably correct while minimizing the code’s latency and noise accumulation. The result is a suite that automates and optimizes the mapping and scheduling of plaintext to HE code.

In experiments, the researchers evaluated Porcupine using a range of image processing and linear algebra programs. According to the researchers, for small programs, Porcupine was able to find the same optimized implementations as hand-written baselines. And on larger, more complex programs, Porcupine discovered optimizations like factorization and even application-specific optimizations involving separable filters.

“Our results demonstrate the efficacy and generality of our synthesis-based compilation approach and further motivates the benefits of automated reasoning in HE for both performance and productivity,” the researchers wrote. “Porcupine abstracts away the details of constructing correct HE computation so that application designers can concentrate on other design considerations.”

Enthusiasm for HE has given rise to a cottage industry of startups aiming to bring it to production systems. Newark, New Jersey-based Duality Technologies, which recently attracted funding from one of Intel’s venture capital arms, pitches its HE platform as a privacy-preserving solution for “numerous” enterprises, particularly those in regulated industries. Banks can conduct privacy-enhanced financial crime investigations across institutions, so goes the company’s sales pitch, while scientists can tap it to collaborate on research involving patient records.

But HE offers no magic bullet. Even leading techniques can calculate only polynomial functions — a nonstarter for the many activation functions in machine learning that are non-polynomial. Plus, operations on encrypted data can involve only additions and multiplications of integers, which poses a challenge in cases where learning algorithms require floating point computations.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Center for Applied Data Ethics suggests treating AI like a bureaucracy

January 22, 2021   Big Data
 Center for Applied Data Ethics suggests treating AI like a bureaucracy

How open banking is driving huge innovation

Learn how fintechs and forward-thinking FIs are accelerating personalized financial products through data-rich APIs.

Register Now


A recent paper from the Center for Applied Data Ethics (CADE) at the University of San Francisco urges AI practitioners to adopt terms from anthropology when reviewing the performance of large machine learning models. The research suggests using this terminology to interrogate and analyze bureaucracy, states, and power structures in order to critically assess the performance of large machine learning models with the potential to harm people.

“This paper centers power as one of the factors designers need to identify and struggle with, alongside the ongoing conversations about biases in data and code, to understand why algorithmic systems tend to become inaccurate, absurd, harmful, and oppressive. This paper frames the massive algorithmic systems that harm marginalized groups as functionally similar to massive, sprawling administrative states that James Scott describes in Seeing Like a State,” the author wrote.

The paper was authored by CADE fellow Ali Alkhatib, with guidance from director Rachel Thomas and CADE fellows Nana Young and Razvan Amironesei.

The researchers particularly look to the work of James Scott, who has examined hubris in administrative planning and sociotechnical systems. In Europe in the 1800s, for example, timber industry companies began using abridged maps and a field called “scientific forestry” to carry out monoculture planting in grids. While the practice resulted in higher initial yields in some cases, productivity dropped sharply in the second generation, underlining the validity of scientific principles favoring diversity. Like those abridged maps, Alkhatib argues, algorithms can both summarize and transform the world and are an expression of the difference between people’s lived experiences and what bureaucracies see or fail to see.

The paper, titled “To Live in Their Utopia: Why Algorithmic Systems Create Absurd Outcomes,” was recently published and accepted by the ACM Conference on Human Factors in Computing Systems (CHI), which will be held in May.

Recalling Scott’s analysis of states, Alkhatib warns against harms that can result from unhampered AI, including the administrative and computational reordering of society, a weakened civil society, and the rise of an authoritarian state. Alkhatib notes that such algorithms can misread and punish marginalized groups whose experiences do not fit within the confines of data considered to train a model.

People privileged enough to be considered the default by data scientists and who are not directly impacted by algorithmic bias and other harms may see the underrepresentation of race or gender as inconsequential. Data Feminism authors Catherine D’Ignazio and Lauren Klein describe this as “privilege hazard.” As Alkhatib put it, “other people have to recognize that race, gender, their experience of disability, or other dimensions of their lives inextricably affect how they experience the world.”

He also cautions against uncritically accepting AI’s promise of a better world.

“AIs cause so much harm because they exhort us to live in their utopia,” the paper reads. “Framing AI as creating and imposing its own utopia against which people are judged is deliberately suggestive. The intention is to square us as designers and participants in systems against the reality that the world that computer scientists have captured in data is one that surveils, scrutinizes, and excludes the very groups that it most badly misreads. It squares us against the fact that the people we subject these systems to repeatedly endure abuse, harassment, and real violence precisely because they fall outside the paradigmatic model that the state — and now the algorithm — has constructed to describe the world.”

At the same time, Alkhatib warns people not to see AI-driven power shifts as inevitable.

“We can and must more carefully reckon with the parts we play in empowering algorithmic systems to create their own models of the world, in allowing those systems to run roughshod over the people they harm, and in excluding and limiting interrogation of the systems that we participate in building.”

Potential solutions the paper offers include undermining oppressive technologies and following the guidance of Stanford AI Lab researcher Pratyusha Kalluri, who advises asking whether AI shifts power, rather than whether it meets a chosen numeric definition of fair or good. Alkhatib also stresses the importance of individual resistance and refusal to participate in unjust systems to deny them power.

Other recent solutions include a culture change in computer vision and NLP, reduction in scale, and investments to reduce dependence on large datasets that make it virtually impossible to know what data is being used to train deep learning models. Failure to do so, researchers argue, will leave a small group of elite companies to create massive AI models such as OpenAI’s GPT-3 and the trillion-parameter language model Google introduced earlier this month.

The paper’s cross-disciplinary approach is also in line with a diverse body of work AI researchers have produced within the past year. Last month, researchers released the first details of OcéanIA, which treats a scientific project for identifying phytoplankton species as a challenge for machine learning, oceanography, and science. Other researchers have advised a multidisciplinary approach to advancing the fields of deep reinforcement learning and NLP bias assessment.

We’ve also seen analysis of AI that teams sociology and critical race theory, as well as anticolonial AI, which calls for recognizing the historical context associated with colonialism in order to understand which practices to avoid when building AI systems. And VentureBeat has written extensively about the fact that AI ethics is all about power.

Last year, a cohort of well-known members of the algorithmic bias research community created an internal algorithm-auditing framework to close AI accountability gaps within organizations. That work asks organizations to draw lessons from the aerospace, finance, and medical device industries. Coauthors of the paper include Margaret Mitchell and Timnit Gebru, who used to lead the Google AI ethics team together. Since then, Google has fired Gebru and, according to a Google spokesperson, opened an investigation into Mitchell.

With control of the presidency and both houses of Congress in the U.S., Democrats could address a range of tech policy issues in the coming years, from laws regulating the use of facial recognition by businesses, governments, and law enforcement to antitrust actions to rein in Big Tech. However, a 50-50 Senate means Democrats may be forced to consider bipartisan or moderate positions in order to pass legislation.

The Biden administration emphasized support for diversity and distaste for algorithmic bias in a televised ceremony introducing the science and technology team on January 16. Vice President Kamala Harris has also spoken passionately against algorithmic bias and automated discrimination. In the first hours of his administration, President Biden signed an executive order to advance racial equality that instructs the White House Office of Science and Technology Policy (OSTP) to participate in a newly formed working group tasked with disaggregating government data. This initiative is based in part on concerns that an inability to analyze such data impedes efforts to advance equity.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Soci raises $80 million to power data-driven localized marketing for enterprises

January 22, 2021   Big Data

The 2021 digital toolkit – How small businesses are taking charge

Learn how small businesses are improving customer experience, accelerating quote-to-cash, and increasing security.

Register Now


Soci, a platform that helps brick-and-mortar businesses deploy localized marketing campaigns, has raised $ 80 million in a series D round of funding led by JMI Equity.

The raise comes at a crucial time for businesses, with retailers across the spectrum having to rapidly embrace ecommerce due to the pandemic. However, businesses with local brick-and-mortar stores will still be around in a post-pandemic world. By focusing on their “local” presence, including offering local pages (e.g. Facebook) and reviews (e.g. Google and Yelp), businesses can lure customers away from Amazon and its ilk. This is where Soci comes into play.

Founded in 2012, San Diego-based Soci claims hundreds of enterprise-scale clients, such as Hertz and Ace Hardware, which use the Soci platform to manage local search, reviews, and content across their individual business locations. It’s all about ensuring that companies maintain accurate and consistent location-specific information, which can be particularly challenging for businesses with thousands of outlets.

“For multi-location enterprises, the ability to connect with local audiences across the most influential marketing networks like Google, Yelp, and Facebook was critical to keeping their local businesses afloat through the pandemic,” Soci cofounder and CEO Afif Khoury told VentureBeat.

Moreover, Soci offers analytics that can help determine which locations are performing best in terms of social reach and engagement, integrating with all the usual touchpoints where businesses typically connect to customers, such as Facebook, Yelp, and Google.

“Soci is now housing and analyzing all of the most critical marketing data from every significant local marketing channel, such as search, social, reviews, and ads,” Khoury continued.

Above: Soci: Local marketing data

Soci had previously raised around $ 35 million, and with its latest cash injection the company plans to double down on sales and M&A activity. Its lead investor hints at the direction Soci is taking, given that JMI Equity is largely focused on enterprise software companies like financial planning platform Adaptive Insights, which Workday acquired a few years ago for more than $ 1.5 billion.

Looking to the future, Soci said it plans to enhance its data integrations, spanning all the common business tools used by enterprises, to build a more complete picture that meshes data from the physical and virtual worlds.

“As Soci continues to integrate with other important ecosystems and technologies such as CRM, point-of-sale, and rewards programs, it will begin to effectively combine online and offline data and deliver an extremely robust customer profile that will enrich the insights we provide and enable much more effective marketing and customer service strategies,” Khoury said.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Aurora partners with Paccar to develop driverless trucks

January 20, 2021   Big Data

The 2021 digital toolkit – How small businesses are taking charge

Learn how small businesses are improving customer experience, accelerating quote-to-cash, and increasing security.

Register Now


Self-driving startup Aurora today announced a partnership with Paccar to build and deploy autonomous trucks. It’s Aurora’s first commercial application in trucking, and the company says it will combine its engineering teams around an “accelerated development program” to create driverless-capable trucks starting with the Peterbilt 579 and the Kenworth T680.

Some experts predict the pandemic will hasten adoption of autonomous vehicles for delivery. Self-driving cars, vans, and trucks promise to minimize the risk of spreading disease by limiting driver contact. This is particularly true with regard to short-haul freight, which is experiencing a spike in volume during the outbreak. The producer price index for local truckload carriage jumped 20.4% from July to August, according to the U.S. Bureau of Labor Statistics, most likely propelled by demand for short-haul distribution from warehouses and distribution centers to ecommerce fulfillment centers and stores.

Aurora — which recently acquired Uber’s Advanced Technologies Group, the ride-hailing company’s driverless vehicle division, reportedly for around $ 4 billion — says it will work with Paccar to create an “expansive” plan for future autonomous trucks. Aurora and Paccar plan to work closely on “all aspects of collaboration,” from component sourcing and vehicle technology enhancements to the integration of the Peterbilt and Kenworth vehicles with Aurora’s hardware, software, and operational services.

 Aurora partners with Paccar to develop driverless trucks

Aurora will test and validate the driverless Peterbilt and Kenworth trucks at Paccar’s technical center in Mt. Vernon, Washington, as well as on public roads. The companies expect them to be deployed in North America within the next several years, during which time Paccar and Aurora will evaluate additional collaboration opportunities with Peterbilt, Kenworth, and DAF truck models and geographies.

Aurora, which was cofounded by Chris Urmson, one of the original leaders of the Google self-driving car project that became Waymo, has its sights set on freight delivery for now. In January, Aurora said that after a year of focusing on capabilities including merging, nudging, and unprotected left-hand turns, its autonomous system — the Aurora Driver, which has been integrated into six different types of vehicles to date, including sedans, SUVs, minivans, commercial vans, and freight trucks — can perform each seamlessly, “even in dense urban environments.” More recently, Aurora, which recently said it has over 1,600 employees, announced it will begin testing driverless vehicles, including semi trucks, in parts of Texas.

Last year, Aurora raised investments from Amazon and others totaling $ 600 million at a valuation of over $ 2 billion, a portion of which it spent to acquire lidar sensor startup Blackmore. (Lidar, a fixture on many autonomous vehicles designs, measures the distance to target objects by illuminating them with laser light and measuring the reflected pulses.) Now valued at $ 10 billion, Pittsburgh-based Aurora has committed to hiring more workers, with a specific focus on mid- to senior-level engineers in software and infrastructure, robotics, hardware, cloud, and firmware. The AGT purchase could grow the size of its workforce from around 600 to nearly 1,200, accounting for ATG’s roughly 1,200 employees.

Paccar, which was founded in 1905, is among the largest manufacturers of medium- and heavy-duty trucks in the world. The company engages in the design, manufacture, and customer support of light-, medium- and heavy-duty trucks under the Kenworth, Peterbilt, Leyland Trucks, and DAF nameplates.

The value of goods transported as freight cargo in the U.S. was estimated to be about $ 50 billion each day in 2013. And the driverless truck market — which is anticipated to reach 6,700 units globally after totaling $ 54.23 billion in 2019 — stands to save the logistics and shipping industry $ 70 billion annually while boosting productivity by 30%. Besides promised cost savings, the growth of trucking automation has been driven by a shortage of drivers. In 2018, the American Trucking Associations estimated that 50,000 more truckers were needed to close the gap in the U.S., despite the sidelining of proposed U.S. Transportation Department screenings for sleep apnea.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Database trends: Why you need a ledger database

January 18, 2021   Big Data
 Database trends: Why you need a ledger database

The 2021 digital toolkit – How small businesses are taking charge

Learn how small businesses are improving customer experience, accelerating quote-to-cash, and increasing security.

Register Now


The problem: The auto dealer can’t sell the car without being paid. The bank doesn’t want to loan the money without insurance. The insurance broker doesn’t want to write a policy without payment. The three companies need to work together as partners, but they can’t really trust each other.

When businesses need to cooperate, they need a way to verify and trust each other. In the past, they traded signed and sealed certificates. Today, you can deliver the same assurance with digital signatures, a mathematical approach that uses secret keys to let people or their computers validate dates. Ledger databases are a new mechanism for marrying data storage with some cryptographic guarantees.

The use cases

Any place where people need to build a circle of trust is a good place to deploy a ledger database.

  • Crypto currency like Bitcoin inspired the application by creating a software tool for tracking the true owner of every coin. The blockchain run by the nodes in the Bitcoin network is a good example of how signatures can validate all transactions changing ownership.
  • Shipping companies need to track goods as they flow through a network of trucks, ships, and planes. Loss and theft can be minimized if each person along the way explicitly transfers control.
  • Manufacturers, especially those that create products like pharmaceuticals, want to make sure that no counterfeits enter the supply chain.
  • Coalitions, especially industry groups, that need to work together while still competing. The ledger database can share a record of the events while providing some assurance that the history is accurate and unchanged.

The solution

Standard databases track a sequence of transactions that add, delete, or change entries. Ledger databases add a layer of digital signatures for each transaction so that anyone can audit the list and see that it was constructed correctly. More importantly, no one has gone back to adjust a previous transaction, to change history so to speak.

The digital signatures form a chain that links the individual rows or entries. Each signature is constructed to certify the data in the new row and also the data in the previous row. Taken together, all of the signatures added over time certify the sequence that data was added to the log. An auditor can look at some or all of the signatures to make sure they’re correct.

In the case of Bitcoin, the database tracks the flow of every coin over time since the system was created. The transactions are grouped together in blocks that are processed about every ten minutes, and taken together, the chain of these blocks provides a history of the owner of every coin.

Bitcoin also includes an elaborate consensus protocol where anyone can compete to solve a mathematical puzzle and validate the next block on the chain. This ritual is often called “mining” because the person who solves this computational puzzle is rewarded with several coins. The protocol was designed to remove the need for central control by one trusted authority — an attractive feature for some coin owners. It is open and offers a relatively clear mechanism for resolving disputes.

Many ledger databases avoid this elaborate ritual. The cost of competing to solve these mathematical puzzles is quite high because of the energy that computers consume while they’re solving the puzzle. The architects of these systems just decide at the beginning who will be the authority to certify the changes. In other words, they choose the parties that will create the digital signatures that bless each addition without running some competition each step.

In the example from the car sales process, each of the three entities may choose to validate each other’s transactions. In some cases, the database vendor also acts as an authority in case there are any external questions.

The legacy players

Database vendors have been adding cryptographic algorithms to their products for some time. All of the major companies, like Oracle or Microsoft, offer mechanisms for encrypting the data to add security and offer privacy. The same toolkits include algorithms that can add digital signatures to each database row. In many cases, the features are included in the standard licenses, or can be added for very little cost.

The legacy companies are also adding explicit features that simplify the process. Oracle, for instance, added blockchain tables to version 21c of its database. They aren’t much different from regular tables, but they only support inserting rows. Each row is pushed through a hash function, and then the result from the previous row is added as a column to the next row that’s inserted. Deletions are tightly controlled.

The major databases also tend to have encryption toolkits that can be integrated to achieve much the same assurance. One approach with MySQL adds a digital signature to the rows. It is often possible to adapt an existing database and schema to become a ledger database by adding an extra field to each row. If the signature of the previous row is added to the new row, a chain of authentication can be created.

The upstarts

There are hundreds of startups exploring this space. Some are tech companies that are approaching the ledger database space like database developers. You could think of some others as accidental database creators.

It is a bit of a reach to include all of the various crypto currencies as ledger databases in this survey, but they are all managing distributed blockchains that store data. Some, like Ethereum, offer elaborate embedded processing that can create arbitrary digital contracts. Some of the people who are nominally buying a crypto coin as an asset are actually using the purchase to store data in the currency’s blockchain.

The problem for many users is that the cost of storing data depends on the cost of creating a transaction, and in most cases, these can be prohibitive for regular applications. It might make sense for special transactions that are small enough, rare enough, and important enough to need the extra assurance that comes from a public blockchain. For this reason, most of the current users tend to be speculators or people who want to hold the currency, not groups that need to store a constant volume of bits.

Amazon is offering the Quantum Ledger Database, a pay-as-you-go service with what the company calls an “SQL-like API”. All writes are cryptographically sealed with the SHA-256 hash function, allowing any auditor to go through the history to double-check the time of all events. The pricing is based upon the volume of data stored, the size of any indices built upon the data, and the amount that leaves. (It’s worth noting that the word “quantum” is just a brand name. It does not imply that a quantum computer is involved.)

The Hyperledger Fabric is a tool that creates a lightly interconnected version of the blockchain that can be run inside of an organization and shared with some trusted partners. It’s designed for scenarios where a few groups need to work together with data that isn’t shared openly. The code is an open source constellation of a number of different programs, which means that it’s not as easy to adopt as a single database. IBM is one company that’s offering commercial versions, and many of the core routines are open source.

Microsoft’s Blockchain service is more elaborate. It’s designed to support arbitrary digital contracts, not just store some bits. The company offers both a service to store the data and a full development platform for creating an architecture that captures your workflow. The contracts can be set up either for your internal teams or across multiple enterprises to bind companies in a consortium.

BigchainDB is built on the MongoDB NoSQL model. Any MongoDB query will work. The database will track the changes and share them with a network of nodes that will converge upon the correct value. The consensus-building algorithms can survive failed nodes and recover.

Is there anything a ledger can’t do?

Because it’s just a service for storing data, any bits that might be stored in a traditional database can be stored in a ledger database. The cost of updating the cryptographic record for each transaction, though, may not be worth it for many high-volume applications that don’t need the extra assurance. Adding the extra digital signature requires more computation. It’s not a significant hurdle for low-volume tables like a bank account where there may be only a few transactions per day. The need for accuracy and trust far outweigh the costs. But it could be prohibitive for something like a log file of high-volume activity that has little need for assurance. If some fraction of a social media chat application disappeared tomorrow, the world would survive.

The biggest question is just how important it will be to trust the historical record in the future. If there’s only a slim chance that someone might want to audit the transaction journal, then the extra cost of computing the signatures or the hash values may not be worth it.

This article is part of a series on enterprise database technology trends.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Incoming White House science and technology leader on AI, diversity, and society

January 18, 2021   Big Data
 Incoming White House science and technology leader on AI, diversity, and society

The 2021 digital toolkit – How small businesses are taking charge

Learn how small businesses are improving customer experience, accelerating quote-to-cash, and increasing security.

Register Now


Technologies like artificial intelligence and human genome editing “reveal and reflect even more about the complex and sometimes dangerous social architecture that lies beneath the scientific progress that we pursue,” Dr. Alondra Nelson said today at a televised ceremony introducing President-elect Joe Biden’s science team. On Friday, the Biden transition team appointed Nelson to the position of OSTP deputy director for science and society. Biden will be sworn in Wednesday to officially become the 46th president of the United States.

Nelson said in the ceremony that science is a social phenomenon and a reflection of people, their relationships, and their institutions. This means it really matters who’s in the room when new technology like AI is developed, she said. This is also why for much of her career she has sought to understand the perspectives of people who are not typically included in the development of emerging technology. Connections between our scientific and social worlds have never been as urgent as they are today, she said, and there’s never been a more important moment to situate scientific development in ethical values like equality, accountability, justice, and trustworthiness.

“When we provide inputs to the algorithm; when we program the device; when we design, test, and research; we are making human choices, choices that bring our social world to bear in a new and powerful way,” she said. “As a Black woman researcher, I am keenly aware of those who are missing from these rooms. I believe we have a responsibility to work together to make sure that our science and technology reflects us, and when it does it reflects all of us, that it reflects who we truly are together. This too is a breakthrough. This too is an innovation that advances our lives.”

Nelson’s comments allude to trends of pervasive algorithmic bias and a well-documented lack of diversity among teams deploying artificial intelligence. Those trends appear to have converged when Google fired AI ethics co-lead Timnit Gebru last month. Algorithmic bias has been shown to disproportionately and negatively impact the lives of Black people in a number of ways, including use of facial recognition leading to false arrests, adverse health outcomes for millions, and unfair lending practices. A study published last month found that diversity on teams developing and deploying artificial intelligence is a key to reducing algorithmic bias.

Dr. Eric Lander will be nominated to serve as director of the OSTP and presidential science advisor. In remarks today, he called America’s greatest asset its “unrivaled diversity” and spoke of science and tech policy that creates new industries and jobs but also ensures benefits of progress are “shared broadly among all Americans.”

“Scientific progress is about someone seeing something that no one’s ever seen before because they bring a different lens, different experiences, different questions, different passions. No one can top America in that regard, but we have to ensure that everyone not only has a seat at the table, but a place at the lab bench,” he said.

Biden also spoke at the ceremony, referring to the team he has assembled as one that will help “restore America’s hope in the frontier of science” while tackling advances in health care and challenges like climate change.

“We have the most diverse population in the world that’s in a democracy, and there’s so much we can do. I can’t tell you how excited we’ve been about doing this. We saved it for last. I know it’s not naming Department of Defense or attorney general, but I tell you what: You have more impact on what our children are going to face and our grandchildren are going to have opportunities to do than anyone,” he said.

As part of today’s announcement, Biden said the presidential science advisor will be a cabinet-level position for the first time in U.S. history. Vice President-elect Kamala Harris, whose mother worked as a scientist at UC Berkeley, also spoke. She concluded her remarks with an endorsement of funding for science, technology, engineering, and mathematics (STEM) education and an acknowledgment of Dr. Kizzmekia Corbett, a Black female scientist whose contributions helped create the Moderna COVID-19 vaccine.

The Biden-Harris campaign platform has also pledged to address some forms of algorithmic bias. While the Trump administration signed a few international agreements supporting trustworthy AI, the current president’s harsh immigration policy and bigoted rhetoric undercut any chance of leadership when it comes to addressing the ways algorithmic bias leads to discrimination or civil rights violations.

Earlier this week, members of the Trump administration introduced the AI Initiatives Office to guide a national AI strategy following the passage of the National Defense Authorization Act (NDAA). The AI Initiatives Office might be one of the only federal offices to depict a neural network and eagle in its seal.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Researchers propose using the game Overcooked to benchmark collaborative AI systems

January 15, 2021   Big Data

The 2021 digital toolkit – How small businesses are taking charge

Learn how small businesses are improving customer experience, accelerating quote-to-cash, and increasing security.

Register Now


Deep reinforcement learning systems are among the most capable in AI, particularly in the robotics domain. However, in the real world, these systems encounter a number of situations and behaviors to which they weren’t exposed during development.

In a step toward systems that can collaborate with humans in order to help them accomplish their goals, researchers at Microsoft, the University of California, Berkeley, and the University of Nottingham developed a methodology for applying a testing paradigm to human-AI collaboration that can be demonstrated in a simplified version of the game Overcooked. Players in Overcooked control a number of chefs in kitchens filled with obstacles and hazards to prepare meals to order under a time limit.

The team asserts that Overcooked, while not necessarily designed with robustness benchmarking in mind, can successfully test potential edge cases in states a system should be able to handle as well as the partners the system should be able to play with. For example, in Overcooked, systems must contend with scenarios like when a plates are accidentally left on counters and when a partner stays put for a while because they’re thinking or away from their keyboard.

 Researchers propose using the game Overcooked to benchmark collaborative AI systems

Above: Screen captures from the researchers’ test environment.

The researchers investigated a number of techniques for improving system robustness, including training a system with a diverse population of other collaborative systems. Over the course of experiments in Overcooked, they observed whether several test systems could recognize when to get out of the way (like when a partner was carrying an ingredient) and when to pick up and deliver orders after a partner has been idling for a while.

According to the researchers, current deep reinforcement agents aren’t very robust — at least not as measured by Overcooked. None of the systems they tested scored above 65% in the video game, suggesting, the researchers say, that Overcooked can serve as a useful human-AI collaboration metric in the future.

 Researchers propose using the game Overcooked to benchmark collaborative AI systems

“We emphasize that our primary finding is that our [Overcooked] test suite provides information that may not be available by simply considering validation reward, and our conclusions for specific techniques are more preliminary,” the researchers wrote in a paper describing their work. “A natural extension of our work is to expand the use of unit tests to other domains besides human-AI collaboration … An alternative direction for future work is to explore meta learning, in order to train the agent to adapt online to the specific human partner it is playing with. This could lead to significant gains, especially on agent robustness with memory.”

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

IBM acquires Taos to supplement its cloud expertise as workloads shift

January 15, 2021   Big Data

The 2021 digital toolkit – How small businesses are taking charge

Learn how small businesses are improving customer experience, accelerating quote-to-cash, and increasing security.

Register Now


IBM announced that it’s acquiring Taos, a provider of managed and professional IT services with a strong focus on public cloud computing platforms. At the same time, Deloitte Consulting announced that it has completed its previously announced acquisition of HashedIn Technologies Private Limited, a software engineering and product development firm specializing in cloud native technologies.

Terms of both deals were undisclosed. However, they both come at a time when the number of workloads being shifted to public clouds has accelerated significantly during the COVID-19 pandemic. That shift is altering the center of data gravity in the enterprise in a way that requires IT services providers to add additional application and data management expertise that spans multiple clouds.

Most of the data organizations manage today still reside in on-premises environments. It’s not likely all data will reside on-premises or in the cloud, but rather organizations will find themselves managing data as it ebbs and flows across multiple centers of data gravity, said David Sun, director of corporate business development for IBM Services, in an interview with VentureBeat.

“All our clients are telling us their applications and data will reside in multiple clouds and hybrid cloud computing environments,” said Sun.

Taos, which will operate as an IBM company based in San Jose, California, will remain with IBM after the rest of IBM’s managed services business focused on infrastructure is spun out sometime next year. That relationship is similar to the one IBM is establishing with 7Summits, an IT services provider focused on the Salesforce platform that IBM acquired last week. This follow IBM’s announcement last month that it plans to acquire Nordcloud, a provider of cloud consulting leader in Europe. Overall, mergers and acquisitions among IT services providers is at an all time high.

The challenge enterprise IT teams face today is that there is a general shortage of cloud computing expertise. IBM and other IT service providers are counting on organizations to augment the limited IT expertise and resources they have with external cloud expertise. That expertise is even more sought after now because most enterprise IT organizations don’t have a lot of experience employing cloud native technologies such as containers, Kubernetes, and serverless computing frameworks.

A report published earlier this week by Information Services Group (ISG), a technology research and advisory firm, finds commercial outsourcing contracts with an annual contract value of $ 5 million or more involving as-a-service platform and managed services reached $ 16 billion in the fourth quarter of 2020. That’s up 13% over last year and 9% over the third quarter. Managed services specifically accounted for $ 7.2 billion for the quarter, which according to the report marks the first time managed service deal sizes have returned to their pre-pandemic levels.

A significant percentage of those managed services contracts revolve around cloud computing projects, said ISG president Steve Hall. Managed service providers (MSPs) are trying to strike a balance between a decline in demand for managing on-premises IT infrastructure against cloud computing opportunities that require more software expertise, said Hall. “Many MSPs have been focused on the data center,” he said.

The biggest challenge in the last year has been the simple fact that landing new business typically requires in-person meetings because of the level of trust that needs to be established between service providers and their end customers, noted Hall.

Less clear at the moment is to what degree organizations will ultimately rely more on outsourcing as IT environments become more complex. The overall percentage of IT consumed as a managed service has been relatively small compared to the trillions of dollars in applications and infrastructure that is managed internally by an IT staff. However, Gartner is forecasting that the market for cloud professional services will exceed $ 200 billion by 2024.

Internal IT staff, of course, are often resistant to relying on service providers that they often perceive to be a potential threat to their jobs. The challenge IT services providers face is either overcoming that bias or simply finding a way to bypass that internal IT altogether by establishing a relationship with a business executive or CIO who has come to believe external services providers simply have a level of expertise in one area or another that is needed much sooner than an internal IT team can acquire.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Ring rolls out end-to-end video encryption after a class action lawsuit

January 13, 2021   Big Data

Transform 2021

Join us for the world’s leading event about accelerating enterprise transformation with AI and Data, for enterprise technology decision-makers, presented by the #1 publisher in AI and Data

Learn More


In September, Amazon-owned Ring announced that it would bring end-to-end video encryption to its lineup of home security devices. While the company already encrypted videos in storage and during transmission, end-to-end encryption secures videos on-device, preventing third parties without special keys from decrypting and viewing the recordings. The feature launches today in technical preview for compatible Ring products.

The rollout of end-to-end encryption comes after dozens of plaintiffs filed a class action lawsuit against Ring, alleging they had been subjected to death threats, racial slurs, and blackmail after their Ring cameras were hacked. In 2019, a data leak exposed the personal information of over 3,000 Ring users, including log-in emails, passwords, time zones, and the names people give to specific Ring cameras. Following the breach, Ring began requiring two-step verification for user sign-ins and launched a compromised password check feature that cross-references login credentials against a list of known compromised passwords.

In a whitepaper, Ring explains that end-to-end encryption, which is available as a setting within the Ring app, is designed so users can view videos on enrolled smartphones only. Videos are encrypted with keys that are themselves encrypted with an algorithm that creates a public and private key. The public key encrypts, but the private key is required to decrypt. Only users have access to the private key, which is stored on their smartphone and decrypts the symmetric key, and by extension, encrypted videos.

 Ring rolls out end to end video encryption after a class action lawsuit

When a user opts into end-to-end encryption, the Ring app presents a 10-word auto-generated passphrase used to secure the cryptographic keys. (Ring says these words are randomly selected from a dictionary of 7,776.) The passphrase, which can be used to enroll additional smartphones, is generated on-device. But the public portion of the instance key pair and the account data key pair are copied to the Ring cloud after being signed by the account-signing key, as are the locally encrypted private portions of the account-signing key pair and the account data key pair.

Ring notes that end-to-end encryption disables certain features, including AI-dependent features that decrypt videos for processing work like motion verification and people-only mode. However, Live View, which decrypts video locally on-device, will continue to run while end-to-end encryption is enabled. And users can share videos through Ring’s controversial Neighbors Public Safety Service, which connects residents with local law enforcement by downloading an end-to-end encrypted video to their smartphone, which saves it in decrypted form.

Users can switch off end-to-end encryption at any time, but any videos encrypted with end-to-end encryption can’t be decrypted; the keys to access those videos are removed permanently in the process. Conversely, turning on end-to-end encryption doesn’t encrypt any videos created before enrollment because the service only encrypts videos created post-enrollment.

 Ring rolls out end to end video encryption after a class action lawsuit

Ring recently made headlines for a deal it reportedly struck with over 400 police departments nationwide that would allow authorities to request that owners volunteer footage from Ring cameras within a specific time and location. Ring, which has said it would not hand over footage if confronted with a subpoena but would comply when given a search warrant, has law enforcement partnerships in more than 1,300 cities.

Advocacy groups like Fight for the Future and the Electronic Frontier Foundation have accused Ring of using its cameras and Neighbors app (which delivers safety alerts) to build a private surveillance network via police partnerships. The Electronic Frontier Foundation in particular has singled Ring out for marketing strategies that foster fear and promote a sale-spurring “vicious cycle,” and for “[facilitating] reporting of so-called ‘suspicious’ behavior that really amounts to racial profiling.”

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Google trained a trillion-parameter AI language model

January 12, 2021   Big Data

Transform 2021

Join us for the world’s leading event about accelerating enterprise transformation with AI and Data, for enterprise technology decision-makers, presented by the #1 publisher in AI and Data

Learn More


Parameters are the key to machine learning algorithms. They’re the part of the model that’s learned from historical training data. Generally speaking, in the language domain, the correlation between the number of parameters and sophistication has held up remarkably well. For example, OpenAI’s GPT-3 — one of the largest language models ever trained, at 175 billion parameters — can make primitive analogies, generate recipes, and even complete basic code.

In what might be one of the most comprehensive tests of this correlation to date, Google researchers developed and benchmarked techniques they claim enabled them to train a language model containing more than a trillion parameters. They say their 1.6-trillion-parameter model, which appears to be the largest of its size to date, achieved an up to 4 times speedup over the previously largest Google-developed language model (T5-XXL).

As the researchers note in a paper detailing their work, large-scale training is an effective path toward powerful models. Simple architectures, backed by large datasets and parameter counts, surpass far more complicated algorithms. But effective, large-scale training is extremely computationally intensive. That’s why the researchers pursued what they call the Switch Transformer, a “sparsely activated” technique that uses only a subset of a model’s weights, or the parameters that transform input data within the model.

The Switch Transformer builds on a mix of experts, an AI model paradigm first proposed in the early ’90s. The rough concept is to keep multiple experts, or models specialized in different tasks, inside a larger model and have a “gating network” choose which experts to consult for any given data.

The novelty of the Switch Transformer is that it efficiently leverages hardware designed for dense matrix multiplications — mathematical operations widely used in language models — such as GPUs and Google’s tensor processing units (TPUs). In the researchers’ distributed training setup, their models split unique weights on different devices so the weights increased with the number of devices but maintained a manageable memory and computational footprint on each device.

In an experiment, the researchers pretrained several different Switch Transformer models using 32 TPU cores on the Colossal Clean Crawled Corpus, a 750GB-sized dataset of text scraped from Reddit, Wikipedia, and other web sources. They tasked the models with predicting missing words in passages where 15% of the words had been masked out, as well as other challenges, like retrieving text to answer a list of increasingly difficult questions.

 Google trained a trillion parameter AI language model

The researchers claim their 1.6-trillion-parameter model with 2,048 experts (Switch-C) exhibited “no training instability at all,” in contrast to a smaller model (Switch-XXL) containing 395 billion parameters and 64 experts. However, on one benchmark — the Sanford Question Answering Dataset (SQuAD) — Switch-C scored lower (87.7) versus Switch-XXL (89.6), which the researchers attribute to the opaque relationship between fine-tuning quality, computational requirements, and the number of parameters.

This being the case, the Switch Transformer led to gains in a number of downstream tasks. For example, it enabled an over 7 times pretraining speedup while using the same amount of computational resources, according to the researchers, who demonstrated that the large sparse models could be used to create smaller, dense models fine-tuned on tasks with 30% of the quality gains of the larger model. In one test where a Switch Transformer model was trained to translate between over 100 different languages, the researchers observed “a universal improvement” across 101 languages, with 91% of the languages benefitting from an over 4 times speedup compared with a baseline model.

“Though this work has focused on extremely large models, we also find that models with as few as two experts improve performance while easily fitting within memory constraints of commonly available GPUs or TPUs,” the researchers wrote in the paper. “We cannot fully preserve the model quality, but compression rates of 10 to 100 times are achievable by distilling our sparse models into dense models while achieving ~30% of the quality gain of the expert model.”

In future work, the researchers plan to apply the Switch Transformer to “new and across different modalities,” including image and text. They believe that model sparsity can confer advantages in a range of different media, as well as multimodal models.

Unfortunately, the researchers’ work didn’t take into account the impact of these large language models in the real world. Models often amplify the biases encoded in this public data; a portion of the training data is not uncommonly sourced from communities with pervasive gender, race, and religious prejudices. AI research firm OpenAI notes that this can lead to placing words like “naughty” or “sucked” near female pronouns and “Islam” near words like “terrorism.”  Other studies, like one published in April by Intel, MIT, and Canadian AI initiative CIFAR researchers, have found high levels of stereotypical bias from some of the most popular models, including Google’s BERT and XLNet, OpenAI’s GPT-2, and Facebook’s RoBERTa. This bias could be leveraged by malicious actors to foment discord by spreading misinformation, disinformation, and outright lies that “radicalize individuals into violent far-right extremist ideologies and behaviors,” according to the Middlebury Institute of International Studies.

It’s unclear whether Google’s policies on published machine learning research might have played a role in this. Reuters reported late last year that researchers at the company are now required to consult with legal, policy, and public relations teams before pursuing topics such as face and sentiment analysis and categorizations of race, gender, or political affiliation. And in early December, Google fired AI ethicist Timnit Gebru, reportedly in part over a research paper on large language models that discussed risks, including the impact of their carbon footprint on marginalized communities and their tendency to perpetuate abusive language, hate speech, microaggressions, stereotypes, and other dehumanizing language aimed at specific groups of people.

VentureBeat

VentureBeat’s mission is to be a digital townsquare for technical decision makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you,
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more.

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More
« Older posts
  • Recent Posts

    • Pearl with a girl earring
    • Dynamics 365 Monthly Update-January 2021
    • Researchers propose Porcupine, a compiler for homomorphic encryption
    • What mean should I use for this exemple?
    • Search SQL Server error log files
  • Categories

  • Archives

    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited