Category Archives: Big Data

Mobileye, Intel’s $15.3 billion computer vision acquisition, partners with Nissan to crowdsource maps for autonomous cars

 Mobileye, Intel’s $15.3 billion computer vision acquisition, partners with Nissan to crowdsource maps for autonomous cars

Mobileye, the Israeli computer vision firm that’s currently in the process of being acquired by chip giant Intel for $ 15.3 billion, has announced a new tie-up with automotive giant Nissan to generate “anonymized, crowdsourced data” for precision mapping in autonomous cars.

Founded in 1999, Mobileye builds the visual smarts behind cars’ driver assistance systems that include adaptive cruise control and lane departure warnings. Its technology is already used by the likes of BMW, Volvo, Buick, and Cadillac, and last summer Mobileye announced a partnership with BMW and Intel to put self-driving cars into full production by 2021. The trio later committed to putting 40 autonomous test cars on public roads in the second half of 2017, before Intel went all-in and decided to buy Mobileye outright.

For self-driving cars to become a reality, carmakers and the technology companies they work with need access to accurate maps of roads and the environment around which autonomous cars will operate — these high-definition maps complement on-board sensors and add an additional level of safety. Mobileye’s existing Road Experience Management (REM) engine is essentially a mapping and localization toolset that can use any camera-equipped vehicle to capture and process data around geometry and physical landmarks, send it to the cloud, and then feed this back into autonomous car systems using minimal bandwidth. It’s basically a crowdsourced data collection effort using millions of cars already on the roads.

Mobileye already powers Nissan’s recently announced ProPilot, a system that’s similar to Tesla’s AutoPilot offering, which can already automate some car functions on the road, including steering and acceleration. And with Nissan recently kicking of its first self-driving car trials in London, it seems now is the time for Mobileye to work with Nissan to boost its crowdsourced mapping efforts.

This adds another major automaker to Mobileye’s existing roster of REM partners, which include the previously announced General Motors and Volkswagen, while the likes of Audio, BMW, and Daimler are on board via their ownership of the HERE mapping platform that partnered with Mobileye last year.

The more carmakers sign up to integrate Mobileye’s REM, the more data can be combined to scale the system to cover every locale where humans drive.

Let’s block ads! (Why?)

Big Data – VentureBeat

VentureBeat is hiring an AI reporter

 VentureBeat is hiring an AI reporter

We’re looking for an experienced reporter to help lead our coverage of artificial intelligence.

As startups and big corporations invest money and talent into AI, VentureBeat aims to cover both the broad ways AI will change life as we know it and the technical infrastructure underpinning it.

As VentureBeat’s AI reporter, you’ll help define our daily coverage of AI and cloud technology — from incremental developments to breakthroughs that may one day beat the Turing Test, cross the Uncanny Valley, and make self-driving cars possible. We also appreciate an appropriately jaundiced view of the many consumer apps already powered by AI. You’ll be responsible for covering breaking news on this topic in a fast-paced newsroom, developing and maintaining key industry contacts, and turning those connections into scoops.

Please be available to work from our San Francisco headquarters. This is a full-time, salaried position with health benefits and a flexible time-off policy. Candidates should have at least two years journalistic experience writing on a deadline in a fast-paced online newsroom.

Finally, it would be great if you love to read VentureBeat. Seriously, though, you should already read VentureBeat!

Please send a resume (or LinkedIn page) and cover letter containing three links to your best stories to Questions? Please get in touch (with “AI reporter” in the subject line).

Let’s block ads! (Why?)

Big Data – VentureBeat

How Investing in Data Quality Saves You Money

You know that data quality helps your data analysts perform their jobs better and can smooth the transfer of data across your organization. But data quality is crucial for another reason: It translates directly to cold, hard cash. Here’s how increase your data quality return on investment…

blog premium data quality 300x300 How Investing in Data Quality Saves You Money

When it comes to big data, you may think of expensive storage infrastructure and sophisticated platforms like Hadoop as the most significant places to invest your money. But while it’s true that you should invest in data storage and analytics technology, investing in data quality is equally crucial.

The reason why is that poor-quality data can undercut your business operations in a variety of ways. No matter how much you spend on analytics – or marketing, recruitment, planning and other endeavors based on those analytics – you’re shooting yourself in the foot if the data you’re working with is subject to inconsistencies, inaccuracies, and other quality issues.

Consider the following ways in which investing in data quality can save you – or earn you – big money.

blog BD Legacy How Investing in Data Quality Saves You Money

Making the most of marketing

Marketing is key to attracting and keeping customers. If your marketing team’s efforts are based on low-quality data, they will chronically come up short.

Think about it. If the email addresses you collect for prospects are not accurate, your marketing campaigns will end up in black holes. If the data you collect about customer preferences turns out to be inconsistent, your marketing team will make plans based on information that doesn’t reflect reality.

The list of marketing problems that can result from low-quality data could go on. The point is that your return on the investment you make in marketing efforts is only as great as the quality of the data at the foundation of your marketing campaigns.

Keeping customers happy

In addition to using marketing to attract new customers, you also want to keep the customers you have. Quality data is key here, too.

Why? Because your ability to meet and exceed the expectations of your customers is largely based on the accuracy of the data you collect about their preferences and behavior. If time zone information in your database of customer transactions is incorrect, you might end up inaccurately predicting when customer demand for your services is highest. As a result, you’ll fall short of being able to guarantee service availability when your customers want it most.

As another example, consider the importance of making sure you maintain accurate data about your customers in order to deliver excellent customer service. When a customer calls you with a complaint or question, you don’t want to misidentify him or her because of inaccurate information linking a phone number to a name. Nor do you want to route customer calls to the wrong call center because the data you have about a customer’s geographic location is wrong.

Staying compliant

Compliance is money – there’s no denying that – and quality data can do much to help ensure that you meet regulatory compliance requirements.

blog compliant How Investing in Data Quality Saves You Money

Data Quality Return on Investment: Compliance

Without quality data, you may end up failing to secure sensitive customer information as required by compliance policies because you have no way of separating data that needs to be protected from the rest of the information you store. Or, you may run afoul of regulatory authorities because you can’t rely on low-quality data to detect and prevent fraudulent activity, another area where data quality is key.

Keeping employees happy

Good employees are hard to find, and they can be even harder to keep. That’s especially true if poor data gets in the way of their ability to do their jobs.

Whether they’re in marketing, HR, legal, development or any other area of your organization, most employees depend on data to accomplish what you expect of them. If you are unable to deliver the quality data they require, they’ll become frustrated and less productive. They may ultimately choose to look for work elsewhere.

Low employee productivity and high turnover rates translate to higher staffing costs.

Keeping operations efficient

Just as your employees can’t do their job without quality data, so will your business fail to operate efficiently without good data.

blog efficiency How Investing in Data Quality Saves You Money

Data Quality Return on Investment: Efficiency

In a world where data is at the root of almost everything a company does, inaccurate or inconsistent data slows down processes, creates unnecessary delays, introduces problems that teams have to scramble to fix, and so on.

Quality data helps you avoid these mistakes and keep your business lean and mean – which translates to greater cost-efficiency.

Achieving data quality with Syncsort

With Syncsort, you can ensure data quality and streamline your data integration workflows at the same time. That’s because Syncsort has added Trillium’s data quality software to its suite of Big Data integration solutions.

To learn more about how Syncsort helps you maximize your data quality return on investment by ensuring the quality of even the hardest-to-manage data – legacy data – check out the latest eBook, “Bringing Big Data to Life.”

 How Investing in Data Quality Saves You Money

Let’s block ads! (Why?)

Syncsort blog

Blog Update: How Are Companies Using Hadoop?

It’s no secret that Hadoop is popular, and our readers have shown that they’re interested to hear the latest on all things Hadoop. On the heels of our annual Hadoop Survey, we recently updated a blog post originally published in 2015, titled “Who Is Using Hadoop? And What Are They Using It For?” The revised post provides a more accurate look at companies using Hadoop today and what business challenges they are tackling with this Big Data tool.

What Are Companies Using Hadoop For? Which Organizations are Using It?

blog Whos Using Hadoop Blog Update: How Are Companies Using Hadoop?

Whether you’re new to Hadoop or an experienced early adopter, it’s always useful to have the inside track of what’s happening in the fast-paced world of technology. Who wouldn’t want to know if their competitors were likely investing in the Big Data solution or whether their industry is finding success with the platform?

Read the updated postWho Is Using Hadoop? And What Are They Using It For? to discover the answers to key questions around which organizations are using Hadoop, such as:

  • WHO? Who is using Hadoop? What does the adoption rate currently look like?
  • WHAT? What does the return on this investment look like? How is it providing business value?
  • HOW? How are companies using Hadoop? Which industries are finding the most value and success?

The updated post also includes statistics and infographics from the Hadoop Perspectives for 2017 eBook around the value of access and integration of legacy and/or mainframe data into the Hadoop platform.

Other Related Blog Posts on this Topic:

Let’s block ads! (Why?)

Syncsort blog

14 AI startups will compete for $1.5 million from Nvidia

Artificial intelligence is hot, and you can tell that because both giant companies and tiny startups are excited about it. Nvidia, which had $ 6.9 billion in revenues last year, is in touch with more than 2,000 AI startups around the world. And this week, the graphics chip maker and AI company took a step in figuring out which ones are the best.

Jen-Hsun Huang, CEO of Nvidia, hosted a Shark Tank style event called Nvidia Inception to find the best AI startups. Huang and a panel of judges listened to pitches from 14 AI startups across three categories. They were filtered from more than 600 contestants who entered the Nvidia Inception contest, and the winners will walk away with $ 1.5 million in cash at a dinner on May 10 at Nvidia’s GPU Technology Conference.

“We are in the beginning of one of the largest computing revolutions that we have ever been through,” said Huang, at the beginning of the event. “The AI revolution is upon us.”

 14 AI startups will compete for $1.5 million from Nvidia

Above: Jeff Herbst of Nvidia.

Image Credit: Dean Takahashi

The judges include Gavin Baker, portfolio manager for Fidelity Investments; Tammy Kiely, global head of semiconductor investment banking at Goldman Sachs; Shu Nyatta, investor for the SoftBank Group; Thomas Laffont, senior managing director for Coatue Management; and Prashant Sharma, global chief technology officer for Microsoft Accelerator.

Jeff Herbst, vice president of business development at Nvidia, said in an interview with VentureBeat that the company decided to create a meaningful award to recognize the amazing work being done by AI startups. The three awards will focus on the “hottest emerging startup,” the “most disruptive startup,” and the startup with the “most potential for social impact.”

I listened to the companies give their pitches to the judges, and this story will focus on the four companies that gave pitches for the hottest emerging startup. The qualifying rule for these companies is that they cannot have raised more than $ 5 million yet. There will be one winner in each category that wins $ 375,000, and the runner-up in each category will win $ 125,000. (We will do other stories on the disruptive and social impact startups later.)

 14 AI startups will compete for $1.5 million from Nvidia

Above: Focal Systems’ tablet for grocery store carts.

Image Credit: Dean Takahashi

Francois Chaubard, CEO of Focal Systems, said he went to a talk in 2015 and got all excited about self-driving cars. But afterward, he went to a grocery store with a friend and found that the shopping experience was anything but automated. So he set out to create the “operating system for brick and mortar retail.”

Part of the solution is hardware. Focal attaches a generic Android tablet to a shopping card (in a way where it can’t be destroyed by kids). It has a side-mounted camera that looks at the shelves, and there are two separate cameras that see what you put in the cart. Those cameras send the data to a hub in the store, which analyzes them. Using computer vision and artificial intelligence, it figures out what you have placed in the cart, tallies the total, and displays what you are buying on the tablet as you shop. You can skip the checkout counter and wheel the cart out to your car.

“We take the AI algorithms that are super cool and apply them to the in-store shopping experience,” Chaubard said. “RFID [radio frequency identification tags] promised this, but it never accomplished it.”

The side camera on the tablet can take images of the aisles and discern which items are out of stock. That means that clerks no longer have to do that task. Currently, product makers often have to pay $ 30 per store to find out if their own items are out of stock; this could reduce that cost. Shoppers can also use the coupon finder on the tablet to get discounts on items they’re going to buy. The tablet can also handle voice-driven customer service.

Chaubard said his research showed that a single Wal-Mart store employs 238 people with a labor cost of $ 572,000 per month. Of the cost, 24 percent covers cashiers, 22 percent customer service, 11 percent scanning, and 43 percent stocking. That means that with Focal Systems, about 57 percent of that labor cost — scanning and stocking — is “redeployable” for $ 326,000 a month in savings.

The addressable market is about $ 13 billion, Chaubard said. Focal Systems is working with major product companies such as Anheuser-Busch, and it charges a monthly service fee for the retailer. It could also make money via the data it collects in the stores.

Rivals include Amazon, but many product companies and retailers don’t want to work with a competitor, and Focal Systems positions itself as a neutral services provider.

“Others are terrified of Amazon moving into this space, and everyone is looking for an answer,” Chaubard said.

The company has 15 employees, and it has raised $ 2.5 million from Zetta Venture Partners and SoftTech VC.

 14 AI startups will compete for $1.5 million from Nvidia

Above: Tom Delisle, CEO of Datalogue.

Image Credit: Dean Takahashi

Tim Delisle, CEO of Datalogue, said that he heard that the best companies are formed by those who feel the pain of the customers. He said that, as a former data scientist at Merck, he knew that data scientists were feeling a lot of pain. They want to be analysts with deep insights into their data, but they are often more like data janitors, cleaning up data so that it can be in the right form for them to analyze.

Datalogue uses deep learning to take that cleanup task off their hands and free them up for analysis. The New York startup is working with its customers to analyze their data and set it up in a way that it becomes useful. For instance, some sensitive data has to be anonymized before it is passed on to another party. Datalogue can take that data and do that work.

“We offload the data scientist who has to spend 80 percent of time on cleaning up data,” Delisle said.

Datalogue’s business is to automate data preparation. It uses AI to understand semantically what is in the data, regardless of the type of data. It can parse, slice, flag, and map the appropriate data so that a customer can get some insight. And Datalogue can also do tasks such as detecting fake phone numbers in data at a “superhuman scale,” Delisle said.

Delisle said his tech can get the job done ten times faster than the competition, and ten times cheaper than manual labor. Initial “anchor” customers in each category can help pave the way for Datalogue to serve all of the customers in that category. A couple of anchor customers are paying more than $ 500,000 a year for Datalogue’s work. The company is working with customers such as large financial data providers.

A customer purchases certain “ontologies,” or types of data they want to identify and clean. The mapping of that data allows the customer to see the data in a more transparent way and deal with issues such as privacy regulations.

“Our competitive advantage is we get exposed to data that Amazon has never seen,” Delisle said. “We build solutions to automate the data cleaning processes of a much larger breadth of data.”

The company has 5 employees and was founded in February 2016. It has raised $ 1.5 million from Flybridge, Bloomberg Beta, and Nvidia.

 14 AI startups will compete for $1.5 million from Nvidia

Above: Scott Stephenson, CEO of Deep Gram.

Image Credit: Dean Takahashi

Scott Stephenson, CEO of Deep Gram, started on his march toward recognizing spoken words through dark matter. He has a background in particle physics, and he was searching for dark matter, a type of matter that physicists and astronomers care a lot about, deep underneath the ground near a dam in China. His team had to create a way to analyze waveforms to identify dark matter.

It turns out that this skill is exactly what is needed to recognize spoken words, Stephenson said. He said that voice recognition services like Siri have an error rate as high as 50 percent in recognizing conversational words. But Deep Gram has used deep learning and its waveform recognition technology to take the accuracy for recognizing spoken words to over 80 percent, he said.

“We build audio AI brains,” Stephenson said.

This can be useful in places like call centers, where transcribing and analyzing calls is a part of the process of reviewing quality. Typically, only 2 percent of the calls are ever analyzed in this way. But the Deep Gram tech can analyze audio data and spot keywords, transcribe, and get insights from phone calls. It can do the same for video footage and online media.

The company has 15 employees, and it has raised $ 1.8 million from Compound and Y Combinator. It is focused on English for now, but could expand to other languages later.

 14 AI startups will compete for $1.5 million from Nvidia

Above: Tanay Tandon, CEO of Athelas.

Image Credit: Dean Takahashi

Theranos, the blood analysis company that crashed and burned in a fraud scandal, might have fared better if it had gone down the path of Athelas. While Theranos tried to perform all sorts of test on a drop of blood, Athelas has made an inexpensive machine that is focused on just one of the most common types of blood tests.

Using computer vision and deep learning algorithms, Athelas has created a machine that looks at a drop of blood and identifies how many white blood cells it has, said Tanay Tandon, founder of Athelas.

“Blood is the window into someone’s health,” said Tandon. “The core of what makes it possible is deep learning.”

The imaging system has a patented flow test strip that can spread the drop out to a single cell layer. After it is scanned, the convolutional neural network goes to work on identifying what is in the sample. It can identify problems in a couple of minutes, and it can then tell you the results of the test much more quickly than current methods. It can detect white blood cell trends, leukemia, infections, inflammation, and other problems.

Athelas is going after a $ 50 billion market. The machine costs about $ 250 to make, and Athelas sells it for about $ 500. It can also generate about $ 5 in revenue per test. That is far less expensive than standard lab tests, which typically cost $ 30 to $ 50 each. Moreover, Tandon said, about $ 100 billion is wasted every year in treating diseases that are diagnosed late.

The company did a clinical trial with 350 patients, and it identified undetected leukemia in one patient. It is now doing about 100 tests per week with full accuracy, Tandon said. He hopes to ship about 10,000 machines by the end of the year.

The company has six employees, and it has raised $ 3.5 million from Sequoia Capital and Y Combinator. The product has been clinically validated and is undergoing clearance from the Food and Drug Administration. Athelas (named after a healing plant in The Lord of the Rings) was formed in May 2016.

Over time, Athelas could expand the testing to other kinds of liquid analysis, such as urine. The company would like to focus on selling a subscription model where there’s a monthly fee for a certain number of tests.

Let’s block ads! (Why?)

Big Data – VentureBeat

Twitter’s Gnip chief Chris Moody is joining Foundry Group

 Twitter’s Gnip chief Chris Moody is joining Foundry Group

Twitter is losing another key executive as Gnip chief executive Chris Moody has announced his departure. He has joined venture capital firm Foundry Group as a partner, a move Moody described as a “once-in-a-lifetime opportunity.” It’s believed that his last day will be at the end of May.

For more than two decades, Moody has been involved in the enterprise either as an executive or consultant. He’s worked at Oracle, IBM, and Aquent before joining Gnip as chief operating officer in 2011 and then assuming the role of CEO at the big data platform in 2013, leading up to its acquisition by Twitter in 2014. Since then, he’s served as a vice president and general manager of the company’s data and enterprise solutions.

Moody’s relationship with Foundry Group isn’t new, as both he and the firm are from the Boulder, Colorado area, and the firm had been an investor in Gnip. When it came time for Foundry to raise its next fund last September, the partners decided to begin having conversations with Moody.

“We knew Chris was an extraordinary board member as well as an extremely seasoned CEO. We had a great affinity for each other, and he shared our value system. When the five of us sat around talking about Chris, after each conversation we got more excited about having him join us, especially as we learned about his personal view for the next decade of his life,” wrote Brad Feld, managing director for Foundry Group.

The departure of Moody strikes another blow to Twitter’s developer relations, especially among brands using the service’s feed of data. But it’s likely that the company already has a backup in place, although a name was not immediately known. We’ve reached out to Twitter for additional information. His resignation joins others in the developer advocacy and platform side who have recently left, including developer advocacy lead Bear Douglas, senior developer advocate Romain Huet, head of developer relations Jeff Sandquist, and senior director of developer and platform relations Prashant Sridharan.

And let’s not also forget about the other executives that have also departed since 2016, such as COO Adam Bain, chief technology officer Adam Messinger, vice president of communications Natalie Kerris, vice president of product Josh McFarland, Vine general manager Jason Toff, and vice president of global media Katie Stanton.

Moody’s move to venture capital could be nothing more than a sign that he wanted to become an investor. But what will Twitter do now to maintain its relationship with brands eager to tap into the service’s firehose of data?

Although no specific investment themes were stated, it’s possible that Moody could focus on the enterprise and finding the next big data platform that could make a big impact in the marketplace.

Let’s block ads! (Why?)

Big Data – VentureBeat


PLEASE NOTE: this blog’s focus has changed. As of 9/1/2013, my posts will be zeroing in on the FUTURE OF PRIVACY and the role of ‘BIG DATA’ as well as on the latest developments in HUMAN-MACHINE relationships. Previously, this blog provided insights on…

Let’s block ads! (Why?)

Privacy, Big Data, Human Futures by Gerd Leonhard

Expert Interview (Part 2): Sean Anderson Talks about Spark Structured Streaming and Cloud Support

In Part 1, Cloudera’s Sean Anderson (@SeanAndersonBD), summarized what’s new in Spark 2.0. In Part 2, he talks more about new features for Spark Structured Streaming, including how unified APIs simplify support for streaming and batch workloads, and support for Spark in the Cloud.

In Spark 2.0, the ecosystem combined the functional API’s and now you have a unified API for both batch and streaming jobs. It’s pretty nice to not have to use different interfaces to achieve this. There’s still native language support, and they are still very simplified and easy to use APIs, but for both of those types of workloads.

blog Spark diagram Expert Interview (Part 2): Sean Anderson Talks about Spark Structured Streaming and Cloud Support

Roberts: Ooh! Streaming and batch together in one interface is something Syncsort has been pushing for a while! That’s great to hear. Very validating.

Anderson: Then the last improvement was around Spark Structured Streaming, which is a streaming API that runs on top of Spark SQL. That generally gives us better performance on micro-batch or streaming workloads, and really helps with things like out of order data handling.

There was this issue with Spark Streaming before where you may have outputs that resolve themselves quicker than the actual inputs or variables. So you have a lot of really messy out of order data that people had to come up with homegrown solutions to address.

And now that Spark Structured Streaming has essentially extensible table rows forever, you can really do that out of order data handling a lot better.

Related: Syncsort goes native for Hadoop and Spark mainframe data integration

Streaming and batch seems like they’ve always been two separate things, and they’re becoming more and more just two different ways to handle data. We are also seeing a lot of push towards Cloud. What else are you seeing coming up that looks exciting?

For us, really understanding how we guide our customers on deploying in the Cloud is great. There’s persistent clusters, there’s transient clusters. For ETL, what’s the best design pattern for that? For exploratory data science, what’s the best for that? For machine learning, what’s the best for cloud based scoring? So giving customers some guidance on those aspects is key.

blog banner BBDtL ExpertsSay Expert Interview (Part 2): Sean Anderson Talks about Spark Structured Streaming and Cloud Support

Recently, we announced S3 integration for Apache Spark which allows us to run Spark jobs on data that already lives in S3. The transient aspects of clusters makes it very easy to just spin up compute resources, and run a Spark job on data that lives in S3. And then you don’t have to spend all that time moving the data and going through all the manual work on the front end.

Really work on the data right where it is.

Exactly. That’s Spark in the Cloud.

Syncsort recently announced support for Spark 2.0 in our DMX-h Intelligent Execution (IX) capabilities. Be sure to check that out, and see what the experts have to say about Spark in our recent eBook.  

Also, be sure to the read the third and final part of this interview on Friday. Paige and Sean talk about two new projects that Cloudera is excited about, Apache Livy and Apache Spot.

Let’s block ads! (Why?)

Syncsort blog

New password guidelines say everything we thought about passwords is wrong

 New password guidelines say everything we thought about passwords is wrong

When I recently discovered a draft of new guidelines for password management from NIST (the National Institute of Standards and Technology), I was amazed about the number of very progressive changes they proposed.

Although NIST’s rules are not mandatory for nongovernmental organizations, they usually have a huge influence as many corporate security professionals use them as base standards and best practices when forming policies for their companies. Thus, another fact I was surprised about was a lack of attention to this document, finalized March 31, from both official media and the blogosphere. After all, those changes are supposed to affect literally everyone who browses the Internet

Here is a quick look at the three main changes the NIST has proposed:

No more periodic password changes. This is a huge change of policy as it removes a significant burden from both users and IT departments. It’s been clear for a long time that periodic changes do not improve password security but only make it worse, and now NIST research has finally provided the proof.

No more imposed password complexity (like requiring a combination of letters, numbers, and special characters). This means users now can be less “creative” and avoid passwords like “Password1$ ”, which only provide a false sense of security.

Mandatory validation of newly created passwords against a list of commonly-used, expected, or compromised passwords. Users will be prevented from setting passwords like “password”, “12345678”, etc. which hackers can easily guess.

So why haven’t we seen any coverage of the changes considering how much of a departure they are from previous advice — and considering every average user is going to be affected? I think there are several reasons for the radio silence.

First, many people now suffer from password fatigue. Users are tired of and disappointed with password rules. They are forced to follow all these complex guidelines, remember and periodically change dozens or hundreds of different passwords, and yet we still hear about an enormous number of security breaches caused by compromised passwords. Users, especially less sophisticated ones, seem to have reconciled themselves to this situation and perceive it as a matter of course, so no one believes it can be improved.

Second, we’ve seen a widespread introduction of MFA (multi factor authentication), also known as two factor authentication, which supposedly pushes the password problem to the background. Let me remind you that unlike traditional authentication by password (“something you know”), MFA requires a second factor like “something you have” (hardware token, mobile phone) or “something you are” (usually biometric such as fingerprint or face recognition). Indeed, if my account is protected by a reliable second factor such as a one-time code texted to my iPhone or generated on demand by my Yubikey, why should I care about passwords anymore? I can just use the same password I remember on every account that is protected by MFA. Unfortunately, this assumption is only partially true because MFA is reliable only when both factors are secure.

Finally, more diligent users these days have access to a large variety of password management software, both commercial and freeware, which can significantly improve user experience and security. With password management software, I only need to remember one password that unlocks my personal “password vault”, so I don’t have to worry about all the complexity rules or frequent password changes; my password manager will generate, store, and enter a secure random password every time I need one. However, there are still scenarios when we cannot use password manager (unlocking our phone, computer, or door, for example).

So are these changes NIST is proposing still relevant and important? Of course they are. Despite the desperate attempts of many security startups to introduce new authentication methods, passwords are here to stay for awhile, if not forever, and millions of people around the world will appreciate even small improvements in user experience and security.

Slava Gomzin is author of the book Hacking Point of Sale (Wiley, 2014) and Bitcoin for Nonmathematicians (Universal Publishers, 2016). He is VP of Information Security and Technology at Pieces Technologies, a health tech startup. Previously he was Director of Information Security at Parkland Center for Clinical Innovation (PCCI) and was a security and payments technologist at Hewlett-Packard, where he helped create products that are integrated into modern payment processing ecosystems using the latest security technologies. He blogs about information security and technology at

Let’s block ads! (Why?)

Big Data – VentureBeat

Big Data Context: Targeting Relevant Data that’s Fit for Purpose

During the Enterprise Data World conference last week, it was clear that many organizations are wrestling with the rapid changes in information management and governance necessitated, and many are assessing where they are in this process, even questioning “where to start?”

Understanding Big Data Context to Find Relevant Data

William McKnight of McKnight Associates noted importantly in his opening keynote: “don’t talk yourself out of starting.” Stan Christaens of Collibra added that organizations are finding different points to get underway, whether facilitating self-service analytics, enabling data stewards to care for data, working on critical compliance requirements, or freeing data scientists to find relevant data. But, as Mike Nicosia of TIAA commented, while maturity assessments may provide insight, “without context, you cannot make good decisions.”

This is a challenge for data-driven businesses as they endeavor to get actionable insights from critical enterprise data assets, leveraging next-generation Big Data environments.

My opening day was filled with tutorials on Data Modeling wrapped around my own presentation “Finding Quality in the Data Lake”. In the morning, I heard about advanced, but traditional techniques for modeling the enterprise data warehouse. That afternoon, I learned about the challenges of modeling for NoSQL databases.

What struck me in comparing the two was context – that is, the understood context of a given piece of data. In the first, the originating context is stripped away to get to a model of an entity – a computerized representation through data of some real-world object. In the second, the context is maintained through the use of techniques such as document stores or graphed relationships. As the instructor in the latter tutorial noted, “context is critical.”

As I’ve recently reflected on the meaning of data quality in the emerging structure of the Data Lake, the notion of context for Big Data takes on primary importance. Nicosia used the analogy of a cholesterol test. If you’ve had the test and the doctor says you are at 250, what does that mean? Is it good, is it bad?

You need context – context that includes a definition of what the data is, how it’s recorded, whether it has a scale of measurement, and even whether there is a prior value or measurement for comparison.

blog big data context Big Data Context: Targeting Relevant Data that’s Fit for Purpose

A Question of Big Data Context: How do we find relevant data for “John Doe”?

However, Big Data context is not simply a reflection of what data means. As Andrew Patricio, former CDO of the District of Columbia Public Schools commented “What problem are you trying to solve?” There needs to be a focus on “relevant data.”

Theresa DelVecchio Dys, Director of Social Policy Research and Analysis at Feeding America noted that their Data First Initiative started with a problem statement. As she noted, “not all data is good for all things.”

blog banner TrilliumPreciseWebcast Big Data Context: Targeting Relevant Data that’s Fit for Purpose

Data Quality Helps Target Fit For Purpose Data

For Feeding America, who coordinate a nationwide network of food banks serving over 46 million people each year, quality data is critical, and yet at the same time, their programs must focus on service and the operational processes to support it. The context of how and where data can be effectively and efficiently gathered is a key factor – too much focus on exactness in data collection upfront can lead to long lines which results in those they service turning away, the exact opposite of their intent! Patricio reiterated this point when he noted that “a goal of effectiveness instead of quality goes towards the solution.”

With an understanding that we, as part of organizations, are trying to solve problems, we can focus on asking key questions, testing hypotheses, and evaluating outcomes. These are activities that must be supported by data, in context, and allow us to make determinations as to what data is fit for purpose.

Laura Sebastian-Coleman, Data Quality Center of Excellence Lead for Cigna, noted specifically that data quality depends on:

  • Fitness for Purpose – how well the data meets the expectations of consumers (always with some constraints)
  • Representational Effectiveness – how consistent the data is to the defined or modeled concepts
  • Data Knowledge – how well consumers understand and can decode the data

Without this knowledge, which depends on the context of the data, our Data Lakes or even our Data Warehouses are doomed to become “Data Graveyards.”

blog relevant data fit for purpose Big Data Context: Targeting Relevant Data that’s Fit for Purpose

4 Steps for Achieving Trust in Your Data: 1) Know your goal or at least form an hypothesis, 2) Understand your data by measuring data quality, 3) Determine if it’s relevant data, ie. “fit for purpose” and 4) Document and validate your results

We make assumptions and take risks as we build out these data repositories. We assume that consumers understand what problems they are trying to solve. Sebastian-Coleman reminded us that we assume that consumers will:

How CDOs Help Big Data Consumers Make Big Decisions

In the closing keynote featuring a panel of Chief Data Officers, these CDO’s emphasized the need to understand the language of the business and the criticality of communication and transparency. This knowledge is key to helping data consumers make informed decisions with Big Data context. As McKnight commented in kicking off the conference, “top performers realize they need data, that they are in the business of data” regardless of their industry, and that “it takes knowledge and focus to get it right, not just more time and budget.”

There are a lot of starting points, a lot of pathways, in managing information in this rapidly changing data landscape. As McKnight said, “beyond the mountain is another mountain,” and Patricio reflected that this is a “continuous cycle of processing and evaluation.”

Our data lakes will not be static; cannot afford to become data graveyards. But keeping them from becoming so requires us to continually reflect on the business problems we are trying to solve, to ask questions of the data, to understand the context of the data, and to measure and evaluate the fitness of the data for our purposes. With Big Data context in mind, we can mature our organizations and make more effective data-driven business decisions.

For more information on how to improve data quality in your customer database, watch our recent webinar Getting Closer to Your Customers with Trillium Precise.

Let’s block ads! (Why?)

Syncsort blog