Category Archives: Big Data

Former IBM employee pleads guilty to ‘economic espionage’ after stealing trade secrets for China

 Former IBM employee pleads guilty to ‘economic espionage’ after stealing trade secrets for China

A former developer for IBM pled guilty on Friday to economic espionage and to stealing trade secrets related to a type of software known as a clustered file system, which IBM sells to customers around the world.

Xu Jiaqiang stole the secrets during his stint at IBM from 2010 to 2014 “to benefit the National Health and Family Planning Commission of the People’s Republic of China,” according to the U.S. Justice Department.

In a press release describing the criminal charges, the Justice Department also stated that Xu tried to sell secret IBM source code to undercover FBI agents posing as tech investors. (The agency does not explain if Xu’s scheme to sell to tech investors was to benefit China or to line his own pockets.)

Part of the sting involved Xu demonstrating the stolen software, which speeds computer performance by distributing works across multiple servers, on a sample network. The former employee acknowledged that others would know the software had been taken from IBM, but said he could create extra computer script to help mask his origins.

Xu, who is a Chinese national who studied computer science at the University of Delaware, will be sentenced on October 13.

The Justice Department’s press release does not identify IBM, but instead refers to “the Victim Company.” But other news outlets name IBM as the target of the theft, while a LinkedIn page with Xu’s name shows he worked at IBM as a file system developer during the relevant dates.

IBM did not immediately respond to request for comment on Sunday.

This isn’t the first time that Chinese nationals have carried out economic espionage against American companies. In 2014, the Justice Department charged five Chinese hackers for targeting U.S. nuclear and solar energy firms. And late last year, the agency charged three others for hacking U.S. law firms with the goal of trading on insider information that they obtained.

This story originally appeared on Fortune.com. Copyright 2017

Let’s block ads! (Why?)

Big Data – VentureBeat

NEW! Bloor Spotlight Paper: Big Data & the Mainframe, Issues and Opportunities

In this new white paper from Bloor, common issues around Big Data deployments are discussed, including strategies to resolving them, or even turning them into opportunities.

blog banner BloorWP NEW! Bloor Spotlight Paper: Big Data & the Mainframe, Issues and Opportunities

“More or less every major organisation in the world has a mainframe at the heart of its enterprise and it is critical that big data deployments are viewed from that perspective rather than treated as isolated efforts that are distinct from the mainframe environment.”
– Author Philip Howard

Discover the six issues involved in Big Data deployments – and how to resolve them – by downloadingBloor Spotlight: Big Data and The Mainframe, Issues and Opportunitiestoday!

Let’s block ads! (Why?)

Syncsort blog

Dark Data, Light Data, and Agility

“Agile” is the operative word in today’s IT world. If you’re not agile, you’re not cutting it in the technology business. And if you work with data, staying agile means being able to turn “dark data” into “light data” instantly.

blog agility Dark Data, Light Data, and Agility

To explain what I mean by that, let me first define the terms:

  • Agile is a word used by programmers and system admins to describe an approach to software development and deployment that is flexible and scalable. The idea behind the agile mindset is to move away from the legacy world, where rigid frameworks and vendor lock-in constrained the way software was produced and managed, and enter a new world where organizations can adapt quickly to changing technological needs and opportunities.
  • Dark data refers to data that an organization collects but does not use. In most cases, the data remains unused – and therefore “dark” – for one of two reasons: Either because of issues accessing it or a failure to build it into data analytics workflows. (Related: 4 Ways to Use Dark Data)
  • Light data is the opposite of dark data. Light data is data that you collect effectively and use to achieve business insights. 

blog eBook Customer360 Dark Data, Light Data, and Agility

Staying Agile in a World of Data

The idea of being agile is used first and foremost in the world of software development. But it can and should be extended to data management as well.

If you develop software, being agile means that you can switch to new development frameworks quickly, scale your application without limit and so on.

Similarly, being agile when it comes to data management means the ability to derive value quickly from whatever type of data you collect, no matter where it is generated or stored.

In other words, if you have an agile data management process in place, you enjoy the ability to make the most of your data, no matter what the circumstances. You work with your data in whichever ways make the most sense for your business goals, rather than being constrained by details of your data management environment.

blog barchart Dark Data, Light Data, and Agility

Dark Data, Light Data and Agility

Being able to translate dark data into light data is a vital element of staying agile from a data management perspective.

If your data remains dark, you’re not agile because you are not making the most of the data available to you. You’re missing out on the valuable insights you could derive from the data.

So, to achieve agility, you need to ensure that you are turning dark data into light data whenever you need. For most organizations, this means the ability to offload data from inaccessible storage environments, such as mainframes, into modern analytics platforms, like Hadoop. There, data that was once dark can become light because it can be integrated into the rest of your analytics workflow.

Syncsort’s DMX-h and other Big Data solutions help you achieve this type of data-management agility by turning dark data light. To learn more about how organizations are using tools like Syncsort’s to drive digital transformation with their data, check out Syncsort’s annual Hadoop Market Adoption Survey.

 Dark Data, Light Data, and Agility

Let’s block ads! (Why?)

Syncsort blog

Google’s AlphaGo narrowly beat the top Go player because it avoids risks

 Google’s AlphaGo narrowly beat the top Go player because it avoids risks

Google’s AlphaGo program bested the world’s top Go player by the slimmest margin possible in the first of three matches Tuesday, but that doesn’t mean humanity is safe.

The AI won a match against Ke Jie as part of an exhibition during the Future of Go Summit in Wuzhen, China. While Ke made moves reminiscent of his computerized opponent, he was eventually defeated by half a point.

AlphaGo is supposed to maximize its chances of winning, rather than maximize the margin of its victory, according to DeepMind founder Demis Hassabis.

“AlphaGo always tries to maximize its probability of winning, rather than try to maximize the size of the winning margin,” he said during a post-match press conference. “So, whenever we see it has a decision to make, it will always take what it thinks is the most certain path to victory, with less risk.”

That behavior is an interesting quirk of AlphaGo’s computerized nature, and points towards one of the key issues with understanding machine learning systems based on neural nets. It can be hard for humans to understand the decisions neural nets make, but figuring that out will be a key part of a future full of AIs.

It’s possible for the program to win by half a point because of a rule known as Komi, which is designed to compensate the player who goes second. AlphaGo started the game with a 7.5 point advantage because Ke went first.

The program’s margin of victory tendencies weren’t exposed during its highly publicized matches against South Korean grandmaster Lee Sedol last year because all of those games ended in one player resigning before scores were counted.

AlphaGo’s win isn’t exactly surprising. The program already beat 19-year-old Ke during a series of matches it played online under a pseudonym. Tuesday’s victory is another in a long series of wins for AlphaGo, which come over a year after its series against Lee.

There’s still more Go to come this week. Ke Jie and AlphaGo have two more one-on-one matches to play, on Thursday and Saturday. The machine will also team up with humans in a Pair Go match on Friday. Another Friday exhibition match will pit AlphaGo against five top players, all working together to defeat it.

Let’s block ads! (Why?)

Big Data – VentureBeat

How Data Quality Can Improve Data Governance

You know data quality is important for optimizing analytics results and speeding business insights. But have you thought about how data quality and IT governance go together? If not, this post is for you.

blog data governance How Data Quality Can Improve Data Governance

Defining Data Governance

Let’s start this discussion of the link between data quality and data governance by defining what data governance means.

To do that, we have to start by explaining the concept of IT governance.

IT governance refers to a set of policies that define how technological resources can be used and managed. At any large organization, establishing IT governance policies is essential to make sure that systems and resources are properly managed in a consistent way.

blog eBook Customer360 How Data Quality Can Improve Data Governance

Data governance, as you might have inferred by now, is the subset of governance practices that relate to data. If your organization collects, monitors or analyzes data – as virtually every type of business does today – it should have data governance policies in place.

IT governance helps ensure consistency and compliance in the way data is collected, stored and accessed. Strong governance policies in the realm of data are especially important because data tends to be subject to several complex regulatory compliance requirements.

blog referree How Data Quality Can Improve Data Governance

Using Data Quality to Facilitate Data Governance

So, where does data quality come into the picture? What does it have to do with data governance?

To put it simply, data quality is essential for data governance because ensuring data quality is the only way to be certain that your data governance policies are consistently followed and enforced.

After all, these policies – like any type of rule – are only worth anything when people actually adhere to them. And when it comes to regulating the way data is stored and managed, adherence to the rules is very difficult to enforce if your data is inconsistent or filled with errors.

If you lack strong data quality – and tools to help maximize data quality – your data governance efforts can be undercut in the following ways:

  • Data governance policies will be followed more closely with some bodies of data than with others. For example, data governance may be ignored on legacy systems, where unsupported file formats or inconsistent data structures make data governance harder to enforce.
  • Missing or inaccurate data within your databases makes it hard to identify which types of data are subject to which data governance rules. For instance, if errors within your data cause you to mistake personal customer information (which needs to be protected for privacy purposes) for non-private data, you will fail to enforce your data governance policies.
  • Data governance audits, which are the only way you or outside authorities can determine whether data governance rules are being followed, become difficult and ineffective when the data being reviewed is filled with inconsistencies or errors.

Achieving Data Quality with Syncsort

Syncsort provides the Big Data integration solutions make data management a seamless process – no matter which types of systems you work with, and the data quality software to ensure you can trust your data in and out of the data lake. To learn more how Syncsort solutions like Trillium Data Quality products and DMX-h can simplify data governance, check out the Syncsort resource library today.

 How Data Quality Can Improve Data Governance

Let’s block ads! (Why?)

Syncsort blog

Meet Blair Hanley Frank, VentureBeat’s new staff writer

We’re thrilled to announce Blair Hanley Frank will join VentureBeat as a staff writer effective May 22, 2017. His areas of coverage will include artificial intelligence, big data, cloud infrastructure, and enterprise technologies.

blair hanley frank Meet Blair Hanley Frank, VentureBeat’s new staff writer

Above: Blair Hanley Frank joins VentureBeat as a staff writer covering AI, big data, cloud infrastructure, and enterprise.

Blair comes to VentureBeat from IDG News Service where he served as a U.S. correspondent and provided breaking news and analysis of the public cloud, productivity and operating systems businesses. His articles have appeared in PC World, CIO, and InfoWorld, among others. Prior to IDG, Blair wrote for GeekWire, focusing on such large companies as Apple, Google, Microsoft, Amazon, Facebook and Twitter. He holds a B.A. in English from Whitman College.

Please join us in welcoming Blair Hanley Frank to VentureBeat! You can follow him at @belril.

Let’s block ads! (Why?)

Big Data – VentureBeat

Now Available: TDWI 2017 Checklist Report

Marketing is a key business driver within the modern digital enterprise. With significant advances in data management, software automation, and customer analytics, today’s marketer is empowered with a single view of each customer.

Further, new sophisticated practices in omnichannel marketing leverage the single customer view and related technical practices to more precisely target marketing to customers and prospects.

blog banner TDWI chklst report 17 Now Available: TDWI 2017 Checklist Report

This Checklist Report from TDWI dives deep into the data requirements of modern digital marketing, with a focus on the single customer view and omnichannel marketing. The goal of the report is to accelerate your understanding of evolving data and marketing best practices and tools to better equip your organization to execute or strengthen its digital marketing programs.

Download the report New Data Practices for a Single Customer View and Omnichannel Marketingtoday!

Let’s block ads! (Why?)

Syncsort blog

4 Ways to Use Dark Data

If you’re like most organizations, you collect a lot of dark data – which means data that you don’t put to work. But your data doesn’t have to stay dark. Keep reading for examples of ways you can put dark data to use by using it to gain new insights.

blog sparkler 4 Ways to Use Dark Data

The Sources of Dark Data

Before discussing use cases involving dark data, let’s take a quick look at where dark data comes from and why it is so prevalent in many businesses.

Infrastructure and business operations generate much more data than many companies are equipped to interpret.

blog eBook Customer360 4 Ways to Use Dark Data

For example, your networking devices probably generate huge amounts of information. Even if you take the time to collect all that machine data, it remains dark unless you analyze it.

In other cases, an inability to work with data efficiently is the reason the data stays in the dark. If the data is stored in a format that your analytics tools don’t support, you lack the ability to turn it into actionable information. In other cases, dark data may be stored on devices from which it is difficult to offload into analytics platforms.

blog light bulb 4 Ways to Use Dark Data

Putting Your Dark Data to Work

The crucial point to understand about dark data is that it doesn’t have to remain dark. The minute you take dark data and leverage it to gain insights, the data becomes actionable and is no longer dark.

To illustrate the point, consider the following examples of ways in which common forms of dark data can be used:

  • Networking machine data.  As noted above, servers, firewalls, network monitoring tools and other parts of your environment generate large amounts of machine data related to network operations. Avoid dark networking data by using this information to analyze network security, as well as to monitor network activity patterns to ensure that your network infrastructure is never under- or over-utilized.
  • Customer support logs.  Most businesses maintain records of customer-support interactions that include information such as when a customer contacted the business, which type of communication channel was used, how long the engagement lasted and so on. Don’t make the mistake of leaving this data in the dark, or using it only when you need to research a customer issue. Instead, build it into your analytics workflows by leveraging it to help understand when your customers are most likely to contact you, what their preferred methods of contact are and so on.
  • “Legacy” system log.  If you have mainframes or other older types of systems running in your environment, you may think that there is no way to use modern analytics tools to understand them. But you can. By offloading system logs and other data from these systems into an analytics platform like Hadoop, you can make sure you are not leaving this “legacy” data in the dark.
  • Non-textual data.  Most data analytics workflows are built around textual data, which is easier to ingest. You can also make use of video, audio or other non-textual files, however. You can analyze the meta data associated with them, or, if appropriate, translate speech to text in order to gain more insight into the content of the data itself. The effort required in this regard may not be worth it in all cases, but the bigger point worth keeping in mind is that your non-textual data doesn’t have to be dark data. There are ways to make it actionable if you need it to be.

Related: Making Dark Data Light Again

blog candle 4 Ways to Use Dark Data

Meeting the Dark Data Challenge

No matter which types of dark data your organization collects, or how it is stored, the key to keeping data out of the dark is to ensure that you have a means of translating data from one form to another and ingesting it easily into whichever analytics platform you use.

Syncsort’s suite of Big Data solutions, which includes data access, translation and integration tools like DMX-h, provides that functionality. It allows you to move data easily into Hadoop from environments that are traditionally very dark ones for data, like mainframes.

To learn more, check out Syncsort’s eBook, Bringing Big Data to Life, to discover best practices for managing dark data with Hadoop.

 4 Ways to Use Dark Data

Let’s block ads! (Why?)

Syncsort blog

Yes, You Can Pay Off Technical Debt – Even on Mainframes

If you’re in the enviable position of having paid off your student loans, your house, your credit bills and any other debt you might have, you may think you’re debt-free. But if you run legacy software and hardware, there’s a good chance you suffer from technical debt. Here’s what technical debt means and how to pay it off.

blog technical debt free sign Yes, You Can Pay Off Technical Debt – Even on Mainframes

Put simply, technical debt is any kind of legacy software or hardware that makes you less than 100 percent efficient but that you can’t easily get rid of.

Technical debt could result from poorly written application code that you haven’t been able to update because it would break dependencies elsewhere in your software stack. It could be an old storage array with low I/O rates that you still use because no one on your staff knows how to set up a new array.

Or it could be a mainframe system that doesn’t support the functionality of modern servers, but on which you still rely because you don’t have the budget to replace it.

In each of these cases, continuing to use the outdated system creates technical debt because it takes you more time or resources to achieve the same results that you could obtain in less time with a more modern or updated solution.

Plus, just as interest causes monetary debt to grow the longer you take to pay it off, your technical debt increases the longer you keep legacy hardware or software in place without updating them.

blog New DB Optimization Yes, You Can Pay Off Technical Debt – Even on Mainframes

The Solution to Technical Debt

An important thing to understand about technical debt is that replacing your systems from the ground up is not the only way to solve it.

In fact, that approach is not feasible at all. No organization can afford to replace all of its hardware and software the moment it starts to age – and even if it could, you’d still be fighting an unwinnable battle because hardware and software evolve so quickly that an app or server ceases to be the latest and greatest solution available the moment after you’ve set it up.

So, while you should upgrade systems when possible, a healthier and more realistic approach to handling technical debt is to find cost-effective solutions that allow you to keep the systems you have while updating their functionality.

blog industry4 0 Yes, You Can Pay Off Technical Debt – Even on Mainframes

Mainframes and Technical Debt

As an illustration of how that can be done, consider mainframes.

While mainframes are by no means dead, they’re no longer the first choice of most organizations when it comes to finding solutions for storing and analyzing data.  If you rely on mainframes today, it’s probably because they were bricked into your infrastructure long ago and replacing them is not manageable.

Related: Breathing New Life into the Mainframe

Yet the fact that legacy mainframes are not on their own, as ideal for today’s workloads as commodity servers does not mean you can’t upgrade your mainframes to make them work better. Using tools like the Big Data integration solutions available from Syncsort, you can equip your mainframes to handle modern workloads. Products like DMX-h allow you to migrate data seamlessly from mainframes to modern analytics environments like Hadoop, where it can be processed as efficiently as it could on the latest, most expensive server.

An approach like this exemplifies the proper solution to technical debt. In the case of mainframes, data integration tools provide a cost-effective solution for eliminating technical debt by modernizing the way you work with data without having to replace all of your infrastructure. Learn more about Syncsort’s mainframe solutions.

Want to learn more about strategies to reduce your mainframe technical debt? Download Syncsort’s new eBook, “5 Strategies for Mainframe Optimization: New DB Optimization and NW Management Choices

 Yes, You Can Pay Off Technical Debt – Even on Mainframes

Let’s block ads! (Why?)

Syncsort blog

Technology vs Humanity – The Future is already here. A film by…

[unable to retrieve full-text content]



Technology vs Humanity – The Future is already here. A film by Futurist …

Privacy, Big Data, Human Futures by Gerd Leonhard