Category Archives: Big Data
Enterprise software giant SAP launched the SAP.iO Fund today to invest tens of millions of dollars in startups making products that call upon SAP platforms, data, or APIs made by SAP.
Starting with an initial investment of $ 35 million, the fund will invest in dozens of enterprise startups in the coming years. SAP also plans to fund startup accelerators in tech capitals cities around the world, like San Francisco and Berlin.
Both the investment fund and accelerator programs will focus on machine learning, Internet of Things (IoT), and big data, according to SAP chief strategy officer Deepak Krishnamurthy.
SAP.iO Fund will invest in 10 to 15 companies in its first year, 25 to 30 the following year and up to 50 in its third year, Krishnamurthy said.
Each seed round will receive roughly $ 250,000, with additional investment in Series A funding rounds. Applicants will be asked to leverage APIs from SAP Systems.
“We’re at this crossroad now where machine learning, artificial intelligence, the internet of everything, blockchain, the notion of all of these unbelievable market forces and opportunities are going to still require that core system,” SAP CEO Bill McDermott told a crowd at SAP.iO offices in San Francisco Wednesday. “It’s going to be that core system, the inter-enterprise transactions between buyers and sellers in efficient global networks, is ultimately how customers are going to fundamentally rethink business models.”
Apart from the SAP.iO Fund is SAP.iO Foundry, a set of startup accelerators set to take place in Berlin, San Francisco, New York City, and Tel Aviv.
The Berlin and San Francisco accelerator programs will take place at SAP facilities, where startups will work alongside SAP intrapreneurs. The Berlin accelerator will be managed by TechStars, one of the first startup accelerators.
New York and Tel Aviv accelerators will be partnerships powered by the Junction in Tel Aviv and TechStars in New York.
Accelerator programs will last between 13 and 16 weeks.
“We want you to build your franchise and your dreams on our core business offering, and by doing that you can bring your imagination to the equation,” McDermott said. “As long as you’re authentic and you’re really running the software, we’ll put the full weight and power of SAP and all of our friends behind you and your idea.”
The first investment by the SAP.iO Fund is in Parable Sciences, a big data analytics company also known as Paradata.
“They’re very close to what we do from a supply chain and procurement perspective,” Krishnamurthy said.
SAP did not disclose the size of its investment in Parable Sciences.
PLEASE NOTE: this blog’s focus has changed. As of 9/1/2013, my posts will be zeroing in on the FUTURE OF PRIVACY and the role of ‘BIG DATA’ as well as on the latest developments in HUMAN-MACHINE relationships. Previously, this blog provided insights on…
[unable to retrieve full-text content]
Are you ready to be microchipped?
Walmart may use drones like in-store messenger pigeons.
The retail giant was granted a patent this week for a system in which drones would shuttle products between different departments inside its stores. The idea is to free customers from having to walk across its super-sized emporiums to find what they want and from having to wait while employees return from far-away storerooms.
Ultimately, Walmart believes that drones “can greatly improve the customer experience without overburdening the human associates of the facility.” Waiting, the filing said, “can contribute to reduced customer satisfaction.”
The drones would mostly avoid buzzing above customers’ heads by being routed over shelves instead of store aisles, according to the patent filing. But the filing did leave the door open to humans and flying delivery assistants crossing paths.
Although flying drones over people’s heads may not be a safety risk, steering clear of customers may “nevertheless provide an increased feeling of security for those below,” the filing said.
Of course, Walmart’s new patent is no guarantee that it will use drones. Companies file patents all the time, and not every one of them is put into use.
Fortune contacted Walmart for more information and will update this story if it responds.
As part of the drone delivery system, each store would have a computer system that would function as automated air traffic control for dispatching drones and deciding their flight paths. Sensors on the drones would help them avoid obstacles.
Several drone landing zones would be created in stores, some of which would be in plain sight of customers while others may be hidden from view.
Drones may uses nets or hooks to grab and carry items, depending on the product’s shape and weight. If needed, employees would attach items to drones after getting specific instructions through text messages or through an app.
Walmart’s proposed drone delivery system is different from other high-profile drone delivery projects by the likes of Amazon, Google parent company Alphabet, and UPS. Those initiatives involve drones delivering goods to people’s homes, but drone experts generally say that such delivery systems are years away because of factors like federal regulations, limited drone battery life, and the high cost of larger drones that are capable of traveling long distances.
A recent Gartner report said that businesses could start using drones to deliver goods within their own facilities, something akin to what Walmart is proposing. Additionally, a business that operates in a large office park could use drones to distribute packages like pneumatic tube systems used years ago for inter-office mail, Gartner analyst Gerald Van Hoy told Fortune in February.
This story originally appeared on Fortune.com. Copyright 2017
Wi-Fi is crucial to the way we work today. Fast, reliable, and consistent wireless coverage in an enterprise is business-critical. Many day-to-day operations in the enterprise depend on it. And yet, most of the time, IT teams are flying blind when it comes to individual experience. This springs from two main challenges.
The first challenge is data collection. We want to know the state of every user at every given time. But these states change constantly as network conditions and user locations change. With tens of thousands of devices being tracked, there is a huge amount of information to be collected. This volume of data simply cannot be handled in an access point or a controller running on an appliance with fixed memory and CPU.
The second challenge is data analysis. It takes considerable time and effort to sort through event logs and data dumps to get meaningful insights. And significant Wi-Fi intelligence is required to actually make heads or tails out of the data.
Someday soon, I believe, big data and machine learning will solve the above hurdles. It will allow me to ask my network how it is feeling, it will tell me where it hurts, and it will provide detailed prescriptions for fixing the problem (or will automatically fix it for me). While this seems to be a futuristic vision, the foundation to achieve it is already being laid out through big data tools and machine learning techniques like unsupervised training algorithms.
Using these technologies, we can now continuously update models that measure and enforce the experience for our wireless users. For example, we can ensure specific internet speeds in real time (i.e throughput) with a high level of accuracy. This allows the IT staff to know a wireless user is suffering before they even realize it — and thus before they have to log a call with the help desk.
Once a user problem is detected, machine learning classification algorithms can isolate the root cause of the problem. For example, is the throughput issue due to interference, capacity, or LAN/WAN issues? After isolating the problem, machine learning can then automatically reconfigure resources to mediate the issue. This minimizes the time and effort IT teams spend on troubleshooting, while delivering the best possible wireless experience.
I’ve written before how artificial intelligence will revolutionize Wi-Fi. I would love to be able to just unleash IT teams on sifting through hordes of data so they can glean meaningful information. But it is like finding a needle in a haystack. Machine learning is key to automating mundane operational tasks like packet captures, event correlation, and root cause analysis. In addition, it can provide predictive recommendations to keep our wireless network out of trouble.
Also key to this vision is the elastic scale and programmability that modern cloud elements bring to the table. The cloud is the only medium suitable for treating Wi-Fi like a big data problem. It has the capacity to store tremendous amounts of data, with a distributed architecture that can analyze this data at tremendous speed.
Wi-Fi isn’t new. But how we use Wi-Fi has evolved. And now more than ever, Wi-Fi needs to perform flawlessly. We are in an era where wireless needs to be managed like a service, with all the flexibility, agility, and reliability of other business-critical platforms. With machine learning, big data, and the cloud, this new paradigm is quickly becoming a reality.
Ajay Malik is a wireless technology expert at Google.
Your mainframes may be among the oldest part of your infrastructure. But they can also help you meet the newest challenges you face in the realm of cybersecurity. Here’s how.
If you follow IT security news, you know that the scope and nature of threats has evolved significantly in just the past few years. Long gone are the days when opportunistic worms and malware were your biggest worry, and some anti-virus software sufficed to secure your systems.
Cybersecurity Threats Today
Today, you have to contend with the threat of attacks carried out by experts who target your systems in particular, rather than just looking for an easy opportunity to steal some data or compute resources.
Meanwhile, the problem of Distributed Denial of Service, or DDoS, attacks is now greater than ever due to changes like the proliferation of IoT devices that hackers can use as a foundation for launching attacks. The Dyn DNS outage that occurred last fall – which shut down dozens of major websites for hours – made that distinctly clear.
Complicating matters further is an increase in the consequences that businesses face today when they are successfully attacked. In an age when virtually all data is digital and regulatory compliance fines are steeper than ever, the cost of cyberattacks adds up to hundreds of billions of dollars per year collectively. And beyond dollars, your company also suffers a major reputation hit if it joins the ranks of businesses that suffer high-profile security breaches.
Your Mainframe’s Role in Cybersecurity
How can you keep your company off of that list? There are many different steps you should take and tools you should implement, of course.
But for any business that relies on mainframes to power its operations, integrating those systems into the cybersecurity strategy is an essential part of the solution to preventing breaches.
Why? Because the data on your mainframes is the basis for detecting anomalies that, in many cases, are the first sign of an intrusion or breach.
After all, in industries like banking and aviation – where businesses work with high volumes of sensitive data and are lucrative targets for attackers – mainframes process millions of transactions per day. By establishing a baseline of normal transaction activity and searching for patterns that seem to be out of place, you can develop a proactive cybersecurity strategy that goes far beyond passively relying on firewalls or antivirus software to keep information secure.
Stopping Threats in Real Time
That’s not all. The huge volume of data processed by your mainframes can also help you to detect and stop attacks in real time. It allows banks to identify a fraudulent credit card transaction and stop it as it’s in progress, for instance.
That’s the ultimate goal, because detecting a breach and shutting it down before attackers have made off with the goods is much more effective than identifying an intrusion after the fact.
Integrating Your Mainframes into Your Cybersecurity Strategy
Now that you know why your mainframes should be an important part of your cybersecurity defense plan, you should also understand what it takes to integrate them effectively.
The answer here revolves around ensuring that your mainframe data can be fed easily into the tools you use for anomaly detection and other security processes. That’s trickier than it may sound because most threat detection tools are not designed to work natively with mainframe data.
That’s why you need a data integration solution, like Ironstream for logs or “machine” data and DMX-h for application data. Ironstream, for example, streams critical mainframe SMF files and other logs seamlessly into modern analytics platforms and SIEMs (security information and event management solutions) so it can be instantly correlated with other security data. Because manually transferring mainframe data to commodity servers and analytics environments takes a long time, automated solutions are the only way to enable real-time threat detection based on mainframe data.
Remember, too, that data quality counts when it comes to security. In order to avoid false positives – or, worse, false negatives – in your anomaly detection routines, you should make sure that your data is as clean and accurate as possible. Solutions like Trillium help significantly on this front.
Security is top of mind for many corporate security officers, but it also affects many others in IT, including mainframers. Read SIEM is Here: What You Should Know, to learn what Security Information and Event Management (SIEM) is and why it’s relevant to you.
Themes at Gartner Data and Analytics Summit: Abundant Data, Scarcity of Trust, and the Need for Data Literacy
The Gartner Data and Analytics Summit this week opened with a number of dichotomies which our decision-makers face, including an abundance of data that we don’t always trust. Improving our data literacy may be the key to reversing this trend.
Data Abundance vs. Scarcity of Trust
One of the dichotomies that particularly resonated was that of Scarcity vs. Abundance. For instance, even though we are now in a world of abundant data, we continue to see a scarcity of trust. Some recent studies have shown that greater than 60% percent of executives are not very confident in their data and analytics insights.
— Paige Roberts (@RobertsPaige) March 9, 2017
What results instead is another dichotomy: Confusion instead of Clarity. As I attended sessions covering a wide range of topics on data usage and analytics, I considered whether there was an opportunity to improve trust and quality in data while reducing confusion.
We have an exponential growth in data, computing power, and access, yet one analyst noted we only have a linear growth in the ability to use or consume that data. This creates a gap where we get overwhelmed by “facts” and instead fall back on “gut feel” or “sub-conscious” decision-making.
In her keynote, entrepreneur and author Margaret Heffernan noted that while we may have all the data, many times we find that no one will listen; that data alone will not drive change, particularly when it runs against the established, prevalent model of thought. This is at the heart of the trust issue: established models attract confirming data and repel disconfirming data.
The Need for Better Data Literacy
Yet at the same time, as Sam Esmail, creator and writer of the TV series Mr. Robot, noted that we can’t forget the part that data does not contain: human intelligence. Humans provide context, perspective, and solutions. There is a need then to come back to and invest in “data literacy” and help understand whether we are asking the right questions of data – the “why?” and “who?” rather than just “what?” and “how?”
— Philip On (@OnPhilip) March 7, 2017
What does this mean for understanding the quality of abundant data in our data lakes, a fundamental component to ensure not only trust in data but also validity in analytics and analytical models? Margaret Heffernan commented that “sometimes the data not going into the model is what counts” and that “anomalies are always interesting.”
Traditionally, our industry has had a very black or white view of data quality: it’s good or it’s bad. Data must conform to a number of dimensions such as completeness and validity if it is to be considered good data, and if it is bad then it’s an error that must be resolved or removed. That view may fit for tightly controlled operational processes, but if we’ve thrown out the anomalies, how can we truly test our data or find new business insights? Simply put, we can’t.
We need to step back and understand that “data literacy” requires us to ask and understand the business problem first (the “why?”), to understand what different users need (the “who?”), and then to understand the questions we may need to ask of data. As Heffernan observed, “one of the greatest uses of data is to provide disconfirmation of mental models.” We can’t do “data quality” for the sake of achieving “data quality.”
Instead, we need to provide a platform to bring in the range of data that may be relevant, including an understanding of its original context. Then, we let the data tell us what it can, see what it shows us even if it doesn’t fit the mental model of “good” data, and finally establish what quality data means to that business problem and the models, algorithms, and analytics we build to address it. (All the while bearing in mind that the quality data requirements of one problem may be completely different to those of another problem).
This approach changes how, what, and where we establish “data quality” in the data lake. Data quality shifts away from being a gatekeeper or filter to the data lake, to becoming a core part of the toolset to understand, explore, and refine the data that has arrived for the different users to take advantage of.
Data literacy then, is about providing users and consumers with the scientific approach towards data (including data quality) that allows them to frame questions in ways that help establish clarity rather than confusion; to prove or disprove established models; to generate new models, analyses, and reports that are supported by data; and to achieve understanding and insights that move all of us towards a greater abundance of trust.
Check out these Common Data Quality Use Casesto learn more about how to build trust in data by ensuring the most accurate, verified and complete information.
The internet has made it easier than ever to collect data about your customers. But does that data actually help more than it hurts? Only if it’s accurate, quality data. For effective customer engagement, you need to add data quality to your business toolbox.
Customer Engagement Today
Gone are the days when knowing your customers’ names and addresses was enough to remain competitive. Today, having a complete, 360-degree picture of your customers is essential.
That’s because technology has created new ways to reach customers and to nuance your relationship with them. You no longer have to treat customers as a monolithic group. Instead, you can leverage things like real-time analytics to deliver custom, individualized offers to different customers.
Approaching customers in this way is not just a nice-to-have feature. It’s essential in an age when your competitors are doing it. If you want to stay ahead, you need to have a complete understanding of your customers and use that information effectively when engaging them.
Customer Data Today
What does compiling a complete picture of your customers entail? It involves collecting a broad range of nuanced data points and making sure they are accurate.
Consider the following types of data points and how their role in understanding customers today is different than it was in the past:
- Phone numbers: Once upon a time, phone numbers were all tied to landlines that were often shared by multiple people. Today, almost everyone has a personal cell phone number. Collecting accurate phone numbers is, therefore, more important than ever for engaging people on an individual basis.
- Email addresses: Remember in the 1990s when email was mostly used for fun? Those days are over. Email is now a primary mode of communication for serious purposes. And because people tend to have multiple email addresses, it’s crucial to have the right ones on hand.
- Location data: It’s no longer enough just to have a customer’s home address on file. To enable real-time, dynamic engagement, you need accurate information about where your customer is at a given moment. Geo-spatial technology helps you get that information, but it’s only useful if it’s accurate – and geo-spatial data sources, like device IP addresses, are not necessarily reliable indicators when used blindly.
Validating customer email addresses is an important part of the data quality process.
The list could go on, but the main point is this: The opportunities for understanding your customers through data are more nuanced and important today than ever. But in order to make the most of those opportunities, the data you rely on needs to be accurate.
Data Quality with Trillium
With Trillium – which includes Trillium Precise, a Data-as-a-Service offering that makes getting started and using Trillium for data quality easier than ever – you can ensure that the data you rely on to understand your customers is as accurate as possible.
In turn, you can engage customers more effectively than ever, creating personalized experiences and tailored offerings that will keep you ahead of the competition in today’s hyber-digitized world. Browse the suite of Syncsort products.