Category Archives: Sisense
“Without Data, Nothing” — Building Apps That Last With Data
Every company is becoming a data company. Data-Powered Apps delves into how product teams are infusing insights into applications and services to build products that will delight users and stand the test of time.
In philosophy, a “sine qua non” is something without which the phenomenon under consideration cannot exist. For modern apps, that “something” is data and analytics. No matter what a company does, a brilliant app concept alone is insufficient. You have to deftly integrate data and analytics into your product to succeed.
Whatever your audience, your users are getting more and more used to seeing data and analytics infused throughout apps, products, and services of all kinds. We’ll dig into ways companies can use data and analytics to succeed in the modern app marketplace and look at some now-extinct players that might have thrived with the right data in their platforms.

Sentiment analysis in customer messages
Yik Yak was an anonymous chat app that looked promising initially but failed because of problems that could have been resolved with data and analytics. What made Yik Yak popular was the exotic feature that enabled members to chat anonymously with others in the same geographic vicinity. Unfortunately, that feature was also the cause of the app’s demise: Yik Yak capitalized as a startup with about $ 75 million and grew to a value of $ 400 million before uncontrolled cyberbullying ruined its reputation. After Yik Yak’s name was spoiled as a result of abusive chat, the company could not sell ads on its platform, meaning it could no longer monetize its innovative concept.
How could Yik Yak have used data and analytics to avert disaster? Luma Health showed how message data can be analyzed for mood and meaning by using AI/ML methods on a data lake of chat messages. Yik Yak could have tagged message content with the originating IP address and then quickly blocked messages from that IP after abusive language was detected. This hindsight can now become foresight for other enterprising companies.
The benefits of leveraging collective data
Color Labs was another successful startup whose failure could have been avoided with the right analytics. Although the company’s investment in AI and convolutional neural networks (CNNs) may have been significant, in retrospect, an innovative use of these technologies on the right data could have given it a better shot at survival. The basic service model behind Color Labs’ app was that users would share images and then see images from other users who were posting pictures in the same vicinity (a media-based counterpart to Yik Yak’s concept). The app failed in part for reasons that new dating apps often fail: Needing to go live with a million users on day one! Color Labs’ users joined up only to find little or nothing posted in their vicinity, giving them little incentive to post and share. and leaving them feeling alone in an empty room. The company ultimately folded.
How could data insights have solved this problem for Color Labs? Leveraging the right collective datasets with CNNs could have identified images tagged to a geographical place already freely shared on the internet. Those images could be used to populate the app and get the user engagement ball rolling. Using CNNs in that way is expensive but justifiable if it means keeping the company afloat long enough to reach profitability. New dating app startups actually use a similar trick — purchasing a database of names and pictures and then filling in the blanks to create an artificial set of matches to temporarily satisfy new subscribers’ cravings for instant gratification (one such database is marketed as “50,000 profiles.”) The gamble is that new subscribers will remain hopeful long enough for a number of subscribers to join up and validate their existence. Color Labs could have benefited from existing media with a much lower cost in terms of ethical compromise as well.
Forecasting and modeling business costs
Shyp was an ingenious service app that failed for a number of reasons, but one of those reasons could have been fixed easily with data insights. The basic innovation of Shyp was to package an item for you and then ship it using a standard service like FedEx. The company’s shortcut, which turned out to be a business model error, was to charge a fixed rate of $ 5 for packaging. Whether the item to ship was a mountain bike or a keychain, the flat rate of $ 5 for packaging was a hole in Shyp’s hull, one that sank the company in short order.
Shyp’s mistake could have been resolved cleverly by using the wealth of existing data about object volume, weight, fragility, temperature sensitivity, and other factors to create an intelligent packaging price calculator. Such a database could even have included local variations in the price of packing materials such as foam peanuts, tape, boxes, and bubble wrap, and have presented the calculation at time of payment. Flat fees are attractive and can be used as loss leaders when trying to gather new customers or differentiate oneself in a crowded market, but if you aren’t Amazon, then you need to square the circle somehow. A data-driven algorithm for shipping prices (or whatever your service is) doesn’t just make good business sense — it can even be a selling point!
Social vs. personal networks: Sentiment analysis in data
“Path” fashioned itself an anti-Facebook: According to its founder, former Facebook developer Dave Morin, Path was a “personal network,” not a social network, where people could share “the story of their lives with their closest friends and family.” And for a moment it almost looked like Path might allow people to do just that. The startup boasted a whopping $ 500 million value with steadfast investor confidence that lasted all the way until it faded into obscurity, ultimately being purchased by a Korean tech firm and then removed from app stores. Path intended to enforce its mission to provide personal networks of true friends by limiting each user’s friend count to 50. The friend limit was perceived as detrimental to Path’s success at a time when Facebook users often had thousands of friends, but this alone did not account for the apparent irrelevance of the novel app. What was the missing piece? Data analysis.
Path could have sustained itself as a stalwart alternative to Facebook users disenchanted with the endless mill of likes and heart emojis. The key would have lain in sentiment analysis of user message content: By using natural language processing methods to distinguish close friends from distant acquaintances, Path could have offered its users an innovative platform for knowing who their “real friends” were.
Data analytics and the competitive future
We have seen that startup apps based on ingenious concepts and with funding levels over $ 100 million failed for a variety of reasons that could have been ameliorated or averted with savvy, transformative uses of data, analytics, and insights. One of the original e-hailing taxi companies failed for no other reason than the founding designers’ lack of awareness that Yellow cab drivers in New York at that time did not carry mobile phones!
Data is not only useful for calculating and forecasting the future, it’s a must-have for your app. Every company with a novel concept to unleash into the market must face the reality, as these companies did, that a good idea alone won’t guarantee an app’s success. Innovative use of data in concert with that idea is something that no modern app can survive without.

Jack Cieslak is a 10-year veteran of the tech world. He’s written for Amazon, CB Insights, and others, on topics ranging from ecommerce and VC investments to crazy product launches and top-secret startup projects.
How Profectus Delivers Value from Data
Every company is becoming a data company. In Data-Powered Businesses, we dive into the ways that companies of all kinds are digitally transforming to make smarter data-driven decisions, monetize their data, and create companies that will thrive in our current era of Big Data.
Streamlining data management across high-volume transactions
Profectus is an international technology and services company that provides leading technologies for rebate and deal management, contract compliance and accounts payable audits. Founded 20 years ago and with offices in Australia, NZ, USA and The UK, their solutions are leveraged by 100 ASX listed companies, including Westpac and HSBC, Coca Cola Amatil, Vodafone, Coles, Kmart, JP Morgan, and Rio Tinto, just to name a few.
For Profectus, data is absolutely everything. From accounts payable data coming in as direct feeds from ERP finance systems, to hundreds of thousands of invoices Profectus’ solutions ingest on behalf of its customers, along with any agreement data that their customers have with their suppliers.
“We crunch enormous Accounts Payable data files, and thousands of rebate agreements, and invoices,” Profectus’ Chief Technology Officer, Mark Webster told attendees. “In the retail sector, for one of our biggest customers, we have 4TB of data that we crunch through every few months. That’s billions and billions of rows of data that we go through to find the different variances in order to find the best value for our customers.”

Part of Profectus’ suite of services is ensuring that every transaction is aligned with a particular deal. But Mark revealed that despite the data-rich services the company provides, a lot of teams still use Excel spreadsheets.
“These have their limitations due to their size and data sets,” he explained. “And when you become a large organization, spreadsheets just aren’t going to cut it for you anymore.”
According to Mark, Profectus found that on average, somewhere between 3.5-4 transactions per 10k transactions contained an error. This number may seem relatively insignificant, but when repeated across millions or even billions of transactions, these errors add up.
“With our solutions, we’re able to save millions of dollars for our clients, simply because we are able to find the details of these transactions buried deep in the data,” Mark explained. “And the reason why we’re able to do that is because we really pride ourselves on focusing on the detail and accuracy of our data analysis. We don’t use aggregate data, we don’t use rollups. We use full detail — and that’s where we find the full value.”
Leveraging smarter data tools to unlock deeper insights
Profectus does a lot of processing, with around 90 people in their office busy “crunching” through row by row of data. But with the company growing fast, the challenge is finding a better way to boost the productivity, efficiency, and accuracy of processing these vast volumes of data at scale.
“Our COO was wondering, how could we possibly bring on more customers and then try to grow the team?” Mark said. “If a customer signs up, well sales are doing their jobs properly. But as they bring all these extra customers on, who can service them? Our business is growing, but our cost base is growing with it, because we just have to hire more and more people to trawl through more and more spreadsheets — that can’t be the sustainable way to do it.”
Profectus began looking for technology to take over and find a solution to automate the process of extracting extremely large volumes of data.
“We wanted to have algorithms, ‘visualization stations’, that actually tease out the differences in the data in a lot more automated way, so that we’re not just throwing more and more human capital at it, but actually leveraging smarter technology,” he added. “Spreadsheets just die at a certain size, and communicating the results becomes extremely difficult.”
“Think about the resources taken for teams to carefully handcraft and curate large spreadsheets, then attach them into an email. Then the customer comes back with various edits and more attachments. Trying to merge all the edits and figure out which version is the right one just gets out of control. And this whole process just breaks down at scale.”
Discovering the “single source of truth” with Sisense
For Profectus, having a streamlined, automated online system, where there’s a single source of truth was their “holy grail” solution.
“We did a very thorough and rigorous examination of the BI space and we put all of the different platforms through the ringer, but Sisense came out as the leading BI solution on the market,” Mark said. “With Sisense, not only is the data stored safely and securely, but we can extract the full value from our data and we can get the consistent repeatable and scalable answers our business needs.
“We also are using embedded analytics, with a portal that our customers can log in to and see easily for a more unified customer experience — and Sisense allows us to do this far more easily.”
Importantly, it was the sheer scalable power of Sisense’s solution that Profectus found was unmatched in the market.
Unlocking data in Snowflake to deliver insights through Sisense
With a high-powered data warehouse in place, Profectus needed a tool to unlock data that answered critical business questions. Through a combined pairing of Sisense and Snowflake, the Profectus team is now able to unlock the data in Snowflake with datasets they provide, including CSVs, spreadsheets, and third-party API integrations. Snowflake’s speed supports the live connections, ensuring Profectus sees the freshest data in its warehouse whenever up-to-date metrics are needed.
“My team now relies on Sisense and Snowflake to simplify a variety of recurring data aggregation workflows, from reports to spend analysis. Anything that used to require manually aggregating and merging spreadsheets can be pulled out of Sisense.”
“As an example, we ran a representative data set that we had in our Snowflake data warehouse through a competing solution, but we killed the process at 20 minutes because that was already unacceptable both from a customer experience and cost perspective,” Mark explained. “With Sisense, we ran the same data set, and it processed the query within 20 seconds! That was our aha moment.”
“This sort of data efficiency gain is a big deal for us, because it helps us to achieve the scale we need to serve our customers and grow as an organization.”
The data-driven vision for the future
Moving forward, Profectus is excited to reap the benefits of its new “project Delta,” which involves leveraging Sisense’s solution as part of a revolutionary shift towards smarter data-driven decision making.
“Project Delta for us is all about leveraging the right technology solutions to instigate new and exciting change,” Mark explained. “We want to enable behavior change in our customers, and for our customers to be able to optimize their business decisions, transform the way they do business with their suppliers, and help them enjoy much greater value. We’re confidently shifting towards automating a lot of our processing, taking the problem away from all the 90 people who have to manually check line after line of data, and actually getting the computer to do the job.”
“Importantly, we’re putting the right visualizations online to solve our communications problems, so our customers, their suppliers, and our own analysts can all log into the same solution and look at the same source of data treatment. They can all actually see the same story at the same time consistently, with full version control and no errors.”
Ultimately, Profectus wasn’t just looking for a “software vendor,” but a technology and business partner to work together, to help bring these great solutions to market.
“This is where Sisense really shines for us, because they have very much the same vision that we have around how to unlock insights from data and then take powerful actions based on those insights,” Mark added. “Sisense has a very compelling vision, which fits perfectly with what we’re trying to achieve.”

David Huynh is a Customer Success Manager with Sisense. He holds a degree in Business Information Systems and has spent the last 9 years in a variety of fields including sales and project management. David is passionate about helping businesses leverage data and technology to succeed. When not in the office, he enjoys cooking, travelling, and working on cars.
Sisense and Signals Analytics Bring the Power of External Data to the Enterprise
We’re stronger when we work together. In our Partner Showcase, we highlight the amazing integrations, joint projects, and new functionalities created and sustained by working with our technology partners at companies like AWS, Google, and others.
Business teams constantly want to know how their companies are performing — against their internal goals and those of the market they compete in. They benchmark their performance against their previous results, what their customers are asking for, what their customers are buying, and ideally what their customers will buy. To get their answers, businesses typically rely on data sources that are all internal, showing decision-makers only part of the picture.
That’s now in the past. Today, through a strategic partnership, Signals Analytics and Sisense are making it easy to incorporate external data analytics into a company’s main BI environment. The result is a broader, more holistic view of the market coupled with more actionable and granular insights. Infusing these analytics everywhere, democratizes data usage and access to critical insights across the enterprise.
Organizations who are truly data-driven know how to leverage a wide range of internal and external data sources in their decision-making. The integration of Signals Analytics in the Sisense business intelligence environment gets them there faster and seamlessly, without the need for specialized resources to build complex systems.
Kobi Gershoni, Signals Analytics co-founder and chief research officer

Why external data analytics?
The integration of Signals Analytics with the Sisense platform delivers on the promise of advanced analytics — infusing intelligence at the right place and the right time, upleveling standard decisions to strategic decisions, and speeding the time to deployment. Combining internal and external data unlocks powerful insights that can drive innovation, product development, marketing, partnerships, acquisitions, and more.

Primary use cases for external data analytics
External data is uniquely well-suited to inform decision points across the product life cycle, from identifying unmet needs to predicting sales for specific attributes, positioning against the competition, measuring outcomes, and more. By incorporating a wide range of external data sources that are connected and contextualized, users benefit from a more holistic picture of the market.
For example, when combining product reviews, product listings, social media, blogs, forums, news sites, and more with sales data, the accuracy rate for predictive analytics jumps from 36% to over 70%. Similar results are seen when going from social listening alone to using a fully connected and contextualized external data set to generate predictions.

The Sisense and Signals Analytics partnership: What you need to know
- Signals Analytics provides the connected and contextualized datasets for specific fast-moving consumer goods (FMCG) categories
- Sisense users can tap into one of the broadest external datasets available and unleash the power of this connected data in their Sisense dashboards
- The ROI of the analytics investment dramatically increases when combining historical data, sales, inventory, and customer data with Signals Analytics data
Integrate external data analytics in your Sisense environment in three easy steps
Step 1: Connect
From your Sisense UI, use the Snowflake data connector to connect to the Signals Analytics Data Mart. The data can be queried live in the Sisense ElastiCube.
Step 2: Select
Once the data connection has been established, select the data types needed by filtering the relevant “Catalog.”
Step 3: Visualize
Select the dimensions, measures, and filters to apply, then visualize.

More data sources, better decisions
Your company is sitting on a large supply of data, but unless and until you find the right datasets to complement it, the questions you can answer and the insights you can harness from it will be limited. Whatever your company does and whatever questions you are trying to answer, mashing up data from a variety of sources, inside the right platform, is vital to surfacing game-changing insights.
To get started on the next leg of your analytics journey start a free trial or become a partner.

Sisense Q4 2020: Analytics for Every User With AI-Powered Insights
Sisense News is your home for corporate announcements, new Sisense features, product innovation, and everything we roll out to empower our users to get the most out of their data.
Every company is becoming a data company; there’s no getting around it. Savvy organizations know they don’t need to fear data and analytics — they see better insights as the pathway to a brighter future.
Yet a recent Gartner survey shows that 50% of organizations lack sufficient data literacy skills to achieve business value. With our Q4 release, Sisense is bridging the skillset gap to help organizations unlock business potential faster with AI-powered explanations, an enhanced live data experience, and a robust new reporting service.

Smarter insights with AI-powered data explanations
Looking at a chart, it can be difficult to uncover actionable insights. Data doesn’t yell, “Your sales increased by 10% last month because an influx of men aged 18-24 in Seattle bought more beard-trimming kits without any promotion.”
Often, to find those types of insights, you slice, dice, and filter. However, this requires a level of familiarity with the data itself. Most of us don’t have the time to dig deep into the data to get rich intelligence that we can use in our daily and strategic decisions.
Now you don’t have to! Sisense fuses your business expertise with deep insights through the power of AI-powered Explanations. Sisense Explanations provides easy, deep data exploration for every user, across the entire data journey.
To start, Sisense does the heavy lifting by highlighting anomalies and points of interest in your data for further exploration. Or a user can simply click on any point to discover the driving force behind the data. Sisense leverages all of the dimensions in your data and runs combinations to determine the exact impact that each variable has on a data point.
As another example, if your sales went up by 10%, Sisense might explain that the increase was attributable to both a specific product category and a certain age group of customer with a visual display of the breakdown. Within seconds, any user can understand the data without the need for specific technical expertise and get the deep intelligence to uplevel every decision.
Fast insights: Enhanced live data experience
With the global economy, your business doesn’t stop at 5 p.m., so why should you limit your data-driven decisions to those hours? These kinds of round-the-clock decisions require the most up-to-date information, which can only be surfaced with the aid of real-time data. Our goal at Sisense is to continually make sure your most critical real-time decisions can be made with ease.
For every query, Sisense translates live widget information into SQL data. This quarter, we released a new translator service to reduce the time and cost for these queries on your Snowflake live connection, with other live connections in beta. Reduce data query time by up to 70% and give your live widgets up to a 15% performance boost!
Code-savvy users can crack open the optimized SQL code that drives live visualizations to easily understand and validate the logic. It’s also simple to track, monitor, and optimize activity on your live data sources to better understand the context of your live queries.
Deeper understanding with next-level reporting
As you bring more robust intelligence to everyone at your company, we continue to make it easy to effectively communicate repeatable and ad hoc reports with a new reporting engine. Now, reporting is better than ever with the ability to generate PDFs and images at scale to share reports with everyone in the organization.
Create robust reports that feature your customizations such as Sisense BloX to ensure every insight is visible and can be understood. The technology is packed in a new microservice and has dedicated queues that can be managed to scale so your reports are always delivered on time at the scale you need.
Toward a data-powered future
Think of all the data your company is sitting on. Now think of all the hoops you need to jump through to make sense of any of it. There are countless decisions you and your colleagues make every day that could be enhanced with the right insights, drawn from your complex array of datasets. Bridging the skills gaps to allow your organization to successfully leverage its datasets is challenging, but with the enhanced functionality of the latest Sisense release, you’re one step closer to becoming a truly data-driven organization.

Mindi Grissom is Director of Product Marketing at Sisense. She has over 5 years of experience in the technology industry, helping thousands of organizations transform their business with data and analytics.
Your Data, Your Brand: Creating Trust in Integrated Workflows and Reporting
Every company is a data company. Insights Everywhere explores the ways companies are evolving to include analytics in their products as a market differentiator and revenue driver.
The key to success for apps of all kinds is stickiness, or getting users to, well, use your app regularly. The stickiest business applications are those that can be seamlessly embedded into the toolkit of your target audience’s daily work environment. Analytic apps are applications that can be bidirectionally embedded, rebranded, and integrated into everyday workflows, connecting employees and customers with data and insights to help them make smarter decisions.
When you’re evaluating a tool to enrich your core product or service, certain features are de rigueur: security, governance, compliance, etc. However, beyond these nuts-and-bolts elements, you also need to consider features that impact how your brand presents itself to users.

The power of branding
Brand-related features may even be the most important criteria when choosing a platform like Sisense to enhance your analytic app. Making sure your analytic app matches your brand’s look and feel is vital to ensuring that the application enjoys the full trust of the user. This includes the app’s user interface, colors, and fonts, but more importantly, you need to drive home the idea that data is integral to your organization. Your data (and any analytics presented to internal or external users) is one of your most important assets, and how these insights are displayed impacts your integrity.
Making data-based decisions is becoming increasingly important to organizations of all kinds. Savvy companies are finding ways to infuse insights into their workers’ daily tasks, allowing them to seamlessly make decisions within a business application. Additionally, these same companies are integrating insights and analytics into their customer-facing products to increase stickiness and even drive new revenue.
Wherever you’re putting data and insights, it’s imperative that any analytics presented to decision-makers be displayed within your brand’s guidelines. In short, your data and analytics must look and feel like they are coming from you.
For this reason, the Sisense product team requires every component built with APIs to keep this integration of data and analytics into other applications in mind. These APIs empower developers to tap into any Sisense interface or functionality and enhance, rebrand, or integrate it into the brand’s own analytic apps and off-the-shelf business systems.

Custom analytics with the Linux Pivot API
The newly released Linux Pivot encompasses a new set of APIs, allowing brands to customize the look and feel of, transform the data in, or change the structure of pivot tables to fit their unique needs.
Developers are motivated to extend and enhance their software using the functionality that standard and supported APIs (like the JavaScript APIs) offered over the pivot table. Using these APIs, developers can ensure that their custom solutions are backward compatible with every version upgrade and count on the vendor to announce breaking changes well in advance for them to mitigate issues. Their software becomes a fully functioning extension of the Sisense Pivot Table, never a hack.
To jump-start the adoption of the new Pivot Table APIs and offer cloud-native customers a list of reusable extensions, Sisense partnered with Paldi Solutions. The joint project took only a few weeks and some well-planned lines of code. Paldi’s team posted several extensions in the Sisense Plugins forum, including adding visual indicators to cells, checkboxes as interactive filters, and presenting sparklines to accompany the data (plus several other features).
Ravid Paldi, CEO of Paldi Solutions, puts it this way: “Two especially interesting features are the new ‘transformPivot()’ function with a very convenient cell selector for cells you wish to manipulate, and the ability to inject React components as your cell’s content. Cool stuff!”

For example, many of our customers in the retail industry present images of their products that align with each product’s key performance indicators inside an analytic app. These are presented in pivot tables; each image is clickable and allows the marketing manager to easily jump into their e-commerce campaign management tool or the product page on their Content Management System.
Taking our example a few steps further, let’s assume that inside the pivot table you have the forecast sales for each product, along with the allocated marketing budget. Using Sisense’s AI capabilities such as Forecast and AI Trends, the marketing manager can adjust the campaign budget within the analytic app to achieve the greatest return. It’s a fully integrated workflow that makes the marketing manager’s job easier and more productive.
Empowering with plugins
These add-ons extending the power of pivot tables are only a few of the numerous solutions customers, partners, and Sisensers post to the Plugins forum. As an open platform, Sisense offers an abundance of APIs, both REST and JavaScript, to embed, rebrand, extend, and customize analytic apps.
Whatever your company’s core product or service is, infusing data and analytics throughout will make it stickier, deliver more value to internal and external users, and can even help drive revenue from your data. Using a platform like Sisense, with an emphasis on flexible, powerful APIs (versus building your analytics from scratch internally), can be a game-changer when it comes to simplifying your analytics deployment. Analytics are the future of every industry, so choose a partner that’s as committed to your success as you are.

Tomer Lapid is a Product Manager for Sisense analytics and reporting. He brings 20 years of experience in a variety of R&D, customer-facing, leadership, and product management roles that he combines into one superpower: problem-solving!
Unlocking Data Storage: The Traditional Data Warehouse vs. Cloud Data Warehouse
We live in a world of data: There’s more of it than ever before, in a ceaselessly expanding array of forms and locations. Dealing with Data is your window into the ways data teams are tackling the challenges of this new world to help their companies and their customers thrive.
The data industry has changed drastically over the last 10 years, with perhaps some of the biggest changes happening in the realm of data storage and processing.
The datasphere is expanding at an exponential rate, and companies of all sizes are sitting on immense data stores. And where does all this data live? The cloud.
Modern businesses are born on the cloud: Their systems are built with cloud-native architecture, and their data teams work with cloud data systems instead of on-premises servers.
The proliferation of cloud options has coincided with a lower bar to entry for younger companies, but businesses of all ages have seen the sense of storing their data online instead of on-premises.
The increased interest in cloud storage (and increased volume of data being stored) coincides with an increased demand for data processing engines that can handle more data than ever before.
The shift to the cloud has opened a lot of doors for teams to build bolder products and infuse insights of all kinds into their in-house workflows, user apps, and more.
The cloud is the future, but how did we get here?
Let’s dig into the history of the traditional data warehouse versus cloud data warehouses.

Data warehouse vs. databases
The boosted popularity of data warehouses has caused a misconception that they are wildly different from databases. While the architecture of traditional data warehouses and cloud data warehouses does differ, the ways in which data professionals interact with them (via SQL or SQL-like languages) is roughly the same.
The primary differentiator is the data workload they serve. Let’s explore:
Data warehouse: online analytical processing (OLAP) |
Database: online transaction processing (OLTP) |
Write once, read many | Write many, read many |
Best for large table scans | Best for short table scans |
Typically a collection of many data sources | Usually one source that serves an application |
Petabyte-level storage | Terabyte-level storage |
Columnar-based storage | Row-based storage |
Lower concurrency | Higher concurrency |
Examples: Redshift, BigQuery, Snowflake | Examples: Postgres, MySQL |
Given that both data warehouses and databases can be queried with SQL, the skillset required to use a data warehouse versus a database is roughly the same. The decision as to which one to use then comes down to what problem you’re looking to solve.
If there’s a need for data storage and processing of transactional data that serves an application, then an OLTP database is great. However, if the goal is to perform complex analytics on large sets of data from disparate sources, a warehouse is the better solution.
Before we look at modern data warehouses, it’s important to understand where data warehouses started to see why cloud data warehouses solve many analytics challenges.

Traditional vs. Cloud Explained
Traditional data warehouses
Before the rush to move infrastructure to the cloud, data being captured and stored by businesses was already increasing, and thus there was a need for an alternative to OLTP databases that could process large volumes of data more efficiently. The business began to build what are now seen as traditional data warehouses.
A traditional data warehouse is typically a multi-tiered series of servers, data stores, and applications.
While the organization of these layers has been refined over the years, the interoperability of the technologies, the myriad software, and orchestration of the systems make the management of these systems a challenge.
Further, these traditional data warehouses are typically on-premises solutions, which makes updating and managing their technology an additional layer of support overhead.
Cloud data warehouses
The traditional data warehouses solved the problem of processing and synthesizing large data volumes, but they presented new challenges for the analytics process.
Cloud data warehouses took the benefits of the cloud and applied them to data warehouses — bringing massive parallel processing to data teams of all sizes.
Software updates, hardware, and availability are all managed by a third-party cloud provider.
Scaling the warehouse as business analytics needs grow is as simple as clicking a few buttons (and in some cases, it is even automatic).
The warehouse being hosted in the cloud makes it more accessible, and with a rise in cloud SaaS products, integrating a company’s myriad cloud apps (Salesforce, Marketo, etc.) with a cloud data warehouse is simple.
The reduced overhead and cost of ownership with cloud data warehouses often makes them much cheaper than traditional warehouses.
Cloud data warehouses in your data stack
We know what data warehouses do, but with so many applications that have their own databases and reporting, where does the warehouse fit inside your data stack?
To answer this question, it’s important to consider what a cloud data warehouse does best: efficiently store and analyze large volumes of data. The cloud data warehouse does not replace your OLTP database, but instead serves as a repository in which you can load and store data from your databases and cloud SaaS tools.
With all of your data in one place, the warehouse acts as an efficient query engine for cleaning the data, aggregating it, and reporting it — often quickly querying your entire dataset with ease for ad hoc analytics needs.
In recent years, there has been a rise in the use of data lakes, and cloud data warehouses are positioning themselves to be paired well with these. Data lakes are essentially sets of structured and unstructured data living in flat files in some kind of data storage. Cloud data warehouses have the ability to connect directly to lakes, making it easy to pair the two data strategies.
A data-driven future powered by cloud data warehouse technologies
The three most popular cloud data warehouse technologies are Amazon’s Redshift, Snowflake, and Google’s BigQuery. They each handle the same workloads relatively well but differ in how computing and storage are architected within the warehouse.
While they’re all great options, the right choice will be based on the scaling needs and data type requirements of the business. Beyond that, the pricing structure for the three varies slightly, and based on the use case, certain warehouses can be more affordable than others.
As the number of cloud data warehouse options on the market grows, niche players will rise and fall in every industry, with companies choosing this or that cloud option based on its ability to handle their data uniquely well.
Whatever your company does and wherever you’re trying to infuse insights, be it into workflows or customer-facing apps, there’ll be a cloud option that works for you.
The future is in the clouds, and companies that understand this and look for ways to put their data in the right hands at the right time will succeed in amazing ways.

Adam Luba is an Analytics Engineer at Sisense who boasts almost five years in the data and analytics space. He’s passionate about empowering data-driven business decisions and loves working with data across its full life cycle.
Why Data Will Power the Self-Driving Car Revolution
We live in a world of data: There’s more of it than ever before, in a ceaselessly expanding array of forms and locations. Dealing with Data is your window into the ways data teams are tackling the challenges of this new world to help their companies and their customers thrive.
Over the years, some bold predictions have been made about the impact autonomous vehicles (AVs) will have on our daily lives. Researchers from the National Highway Traffic Safety Administration estimated that fully autonomous cars could reduce traffic fatalities by up to 94% by eliminating accidents due to human error. Meanwhile, “Science” magazine reported that introducing even just a small number of AVs onto the roads could improve overall traffic flow and reduce trip times. And the European Commission stated that transport will be “…cleaner, cheaper and more accessible to the elderly and to people with reduced mobility” as a result of fully automated and connected mobility systems.
It’s difficult not to get excited about this future. However, according to experts, it might be further away than we’ve been led to believe. Real-life AVs are a huge undertaking, composed of regulatory hurdles, programming and data challenges, and a massive culture shift. They will change the world in ways that only science fiction writers and futurists have envisioned, and data will play a huge role in that story.

Baby steps for self-driving vehicles
There’s been huge buzz around self-driving cars for years now, and countless startups and established car companies have set to the task of handling every part of the AV puzzle. Despite the extraordinary efforts of many of the biggest automotive industry players, fully autonomous cars are still inaccessible, except in special pilot programs.
“While we are already seeing a small number of AVs being tested on our roads today, they have limited capabilities and can only drive in very specific conditions,” explained Ryan Pietzsch, a driver safety education expert with the National Safety Council, a not-for-profit organization promoting health and safety in the U.S. “I liken this to the advent of the cellphone. The first cellphones had limited abilities, and their coverage was extremely narrow. They still don’t work in all areas, but the network is much improved. We are finding the same with AVs. The reality is that most cars on our roads have a very low level of autonomy.”
The Society of Automotive Engineers defines six levels of driving automation, from 0 (fully manual) to 5 (fully autonomous). These levels have been adopted by the U.S. Department of Transportation.
“There really are no autonomous vehicles operating today,” said Mike Ramsey, vice president and analyst for the automotive and smart mobility division at analyst firm Gartner. “There are some automakers, like Tesla and GM, that offer systems that can handle some driving tasks but require people to be paying attention. These are so-called level 2 vehicles. What most people think of as autonomous vehicles are what we would call level 4 or 5, where the car drives itself and a human never has to pay attention. Level 3 is conditional automation, where the car could drive itself in specific areas, but where a person would have to take over when those conditions aren’t met. There are some researchers who think that this type of system won’t really be possible because people can’t be trusted to pay enough attention to the environment.”
There are numerous challenges to the safe introduction of fully autonomous cars to roads filled with human drivers. However, the experts agree that there is one critical enabler in expediting their adoption — data.
Data is the dealbreaker
“Data is a critical factor in getting to where we need to be,” explained Ramsey. “AVs are the most advanced version of artificial intelligence (AI) that we are working on right now and require an enormous amount of data to do machine learning to improve the computer’s ability to understand the world and make decisions. Almost a limitless amount of data is required to train vehicles because what we are trying to do is duplicate how a human mind works.”
Cara Bloom, a senior cybersecurity engineer at not-for-profit organization Mitre Labs and former staff researcher at Carnegie Mellon University’s CyLab, agreed. “Computers are the new drivers,” she said. “Much of the data that is used to drive non-autonomous vehicles is the exact same data that AVs will use, but the difference is in what processes that data: a person or a computer. The road conditions, signage, weather, maps, and predictions about other cars on the road are all ‘data’ that both people and computers must process to drive safely. But AVs won’t just use data; they will create it and use the new information to make new decisions — some of which are not decisions we have been afforded before.”
“Data is extremely important to the improvement of all advanced driver-assistance systems and autonomous features,” added Pietzsch. “Specifically, data is advancing the improvement of sensor systems such as light detection and ranging (LiDAR), as well as sensor performance — reducing false activations and improving overall autonomous performance. When we see level 4 or level 5 AVs on our roads, it will be because of data engineers’ ability to collect the correct data, interpolate the data correctly, invest in hardware changes when needed, and implement successful changes and improvements.”

Data centers on wheels
For level 4 and 5 AVs to become commonplace on our roads, it’s clear that more computing power is needed — especially when you consider that today, even at lower levels of autonomy, connected cars generate around 25 gigabytes of data per hour. AVs of the future will require different types of storage — and lots of it — to gather data from LiDAR, radar, cameras, and other sensors as well as in-vehicle infotainment, navigation systems, and maintenance data. In fact, according to forecasts by Western Digital, the storage capacity per vehicle could amount to 11 terabytes by 2030.
“The most advanced prototypes of level 4 and 5 AVs carry huge computers,” said Ramsey. “These computers need to get smaller so that processing can be done in the car itself — this is important to reduce the amount of time lag and the cost of transferring data to the cloud.”
Ramsey said that, while all real AI and machine learning (ML) processing is done in the cloud right now, this will change. “While we won’t get to the stage where cars will do most of the heavy lifting and ML onboard, what we will see is real-time data analytics in vehicles. For example, an AV driving down a street will recognize a feature of the neighborhood that isn’t in its HD map and react accordingly. If it has to do this repeatedly, then it will make an adjustment on board and send information to the cloud, but it will have already adjusted its behavior based on what it sees in its environment.”
Meanwhile, Pietzsch said, further advances need to be made in how data is retrieved remotely. “Some progress is being made already,” he said. “Advancements in data sharing to the cloud will greatly improve accuracy and advancement of ML. We are starting to see software updates based on ML being sent directly to vehicles through satellites. This provides the most up-to-date technology to the vehicle, which is important if we ever move away from having an engineer physically in the AV directly plugged into the computer.”
Security and privacy concerns
For Bloom, however, the biggest hurdles to getting autonomous cars on our roads revolve around the privacy and security of data. “Because AVs collect data in public where there is little ‘reasonable expectation of privacy’, they are not subject to many of the privacy laws in the U.S. and abroad,” she explained. “The data collected by AVs in the U.S. will likely be owned by the collector of the data, not the data subject. The data subjects themselves are unlikely to have the option to opt out of data collection on public roads by AVs or other sensors, except to avoid such sensors entirely.”
Bloom also pointed out that if AVs are in a collective fleet, such as for ride-sharing, the data could be centralized, stored, analyzed, and sold for profit (as has happened with other centralized data aggregators). “If AVs have facial recognition and license plate recognition systems, that data could be used to surveil populations and sold for profit — in addition to being used for socially beneficial purposes such as safety and traffic management,” she said. “For example, is it OK if a fleet of AVs collect license plate data to track down a vehicle that’s involved in an Amber Alert? What if this data is also used for open warrants? For insurance company premiums? Advertising? Since it is infeasible for people to opt out of all data collection by AVs, it is essential to fulfill their expectations upfront to prevent harm. Makers of AVs will need to determine what acceptable and safe data use is before implementing these technologies. If not, they could face backlash from consumers and regulators.”
AVs will also need advanced encryption schemes and stringent technical and policy measures to protect the location privacy of the passengers, Bloom said. “Without security, the vehicles will not be safe or trustworthy: They could be rendered inoperable by ransomware, used to surveil populations, or intentionally endanger passengers and others.”
A call for collaboration
In addition to the aforementioned concerns, the experts agreed that, in order to make better progress and realize the many benefits of AVs, the industry needs to better collaborate.
“Industry collaboration is undoubtedly key to future success,” said Ramsey. “Labeled data is so critical to train machine learning models to develop and deploy AVs.”
Thankfully, progress is being made in this direction. Earlier this year, Waymo (formerly the Google self-driving car project) and Ford released open datasets of information collected during AV tests and challenged developers to use them to come up with faster and smarter self-driving algorithms. Meanwhile, U.S. startup Scale AI, in collaboration with LiDAR manufacturer Hesai, launched an open-source dataset called PandaSet that can be used for training machine learning models for autonomous driving.
The U.S. Department of Transportation has also been working with stakeholders to prioritize and facilitate the iterative development of voluntary data exchanges to accelerate safe integration of AVs. Improving access to work zone data is one of the top needs identified.
“We launched the WZDx Specification to jump-start the voluntary adoption of a basic work zone data specification through collaboration with data producers and data users,” explained a spokesperson from the organization. “Longer term, the goal is to enable collaborative maintenance and expansion of the specification to meet the emerging needs of [automated driving systems].”
Work is also underway to facilitate the sharing of key mapping data. “All players operating in the self-driving vehicle industry need to agree on defining how mapping data can be shared between companies and authorities, to speed up the development of safe self-driving vehicles, without hindering competition,” stated a recent report by British-government-backed AV accelerator organization Zenzic. “Merging mapping data from regional sources requires streamlining to avoid multiple different ways of processing and handling data. Mapping data quality, specifically accuracy and precision of such data, is seen to be more important than resolution.”
The Zenzic report advises the connected and self-driving technology industries to follow the gaming, weather, and building information modeling sectors in a quest for common terminology.
The road toward an autonomous future
Creating a safe and successful AV industry is likely to bring huge economic and social benefits to consumers and industry alike. Major automaker companies, technology giants, and specialized startups have already invested more than $ 50 billion in AVs over the past five years, and their investments will only continue in the years to come.
For Pietzsch, it’s money well spent. “There’s a lot still to learn, but as our knowledge of data science expands, so too will the development of AVs,” he said. “This will also have far-reaching implications for other areas. After all, data science impacts our daily lives. The lessons that NASA has given us over the years have spawned countless opportunities and have influenced industries from adhesives to transportation. Similarly, I expect that the science that is occurring in the development of AVs will create opportunities and spark advancements in data analytics that can be used in other industries, and other industries that are not currently in the AV area may find themselves in it.”

Lindsay James is a journalist and copywriter with over 20 years’ experience writing for enterprise business audiences. She has had the privilege of creating all sorts of copy for some of the world’s biggest companies and is a regular contributor to The Record, Compass, and IT Pro.
Sisense and Adobe: Custom Analytics + Custom Visuals
We’re stronger when we work together. In our Partner Showcase, we highlight the amazing integrations, joint projects, and new functionalities developed with companies like Adobe and others.
You didn’t become a product developer to leave your dreams and visions half-realized. When it comes to building amazing apps, design matters. The Sisense data and analytics platform already gives you unparalleled flexibility when it comes to what you can do with your data as you embed insights into your product. Now, enhanced integration with two heavy-hitter Adobe Creative Cloud programs, XD and Photoshop, takes your ability to create and deploy custom visuals to new heights.

Design reigns supreme
It’s not just enough for your app to employ data and analytics in interesting, compelling ways, it also needs to look great. App design (visual style, UI, UX, etc.) has undergone rapid evolution in the past decade. Consumers want a smooth, easy-to-navigate experience and they also want your app to look great. Plus, your friends in the marketing department want your app’s style to perfectly match your brand guidelines — especially when embedding third-party analytics like Sisense into your product.
Branding matters! Whether they are cognizant of this or not, your users know your brand. Your colors and font choices are integral to your brand’s conception in your audience’s mind. When deploying analytics and data elements into your product, you need to match your look and feel.
The right integrations allow you to take this further, blending your branding needs with custom visuals. Whether you want an interactive animated visual or a custom image to go in your dashboard, Adobe XD and Photoshop can help you create it. Adobe has set the industry standard for beautiful design and our integration empowers product teams to work with their design colleagues to turn beautiful concepts into functional reality.

Sisense + Adobe XD: Vibrant, versatile vectors
The Sisense data and analytics platform is built to differentiate the analytics/dashboards you’re providing to your end-users. Rebuilt from the ground up for cloud-native architecture, wherever your data lives and whatever insights you want to present to your users, you can do it with Sisense.
But what about custom vector visuals that go beyond the usual Sisense range of options? With the Sisense plugin for Adobe XD, if your designers can imagine it, you can implement it. Custom animated visuals, like this thermometer, add a dynamic element that will delight users and convey usable information in a compelling way:
Representing data in interesting, consumable ways gives users a faster, more engaging way to understand data. It also gives product developers like you a way to create more beautiful analytic apps and truly bring your wildest ideas to life. (Tech aficionados will appreciate that the plugin was rewritten from scratch using React and features SVG Export to control whether code is minified; read more here!)
Sisense + Photoshop: Beautiful, functional bitmaps
Photoshop is one of the most vital programs in the modern design world. Sisense and Adobe have taken our collaboration to the next level with the release of the Sisense plugin for Photoshop, which lets you put custom Photoshop visuals into your Sisense embedded analytics deployment.
Simply put, whatever you and your product team can dream up, you can have a designer with Photoshop skills put together. Then you bring your beautiful, functional imagery into Sisense. That’s right, you’re not just dressing up normal Sisense insights with fancy pictures; the Sisense plugin for Adobe Photoshop provides advanced automation to update graphics and text on dashboards in real-time. Changes to the content are controlled in Adobe Photoshop and automatically reflected on Sisense dashboards. Take a look at what customers can already do with it:

Breaking down silos; building better products
The unparalleled custom imagery you can now create inside your analytic apps with Sisense’s plugins for Adobe XD and Adobe Photoshop is a game-changing leap forward for product teams and design teams alike.
For starters, it breaks down the imaginary walls between these two teams, allowing more people in the organization to build analytics, as designers can now easily create widgets that will live inside the embedded Sisense analytics. It also integrates designers into the process of creating analytics for end-users, instead of keeping them at a distance or using their skills piecemeal.
These plugins also remove friction when building analytics and dashboards: The diverse teams building your embedded analytics can share and collaborate on images, graphics, and the analytics functionality itself all during the same process. The result is a better product, faster!
Your users will love your new creations too. Creating fun, interactive designs is the perfect way to reduce chart/data overload. Again: Design reigns supreme! Users demand fun, easy-to-use, beautiful experiences. A better user experience also translates to increased stickiness and user satisfaction (and in the long run more revenue!).
The math is simple: (Sisense + Adobe) * (Builders + Designers) = better, more beautiful insights for all.
To install the Sisense plugin for Photoshop visit our listing on the Adobe marketplace and learn more about the steps to install here. To install the Sisense plugin for XD click on this link and read the Sisense support documentation here. Go forth and build boldly and beautifully!

Lio Fleishman is the Partnership Solutions Engineer at Sisense and is passionate about JavaScript and front-end engineering tools. He is obsessed with making the way engineers do their jobs easier and better every day.
Harnessing Streaming Data: Insights at the Speed of Life
We live in a world of data: There’s more of it than ever before, in a ceaselessly expanding array of forms and locations. Dealing with Data is your window into the ways data teams are tackling the challenges of this new world to help their companies and their customers thrive.
Streaming data analytics is expected to grow into a $ 38.6 billion market by 2025. The world is moving faster than ever, and companies processing large amounts of rapidly changing or growing data need to evolve to keep up — especially with the growth of Internet of Things (IoT) devices all around us. Real-time insights provide businesses with unique advantages that are rapidly becoming must-haves for staying competitive in a variety of markets. Let’s look at a few ways that different industries take advantage of streaming data.

How industries can benefit from streaming data
- Sales, marketing, ad tech: Making faster marketing decisions, optimizing ad spend
- Security, fraud detection: Reducing the time needed to detect and respond to malicious threats
- Manufacturing, supply chain management: Increasing inventory accuracy, avoiding production line downtime, monitoring current IoT and machine-to-machine data
- Energy, utilities: Analyzing IoT data to alert and address equipment issues, get ahead of potential issues, help reduce maintenance costs
- Finance, fintech: Tracking customer behavior, analyzing account activities, responding to fraud and customer needs immediately and proactively while they’re engaged.
- Automotive: Monitoring connected, autonomous cars in real time to optimize routes to avoid traffic and for diagnosis of mechanical issues
As real-time analytics and machine learning stream processing are growing rapidly, they introduce a new set of technological and conceptual challenges. In this piece, we’ll dig into those challenges and how Upsolver and Sisense are helping tackle them.
Technological challenges to handling streaming data
Stateful transformations
One of the main challenges when dealing with streaming data comes from performing stateful transformations for individual events. Unlike a batch processing job that runs within an isolated batch with clear start and end times, a stream processing job runs continuously on each event separately. Operations like API calls, joins, and aggregations that used to run every few minutes/hours now need to run many times per second. Dealing with this challenge requires caching the relevant context on the processing instances (state management) using techniques like sliding time windows.
Optimizing object storage
Another goal that teams dealing with streaming data may have is managing and optimizing a file system on object storage. Streaming data tends to be very high-volume and therefore expensive to store. However, cloud object storage, like Amazon S3, is a very cost-effective solution (starting at $ 23 per month per terabyte for hot storage at time of writing) compared to traditional databases and Kafka (which creates three replicas by default on local storage — that’s a lot of data being stored!).
The challenge with object storage is the complexity of optimizing its file system by combining the right file format, compression, and size. For example, small files are a performance anti-pattern for object storage (50X impact), but using streaming data forces us to create such files.
Cleaning up dirty data
Every data professional knows that ensuring data quality is vital to producing usable query results. Streaming data can be extra challenging in this regard, as it tends to be “dirty,” with new fields that are added without warning and frequent mistakes in the data collection process. In order to bridge the gap to analytics-ready data, developers have to be able to address data quality issues quickly.
The best architecture for that is called “event sourcing.” Implementing this requires a repository of all raw events, a schema on read, and an execution engine that transforms raw events into tables. Every time analytics data needs to be adjusted, the developer will run a processing job from the raw data repository (time travel/replay/reprocessing).
Broader considerations
These are just a few of the technical considerations that teams need to grapple with when attempting to use real-time data. They also have to orchestrate huge volumes of jobs to handle rapidly changing data, work to ensure data consistency with exactly-once processing,
and deal with concurrent requests from a variety of users all trying to get insights from the same data at the same time.

How Upsolver solves stream processing
Sisense has partnered with Upsolver to help solve these technical challenges. Upsolver has spent countless hours to eliminate the engineering complexity so that companies can solve these types of data challenges in real time.

Challenges like job orchestration, exactly-once data consistency, and file system management are heavy engineering challenges. Solving them is a complex challenge, resulting in a data engineering bottleneck that slows the analytics process.
Upsolver encapsulates the streaming engineering complexity by empowering every technical user (data engineers, DBAs, analysts, scientists, developers) to ingest, discover, and prepare streaming data for analytics. These experts can define transformations from streams to tables and govern the processing progress using a visual, SQL-based interface. Engineering complexity is abstracted from the user via an execution engine that turns multiple data sources (stream and batch) into tables in various databases.
Let’s dig into how it works!
Step 1: Connect to data sources — cloud storage, data streams, databases
Once inside your Upsolver user interface (UI), users simply click on “Add a new data source” and choose one of the built-in connectors for cloud storage, databases, or streams. The UI allows users to parse their source data in formats including JSON, CSV, Avro, Parquet and Protobuf.

A sample from the parsed data is displayed before ingestion starts:

Schema on read and statistics per field are automatically detected and presented to the user:

Step 2: Define stateful transformations
Now that each data source has been set up, it’s time to define an output to be Upsolver’s entity for processing jobs. Each output creates one table in a target sync and populates it continuously with data based on the transformations the user defined.

Transformations can be defined via the UI, SQL, or both (bidirectional sync between UI and SQL).
(Note: The SQL statement isn’t for querying data like in databases. In Upsolver, SQL is used to define continuous processing jobs so the SQL statement is executed once for every source event.)
Upsolver provides over 200 built-in functions and ANSI SQL support out of the box to simplify stream processing. These native features hide implementation complexities from users so their time isn’t wasted on customized coding.
Upsolver also provides stateful transformations that allow the user to join the currently processed event with other data sources, run aggregation (with and without time windows), and deduplicate similar events.
Adding transformations using the UI:

Editing transformations with SQL:
The output below calculates the order total by aggregating net total and sales tax.

IF_DUPLICATE function for filtering similar events:

Defining keys for tables in Amazon Redshift:
This feature is necessary to support use cases like change data capture log streaming and enforcing GDPR/CCPA compliance in Amazon Redshift.

Step 3: Execute jobs query optimization
Now that your output is ready, click on RUN to execute your job.
First, fill in the login information for Redshift:

Second, choose the Upsolver cluster for running the output and the time frame to run at.
Upsolver clusters run on Amazon EC2 spot instances and scale out automatically based on compute utilization.

Note that during this entire process, the user didn’t need to define anything except data transformations: The processing job is automatically orchestrated, and exactly-once data consistency is guaranteed by the engine. The impact of implementing these best practices is faster queries that will power Redshift and dashboards in Sisense.
Step 4: Query
Log in to your Sisense environment with at least data designer privileges. We’ll be creating a live connection to the Redshift cluster that was set up in Step 3 and a simple dashboard. The first step is to navigate to the “Data” tab and create a new live model. You can give it whatever name you like.

Now, let’s connect to Redshift. Click the “+ Data” button in the upper right hand corner of the screen. Select Redshift from the list of available connections. Next, you’ll need to enter your Redshift location and credentials. The location is the URL/endpoint for the cluster. The credentials used should have read access to the data you are using. Make sure to check “Use SSL” and “Trust Server Certificate” (most clusters require these options by default).
When ready, click “Next” and select the database you’ll be connecting to. Then click “Next” again.
Here you can see all of the tables you have access to, separated by schema. Navigate to the schema/tables you want to import, and select a few to start with. You can always add more later. When ready, click “Done” to add these tables to your model.

Once you have tables to work with, it’s time to connect them. In this case, the tables link on “orderId.” Simply drag one table onto the other and select the linking field (the key) on both sides. When you see the line appear between the tables, you can click “Done.” Finally, click “Publish” in the upper right hand corner, and you’re ready to create a dashboard!

Now you need to navigate over to the “Analytics” tab. This is where we’ll create a new dashboard to view and explore the data. In there, you’ll see a button to create a new dashboard. Click it and select the live model you just published. You can give another name to the dashboard or it will default to the same name as the model.

Now, it’s time to build the dashboard and explore your data. In the following animation, we create a few different visualizations based on fields from both tables. Notice that we are selecting fields to use as both dimensions and measures from both tables in the model. Sisense is automatically joining the tables for us, and Upsolver keeps the data in both tables synced with the stream.

Feel free to explore your data now. You can left-click on elements of the dashboard to place a filter and right-click to drill down. You can also interact with filters on the right hand filter panel. Again, you can filter on fields in both tables, and Sisense will determine the right queries for you.

Getting your streaming data to work for you
Streaming data analytics is important for businesses to make critical decisions in real time. To get there, it’s necessary to solve several engineering complexities that are introduced with streaming and aren’t addressed with the batch technology stack. In this article, we showed how Upsolver, AWS, and Sisense can be used together to deliver an end-to-end solution for streaming data analytics that is quick to set up, easy to operate without coding, and scales elastically using cloud computing/storage.
Click here to find out more about Upsolver. Visit this page to become a Sisense Platform Partner and here to learn more about our partnership with AWS.


Ori Rafael has a passion for taking technology and making it useful for people and organizations. Before founding Upsolver, Ori performed a variety of technology management roles at IDF’s elite technology intelligence unit, followed by corporate roles. Ori has a B.A. in Computer Science and an MBA.

Mei Long, PM of Upsolver has held senior positions in many high-profile technology startups. Before Upsolver, she played an instrumental role on the teams that contributed to the Apache Hadoop, Spark, Zeppelin, Kafka, and Kubernetes project. Mei has a B.S. in Computer Engineering.
Quantitative and Qualitative Data: A Vital Combination
We live in a world of data: There’s more of it than ever before, in a ceaselessly expanding array of forms and locations. Dealing with Data is your window into the ways organizations and data teams tackle the challenges of this new world to help their companies and their customers thrive.
Almost every modern organization is now a data-generating machine. As soon as a company’s systems are computerized, the data-generation engine starts up. When these systems connect with external groups — customers, subscribers, shareholders, stakeholders — even more data is generated, collected, and exchanged. And, as industrial, business, domestic, and personal Internet of Things devices become increasingly intelligent, they communicate with each other and share data to help calibrate performance and maximize efficiency. The result, as Sisense CEO Amir Orad wrote, is that every company is now a data company.
Smart use of your data can be the key to optimizing processes, identifying new opportunities, and gaining or keeping a competitive edge. But are you paying attention to all of your data? Do you have the means to handle every kind of data? And can you take full advantage of the insights it can reveal? Your answers will depend on whether you can gather and analyze both quantitative and qualitative data. Let’s consider the differences between the two, and why they’re both important to the success of data-driven organizations.

Digging into quantitative data
Most commonly, we think of data as numbers that show information such as sales figures, marketing data, payroll totals, financial statistics, and other data that can be counted and measured objectively.
This is quantitative data. It’s “hard,” structured data that answers questions such as “how many?” or “how often?” It’s in an organized format, usually rows and columns, and is stored as a relational database, in which there are relationships among these rows and columns.
All descriptive statistics can be calculated using quantitative data. It’s analyzed through numerical comparisons and statistical inferences and is reported through statistical analyses.
As quantitative data is always numeric, it’s relatively straightforward to put it in order, manage it, analyze it, visualize it, and do calculations with it. Spreadsheet software like Excel, Google Sheets, or traditional database management systems all mainly deal with quantitative data. These programs and systems are great at generating basic visualizations like graphs and charts from static data. The challenge comes when the data becomes huge and fast-changing.
Why is quantitative data important?
Quantitative data is often viewed as the bedrock of your business intelligence and analytics program because it can reveal valuable insights for your organization. These numbers show performance, efficiency, reach, market share, revenue in, and expenses out. This is fundamental information when it comes to understanding how your organization is doing.
Additionally, quantitative data forms the basis on which you can confidently infer, estimate, and project future performance, using techniques such as regression analysis, hypothesis testing, and Monte Carlo simulations.
These techniques allow you to:
- See trends and relationships among factors so you can identify operational areas that can be optimized
- Compare your data against hypotheses and assumptions to show how decisions might affect your organization
- Anticipate risk and uncertainty via mathematically modeling
Consequently, using quantitative data, you can make strategic and tactical decisions that will benefit your organization and drive growth.

What are the problems with quantitative data?
Despite its many uses, quantitative data presents two main challenges for a data-driven organization.
First, data isn’t created in a uniform, consistent format. It’s generated by a host of sources in different ways. To access it effectively, you need to organize it and clean it to avoid mistakes, oversights, and repetitions that could compromise the integrity of your data and, by extension, the accuracy of your insights.
The solution to this problem is to ensure that your BI and analytics platform can handle as many sources and forms of data as possible, both on-premises and on the cloud, so that you don’t neglect or miss any of it.
Second, and more challenging, is the fact that not all the information you generate and collect is structured, quantitative data. As data is produced from a growing array of sources, quantitative data may be as little as 20% of all the data available to most organizations. If you focus your efforts entirely on quantitative data, you’ll overlook a huge amount of valuable information, and your insights and decision-making could become distorted as a result. You need a solution that can access and analyze the other 80%: qualitative data.
Exploring qualitative data
Qualitative data is unstructured, meaning it’s not in a predefined format. In its raw state, it comes in a variety of forms: text, social media comments, phone call transcripts, various logs files, images, audio, video, and more.
Often, the information in qualitative data is a categorical variable. This means it describes the characteristics or qualities of data units, such as “what type,” “which category,” “who” (or “which persona”). These create a picture of the context and environment in which the data can help build an understanding of feelings, opinions, and intentions. That’s because qualitative data is concerned with understanding the perspective of customers, users, or stakeholders. This type of data is often collected through less rigid, measurable means than quantitative data. Examples include comments, recordings, interviews, focus group reports, and more, typically received in the natural language of the informants.
Qualitative data is much more subjective, “soft” information than quantitative data. However, advanced analytics can now identify and classify this information and transform it into findings that lead to game-changing insights for organizations.
As the sources of data continue to proliferate, an increasing proportion of it is unstructured and qualitative. It’s more complex, and it requires more storage. Traditional methods of gathering and organizing data can’t organize, filter, and analyze this kind of data effectively. Advanced technology and new approaches are needed. What seem at first to be very random, disparate forms of qualitative data require the capacity of data warehouses, data lakes, and NoSQL databases to store and manage them.
Qualitative data benefits: Unlocking understanding
Qualitative data can go where quantitative data can’t. For example, it’s the gateway to sentiment analysis — understanding how users, customers, and stakeholders think and feel, as well as what they do. Techniques that focus on qualitative data, such as content analysis and narrative analysis, enable you to:
- Analyze text to find the most common themes in open-ended data such as user feedback, interview transcripts, and surveys, so you can pinpoint the most important focus areas
- Better interpret feelings or perceptions about your organization, product, services, processes, or brand, to help identify what changes or innovations would be most effective
Having access to this information gives you a new dimension of insights on which to base decisions and determine the tactical and strategic direction of your organization. With qualitative data, you can understand intention as well as behavior, thereby making predictive analytics more accurate and giving you fuller insights. You can analyze and learn from the large volume of unstructured data to ensure that your data-driven decisions are as solid as possible.
For example, Skullcandy, the manufacturer of headphones, earbuds, and other audio and wireless products, uses predictive and sentiment analysis to understand customers better. This enables the company’s product development team to explore opportunities for reducing returns and warranty requests for new products before they’re even released and get reactions from customers about current products. Skullcandy uses these insights to hone its new products based on what customers say they like and dislike, so that it can offer more attractive products to its target market.

Getting the most from qualitative data
Qualitative data eludes the rigid categorization and the storage limitations of traditional databases. It provides the raw material for information that is more varied and harder to organize than structured, qualitative data. Making sense of and deriving patterns from it calls for newer, more advanced technology.
Natural language processing (NLP), involving machine learning, is how your BI and analytics platform can understand the meaning of unstructured data such as emails, comments, feedback, and instant messages. It enables search-driven analytics that allow you to process complex NLP, because you can ask the system questions using natural, everyday language. Plus, you can receive visualizations and embed insights into external apps, even if what you’re asking for is relatively complex from an analytical point of view.
Furthermore, systems powered by AI and augmented analytics continuously learn what people choose to do with data. Algorithms identify patterns in the data, and these make search results faster, more accurate, and more complete. You can integrate AI analytics tools with other interfaces like Amazon Alexa and chatbots, drawing on these technologies’ impressive developments in NLP, too.
With all this potential to find new insights, it’s no surprise that 97% of business leaders say their businesses are investing in big data and AI initiatives.
Better together: Working with qualitative data and quantitative data
Quantitative data is the bedrock of your BI and analytics. It provides a solid foundation on which you can build data visualizations and insights that are fundamental to your organization.
With that foundation in place, qualitative data enables you to take your analytics further, get more understanding from a fast-growing array of sources, and power your decision-making with data from even more perspectives.
You have to be able to work with both kinds of data in order to unlock the most comprehensive insights. Together, they can help set the stage for game-changing pivots and innovations that can keep your organization at the head of the pack.

Adam Murray began his career in corporate communications and public relations in London and New York before moving to Tel Aviv. He’s spent the last ten years working with tech companies like Amdocs, Gilat Satellite Systems, and Allot Communications. He holds a Ph.D. in English literature. When he’s not spending time with his wife and son, he’s preoccupied with his beloved football team, Tottenham Hotspur.