• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Tag Archives: Code

You don’t code? Do machine learning straight from Microsoft Excel

December 31, 2020   Big Data

Transform 2021

Join us for the world’s leading event about accelerating enterprise transformation with AI and Data, for enterprise technology decision-makers, presented by the #1 publisher in AI and Data

Learn More


Machine learning and deep learning have become an important part of many applications we use every day. There are few domains that the fast expansion of machine learning hasn’t touched. Many businesses have thrived by developing the right strategy to integrate machine learning algorithms into their operations and processes. Others have lost ground to competitors after ignoring the undeniable advances in artificial intelligence.

But mastering machine learning is a difficult process. You need to start with a solid knowledge of linear algebra and calculus, master a programming language such as Python, and become proficient with data science and machine learning libraries such as Numpy, Scikit-learn, TensorFlow, and PyTorch.

And if you want to create machine learning systems that integrate and scale, you’ll have to learn cloud platforms such as Amazon AWS, Microsoft Azure, and Google Cloud.

Naturally, not everyone needs to become a machine learning engineer. But almost everyone who is running a business or organization that systematically collects and processes can benefit from some knowledge of data science and machine learning. Fortunately, there are several courses that provide a high-level overview of machine learning and deep learning without going too deep into math and coding.

But in my experience, a good understanding of data science and machine learning requires some hands-on experience with algorithms. In this regard, a very valuable and often-overlooked tool is Microsoft Excel.

To most people, MS Excel is a spreadsheet application that stores data in tabular format and performs very basic mathematical operations. But in reality, Excel is a powerful computation tool that can solve complicated problems. Excel also has many features that allow you to create machine learning models directly into your workbooks.

While I’ve been using Excel’s mathematical tools for years, I didn’t come to appreciate its use for learning and applying data science and machine learning until I picked up Learn Data Mining Through Excel: A Step-by-Step Approach for Understanding Machine Learning Methods by Hong Zhou.

Learn Data Mining Through Excel takes you through the basics of machine learning step by step and shows how you can implement many algorithms using basic Excel functions and a few of the application’s advanced tools.

While Excel will in no way replace Python machine learning, it is a great window to learn the basics of AI and solve many basic problems without writing a line of code.

Linear regression machine learning with Excel

Linear regression is a simple machine learning algorithm that has many uses for analyzing data and predicting outcomes. Linear regression is especially useful when your data is neatly arranged in tabular format. Excel has several features that enable you to create regression models from tabular data in your spreadsheets.

One of the most intuitive is the data chart tool, which is a powerful data visualization feature. For instance, the scatter plot chart displays the values of your data on a cartesian plane. But in addition to showing the distribution of your data, Excel’s chart tool can create a machine learning model that can predict the changes in the values of your data. The feature, called Trendline, creates a regression model from your data. You can set the trendline to one of several regression algorithms, including linear, polynomial, logarithmic, and exponential. You can also configure the chart to display the parameters of your machine learning model, which you can use to predict the outcome of new observations.

You can add several trendlines to the same chart. This makes it easy to quickly test and compare the performance of different machine learning models on your data.

 You don’t code? Do machine learning straight from Microsoft Excel

Above: Excel’s Trendline feature can create regression models from your data.

In addition to exploring the chart tool, Learn Data Mining Through Excel takes you through several other procedures that can help develop more advanced regression models. These include formulas such as LINEST and LINREG, which calculate the parameters of your machine learning models based on your training data.

The author also takes you through the step-by-step creation of linear regression models using Excel’s basic formulas such as SUM and SUMPRODUCT. This is a recurring theme in the book: You’ll see the mathematical formula of a machine learning model, learn the basic reasoning behind it, and create it step by step by combining values and formulas in several cells and cell arrays.

While this might not be the most efficient way to do production-level data science work, it is certainly a very good way to learn the workings of machine learning algorithms.

Other machine learning algorithms with Excel

Beyond regression models, you can use Excel for other machine learning algorithms. Learn Data Mining Through Excel provides a rich roster of supervised and unsupervised machine learning algorithms, including k-means clustering, k-nearest neighbor, naive Bayes classification, and decision trees.

The process can get a bit convoluted at times, but if you stay on track, the logic will easily fall in place. For instance, in the k-means clustering chapter, you’ll get to use a vast array of Excel formulas and features (INDEX, IF, AVERAGEIF, ADDRESS, and many others) across several worksheets to calculate cluster centers and refine them. This is not a very efficient way to do clustering, but you’ll be able to track and study your clusters as they become refined in every consecutive sheet. From an educational standpoint, the experience is very different from programming books where you provide a machine learning library function your data points and it outputs the clusters and their properties.

 You don’t code? Do machine learning straight from Microsoft Excel

Above: When doing k-means clustering on Excel, you can follow the refinement of your clusters on consecutive sheets.

In the decision tree chapter, you will go through the process calculating entropy and selecting features for each branch of your machine learning model. Again, the process is slow and manual, but seeing under the hood of the machine learning algorithm is a rewarding experience.

In many of the book’s chapters, you’ll use the Solver tool to minimize your loss function. This is where you’ll see the limits of Excel, because even a simple model with a dozen parameters can slow your computer down to a crawl, especially if your data sample is several hundred rows in size. But the Solver is an especially powerful tool when you want to fine-tune the parameters of your machine learning model.

 You don’t code? Do machine learning straight from Microsoft Excel

Above: Excel’s Solver tool fine-tunes the parameters of your model and minimizes loss functions.

Deep learning and natural language processing with Excel

Learn Data Mining Through Excel shows that Excel can even express advanced machine learning algorithms. There’s a chapter that delves into the meticulous creation of deep learning models. First, you’ll create a single layer artificial neural network with less than a dozen parameters. Then you’ll expand on the concept to create a deep learning model with hidden layers. The computation is very slow and inefficient, but it works, and the components are the same: cell values, formulas, and the powerful Solver tool.

 You don’t code? Do machine learning straight from Microsoft Excel

Above: Deep learning with Microsoft Excel gives you a view under the hood of how deep neural networks operate.

In the last chapter, you’ll create a rudimentary natural language processing (NLP) application, using Excel to create a sentiment analysis machine learning model. You’ll use formulas to create a “bag of words” model, preprocess and tokenize hotel reviews, and classify them based on the density of positive and negative keywords. In the process you’ll learn quite a bit about how contemporary AI deals with language and how much different it is from how we humans process written and spoken language.

Excel as a machine learning tool

Whether you’re making C-level decisions at your company, working in human resources, or managing supply chains and manufacturing facilities, a basic knowledge of machine learning will be important if you will be working with data scientists and AI people. Likewise, if you’re a reporter covering AI news or a PR agency working on behalf of a company that uses machine learning, writing about the technology without knowing how it works is a bad idea (I will write a separate post about the many awful AI pitches I receive every day). In my opinion, Learn Data Mining Through Excel is a smooth and quick read that will help you gain that important knowledge.

Beyond learning the basics, Excel can be a powerful addition to your repertoire of machine learning tools. While it’s not good for dealing with big data sets and complicated algorithms, it can help with the visualization and analysis of smaller batches of data. The results you obtain from a quick Excel mining can provide pertinent insights in choosing the right direction and machine learning algorithm to tackle the problem at hand.

Ben Dickson is a software engineer and the founder of TechTalks. He writes about technology, business, and politics.

This story originally appeared on Bdtechtalks.com. Copyright 2020

VentureBeat

VentureBeat’s mission is to be a digital townsquare for technical decision makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you,
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more.

Become a member

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Debugging Code in Dynamics 365 Finance & Operations

November 26, 2020   Microsoft Dynamics CRM

The purpose of this blog is to show how to quickly debug X++ code using the debugging feature in Microsoft Visual Studio for Dynamics 365 Finance & Operations (D365 FO). More specifically, we’ll be debugging an error message issued from D365 FO. To accomplish this, we must first understand the logic behind this error message and trace from where in the code it was issued. To debug the X++ code in…

Source

Let’s block ads! (Why?)

PowerObjects- Bringing Focus to Dynamics CRM

Read More

arXiv now allows researchers to submit code with their manuscripts

October 11, 2020   Big Data
 arXiv now allows researchers to submit code with their manuscripts

Automation and Jobs

Read our latest special issue.

Open Now

Papers with Code today announced that preprint paper archive arXiv will now allow researchers to submit code alongside research papers, giving computer scientists an easy way to analyze, scrutinize, or reproduce claims of state-of-the-art AI or novel advances in what’s possible.

An assessment of the AI industry released a week ago found that only 15% of papers submitted by researchers today publish their code.

Maintained by Cornell University, arXiv hosts manuscripts from fields like biology, mathematics, and physics, and it has become one of the most popular places online for artificial intelligence researchers to publicly share their work. Preprint repositories give researchers a way to share their work immediately, before undergoing what can be a long peer review process as practiced by reputable scholarly journals. Code shared on arXiv will be submitted through Papers with Code and can be found in a Code tab for each paper.

“Having code on arXiv makes it much easier for researchers and practitioners to build on the latest machine learning research,” Papers with Code cocreator Robert Stojnic said a blog post today. “We also hope this change has ripple effects on broader computational science beyond machine learning. Science is cumulative. Open science, including making available key artefacts such as code, helps to accelerate progress by making research easier to build upon.”

Started in 2018, Papers with Code focuses on encouraging reproducibility of AI model results and, as the name states, submitting research with code. The Papers with Code website shares nearly 2,000 papers and code from across major fields in AI like natural language processing, computer vision, adversarial machine learning, and robotics. Papers with Code was initially founded in part by members of Facebook AI Research. Last year, Facebook and Papers with Code launched PyTorch Hub to encourage reproducibility.

In the past year or so, sharing code along with a research paper manuscript has become standard at major AI research conferences. At ICML 2019, nearly 70% of authors submitted code with their papers by the start of the conference. ICML organizers found that 90% of researchers who submitted code came from academia, and about 27% included an author from industry. Conversely, nearly 84% of authors overall came from industry and about 27% from academia. People developing software or AI inside companies may be more likely to view secrecy as important to protecting intellectual property or financial interests.

NeurIPS began to experiment with a code submission policy in 2019 and put an official code submission policy into effect this year. In other news about evolving standards for AI researchers, earlier this year NeurIPS began to require all paper submissions to include social impact statements and declare any potential financial conflicts of interest.

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Intel researchers create AI system that rates similarity of 2 pieces of code

July 29, 2020   Big Data
 Intel researchers create AI system that rates similarity of 2 pieces of code

VB Transform

Watch every session from the AI event of the year

On-Demand

Watch Now

In partnership with researchers at MIT and the Georgia Institute of Technology, Intel scientists say they’ve developed an automated engine — Machine Inferred Code Similarity (MISIM) — that can determine when two pieces of code perform similar tasks, even when they use different structures and algorithms. MISIM ostensibly outperforms current state-of-the-art systems by up to 40 times, showing promise for applications from code recommendation to automated bug fixing.

With the rise of heterogeneous computing — i.e., systems that use more than one kind of processor — software platforms are becoming increasingly complex. Machine programming (a term coined by Intel Labs and MIT) aims to tackle this with automated, AI-driven tools. A key technology is code similarity, or systems that attempt to determine whether two code snippets show similar characteristics or achieve similar goals. Yet building accurate code similarity systems is a relatively unsolved problem.

MISIM works because of its novel context-aware semantic structure (CASS), which susses out the purpose of a given bit of source code using AI and machine learning algorithms. Once the structure of the code is integrated with CASS, algorithms assign similarity scores based on the jobs the code is designed to perform. If two pieces of code look different but perform the same function, the models rate them as similar — and vice versa.

CASS can be configured to a specific context, enabling it to capture information that describes the code at a higher level. And it can rate code without using a compiler, a program that translates human-readable source code into computer-executable machine code. This confers the usability advantage of allowing developers to execute on incomplete snippets of code, according to Intel.

Intel says it’s expanding MISIM’s feature set and moving it from the research to the demonstration phase, with the goal of creating a code recommendation engine to assist internal and external researchers programming across its architectures. The proposed system would be able to recognize the intent behind an algorithm and offer candidate codes that are semantically similar but with improved performance.

That could save employers a few headaches — not to mention helping developers themselves. According to a study published by the University of Cambridge’s Judge Business School, programmers spend 50.1% of their work time not programming and half of their programming time debugging. And the total estimated cost of debugging is $ 312 billion per year. AI-powered code suggestion and review tools like MISIM promise to cut development costs substantially while enabling coders to focus on more creative, less repetitive tasks.

“If we’re successful with machine programming, one of the end goals is to enable the global population to be able to create software,” Justin Gottschlich, Intel Labs principal scientist and director of machine programming research, told VentureBeat in a previous interview. “One of the key things you want to do is enable people to simply specify the intention of what they’re trying to express or trying to construct. Once the intention is understood, with machine programming, the machine will handle the creation of the software — the actual programming.”

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Amazon launches AI-powered code review service CodeGuru in general availability

June 30, 2020   Big Data
 Amazon launches AI powered code review service CodeGuru in general availability

Amazon today announced the general availability of CodeGuru, an AI-powered developer tool that provides recommendations for improving code quality. It was first revealed during the company’s Amazon Web Services (AWS) re:Invent 2019 conference in Las Vegas, and starting today, it’s available with usage-based pricing.

Software teams perform code reviews to check the logic, syntax, and style before new code is added to an existing application codebase — it’s an industry-standard practice. But it’s often challenging finding enough developers to perform reviews and monitor the apps post-deployment. Plus, there’s no guarantee those developers won’t miss problems, resulting in bugs and performance issues.

CodeGuru ostensibly solves this with a component that integrates with existing integrated development environments (IDEs) and taps AI algorithms trained on over 10,000 of the most popular open source projects to evaluate code as it’s being written. Where there’s an issue, CodeGuru proffers a human-readable comment that explains what the issue is and suggests potential remediations. The tool also finds the most inefficient and unproductive lines of code by creating a profile that takes into account things like latency and processor utilization.

It’s a two-part system. CodeGuru Reviewer — which uses a combination of rule mining and supervised machine learning models — detects deviation from best practices for using AWS APIs and SDKs, flagging common issues that can lead to production issues such as detection of missing pagination, error handling with batch operations, and the use of classes that are not thread-safe. Developers commit their code as usual to the repository of their choice (e.g. GitHub, GitHub Enterprise, Bitbucket Cloud, and AWS CodeCommit) and add Reviewer as one of the code reviewers. Reviewer then analyzes existing code bases in the repository, identifies bugs and issues, and creates a baseline for successive code reviews by opening a pull request. The service also provides a dashboard that lists information for all code reviews, which reflects feedback solicited from developers.

VB Transform 2020 Online – July 15-17. Join leading AI executives: Register for the free livestream.

CodeGuru Profiler delivers specific recommendations on issues like extravagant recreation of objects, expensive deserialization, usage of inefficient libraries, and excessive logging. Users install an agent in their app that observes the app run time and profiles the app to detect code quality issues (along with details on latency and CPU usage). Profiler then uses machine learning to automatically identify code and anomalous behaviors that are most impacting latency and CPU usage. The information is brought together in a profile that shows the areas of code that are most inefficient. This profile includes recommendations on how developers can fix issues to improve performance and also estimates the cost of continuing to run inefficient code.

Amazon says that CodeGuru — which encodes AWS’ best practices — has been used internally to optimize 80,000 applications, leading to tens of millions of dollars in savings. In fact, Amazon claims that some teams were able to reduce processor utilization by 325% and lower costs by 39% in just a year.

CodeGuru is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Ireland), EU (London), EU (Frankfurt), EU (Stockholm), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo) with availability expanding to additional regions in the coming months. Early adopters include Atlassian, cloud tech consultancy EagleDream Technologies, enterprise software developer DevFactory, condominium review website operator Renga, and scheduling program startup YouCanBook.me.

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Improve Row Count Estimates for Table Variables without Changing Code

May 27, 2020   BI News and Info

You probably have heard that table variables work fine when the table variable only contains a small number of records, but when the table variable contains a large number of records it doesn’t perform all that well. A solution for this problem has been implemented in version 15.x of SQL Server (Azure SQL Database and SQL Server 2019) with the rollout of a feature called Table Variable Deferred Compilation

Table Variable Deferred Compilation is one of many new features to improve performance that was introduced in the Azure SQL Database and SQL Server 2019. This new feature was included in the Intelligent Query Processing (IQP). See Figure 1 for a diagram that shows all the IQP features introduced in Azure SQL Database and SQL Server 2019, as well as features that originally were part of the Adaptive Query Processing feature included in the older generation of Azure SQL Database and SQL Server 2017.

word image 27 Improve Row Count Estimates for Table Variables without Changing Code

Figure 1: Intelligent Query Processing

In releases of SQL Server prior to 15.x, the database engine used a wrong assumption on the number of rows that were in a table variable. Because of this bad assumption, the execution plan that was generated didn’t work too well when a table variable contained lots of rows. With the introduction of SQL Server 2019, the database engine now defers the compilation of a query that uses a table variable until the table variable is used the first time. By doing this, the database engine can more accurately identify cardinality estimates for table variables. By having more accurate cardinality numbers, queries that have large numbers of rows in a table variable will perform better. Those queries will need to be running against a database with a database compatibility level set to 150 (version 15.x of SQL Server) to take advantage of this feature. To better understand how deferred compilation improves the performance of table variables that contain a large number of rows, I’ll run through an example, but first, I’ll discuss what is the problem with table variables in versions of SQL Server prior to version 15.x.

What is the Problem with Table Variables?

A table variable is defined using a DECLARE statement in a batch or stored procedure. Table variables don’t have distribution statistics and don’t trigger recompiles. Because of this, SQL Server is not able to estimate the number of rows in a table variable like it does for normal tables. When the optimizer compiles code that contains a table variable, prior to 15.x, it assumes a table is empty. This assumption causes the optimizer to compile the query using an expected row count of 1 for the cardinality estimate for a table variable. Because the optimizer only thinks a table variable contains a single row, it picks operators for the execution plan that work well with a small set of records, like the NESTED LOOPS operator for a JOIN operation. The operators that work well on a small number of records do not always scale well when a table variable contains a large number of rows. Microsoft documented this problem and recommends that temp tables might be a better choice than using a table variable that contains more than 100 rows. Additionally, Microsoft even recommends that if you are joining a table variable with other tables that you consider using the query hint RECOMPILE to make sure that table variables get the correct cardinality estimates. Without the proper cardinality estimates queries with large table variables are known to perform poorly.

With the introduction of version 15.x and the Table Variable Deferred Compilation feature, the optimizer delays the compilation of a query that uses a table variable until just before it is used the first time. This allows the optimizer to know the correct cardinality estimates of a table variable. When the optimizer has an accurate cardinality estimate, it has a good chance at picking execution plan operators that perform well for the number of rows in a table variable. In order for the optimizer to defer the compilation, the database must have its compatibility level set to 150. To show how deferred compilation of table variables work, I’ll show an example of this new feature in action.

Table Variable Deferred Compilation in Action

To understand how deferred compilation works, I will run through some sample code that uses a table variable in a JOIN operation. That sample code can be found in Listing 1.

Listing 1: Sample Test Code that uses Table Variable in JOIN operation

USE WideWorldImportersDW;

GO

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City;

SELECT O.[Order Key], TV.[City Key]

FROM Fact.[Order] as O INNER JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key];

As you can see, this code uses the WideWorldImportersDW database, which can be downloaded here. In this script, I first declare my table variable @MyCities and then insert 116,295 rows from the Dimension.City table into the variable. That variable is then used in an INNER JOIN operation with the Fact.[Order] table.

To show the deferred compilation in action, I will need to run the code in Listing 1 twice. The first execution will be run against the WideWorldImportsDW using compatibility code 140, and the second execution will run against this same database using compatibility level 150. The script I will use to compare how table variables work, using the two difference compatibility levels, can be found in Listing 2.

Listing 2: Comparison Test Script

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

USE WideWorldImportersDW;

GO

– Turn on time statistics

SET STATISTICS TIME ON;

GO

—————————————————

– Test #1 – Using SQL Server 2017 compatibility –

—————————————————

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 140;

GO

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City;

SELECT O.[Order Key], TV.[City Key]

FROM Fact.[Order] as O JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key]

—————————————————

– Test #2 – Using SQL Server 2019 compatibility –

—————————————————

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

USE WideWorldImportersDW;

GO

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City

SELECT O.[Order Key], TV.[City Key]

FROM Fact.[Order] as O JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key];

GO

When I run the code in Listing 2, I run it from a query window in SQL Server Management Studio (SSMS), with the Include Actual Execution Plan query option turned on. The execution plan I get with I run query Test #1 and #2 can be found in Figure 2 and Figure 3, respectfully.

word image 28 Improve Row Count Estimates for Table Variables without Changing Code

Figure 2: Execution Plan for Test #1 code in Listing 2, using compatibility level 140

word image 29 Improve Row Count Estimates for Table Variables without Changing Code

Figure 3: Execution Plan for Test #2 code in Listing 2, using compatibility level 150

If you compare the execution plan between Figure 2 and 3, you will see the execution plans are a little different. When compatibility mode 140 was used, my test query used a NESTED LOOPS operation to join the table variable to the Fact.[Order] table, whereas when using compatibility mode 150, the optimizer picked a HASH MATCH operator for the join operation. This occurred because the Test #1 query uses an estimated row count of 1 for the table variable @MyCities. Whereas the Test #2 query was able to use the deferred table variable compilation feature which allowed the optimizer to use an estimated row count of 116,295 for the table variable. These estimated row count numbers can be verified by looking at the Table Scan operator properties for each execution plan, which are shown in Figure 4 and 5 respectfully.

word image 30 Improve Row Count Estimates for Table Variables without Changing Code

Figure 4: Table Scan properties when Test #1 query ran under compatibility level 140

word image 31 Improve Row Count Estimates for Table Variables without Changing Code

Figure 5: Table Scan properties when Test #2 query ran under compatibility level 150

By reviewing the table scan properties, the optimize used the correct estimated row count when compatibility level 150 was used. Whereas when compatibility level 140 was used, the optimizer estimated a row count of 1. Also note that my query that ran under compatibility level 150 also used BATCH mode for the TABLE SCAN operation, whereas the compatibility mode 140 query ran using ROW mode. You may be asking yourself now, how much faster does running my test code under compatibility level 150 perform over running the test code under the older compatibility level 140.

Comparing Performance between Compatibility Mode 140 and 150

In order to compare the performance of running my test query under both compatibility level, I executed the script in Listing 1 ten different times under each of the two compatibility levels. I then calculated the average CPU and elapsed time for the two different compatibility levels, and finally graphed the average performance number in the graph in Figure 6.

word image 32 Improve Row Count Estimates for Table Variables without Changing Code

Figure 6: Performance Comparison between Compatibility Mode 140 and 150.

When the test query was run under compatibility mode 150, it used a fraction of the CPU over compatibility level 140. Whereas the Elapsed Time value of the test query that ran under compatibility level 150 ran 4.6 times faster than then using compatibility level 140. This is a significate performance improvement. But since batch mode processing was for the compatibility level 150 test, I can’t assume all this improvement was associated with only the Deferred Table Variable Compilation feature.

In order to remove the batch mode from my performance test, I’m going to run my test query under compatibility mode 150 one more time. But this time my test will run with a query hint to disable the batch mode feature. The script I will use for this additional test can be found in Listing 3.

1

2

3

4

5

6

7

8

9

10

USE WideWorldImportersDW;

GO

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City

SELECT O.[Order Key], TV.[City Key]

FROM Fact.[Order] as O JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key]

OPTION(USE HINT(‘DISALLOW_BATCH_MODE’));

GO 10

Listing 3: Test #2 query with Batch Mode disabled

The graph in Figure 7 shows the new performance comparison results using deferred compilation and row mode features when my test ran under compatibility level 150.

word image 33 Improve Row Count Estimates for Table Variables without Changing Code

Figure 7: Table Variable Deferred Compilation Comparison with Batch Mode disabled

With the Batch Mode feature disabled, my CPU went up significantly from my previous test when batch mode was enabled. But the Elapsed Time was only slightly different. Deferred Compilation seems to provide significate performance improvements, by delaying the compilation of a query until the table variable is used the first time. I have to wonder if the deferred compilation feature will improve the cardinality estimate issue caused by parameter sniffing with a parameterized query.

Does Deferred Compilation Help with Parameter Sniffing?

Parameter sniffing has been known to cause performance issues when a compiled execute plan is executed multiple times using different parameter values. But does the deferred table variable compilation feature in 15.x solve this parameter sniffing issue? To determine whether or not it does, let me create a stored procedure name GetOrders, to test this out. That stored procedure CREATE statement can be found in Listing 4.

Listing 4: Code to test out parameter sniffing

1

2

3

4

5

6

7

8

9

10

11

12

USE WideWorldImportersDW;

GO

CREATE OR ALTER PROC GetOrders(@CityKey int)

AS

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City

  WHERE [City Key] < @CityKey;

SELECT *

FROM Fact.[Order] as O INNER JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key]

GO

The number of rows returned by the stored procedure in Listing 4 is controlled by the value passed in the parameter @MyCities. To test if the deferred compilation feature solves the parameter sniffing issue, I will run the code in Listing 5.

Listing 5: Code to see if deferred compilation resolves parameter sniffing issue

USE WideWorldImportersDW;

GO

SET STATISTICS IO ON;

DBCC FREEPROCCACHE;

– First Test

EXEC GetOrders @CityKey = 10;

–Second Test

EXEC GetOrders @CityKey = 231412;

The code in Listing 5 first runs the test stored procedure using a value of 10 for the parameter. The second execution uses the value 231412 for the parameter. These two different parameters will cause the store procedure to process drastically different numbers of rows. After I run the code in Listing 5, I will explore the execution plan for each execution of the stored procedure. I will look at the properties of the TABLE SCAN operation to see what the optimizer thinks are the estimated and actual rows count for the table variables for each execution. The table scan properties for each execution can be found in Figure 8 and 9 respectfully.

word image 34 Improve Row Count Estimates for Table Variables without Changing Code

Figure 8: Table Scan Statistics for the first execution of the test stored procedure

word image 35 Improve Row Count Estimates for Table Variables without Changing Code

Figure 9: Table Scan Statistics for the second execution of the test stored procedure

Both executions got the same number of estimated rows counts but got considerably different actual row counts. This means that the deferred table compilation feature of version 15.x doesn’t resolve the parameter sniffing problem of a stored procedure.

What Editions Supports the Deferred Compilations for Table Variables?

Like many cool new features that have come out with each new release of SQL Server in the past, they are first introduced in Enterprise edition only, and then over time, they might become available in other editions. You will be happy to know that the Deferred Compilation for Table Variables feature doesn’t follow this typical pattern. As of the RTM release of SQL Server 2019, the deferred compilation feature is available in all editions of SQL Server, as documented here.

Improve Performance of Code using Table Variables without Changing Any Code

TSQL code that contains a table variable has been known not to perform well when the variable contains lots of rows. This is because the code that declares the table valuable is compiled before the table has been populated with any rows of data. Well, that has all changed when TSQL code is executed in SQL Server 2019 or Azure SQL DB when your database is running under compatibility level 150. When using a database that is in compatibility level 150, the optimizer defers the compilation of code using a table variable until the first time the table variable is used in a query. By deferring the compilation, SQL Server can obtain a more accurate estimate of the number of rows in the table variable. When the optimizer has better cardinality estimates for a table variable, the optimizer can pick more appropriate operators for the execution plan, which leads to better performance. Therefore, if you have found code where table variables don’t scale well when they contain a lot of rows, then possibly version 15.x of SQL Server might help. By running TSQL code under compatibility level 150, you can improve the performance of code using table variables without changing any code.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

Reduce CPU of Large Analytic Queries Without Changing Code

March 27, 2020   BI News and Info

When Microsoft came out with columnstore in SQL Server 2012, they introduced a new way to process data called Batch Mode. Batch mode processes a group of rows together as a batch, instead of processing the data row by row. By processing data in batches, SQL Server uses less CPU than row by row processing. To take advantage of batch mode, a query had to reference a table that contained a column store index. If your query only involved tables that contain data in row stores, then your query would not use batch mode. That has now changed. With the introduction of version 15.x of SQL Server, aka SQL Server 2019, Microsoft introduced a new feature call Batch Mode on Rowstore.

Batch Mode on Rowstore is one of many new features that was introduced in the Azure SQL Database and SQL Server 2019 to help speed up rowstore queries that don’t involve a column store. The new Batch Mode on Rowstore feature can improve performance of large analytic queries that scan many rows, where these queries aggregate, sort or group selected rows. Microsoft included this new batch mode feature in the Intelligent Query Processing (IQP). See Figure 1 for a diagram from Microsoft’s documentation that shows all the IQP features introduced in Azure SQL Database and SQL Server 2019. It also shows the features that originally were part of Adaptive Query Processing included in the older generation of Azure SQL Database and SQL Server 2017.

word image 46 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 1: Intelligent Query Processing

Batch Mode on Rowstore can help speed up your big data analytic queries but might not kick in for smaller OLTP queries (more on this later). Batch mode has been around for a while and supports columnstore operators, but it wasn’t until SQL Server version 15.x that batch mode worked on Rowstores without performing a hack. Before seeing the new Batch Mode on Rowstore feature in action, let me first explain how batch mode processing works.

How Batch Mode Processing Works

When the database engine processes a transact SQL statement, the underlying data is processed by one or more operators. These operators can process the data using two different modes: Row or Batch. At a high level, row mode can be thought of as processing rows of data, one row at a time. Whereas, batch mode processes multiple rows of data together in a batch. The processing of batches of rows at a time versus row by row can reduce CPU usage.

When batch mode is used for rowstore data, the rows of data are scanned and loaded into a vector storage structure, known as a batch. Each batch is a 64K internal storage structure. This storage structure can contain between 64 and 900 rows of data, depending on the number of columns involved in the query. Each column used by the query is stored in a continuous column vector of fixed size elements, where the qualifying rows vector indicates which rows are still logically part of the batch (see Figure 2 which came from a Microsoft Research paper).

Rows of data can be processed very efficiently when an operation uses batch mode, as compared to row mode processing. For instance, when a batch mode filter operation needs to qualify rows that meet a given column filter criteria, all that is needed is to scan the vector that contains the filtered column and mark the row appropriately in the qualifying rows vector, based on whether or not the column value meets the filter criteria.

word image 47 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 2: A row batch is stored column-wise and contains one vector for each column plus a bit vector indicating qualifying rows

SQL Server executes fewer instructions per row when using batch mode over row mode. By reducing the number of instructions when using batch mode, queries typically use less CPU than row mode queries. Therefore, if a system is CPU bound, then batch mode might help reduce the environment’s CPU footprint.

In a given execution plan, SQL Server might use both batch and row mode operators, because not all operators can process data in batch mode. When mixed-mode operations are needed, SQL Server needs to transition between batch mode and row mode processing. This transition comes at a cost. Therefore, SQL Server tries to minimize the number of transitions to help optimize the processing of mixed-mode execution plans.

For the engine to consider batch mode for a rowstore, the database compatibility level must be set to 150. With the compatibility level set to 150, the database engine performs a few heuristic checks to make sure the query qualifies to use batch mode. One of the checks is to make sure the rowstore contains a significate number of rows. Currently, it appears that the magic number seems to be 131,072. Dmitry Pilugin wrote an excellent post on this magic number. I also verified that this is still the magic number for the RTM release of SQL Server 2019. That means that batch mode doesn’t kick in for smaller tables (less than 131,072 rows), even if the database is set to compatibility mode 150. Another heuristic check verifies that the rowstore is using either a b-tree or heap for its storage structure. Batch mode doesn’t kick in if the table is an in-memory table. The cost of the plan is also considered. If the database optimizer finds a cheaper plan that doesn’t use Batch Mode on Rowstore, then the cheaper plan is used.

To see how this new batch mode feature works on a rowstore, I set up a test that ran a couple of different aggregate queries against the WideWorldImportersDW database.

Batch Mode on Rowstore In Action

This section demonstrates running a simple test aggregate query to summarize a couple of columns of a table that uses heap storage. The example runs the test aggregate query twice. The first execution uses compatibility level 140, so the query must use row mode operators to process the test query. The second execution runs under compatibility mode 150 to demonstrate how batch mode improves the query processing for the same test query.

After running the test query, I’ll explain how the graphical execution plans show the different operators used between the two test query executions. I’ll also compare the CPU and Elapsed time used between the two queries to identify the performance improvement using batch mode processing versus row mode processing. Before showing my testing results, I’ll first explain how I set up my testing environment.

Setting up Testing Environment

I used the WideWorldImportersDW database as a starting point for my test data. To follow along, you can download the database backup for this DB here. I restored the database to an instance of SQL Server 2019 RTM running on my laptop. Since the Fact.[Order] table in this database isn’t that big, I ran the code in Listing 1 to create a bigger fact table named Fact.OrderBig. The test query aggregates data using this newly created fact table.

Listing 1: Code to create the test table Fact.OrderBig

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

USE WideWorldImportersDW;

GO

CREATE TABLE Fact.[OrderBig](

[Order Key] [bigint],

[City Key] [int] NOT NULL,

[Customer Key] [int] NOT NULL,

[Stock Item Key] [int] NOT NULL,

[Order Date Key] [date] NOT NULL,

[Picked Date Key] [date] NULL,

[Salesperson Key] [int] NOT NULL,

[Picker Key] [int] NULL,

[WWI Order ID] [int] NOT NULL,

[WWI Backorder ID] [int] NULL,

[Description] [nvarchar](100) NOT NULL,

[Package] [nvarchar](50) NOT NULL,

[Quantity] [int] NOT NULL,

[Unit Price] [decimal](18, 2) NOT NULL,

[Tax Rate] [decimal](18, 3) NOT NULL,

[Total Excluding Tax] [decimal](18, 2) NOT NULL,

[Tax Amount] [decimal](18, 2) NOT NULL,

[Total Including Tax] [decimal](18, 2) NOT NULL,

[Lineage Key] [int] NOT NULL);

GO

INSERT INTO Fact.OrderBig

   SELECT * FROM Fact.[Order];

GO 100

The code in Listing 1 created the Fact.OrderBig table that is 100 times the size of the original Fact.[Order] table with 23,141,200 rows.

Comparison Test Script

To do a comparison test between batch mode and row mode, I ran two different test queries found in Listing 2.

Listing 2: Test script

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

USE WideWorldImportersDW;

GO

– Turn on time statistics

SET STATISTICS IO, TIME ON;

– Clean buffers so cold start performed

DBCC DROPCLEANBUFFERS

GO

– Prepare Database Compatibility level for Test #1

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 140;

GO

– Test #1

SELECT [Customer Key],

       SUM(Quantity) AS TotalQty,

       AVG(Quantity) AS AvgQty,

       AVG([Unit Price]) AS AvgUnitPrice

FROM Fact.[OrderBig]

WHERE [Customer Key] > 10 and [Customer Key] < 100

GROUP BY [Customer Key]

ORDER BY [Customer Key];

GO

– Clean buffers so cold start performed

DBCC DROPCLEANBUFFERS

GO

– Prepare Database Compatibility level for Test #2

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

– Test #2

SELECT [Customer Key],

     SUM(Quantity) AS TotalQty,

AVG(Quantity) AS AvgQty,

AVG([Unit Price]) AS AvgUnitPrice

FROM Fact.[OrderBig]

WHERE [Customer Key] > 10 and [Customer Key] < 100

GROUP BY [Customer Key]

ORDER BY [Customer Key];

GO

The code in Listing 2 executes two different tests, collects some performance statistics, and cleans the data buffer cache between each test. Both tests run the same simple aggregate query against the Fact.OrderBig table. Test #1 runs the aggregate SELECT statement using compatibility level 140, whereas Test #2 runs the same aggregate SELECT statement using compatibility level 150. By setting the compatibility level to 140, Test #1 uses row mode processing. Whereas Test #2, uses compatibility level 150, so batch mode can be considered for the test query. Additionally, I turned on the TIME statistics so I could measure performance (CPU and Elapsed time) between each test. By doing this, I can validate the performance note in Figure 3, that was found in this Microsoft documentation.

word image 48 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 3: Documentation Note on Performance

When I ran my test script in Listing 2, I executed it from a SQL Server Management Studio (SSMS) query window. In that query window, I enabled the Include Actual Execution Plan option so that I could compare the execution plans created for both of my tests. Let me review the execution artifacts created when I ran my test script in Listing 2.

Review Execution Artifacts

When I ran my test script, I collected CPU and Elapsed Time statistics as well as the actual execution plans for each execution of my test aggregate query. In this section, I’ll review the different execution artifacts to compare the differences between row mode and batch mode processing.

The CPU and Elapsed time statistics, as well as the actual execution plan for when I ran my first test query, which was using compatibility level 140, can be found in Figure 4 and Figure 5 respectfully.

word image 49 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 4: CPU and Elapsed Time Statistics for Test #1

word image 50 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 5: Actual Execution Plan under Compatibility Level 140 for Query 1

Figure 6 and 7 below, show the time statistics and the actual execution plan when I ran my test query under compatibility level 150.
word image 51 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 6: Execution Statistics for Test #2

word image 52 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 7: Execution Plan for Test #2

The first thing to note is that the plan that ran under compatibility level 150 (Figure 7) is more streamlined than the one that ran under compatibility mode 140 (Figure 6). From just looking at the execution plan for the second test query, I can’t tell whether or not the query (which ran under compatibility mode 150) uses batch mode or not. To find out, you must right-click on the SELECT icon in the execution for the Test #2 query (Figure 7) and then select the Properties item from the context menu. Figure 8 shows the properties of this query.

word image 53 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 8: Properties for Compatibility Level 150 Query (Test #2)

Notice that the property BatchModeOnRowstoreUsed is True. This property is a new showplan attribute that Microsoft added in SSMS version 18. When this property is true, it means that some of the operators used in processing Test #2 did use a batch mode operation on the Rowstore Fact.OrderBig table.

To review which operators used Batch Mode on Rowstore, you must review the properties of each operator. Figure 9 has some added annotations to the execution plan that shows which operators used batch mode processing and which ones used row mode processing.

word image 54 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 9: Execution Plan for Batch Mode query with Operator property annotations

If you look at the Table Scan (Heap) operator, you can see that the Fact.OrderBig table is a RowStore by reviewing the Storage Property. You can also see that this operation used batch mode by looking at the Actual Execution Mode property. All the other operators ran in batch mode, except the Parallelism operator, which used row mode.

The test table (Fact.OrderBig) contains 23,141,200 rows and the test query referenced 3 different columns. The query didn’t need all those rows because it was filtered to include the rows where the customerid was greater than 10 and less than 100. To determine the number of batches the query created, look at the properties of the table scan operator in the execution plan, which is shown in Figure 10.

word image 55 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 10: Number of batches used for Test #2.

The Actual Number of Batches property in Figure 8 shows that the table scan operator of the test #2 query created 3,587 batches. To determine the number of rows in each batch, use the following formula: Actual Number of Rows divided by the Actual Number of Batches. By using this formula, I got, on average, 899.02 rows per batch.

The cost estimate for each of the queries is the same, 50%. Therefore, to measure performance between batch mode and row mode, I’ll have to look at the TIME statistics.

Comparing Performance of Batch Mode and Row Mode

To compare performance between running batch mode and row mode queries, I ran my test script in Listing 2 ten different times. I then averaged the CPU and Elapsed times between my two different tests and then graphed the results in the chart found in Figure 11.

word image 56 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 11: CPU and Elapsed time Comparison between Row Mode and Batch Mode

The chart in Figure 11 shows that the row mode test query used a little more than 30% more CPU over the batch mode test query. Both the batch and row mode queries ran about the same elapsed time. Just like the note (Figure 4) above suggested, this first test showed considerable CPU improvement could be gained when a simple aggregate query uses Batch Mode processing. But not all queries are created equal when it comes to performance improvements using Batch Mode versus Row Mode.

Not All Queries are Created Equal When It Comes to Performance

The previous test showed a 30% improvement in CPU but little improvement in Elapsed Time. The resource (CPU and Elapsed Time) improvements using Batch Mode operations versus Row mode depend on the query. Here is another contrived test that shows some drastic improvements in Elapsed Time, using the new Batch Mode on Rowstore feature. The test script I used for my second performance test can be found in Listing 3.

Listing 3: Stock Item Key Query Test Script

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

– Turn on time statistics

SET STATISTICS IO, TIME ON;

– Clean buffers so cold start performed

DBCC DROPCLEANBUFFERS

GO

– Prepare Database Compatibility level for Test #1

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 140;

GO

SELECT [Stock Item Key],[City Key],[Order Date Key],[Salesperson Key],

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key]) AS StockAvgQty,

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key])

        AS StockCityAvgQty,

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key],

        [Order Date Key]) AS StockCityDateAvgQty,  

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key],

        [Order Date Key],[Salesperson Key])

        AS StockCityDateSalespersonAvgQty

FROM Fact.OrderBig

WHERE [Customer Key] > 10 and [Customer Key] < 100

– Clean buffers so cold start performed

DBCC DROPCLEANBUFFERS

GO

– Prepare Database Compatibility level for Test #2

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

SELECT [Stock Item Key],[City Key],[Order Date Key],[Salesperson Key],

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key]) AS StockAvgQty,

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key])

        AS StockCityAvgQty,

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key],

        [Order Date Key]) AS StockCityDateAvgQty,  

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key],

        [Order Date Key],[Salesperson Key])

        AS StockCityDateSalespersonAvgQty

FROM Fact.OrderBig

WHERE [Customer Key] > 10 and [Customer Key] < 100

In Listing 3, I used the OVER clause to create four different aggregations, where each aggregation had a different PARTITION specification. To gather the performance statistics for Listing 3 queries, I ran this script ten different times. Figure 12 shows the numbers for CPU and Elapsed Time numbers graphically.

word image 57 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 12: CPU and Elapsed Time comparison for Window Function Query test

As you can see by creating the different aggregation in Listing 3, I once again saw a big performance improvement in CPU (around 72%). This time, I also got a big improvement in Elapsed Time (a little more than 45%) when batch mode was used. My testing showed that not all queries are created equal when it comes to performance. For this reason, I recommend you test all the queries in your environment to determine how each query performs using this new Batch Mode on Rowstore feature. If you happen to find some queries that perform worse using batch mode, then you can either rewrite the queries to perform better or consider disabling batch mode for those problem queries.

Disabling Batch Mode on Row Store

If you find you have a few queries that don’t benefit from using batch mode, and you don’t want to rewrite them, then you might consider turning off the Batch Mode on Rowstore feature with a query hint.

If you use the DISALLOW_BATCH_MODE hint, you can disable Batch Mode on Rowstore feature for a given query. The code in Listing 4 shows how I disabled batch mode for the first test query I used in this article.

Listing 4: Using “DISALLOW BATCH MODE” hint to disable batch mode for a single query

SELECT [Customer Key],

       SUM(Quantity) AS TotalQty,

       AVG(Quantity) AS AvgQty,

       AVG([Unit Price]) AS AvgUnitPrice

FROM Fact.[OrderBig]

WHERE [Customer Key] > 10 and [Customer Key] < 100

GROUP BY [Customer Key]

ORDER BY [Customer Key]

OPTION(USE HINT(‘DISALLOW_BATCH_MODE’));

When I ran the query in Listing 4 against the WideWorldImportersDW database running in compatibility mode 150, the query didn’t invoke any batch mode operations. I verified this by reviewing the properties of each operator. They all processed using a row mode operation. The value of using the DISALLOW_BATCH_MODE hint is I can disable the batch mode feature for a single query. This means it’s possible to be selective on which queries will not consider batch mode when your database is running under compatibility level 150.

Alternatively, you could disable the Batch Mode on Rowstore feature at the database level, as shown in Listing 5.

Listing 5: Disabling Batch Mode at the database level

– Disable batch mode on rowstore

ALTER DATABASE SCOPED CONFIGURATION SET BATCH_MODE_ON_ROWSTORE = OFF;

Disabling the batch mode feature at the database level still allows other queries to take advantages of the other new 15.x features. This might be an excellent option to use if you wanted to move to version 15.x of SQL Server while you complete testing of all of your large aggregation queries to see how they are impacted by the batch mode feature. Once testing is complete, reenable batch mode by running the code in Listing 6.

Listing 6: Enabling Batch Mode at the database level

– Enable batch mode on rowstore

ALTER DATABASE SCOPED CONFIGURATION SET BATCH_MODE_ON_ROWSTORE = ON;

By using the hint or database scoped configure method to disable batch mode, I have control over how I want this new feature to affect the performance of my row mode query operations. It is great that the team at Microsoft allows these different methods to disable/enable the Batch Mode on Rowstore feature. By allowing these different options for enable/disabling batch mode on rowstore, I have more flexibility in how I roll out the batch mode feature across a database.

Which Editions Support Batch Mode?

Before you get too excited about how this feature might help the performance of your large analytic queries, I have to tell you the bad news. Batch Mode on Rowstore is not available to all version of SQL. Like many cool new features that have come out in the past, they are first introduced in Enterprise edition only, and then over time, they might become available in other editions. Batch Mode on Rowstore is no exception. As of the RTM release of SQL Server 2019, the Batch Mode on Rowstore feature is only available in Enterprise Edition, as documented here. Also note that developer edition supports Batch Mode on Rowstore, but of course cannot be used for production work. Be careful when doing performance testing of this new feature on the developer edition of SQL Server 2019 if you plan to roll out your code into any production environment except Enterprise. If you want to reduce your CPU footprint using this new feature, then you better get out your checkbook and upgrade to Enterprise edition, or just wait until Microsoft rolls this feature out to other editions of SQL Server. It also works on Azure SQL Database.

Reduce CPU of Large Analytic Queries Without Changing Code

If you have large analytic queries that perform aggregations, you might find that using the new Batch Mode on Rowstore feature improves CPU and Elapsed time without changing any code if your query environment meets a few requirements. The first requirement is that your query needs to be running using SQL Server version 15.x (SQL Server 2019) or better. The second requirement is you need to be running on an edition of SQL Server that supports the Batch Mode on Rowstore feature. Additionally, the table being queried needs to have at least 131,072 rows and be stored in a b-tree or heap before batch mode is considered for the table.

I am impressed by how much less CPU and Elapsed time was used for my test aggregation queries. If you have a system that runs lots of aggregate queries, then migrating to SQL Server 2019 might be able to eliminate your CPU bottlenecks and get some of your queries to run faster at the same time.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

Write the code to display a given plot [closed]

February 23, 2020   BI News and Info
Closed. This question is off-topic. It is not currently accepting answers.


Lzhl2 Write the code to display a given plot [closed]

Write the code to display the exact image shown above.

2 Answers

Let’s block ads! (Why?)

Recent Questions – Mathematica Stack Exchange

Read More

Get Your Scalar UDFs to Run Faster Without Code Changes

February 13, 2020   BI News and Info

Over the years, you probably have experienced or heard that using user-defined functions (UDF’s) do not scale well as the number of rows processed gets larger and larger. Which is too bad, because we have all heard that encapsulating your code into modules promotes code reuse and is a good programming practice. Now the Microsoft SQL Server team have added a new feature to the database engine in Azure SQL Database and SQL Server 2019 that allows UDF’s performance to scale when processing large recordsets. This new feature is known as T-SQL Scalar UDF Inlining.

T-SQL Scalar UDF Inlining is one of many new features to improve performance that was introduced in the Azure SQL Database and SQL Server 2019. This new feature contains many options available in the Intelligent Query Processing (IQP) feature set. Figure 1 from Intelligent Query Processing in SQL Databases shows all the IQP features introduced in Azure SQL Database and SQL Server 2019, as well as features that originally were part of the Adaptive Query Processing feature set that was included in the older generation of Azure SQL Database and SQL Server 2017.

word image 7 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 1: Intelligent Query Processing

The T-SQL Scalar UDF Inlining feature will automatically scale UDF code without having to make any coding changes. All that is needed is for your UDF to be running against a database in Azure SQL Database or SQL Server 2019, where the database has the compatibility level set to 150. Let me dig into the details of the new inlining feature a little more.

T-SQL Scalar UDF Inlining

The new T-SQL Scalar UDF Inlining feature will automatically change the way the database engine interprets, costs, and executes T-SQL queries when a scalar UDF is involved. Microsoft incorporated the FROID framework into the database engine to improve the way scalar UDFs are processed. This new framework refactors the imperative scalar UDF code into relational algebraic expressions and incorporates these expressions into the calling query automatically.

By refactoring the scalar UDF code, the database engine can improve the cost-based optimization of the query as well as perform set based optimization that allows the UDF code to go parallel if needed. Refactoring of scalar UDFs is done automatically when a database is running under compatibility level 150. Before I dig into the new scalar UDF inlining feature, let me review why scalar UDF’s are inherently slow, and discuss the differences between imperative and relational equivalent code.

Why are Scalar UDF Functions inherently slow?

When running a scaler UDF on a database with a compatibility level set to less than 150, they just don’t scale well. By scale, I mean they work fine for a few rows but run slower and slower as the number of rows processed gets larger and larger. Here are some of the reasons why scalar UDF’s don’t work well with large recordsets.

  • When a T-SQL statement uses a scalar function, the database engine optimizer doesn’t look at the code inside a scalar function to determine its costing. This is because Scalar operators are not costed, whereas relational operators are costed. The optimizer considers scalar functions as a black box that uses minimal resources. Because scalar operations are not costed appropriately, the optimize is notorious for creating very bad plans when scalar functions perform expensive operations.
  • A Scalar function is evaluated as a batch of statements where each statement is run sequentially one statement after another. Because of this, each statement has its own execution plan and is run in isolation from the other statements in the UDF, and therefore can’t take advantage of cross-statement optimization.
  • The optimize will not allow queries that use a scalar function to go parallel. Keep in mind, parallelism may not improve all queries, but when a scalar UDF is being used in a query, that query’s execution plan will not go parallel.

Imperative and Relational Equivalent Code

Scalar UDFs are a great way to modularize your code to promote reuse, but all too often they contain procedural code. Procedural code might contain imperative code such as variable declarations, IF/ELSE structures, as well as WHILE looping. Imperative code is easy to write and read, hence why imperative code is so widely used when developing code for applications.

The problem with imperative code is that it is hard to optimize, and therefore query performance suffers when imperative code is executed. The performance of imperative code is fine when a small number of rows are involved, but as the row count grows, the performance starts to suffer. Because of this, you should not use them for larger record sets if they are executed on a database running with a compatibility less than 150. With the introduction of version 15.x of SQL Server, the scaling problem associated with UDFs has been solved by the refactoring of imperative code using a new optimization technique known as the FROID framework.

The FROID framework refactors imperative code into a single relational equivalent query. It does this by analyzing the scalar UDF imperative code and then converts blocks of imperative code into relational equivalent algebraic expressions. These relational expressions are then combined into a single T-SQL statement using APPLY operators. Additionally, the FROID framework looks for redundant or unused code and removes it from the final execution plan of the query. By converting the imperative code in a scalar UDF into re-factored relational expressions, the query optimizer can perform set-based operations and use parallelism to improve the scalar UDF performance. To further understand the difference between imperative code and relational equivalent code, let me show you an example.

Listing 1 contains some imperative code. By reviewing this listing, you can see it includes a couple of DECLARE statements and some IF/ELSE logic.

Listing 1: Imperative Code Example

1

2

3

4

5

6

7

8

9

10

DECLARE @Sex varchar(10) = ‘Female’;

DECLARE @SexCode int;

IF @Sex = ‘Female’

SET @SexCode = 0

ELSE

IF @Sex = ‘Male’

   SET @SexCode = 1;

     ELSE

        SET @SexCode = 2;

SELECT @SexCode AS SexCode;

I have then re-factored the code in Listing 1 into a relational equivalent single SELECT statement in Listing 2, much like the FROID framework might doing it when compiling a scalar UDF.

Listing 2: Relational Code Example

SELECT B.SexCode FROM (SELECT ‘Female’ AS Sex) A

OUTER APPLY  

  (SELECT CASE WHEN A.Sex = ‘Female’ THEN 0

          WHEN A.Sex = ‘Male’ THEN 1

     ELSE 2

END AS SexCode) AS B;

By looking at these two examples, you can see how easy it is to read the imperative code in Listing 1 to see what is going on. Whereas in Listing 2, which contains the relational equivalent code, requires a little more analysis/review to determine exactly what is happening.

  • Currently, the FROID framework is able to rewrite the following scalar UDF coding constructs into relational algebraic expressions:
  • Variable declaration and assignments using DECLARE or SET statement
  • Multiple variable assignments in a SELECT statement
  • Conditional testing using IF/ELSE logic
  • Single or multiple RETURN statements
  • Nested/recursive function calls in a UDF
  • Relational operations such as EXISTS and ISNULL

The two listings found in this section only logically demonstrate how the FROID framework might convert imperative UDF code into relational equivalent code using the FROID framework. For more detailed information on the FROID framework, I suggest you read this technical paper.

In order to see FROID optimization in action, let me show you an example that compares the performance of a scalar UDF running with and without FROID optimization.

Comparing Performance of Scalar UDF with and Without FROID Optimization

To test how a scalar UDF would perform with and without FROID optimization, I will run a test using the sample WorldWideImportersDW database (download here). In that database, I’ll create a scalar UDF called GetRating. The code for this UDF can be found in Listing 3.

Listing 3: Scalar UDF that contains imperative code

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

CREATE OR ALTER FUNCTION dbo.GetRating(@CityKey int)

RETURNS VARCHAR(13)

AS

BEGIN

   DECLARE @AvgQty DECIMAL(5,2);

   DECLARE @Rating VARCHAR(13);

   SELECT @AvgQty  = AVG(CAST(Quantity AS DECIMAL(5,2)))

   FROM Fact.[Order]

   WHERE [City Key] = @CityKey;

   IF @AvgQty / 40 >= 1  

  SET @Rating = ‘Above Average’;

   ELSE

  SET @Rating = ‘Below Average’;

   RETURN @Rating

END

By reviewing the code in Listing 3 you can see that I am creating my scalar UDF that I will be using for testing. This function calculates a rating for a [City Key] value. The rating returned is either “Above Average” or “Below Average” based on 40 being the average rating. Note that this UDF contains imperative code.

In order to test how scalar inlining can improve performance I will be running the code in Listing 4.

Listing 4: Code to test performance of scalar UDF

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

– Turn on Time Statistics

SET STATISTICS TIME ON;

GO

USE WideWorldImportersDW;

GO

– Set Compatibility level to 140

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 140;

GO

– Test 1

SELECT DISTINCT ([City Key]), dbo.GetRating([City Key]) AS CityRating

FROM Dimension.[City]

– Set Compatibility level to 150

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

– Test 2

SELECT DISTINCT ([City Key]), dbo.GetRating([City Key]) AS CityRating

FROM Dimension.[City]

GO

The code in Listing 4 runs two tests. The first test (Test 1) calls the scaler UDF dbo.GetRating using compatibility level 140 (SQL Server 2017). For the second test (Test 2), I only changed the compatibility level to 150 (SQL Server 2019) and ran the same UDF as Test 1 without making any coding changes to the UDF.

When I run Test 1 in Listing 4, I get the execution statistics shown in Figure 2 and the execution plan shown in Figure 3.

word image 8 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 2: Execution Statistics for Test 1
word image 9 Get Your Scalar UDFs to Run Faster Without Code Changes
Figure 3: Execution plan when using compatibility level 140 using Test 1

Prior to reviewing the time statistics and execution plan for Test 1 let me run Test 2. The time statistics and execution plan for Test 2 can be found in Figure 4 and Figure 5, respectfully.

word image 10 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 4: Execution Statistics for Test 2

word image 11 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 5: Execution plan when using compatibility level 150 using Test 2

Performance Comparison between Test 1 and Test 2

The only change I made between Test 1 and Test 2 was to change the compatibility level from 140 to 150. Let me review how the FROID optimization changed the execution plan and improved the performance when I executed my test using compatibility level 150.

Before running the two different tests, I turned on statistics time. Figure 6 compares the time statistics between the two different tests.

word image 12 Get Your Scalar UDFs to Run Faster Without Code Changes
Figure 6: CPU and Elapsed Time Comparison Between Test 1 and Test 2

As you can see, when I executed the Test 1 SELECT statement in Listing 4 using compatibility level 140, the CPU and elapsed time took a little over 30 seconds. Whereas, when I changed the compatibility level to 150 and ran the Test 2 SELECT statement in Listing 4, my CPU and Elapsed time used just over 1second of time each. As you can see, Test 2, which used compatibility level 150 and the FROID framework, ran magnitudes faster than List 1 which ran under compatibility 140 without the FROID framework optimization. The improvement I gained using the FRIOD framework and compatibility level 150 achieved this performance improvement without changing a single line of code in my test scalar UDF. To better understand why the time comparisons were so drastically different between these two executions of the same SELECT statement, let me review the execution plans produced by each of these test SELECT queries.

If you look at Figure 3, you will see a simple execution plan when the SELECT statement was run under compatibility 140. This execution plan didn’t go parallel and only includes two operators. All the work related to calculating the city rating in the UDF using the data in the Fact.[Order] table is not included in this execution plan. To get the rating for each city, my scalar function had to run multiple times, once for every [City Key] value found in the Dimension.[City] table. You can’t see this in the execution plan, but if you monitor the query using an extended event, you can verify this. Each time the database engine needs to invoke my UDF in Test 1, a context switch has to occur. The cost of the row by row operation nature of calling my UDF over and over again causes the query in Test 1 to run slow.

If we look at the execution plan in Figure 5, which is for Test 2, you see a very different plan as compared to Test 1. When the SELECT statement in Test 2 was run, it ran under compatibility level 150, which allowed the scalar function to be inlined. By inlining the scalar function, FROID optimization converted my scalar UDF into a relational operation which allowed my UDF logic to be included in the execution plan of the calling SELECT statement. By doing this, the database engine was able to calculate the rating value for each [City Key] using a set-based operation, and then joins the rating value to all the cities in the Dimension.[City] table using an inner join nested loop operation. By doing this set based operation in Test 2, my query runs considerably faster and uses fewer resources than the row by row nature of my Test 1 query.

Not all Scalar Functions Can be Inlined

Not all scalar function can be inlined. If a scalar function contains coding practices that cannot be converted to relational algebraic expressions by the FRIOD framework, then your UDF will not be inlined. For instance, if a scalar UDF contains a WHILE loop, then the scalar function will not be inlined. To demonstrate this, I’m going to modify my original UDF code so it contains a dummy WHILE loop. My new UDF is called dbo.GetRating_Loop and can be found in Listing 5.

Listing 5: Scalar UDF containing a WHILE loop

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

CREATE OR ALTER FUNCTION dbo.GetRating_Loop(@CityKey int)

RETURNS VARCHAR(13)

AS

BEGIN

   DECLARE @AvgQty DECIMAL(5,2);

   DECLARE @Rating VARCHAR(13);

– Dummy code to support WHILE loop

   DECLARE @I INT = 0;

   WHILE @I < 1

   BEGIN

  SET @I = @I + 1;

   END

   SELECT @AvgQty  = AVG(CAST(Quantity AS DECIMAL(5,2)))

   FROM Fact.[Order]

   WHERE [City Key] = @CityKey;

   IF @AvgQty / 40 >= 1  

  SET @Rating = ‘Above Average’;

   ELSE

  SET @Rating = ‘Below Average’;

   RETURN @Rating

END

By reviewing the code in Listing 5, you can see I added a dummy WHILE loop at the top of my original UDF. When I run this code using the code in Listing 6, I get the execution plan in Figure 7.

Listing 6: Code to run dbo.GetRating_Loop

1

2

3

4

5

6

7

8

9

10

USE WideWorldImportersDW;

GO

– Set Compatibility level to 150

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

– Test UDF With WHILE Loop

SELECT DISTINCT ([City Key]),

    dbo.GetRating_Loop([City Key]) AS CityRating

FROM Dimension.[City]

GO

word image 13 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 7: Execution plan created while execution Listing 6.

By looking at the execution plan in Figure 7, you can see that my new UDF didn’t get inlined. The execution plan for this test looks very similar to the execution plan I got when I ran my original UDF in Listing 3 under database compatibility level 140. This example shows not all scalar UDF functions will be inlined. Just those scalar UDF that use only the functionality support by the FRIOD framework will be inline.

Disabling Scalar UDF Inlining

With this new version of SQL Server, the design team wanted to make sure you could disable any new features at the database level or statement level. Therefore, you can use the code in Listing 6 or 7 to disable scalar UDF inlining. Listing 6 shows how to disable scalar UDF inlining at the database level.

Listing 6: Disabling inlining at the database level

ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = OFF;

Listing 7 shows how to disable scalar inlining when the scalar UDF is created.

Listing7: Disabling when defining UDF

CREATE FUNCTION dbo.MyScalarUDF (@Parm int)

RETURNS INT

WITH INLINE=OFF

...

Make Your Scalar UDF just Run faster by Using SQL Server version 15.x

If you want to make your Scalar UDF run faster without making any coding changes, then SQL Server 2019 is for you. With this new version of SQL Server, the FROID framework was added. This framework will refactor a scalar UDF function into relational equivalent code that can be placed directly into the calling statement’s execution plan. By doing this, a scalar UDF is turned into a set-based operation instead of being called for every candidate row. All it takes to have a scalar UDF refactored is to set your database to compatibility level 150.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

CRM and customer-centricity are code words for customer engagement and customer experience

January 22, 2020   CRM News and Info
istock 936223112 CRM and customer centricity are code words for customer engagement and customer experience

While I labor mightily over the Watchlist submissions (with a few surprises so far — more than last year), I am continuing to keep you informed with weekly guest posts from prominent thinkers in multiple venues. This week, another really insightful one from one of my favorite world-class analysts, thought leaders, and friends: Brian Solis.  

Brian dropped a piece on us a couple weeks ago on voice in general and Einstein Voice in particular. This week, he addresses a fundamental transformation that is going on in the world of business and customers — engagement and experience dominating and how to address it with Salesforce 360 Truth as a lynchpin of his discussion. 

So, Mr. Solis, take it away!


“I’m going to let you in on a little secret, we’re really not customer centric, we just say it because everyone else does,” said no credible executive ever.

What does customer-centricity even mean today? Is it a state? Is it an intention? Is it plain old marketing? Putting the customer at the center of any business has been aspirational to date, an evolving pursuit of IT, sales/commerce, marketing, and service. But along the way, existing organizational models, lines of business, and systems struggled to collaborate. Add to that, the continuous clash between legacy and emerging technologies made the physical acts of placing the customer at the center of the business onerous and even elusive.  

Heading into a new decade, however, businesses are now gaining access to incredible arrays of innovative technologies to accelerate growth and transformation. CRM is still one of the most important keys to customer-centricity. And still, even with strides in innovation, CRM faces some of its most relentless hurdles. It’s incredibly difficult to be customer-centric if you’re not actually centered around the customer. Business and data silos, incomplete or duplicate customer data records, incongruent touchpoints, disconnected apps, and incompatible systems and services — and, to be honest, a lack of unified leadership driving toward strategic integration — remain as common issues that require prioritization and escalation.

The Quest for a 360 View of Customers

Let’s remember that CRM stands for customer relationship management. This is about optimizing relationships and using advancing technology to get closer to customers. The question is, what kind of relationship do you want to have with your customers? The answer starts with knowing the customer, defining the experience you wish to deliver and then building an architecture for meaningful engagement.

Organizations need a 360 view of customers to get to their truth. And this “360 view,” too, has been a long sought after “Holy Grail” in its own right. But to get there requires more than technology and data.

Enterprise vendors, which help business customers modernize operational models, as well as infrastructure, and executive mindsets to organize around integrated customer data and insights, will ultimately create a new blueprint for customer-centricity.

In my research over the years, I’ve learned that the pursuit of customer-centricity was a leading driver for digital transformation. And at the heart of all this is customer experience (CX). In fact, year over year, I found that a majority of digital transformation initiatives were focused on CX. Companies that prioritized customer experience investments evolved much faster across what I defined as “The Six Stages of Digital Transformation.”

The New Blueprint for 360 Customer Views and Unified Customer Experiences

As a digital analyst, I try to keep up with all of the technology advancements from leading vendors pushing forward cloud, AI, voice, real-time and predictive analytics, sentiment analysis, data integration, 5G, cloud migration, et al. Every day, it seems that CRM, BPM, and ERP platforms are only becoming more and more awesome. But also, as someone who interviews business executives as part of ongoing research efforts, I also hear the very human struggles trying to also keep up with everything while leading difficult modernization and change management initiatives within.

This is why vendors need to apply customer-centricity to their go-to-market initiatives. Perhaps it’s a shift in enterprise sales and marketing from B2B to B2B2C or maybe just P2P (people to people.) We need to complement conversations about innovation and capabilities and platforms with empathy that takes into account very real challenges within the enterprise and also the very real challenges their customers face traversing today’s customer journeys.

Maybe it’s time we humanize the CRM and CX lexicon. (Paul note: Sorry to interrupt. I am behind this 1,000%.)

When we talk about CRM, what we’re really talking about is technology that allows enterprise customers to facilitate more meaningful, productive, and loyalty-building customer engagement. When we talk about customer-centricity, we really need to emphasize efforts around unified customer experience. Actually, we need to call attention more directly to the “customer’s experience.” Adding that “‘s” changes the dynamic of planning and strategy toward customer-centric systems thinking, holistic systems of engagement, and operational innovation.

This is important because customer experience is defined by the sum of all engagements a customer has with your business. Every touchpoint counts. Anything that isn’t helping customers toward their standard of experience may, in fact, be taking away from the experience you’re aiming to deliver.

When it comes to CRM or any enterprise technology for that matter, engagement and experiences must be humanized for executive decision-makers as well as how platforms help them deliver the integrated experiences customers seek.

Rapid Advancements in CRM Technology and User-Centered Narratives Will Help Enterprise Customers Adapt and Innovate

Leading enterprise vendors such as Zoho, Pega, SAP, Oracle, Microsoft, Adobe, and more, are starting to make strides on this front. In my work with each vendor, I’ll start to explore this topic of CRM to CX evolution in more detail over time.

One vendor that recently caught my attention was Salesforce through its recent announcement of Customer 360 Truth. Originally announced at Dreamforce 19, I didn’t get a chance to really process the news and its approach until now. It was this Salesforce hosted Q&A with Patrick Stokes, executive vice president of platform shared services at Salesforce, that really impressed me. In it, he explains how Salesforce views the integration of sales, service, marketing, commerce, and communities as well as third-party and legacy systems as a single source of “truth.” But it was his transparency in explaining the opportunity and the challenges facing the harnessing of customer truth, that hit home.

Stokes explained, “Seeking a single source of truth for each customer isn’t a new idea, but it’s been difficult to achieve. Buying a car, or even selling a CRM solution as we do, can involve hundreds of touchpoints across a variety of systems that need to be tracked and managed. Ultimately, you want to have a holistic graph of past and present customer engagement so you can better serve the customer and predict future needs.”

It’s not just Salesforce solving for this. The difficulties in achieving a 360 customer view explained earlier in this article is what every company faces and what every vendor, and also enterprise digital transformations, are striving to answer.

I also appreciated Stokes’ expanded perspective beyond technology and technical hurdles that made the conversation much more relatable.

“You can’t up-level customer experiences if you can’t overcome organizational barriers that prevent deeper integration.” he continued. “When you have a more customer-centric culture, those barriers tend to break down.”

Customer 360 Truth Aims to Unite Disparate Data for a 360 Customer View

Enterprise Software

We are finally arriving at the moment when CRM technology and data can power integrated and value-added customer engagement and experiences. According to Salesforce research, the demand is certainly there, “70 percent of customers say they expect connected experiences in which their preferences are known across touchpoints.” That number will only go higher as customers start to realize the fruits of next-generation CRM solutions.

Customer 360 Truth introduces a new set of data and identity capabilities. It promises the ability for companies to connect, authenticate, and govern customer data and identity across the Salesforce platform to build “a single source of truth across all of their customer relationships.”

Customer 360 Truth as a service is divided into four parts (per Salesforce’s press release):

Customer 360 Data Manager: Delivers the ability to access, connect and resolve a customer’s data across Salesforce and other systems, using a canonical data model and a universal Salesforce ID that represents each customer.

Salesforce Identity for Customers: Removes friction from the login experience and enables a single, authenticated and secure relationship between a customer and all of a company’s websites, e-commerce stores, mobile apps, and connected products.

Customer 360 Audiences: Builds unified customer profiles across known data such as email addresses and first-party IDs and unknown data such as website visits and device IDs. It then creates customer segments and marketing engagement journeys from those profiles and delivers AI-powered insights, like lifetime value and likelihood to churn.

Privacy and Data Governance: Enables companies to collect and respect customer data use and privacy preferences, as well as apply data classification labels to all data in Salesforce.

Additionally, Customer 360 Truth is powered by the Cloud Information Model (CIM), enabled by Mulesoft’s open-source modeling technology. CIM is an open-source data model that standardizes data interoperability across cloud apps.

As Stokes described it, “…When you are connecting data sources to Customer 360, you are mapping data from the source to a canonical data model, CIM. The Customer 360 source of truth is not only the identity of who your customer is, but it is the corresponding keys for where that data lives within Salesforce.”

Salesforce admins can then establish connections between data sources to prepare, match, reconcile and update a customer profile.

“The reconciled profile across apps enable employees to pull up relevant data at the time of need from any connected system, such as when a service agent may need to pull a list of past purchases from an order system in order to better assist in solving a problem,” Stokes explained.

What’s important is that this approach connects data but leaves the information in the systems that manage it. Add to this, Salesforce’s recently upgraded Einstein platform (now with voice), businesses have access to AI-powered recommendations and insights to take immediate action.

Other approaches are still siloed and require incredible patchwork. And other promising solutions are creating separate AI-powered layers to process information from disparate systems (even sourcing from multiple vendors) in real-time and then feeding results back to business users across platforms.

With Salesforce, a map is created on top of the data in each cloud to know where it is, how to retrieve it in time of need and then use it to personalize a variety of experiences at the moment.

What I appreciate about Customer 360 Truth goes beyond its capabilities. It’s also how Salesforce is communicating to customers that it’s going after root problems to deliver what their customers ultimately want, more personalized, integrated experiences. In doing so, Salesforce is also walking the walk by delivering a more human, relatable customer experience for businesses, which helps them deliver better experiences to their customers.

The evolution of CRM, after all of these years, is finally making customer-centricity a reality. The ability to place the customer (and their data) at the center of “their experience” and empower businesses to deliver more personal engagement, integrated journeys, and better outcomes is what adds up to memorable and sought-after customer experiences. And this is just the beginning of a new genre of experience innovation (and market narratives). When human-centered innovation and market conversations win, the customer and the customer’s customer also win.

Artificial Intelligence

Let’s block ads! (Why?)

ZDNet | crm RSS

Read More
« Older posts
  • Recent Posts

    • TODAY’S OPEN THREAD
    • IBM releases Qiskit modules that use quantum computers to improve machine learning
    • Transitioning to Hybrid Commerce
    • Bad Excuses
    • Understanding CRM Features-Better Customer Engagement
  • Categories

  • Archives

    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited