• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Tag Archives: Without

Check if polynomial is in factored form, without factoring

March 16, 2021   BI News and Info

 Check if polynomial is in factored form, without factoring

1 Answer

Let’s block ads! (Why?)

Recent Questions – Mathematica Stack Exchange

Read More

“Without Data, Nothing” — Building Apps That Last With Data

January 20, 2021   Sisense

Every company is becoming a data company. Data-Powered Apps delves into how product teams are infusing insights into applications and services to build products that will delight users and stand the test of time.

In philosophy, a “sine qua non” is something without which the phenomenon under consideration cannot exist. For modern apps, that “something” is data and analytics. No matter what a company does, a brilliant app concept alone is insufficient. You have to deftly integrate data and analytics into your product to succeed. 

Whatever your audience, your users are getting more and more used to seeing data and analytics infused throughout apps, products, and services of all kinds. We’ll dig into ways companies can use data and analytics to succeed in the modern app marketplace and look at some now-extinct players that might have thrived with the right data in their platforms.

data analytics successful apps blog cta banner 770x250 1 “Without Data, Nothing” — Building Apps That Last With Data

Sentiment analysis in customer messages

Yik Yak was an anonymous chat app that looked promising initially but failed because of problems that could have been resolved with data and analytics. What made Yik Yak popular was the exotic feature that enabled members to chat anonymously with others in the same geographic vicinity. Unfortunately, that feature was also the cause of the app’s demise: Yik Yak capitalized as a startup with about $ 75 million and grew to a value of $ 400 million before uncontrolled cyberbullying ruined its reputation. After Yik Yak’s name was spoiled as a result of abusive chat, the company could not sell ads on its platform, meaning it could no longer monetize its innovative concept.

How could Yik Yak have used data and analytics to avert disaster? Luma Health showed how message data can be analyzed for mood and meaning by using AI/ML methods on a data lake of chat messages. Yik Yak could have tagged message content with the originating IP address and then quickly blocked messages from that IP after abusive language was detected. This hindsight can now become foresight for other enterprising companies.

The benefits of leveraging collective data

Color Labs was another successful startup whose failure could have been avoided with the right analytics. Although the company’s investment in AI and convolutional neural networks (CNNs) may have been significant, in retrospect, an innovative use of these technologies on the right data could have given it a better shot at survival. The basic service model behind Color Labs’ app was that users would share images and then see images from other users who were posting pictures in the same vicinity (a media-based counterpart to Yik Yak’s concept). The app failed in part for reasons that new dating apps often fail: Needing to go live with a million users on day one! Color Labs’ users joined up only to find little or nothing posted in their vicinity, giving them little incentive to post and share. and leaving them feeling alone in an empty room. The company ultimately folded.

How could data insights have solved this problem for Color Labs? Leveraging the right collective datasets with CNNs could have identified images tagged to a geographical place already freely shared on the internet. Those images could be used to populate the app and get the user engagement ball rolling. Using CNNs in that way is expensive but justifiable if it means keeping the company afloat long enough to reach profitability. New dating app startups actually use a similar trick — purchasing a database of names and pictures and then filling in the blanks to create an artificial set of matches to temporarily satisfy new subscribers’ cravings for instant gratification (one such database is marketed as “50,000 profiles.”) The gamble is that new subscribers will remain hopeful long enough for a number of subscribers to join up and validate their existence. Color Labs could have benefited from existing media with a much lower cost in terms of ethical compromise as well.

Forecasting and modeling business costs

Shyp was an ingenious service app that failed for a number of reasons, but one of those reasons could have been fixed easily with data insights. The basic innovation of Shyp was to package an item for you and then ship it using a standard service like FedEx. The company’s shortcut, which turned out to be a business model error, was to charge a fixed rate of $ 5 for packaging. Whether the item to ship was a mountain bike or a keychain, the flat rate of $ 5 for packaging was a hole in Shyp’s hull, one that sank the company in short order.

Shyp’s mistake could have been resolved cleverly by using the wealth of existing data about object volume, weight, fragility, temperature sensitivity, and other factors to create an intelligent packaging price calculator. Such a database could even have included local variations in the price of packing materials such as foam peanuts, tape, boxes, and bubble wrap, and have presented the calculation at time of payment. Flat fees are attractive and can be used as loss leaders when trying to gather new customers or differentiate oneself in a crowded market, but if you aren’t Amazon, then you need to square the circle somehow. A data-driven algorithm for shipping prices (or whatever your service is) doesn’t just make good business sense — it can even be a selling point!

Social vs. personal networks: Sentiment analysis in data

“Path” fashioned itself an anti-Facebook: According to its founder, former Facebook developer Dave Morin, Path was a “personal network,” not a social network, where people could share “the story of their lives with their closest friends and family.” And for a moment it almost looked like Path might allow people to do just that. The startup boasted a whopping $ 500 million value with steadfast investor confidence that lasted all the way until it faded into obscurity, ultimately being purchased by a Korean tech firm and then removed from app stores. Path intended to enforce its mission to provide personal networks of true friends by limiting each user’s friend count to 50. The friend limit was perceived as detrimental to Path’s success at a time when Facebook users often had thousands of friends, but this alone did not account for the apparent irrelevance of the novel app. What was the missing piece? Data analysis.

Path could have sustained itself as a stalwart alternative to Facebook users disenchanted with the endless mill of likes and heart emojis. The key would have lain in sentiment analysis of user message content: By using natural language processing methods to distinguish close friends from distant acquaintances, Path could have offered its users an innovative platform for knowing who their “real friends” were.

Data analytics and the competitive future

We have seen that startup apps based on ingenious concepts and with funding levels over $ 100 million failed for a variety of reasons that could have been ameliorated or averted with savvy, transformative uses of data, analytics, and insights. One of the original e-hailing taxi companies failed for no other reason than the founding designers’ lack of awareness that Yellow cab drivers in New York at that time did not carry mobile phones!

Data is not only useful for calculating and forecasting the future, it’s a must-have for your app. Every company with a novel concept to unleash into the market must face the reality, as these companies did, that a good idea alone won’t guarantee an app’s success. Innovative use of data in concert with that idea is something that no modern app can survive without.

Jack Cieslak is a 10-year veteran of the tech world. He’s written for Amazon, CB Insights, and others, on topics ranging from ecommerce and VC investments to crazy product launches and top-secret startup projects.

Let’s block ads! (Why?)

Blog – Sisense

Read More

New York City Council votes to prohibit businesses from using facial recognition without public notice

December 11, 2020   Big Data
 New York City Council votes to prohibit businesses from using facial recognition without public notice

The cutting-edge computer architecture that’s changing the AI game

Learn about the next-gen architecture needed to unlock the true capabilities of AI and machine learning.

Register here

New York City Council today passed a privacy law for commercial establishments that prohibits retailers and other businesses from using facial recognition or other biometric tracking without public notice. If signed into law by NYC Mayor Bill de Blasio, the bill would also prohibit businesses from being able to see biometric data for third parties.

In the wake of the Black Lives Matter movement, an increasing number of cities and states have expressed concerns about facial recognition technology and its applications. Oakland and San Francisco, California and Somerville, Massachusetts are among the metros where law enforcement is prohibited from using facial recognition. In Illinois, companies must get consent before collecting biometric information of any kind, including face images. New York recently passed a moratorium on the use of biometric identification in schools until 2022, and lawmakers in Massachusetts have advanced a suspension of government use of any biometric surveillance system within the commonwealth. More recently, Portland, Maine approved a ballot initiative banning the use of facial recognition by police and city agencies.

The New York City Council bill, which was sponsored by Bronx Councilman Ritchie Torre, doesn’t outright ban the use of facial recognition technologies by businesses. However, it does impose restrictions on the ways brick-and-mortar locations like retailers, which might use facial recognition to prevent theft or personalize certain services, can deploy it. Businesses that fail to post a warning about collecting biometric data must pay $ 500. Businesses found selling data will face fines of $ 5,000.

In this aspect, the bill falls short of Portland, Oregon’s recently-passed ordinance regarding biometric data collection, which bans all private use of biometric data in places of “public accommodation,” including stores, banks, restaurants, public transit stations, homeless shelters, doctors’ offices, rental properties, retirement homes, and a variety of other types of businesses (excepting workplaces). It’s scheduled to take effect starting January 1, 2021.

“I commend the City Council for protecting New Yorkers from facial recognition and other biometric tracking. No one should have to risk being profiled by a racist algorithm just for buying milk at the neighborhood store,” Fox Cahn, executive director of the Surveillance Technology Oversight Project, said. “While this is just a first step towards comprehensively banning biometric surveillance, it’s a crucial one. We shouldn’t allow giant companies to sell our biometric data simply because we want to buy necessities. Far too many companies use biometric surveillance systems to profile customers of color, even though they are biased. If companies don’t comply with the new law, we have a simple message: ‘we’ll see you in court.’”

Numerous studies and VentureBeat’s own analyses of public benchmark data have shown facial recognition algorithms are susceptible to bias. One issue is that the data sets used to train the algorithms skew white and male. IBM found that 81% of people in the three face-image collections most widely cited in academic studies have lighter-colored skin. Academics have found that photographic technology and techniques can also favor lighter skin, including everything from sepia-tinged film to low-contrast digital cameras.

“Given the current lack of regulation and oversight of biometric identifier information, we must do all we can as a city to protect New Yorkers’ privacy and information,” said Councilman Andrew Cohen, who chairs the Committee on Consumer Affairs. Crain’s New York reports that the committee voted unanimously in favor of advancing Torres’ bill to the full council hearing earlier this afternoon.

The algorithms are often misused in the field, as well, which tends to amplify their underlying biases. A report from Georgetown Law’s Center on Privacy and Technology details how police feed facial recognition software flawed data, including composite sketches and pictures of celebrities who share physical features with suspects. The New York Police Department and others reportedly edit photos with blur effects and 3D modelers to make them more conducive to algorithmic face searches. And police in Minnesota have been using biometric technology from vendors including Cognitec since 2018, despite a denial issued that year, according to the Star Tribune.

Amazon, IBM, and Microsoft have self-imposed moratoriums on the sale of facial recognition systems. But some vendors, like Rank One Computing and Los Angeles-based TrueFace, are aiming to fill the gap with customers, including the City of Detroit and the U.S. Air Force.

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Get a Microsoft Dynamics 365/CRM Estimate without Engaging a Salesperson

November 30, 2020   Microsoft Dynamics CRM
crmnav Get a Microsoft Dynamics 365/CRM Estimate without Engaging a Salesperson

There’s never been a better time to investigate the versatility, power, and advanced features of Microsoft Dynamics 365. Now more than ever, businesses are looking for tools to make their workers, both onsite and remote, more efficient, accurate, productive, and secure. Reading posts by our expert members on the CRM Software Blog will answer a lot of your questions about Dynamics 365’s great features, along with suggestions about how it can help your business.

But one thing you’ll never see in a post is a price quote. Naturally, a competent partner will want to sit down with you and discuss your particular business needs and goals. That’s because there are so many variables, even for businesses within the same industry.

Perhaps you’re not ready to sit down with a salesperson just yet. Maybe you’d like an estimate to see if Dynamics 365/CRM will fit into your budget. Good news! We have a tool for that: The CRM Software Blog’s Quick Quote Tool.

The Quick Quote Tool

We developed The CRM Software Blog’s Quick Quote tool years ago.  As the industry has evolved and technology progressed, we’ve adjusted the tool to keep pace. The Quick Quote tool now provides a working estimate for Microsoft Dynamics 365, Microsoft’s solution that integrates ERP and CRM. You’ll get an estimated price for the total cost of software, implementation, training, and ongoing expenses. The Quick Quote tool is a hassle-free way to determine if Microsoft Dynamics 365 is a fit for your business and your budget.

The Quick Quote tool takes only a few minutes and is completely free. Find it on the right side of any page of The CRM Software Blog. Click on the orange bar labeled “Request Instant Quote Dynamics 365/CRM”. Fill out the Microsoft Dynamics 365 Quick Quote request form to let us know whether you’re interested in the Business Edition or Enterprise Edition, what level you want (Basic, Basic Plus, or Advanced), how many users you anticipate, and your contact information. It’s as easy as that. Click submit, and within a couple of minutes, a personalized proposal will appear in your inbox.

The proposal will contain a detailed budgetary estimate, as well as information about setup and training, client testimonial videos, and a dozen or so links to helpful information so you can learn all about how Microsoft Dynamics 365 can be used to the greatest advantage at your company. Your contact information will be forwarded to just one of our CRM Software Blog members who will be glad to answer any questions you have and work with you on the installation if you choose. Of course, both the estimate and the partner referral are non-binding. They are provided for your convenience.

So, why not try the Quick Quote tool? It’s fast, and it’s free. Get your Microsoft Dynamics Quick Quote estimate now!

By CRM Software Blog Writer, www.crmsoftwareblog.com

Let’s block ads! (Why?)

CRM Software Blog | Dynamics 365

Read More

Why AI can’t move forward without diversity, equity, and inclusion

November 12, 2020   Big Data
 Why AI can’t move forward without diversity, equity, and inclusion

Maintain your employer brand in a pandemic

Read the VentureBeat Jobs guide to employer branding

Download eBook

The need to pursue racial justice is more urgent than ever, especially in the technology industry. The far-reaching scope and power of machine learning (ML) and artificial intelligence (AI) means that any gender and racial bias at the source is multiplied to the nth power in businesses and out in the world. The impact those technology biases have on society as a whole can’t be underestimated.

When decision-makers in tech companies simply don’t reflect the diversity of the general population, it profoundly affects how AI/ML products are conceived, developed, and implemented. Evolve, presented by VentureBeat on December 8th, is a 90-minute event exploring bias, racism, and the lack of diversity across AI product development and management, and why these issues can’t be ignored.

“A lot has been happening in 2020, from working remotely to the Black Lives Matter movement, and that has made everybody realize that diversity, equity, and inclusion is much more important than ever,” says Huma Abidi, senior director of AI software products and engineering at Intel – and one of the speakers at Evolve. “Organizations are engaging in discussions around flexible working, social justice, equity, privilege, and the importance of DEI.”

Abidi, in the workforce for over two decades, has long grappled with the issue of gender diversity, and was often the only woman in the room at meetings. Even though the lack of women in tech remains an issue, companies have made an effort to address gender parity and have made some progress there.

In 2015, Intel allocated $ 300 million toward an initiative to increase diversity and inclusion in their ranks, from hiring to onboarding to retention. The company’s 2020 goal is to increase the number of women in technical roles to 40% by 2030 and to double the number of women and underrepresented minorities in senior leadership.

“Diversity is not only the right thing to do, but it’s also better for business,” Abidi says. “Studies from researchers, including McKinsey, have shown data that makes it increasingly clear that companies with more diverse workforces perform better financially.”

The proliferation of cases in which alarming bias is showing up in AI products and solutions has also made it clear that DEI is a broader and more immediate issue than had previously been assumed.

“AI is pervasive in our daily lives, being used for everything from recruiting decisions to credit decisions, health care risk predictions to policing, and even judicial sentencing,” says Abidi. “If the data or the algorithms used in these cases have underlying biases, then the results could be disastrous, especially for those who are at the receiving end of the decision.”

We’re hearing about cases more and more often, beyond the famous Apple credit check fiasco, and the fact that facial recognition still struggles with dark skin. There’s Amazon’s secret recruiting tool that avoided hiring qualified women because of the data set that was used to train the model. It showed that men were more qualified, because historically that’s been the case for that company.

An algorithm used by hospitals was shown to prioritize the care of healthier white patients over sicker Black patients who needed more attention. In Oakland, an AI-powered software piloted to predict areas of high crime turned out to be actually tracking areas with high minority populations, regardless of the crime rate.

“Despite great intentions to build technology that works for all and serves all, if the group that’s responsible for creating the technology itself is homogenous, then it will likely only work for that particular specific group,” Abidi says. “Companies need to understand, that if your AI solution is not implemented in a responsible, ethical manner, then the results can cause, at best, embarrassment, but it could also lead to potentially having legal consequences, if you’re not doing it the right way.”

This can be addressed with regulation, and the inclusion of AI ethics principles in research and development, around responsible AI, fairness, accountability, transparency, and explainability, she says.

“DEI is well established — it makes business sense and it’s the right thing to do,” she says. “But if you don’t have it as a core value in your organization, that’s a huge problem. That needs to be addressed.”

And then, especially when it comes to AI, companies have to think about who their target population is, and then whether the data is representative of the target population. The people who first notice biases are the users from the specific minority community that the algorithm is ignoring or targeting — therefore, maintaining a diverse AI team can help mitigate unwanted AI biases.

And then, she says, companies need to ask if they have the right interdisciplinary team, including personnel such as AI ethicists, including ethics and compliance, law, policy, and corporate responsibility. Finally, you have to have a measurable, actionable de-biasing strategy that contains a portfolio of technical, operational, organizational actions to establish a workplace where these metrics and processes are transparent.

“Add DEI to your core mission statement, and make it measurable and actionable — is your solution in line with the mission of ethics and DEI?” she says. “Because AI has the power to change the world, the potential to bring enormous benefit, to uplift humanity if done correctly. Having DEI is one of the key components to make it happen.”


The 90-minute Evolve event is divided into two distinct sessions on December 8th:

  1. The Why, How & What of DE&I in AI
  2. From ‘Say’ to ‘Do’: Unpacking real world case studies & how to overcome real world issues of achieving DE&I in AI

Register for free right here.

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Today, Unca Donald may get the Money Bin opened-up without the Beagle Boys.

July 12, 2020   Humor
 Today, Unca Donald may get the Money Bin opened up without the Beagle Boys.

“Huey, Dewey, and Louie are the nephews of Donald Duck and the great-nephews of Scrooge McDuck. Like their uncles, the boys are anthropomorphic white ducks with yellow-orange bills and feet. Scrooge is an elderly Scottish anthropomorphic Pekin duck with a yellow-orange bill, legs, and feet. He typically wears a red or blue frock coat, top hat, pince-nez glasses, and spats. He is portrayed in animations as speaking with a Scottish accent. The son of Hortense and Quackmore Duck, this Donald is very similar to his main universe counterpart — a hot-tempered duck with a good heart and a quacky voice (which has somehow not stopped him from earning a major in public speaking.)”

Cartoon characters without pants might have to drop drawers as SCOTUS finally rules on Trump’s financial records today. There will be an over/under and perhaps even a split difference, but regardless there will be strong, powerful, or even strongly or powerfully Trump distractions. Bigly matters will at least be referenced since niece Mary Trump reminds us of the massive tax fraud committed by the Trump family. Yet would we be surprised if they eviscerated Congressional oversight and let Trump off scot free. 716-671.

x

The Supreme Court has announced that tomorrow will be the last day of opinions for the term. That’s when we’ll have rulings on Trump financial records.

— Kyle Griffin (@kylegriffin1) July 8, 2020



x

Tomorrow we find out if SCOTUS will require that Trump’s taxes be turned over to Congress and/or Manhattan DA Cy Vance. It’s sort of a big moment for our country.

— Joyce Alene (@JoyceWhiteVance) July 8, 2020

On related matters:

The U.S. Supreme Court all but guaranteed House Democrats won’t get pre-election access to confidential materials from Special Counsel Robert Mueller’s Russia investigation, agreeing to hear a Trump administration appeal likely to extend a legal fight into next year.

The justices said they will review a lower court order that would require the Justice Department to turn over redacted parts of Mueller’s 448-page report, along with underlying grand jury transcripts and exhibits. The Supreme Court will consider the case in the nine-month term that starts in October.

The House Judiciary Committee sought the records as part of its impeachment inquiry last year. President Donald Trump was impeached on different grounds by the Democratic-controlled House before being acquitted by the Republican-controlled Senate.

Democrats say the materials would help them determine whether Trump committed impeachable offenses by obstructing the FBI’s and Mueller’s investigations into Russian interference in the 2016 election. Mueller found 10 instances of possible obstruction of justice but stopped short of determining whether Trump had engaged in obstruction.

Trump and Attorney General William Barr “are continuing to try to run out the clock on any and all accountability,” House Judiciary Chairman Jerrold Nadler of New York said in an emailed statement. “While I am confident their legal arguments will fail, it is now all the more important for the American people to hold the president accountable at the ballot box in November.”

www.bloomberg.com/…

x

“This is a book that’s been written from pain and is designed to hurt.” https://t.co/wR4vJw6gGb

— Maggie Haberman (@maggieNYT) July 9, 2020

x

Watch the Kushner episode of Dirty Money on Netflix. It’s called “Slumlord Millionaire,” and it’ll make your blood boil. That family is trash.

— (((Josh Malina))) (@JoshMalina) July 9, 2020

x

“It’s patently clear that some of the people who’re involved in current politics…are borrowing some of the tactics of the 1920s and 1930s.”

Yale History professor Timothy Snyder says some of today’s politicians have learned propaganda techniques from twentieth century fascists. pic.twitter.com/YXMjePdlHc

— Channel 4 News (@Channel4News) October 9, 2019

x

Nothing SCOTUS decides tomorrow — either way — will result in the public release of the tax information Trump has been so desperate to conceal from voters in time for the 2020 election. So the thing to look for is whether SCOTUS upholds or rejects Trump’s “I am the law” boast.

— Laurence Tribe (@tribelaw) July 8, 2020

Let’s block ads! (Why?)

moranbetterDemocrats

Read More

Google’s federated analytics method could analyze end user data without invading privacy

May 28, 2020   Big Data
 Google’s federated analytics method could analyze end user data without invading privacy

In a blog post today, Google laid out the concept of federated analytics, a practice of applying data science methods to the analysis of raw data that’s stored locally on edge devices. As the tech giant explains, it works by running local computations over a device’s data and making only the aggregated results — not the data from the particular device — available to authorized engineers.

While federated analytics is closely related to federated learning, an AI technique that trains an algorithm across multiple devices holding local samples, it only supports basic data science needs. It’s “federated learning lite” — federated analytics enables companies to analyze user behaviors in a privacy-preserving and secure way, which could lead to better products. Google for its part uses federated techniques to power Gboard’s word suggestions and Android Messages’ Smart Reply feature.

“The first exploration into federated analytics was in support of federated learning: how can engineers measure the quality of federated learning models against real-world data when that data is not available in a data center? The answer was to re-use the federated learning infrastructure but without the learning part,” Google research scientist Daniel Ramage and software engineer Stefano Mazzocchi said in a statement. “In federated learning, the model definition can include not only the loss function that is to be optimized, but also code to compute metrics that indicate the quality of the model’s predictions. We could use this code to directly evaluate model quality on phones’ data.”

As an example, in a user study, Gboard engineers measured the overall quality of word prediction models against raw typing data held on phones. Participating phones downloaded a candidate model, locally computed a metric of how well the model’s predictions matched words that were actually typed, and then uploaded the metric without any adjustment to the model itself or any change to the Gboard typing experience. By averaging the metrics uploaded by many phones, engineers learned a population-level summary of model performance.

VB Transform 2020 Online – July 15-17. Join leading AI executives: Register for the free livestream.

In a separate study, Gboard engineers wanted to discover words commonly typed by users and add them to dictionaries for spell-checking and typing suggestions. They trained a character-level recurrent neural network on phones, using only the words typed on these phones that weren’t already in the global dictionary. No typed words ever left the phones, but the resulting model could then be used in the datacenter to generate samples of frequently typed character sequences — i.e., the new words.

Beyond model evaluation, Google uses federated analytics to support the Now Playing feature on its Pixel phones, which shows what song might be playing nearby. Under the hood, Now Playing taps an on-device database of song fingerprints to identify music near a phone without the need for an active network connection.

When it recognizes a song, Now Playing records the track name into the on-device history, and when the phone is idle and charging while connected to Wi-Fi, Google’s federated learning and analytics server sometimes invites it to join a “round” of computation with hundreds of phones. Each phone in the round computes the recognition rate for the songs in its Now Playing history and uses a secure aggregation protocol to encrypt the results. The encrypted rates are sent to the federated analytics server, which doesn’t have the keys to decrypt them individually; when combined with the encrypted counts from the other phones in the round, the final tally of all song counts can be decrypted by the server.

The result enables Google’s engineers to improve the song database without any phone revealing which songs were heard, for example, by making sure the database contains truly popular songs. Google claims that in its first improvement iteration, federated analytics resulted in a 5% increase in overall song recognition across all Pixel phones globally.

“We are also developing techniques for answering even more ambiguous questions on decentralized datasets like ‘what patterns in the data are difficult for my model to recognize?’ by training federated generative models. And we’re exploring ways to apply user-level differentially private model training to further ensure that these models do not encode information unique to any one user,” wrote Ramage and Mazzocchi. “It’s still early days for the federated analytics approach and more progress is needed to answer many common data science questions with good accuracy … [B]ut federated analytics enables us to think about data science differently, with decentralized data and privacy-preserving aggregation in a central role.”

Let’s block ads! (Why?)

Big Data – VentureBeat

Read More

Improve Row Count Estimates for Table Variables without Changing Code

May 27, 2020   BI News and Info

You probably have heard that table variables work fine when the table variable only contains a small number of records, but when the table variable contains a large number of records it doesn’t perform all that well. A solution for this problem has been implemented in version 15.x of SQL Server (Azure SQL Database and SQL Server 2019) with the rollout of a feature called Table Variable Deferred Compilation

Table Variable Deferred Compilation is one of many new features to improve performance that was introduced in the Azure SQL Database and SQL Server 2019. This new feature was included in the Intelligent Query Processing (IQP). See Figure 1 for a diagram that shows all the IQP features introduced in Azure SQL Database and SQL Server 2019, as well as features that originally were part of the Adaptive Query Processing feature included in the older generation of Azure SQL Database and SQL Server 2017.

word image 27 Improve Row Count Estimates for Table Variables without Changing Code

Figure 1: Intelligent Query Processing

In releases of SQL Server prior to 15.x, the database engine used a wrong assumption on the number of rows that were in a table variable. Because of this bad assumption, the execution plan that was generated didn’t work too well when a table variable contained lots of rows. With the introduction of SQL Server 2019, the database engine now defers the compilation of a query that uses a table variable until the table variable is used the first time. By doing this, the database engine can more accurately identify cardinality estimates for table variables. By having more accurate cardinality numbers, queries that have large numbers of rows in a table variable will perform better. Those queries will need to be running against a database with a database compatibility level set to 150 (version 15.x of SQL Server) to take advantage of this feature. To better understand how deferred compilation improves the performance of table variables that contain a large number of rows, I’ll run through an example, but first, I’ll discuss what is the problem with table variables in versions of SQL Server prior to version 15.x.

What is the Problem with Table Variables?

A table variable is defined using a DECLARE statement in a batch or stored procedure. Table variables don’t have distribution statistics and don’t trigger recompiles. Because of this, SQL Server is not able to estimate the number of rows in a table variable like it does for normal tables. When the optimizer compiles code that contains a table variable, prior to 15.x, it assumes a table is empty. This assumption causes the optimizer to compile the query using an expected row count of 1 for the cardinality estimate for a table variable. Because the optimizer only thinks a table variable contains a single row, it picks operators for the execution plan that work well with a small set of records, like the NESTED LOOPS operator for a JOIN operation. The operators that work well on a small number of records do not always scale well when a table variable contains a large number of rows. Microsoft documented this problem and recommends that temp tables might be a better choice than using a table variable that contains more than 100 rows. Additionally, Microsoft even recommends that if you are joining a table variable with other tables that you consider using the query hint RECOMPILE to make sure that table variables get the correct cardinality estimates. Without the proper cardinality estimates queries with large table variables are known to perform poorly.

With the introduction of version 15.x and the Table Variable Deferred Compilation feature, the optimizer delays the compilation of a query that uses a table variable until just before it is used the first time. This allows the optimizer to know the correct cardinality estimates of a table variable. When the optimizer has an accurate cardinality estimate, it has a good chance at picking execution plan operators that perform well for the number of rows in a table variable. In order for the optimizer to defer the compilation, the database must have its compatibility level set to 150. To show how deferred compilation of table variables work, I’ll show an example of this new feature in action.

Table Variable Deferred Compilation in Action

To understand how deferred compilation works, I will run through some sample code that uses a table variable in a JOIN operation. That sample code can be found in Listing 1.

Listing 1: Sample Test Code that uses Table Variable in JOIN operation

USE WideWorldImportersDW;

GO

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City;

SELECT O.[Order Key], TV.[City Key]

FROM Fact.[Order] as O INNER JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key];

As you can see, this code uses the WideWorldImportersDW database, which can be downloaded here. In this script, I first declare my table variable @MyCities and then insert 116,295 rows from the Dimension.City table into the variable. That variable is then used in an INNER JOIN operation with the Fact.[Order] table.

To show the deferred compilation in action, I will need to run the code in Listing 1 twice. The first execution will be run against the WideWorldImportsDW using compatibility code 140, and the second execution will run against this same database using compatibility level 150. The script I will use to compare how table variables work, using the two difference compatibility levels, can be found in Listing 2.

Listing 2: Comparison Test Script

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

USE WideWorldImportersDW;

GO

– Turn on time statistics

SET STATISTICS TIME ON;

GO

—————————————————

– Test #1 – Using SQL Server 2017 compatibility –

—————————————————

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 140;

GO

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City;

SELECT O.[Order Key], TV.[City Key]

FROM Fact.[Order] as O JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key]

—————————————————

– Test #2 – Using SQL Server 2019 compatibility –

—————————————————

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

USE WideWorldImportersDW;

GO

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City

SELECT O.[Order Key], TV.[City Key]

FROM Fact.[Order] as O JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key];

GO

When I run the code in Listing 2, I run it from a query window in SQL Server Management Studio (SSMS), with the Include Actual Execution Plan query option turned on. The execution plan I get with I run query Test #1 and #2 can be found in Figure 2 and Figure 3, respectfully.

word image 28 Improve Row Count Estimates for Table Variables without Changing Code

Figure 2: Execution Plan for Test #1 code in Listing 2, using compatibility level 140

word image 29 Improve Row Count Estimates for Table Variables without Changing Code

Figure 3: Execution Plan for Test #2 code in Listing 2, using compatibility level 150

If you compare the execution plan between Figure 2 and 3, you will see the execution plans are a little different. When compatibility mode 140 was used, my test query used a NESTED LOOPS operation to join the table variable to the Fact.[Order] table, whereas when using compatibility mode 150, the optimizer picked a HASH MATCH operator for the join operation. This occurred because the Test #1 query uses an estimated row count of 1 for the table variable @MyCities. Whereas the Test #2 query was able to use the deferred table variable compilation feature which allowed the optimizer to use an estimated row count of 116,295 for the table variable. These estimated row count numbers can be verified by looking at the Table Scan operator properties for each execution plan, which are shown in Figure 4 and 5 respectfully.

word image 30 Improve Row Count Estimates for Table Variables without Changing Code

Figure 4: Table Scan properties when Test #1 query ran under compatibility level 140

word image 31 Improve Row Count Estimates for Table Variables without Changing Code

Figure 5: Table Scan properties when Test #2 query ran under compatibility level 150

By reviewing the table scan properties, the optimize used the correct estimated row count when compatibility level 150 was used. Whereas when compatibility level 140 was used, the optimizer estimated a row count of 1. Also note that my query that ran under compatibility level 150 also used BATCH mode for the TABLE SCAN operation, whereas the compatibility mode 140 query ran using ROW mode. You may be asking yourself now, how much faster does running my test code under compatibility level 150 perform over running the test code under the older compatibility level 140.

Comparing Performance between Compatibility Mode 140 and 150

In order to compare the performance of running my test query under both compatibility level, I executed the script in Listing 1 ten different times under each of the two compatibility levels. I then calculated the average CPU and elapsed time for the two different compatibility levels, and finally graphed the average performance number in the graph in Figure 6.

word image 32 Improve Row Count Estimates for Table Variables without Changing Code

Figure 6: Performance Comparison between Compatibility Mode 140 and 150.

When the test query was run under compatibility mode 150, it used a fraction of the CPU over compatibility level 140. Whereas the Elapsed Time value of the test query that ran under compatibility level 150 ran 4.6 times faster than then using compatibility level 140. This is a significate performance improvement. But since batch mode processing was for the compatibility level 150 test, I can’t assume all this improvement was associated with only the Deferred Table Variable Compilation feature.

In order to remove the batch mode from my performance test, I’m going to run my test query under compatibility mode 150 one more time. But this time my test will run with a query hint to disable the batch mode feature. The script I will use for this additional test can be found in Listing 3.

1

2

3

4

5

6

7

8

9

10

USE WideWorldImportersDW;

GO

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City

SELECT O.[Order Key], TV.[City Key]

FROM Fact.[Order] as O JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key]

OPTION(USE HINT(‘DISALLOW_BATCH_MODE’));

GO 10

Listing 3: Test #2 query with Batch Mode disabled

The graph in Figure 7 shows the new performance comparison results using deferred compilation and row mode features when my test ran under compatibility level 150.

word image 33 Improve Row Count Estimates for Table Variables without Changing Code

Figure 7: Table Variable Deferred Compilation Comparison with Batch Mode disabled

With the Batch Mode feature disabled, my CPU went up significantly from my previous test when batch mode was enabled. But the Elapsed Time was only slightly different. Deferred Compilation seems to provide significate performance improvements, by delaying the compilation of a query until the table variable is used the first time. I have to wonder if the deferred compilation feature will improve the cardinality estimate issue caused by parameter sniffing with a parameterized query.

Does Deferred Compilation Help with Parameter Sniffing?

Parameter sniffing has been known to cause performance issues when a compiled execute plan is executed multiple times using different parameter values. But does the deferred table variable compilation feature in 15.x solve this parameter sniffing issue? To determine whether or not it does, let me create a stored procedure name GetOrders, to test this out. That stored procedure CREATE statement can be found in Listing 4.

Listing 4: Code to test out parameter sniffing

1

2

3

4

5

6

7

8

9

10

11

12

USE WideWorldImportersDW;

GO

CREATE OR ALTER PROC GetOrders(@CityKey int)

AS

DECLARE @MyCities TABLE ([City Key] int not null);

INSERT INTO @MyCities

  SELECT [City Key] FROM Dimension.City

  WHERE [City Key] < @CityKey;

SELECT *

FROM Fact.[Order] as O INNER JOIN @MyCities as TV

ON O.[City Key] = TV.[City Key]

GO

The number of rows returned by the stored procedure in Listing 4 is controlled by the value passed in the parameter @MyCities. To test if the deferred compilation feature solves the parameter sniffing issue, I will run the code in Listing 5.

Listing 5: Code to see if deferred compilation resolves parameter sniffing issue

USE WideWorldImportersDW;

GO

SET STATISTICS IO ON;

DBCC FREEPROCCACHE;

– First Test

EXEC GetOrders @CityKey = 10;

–Second Test

EXEC GetOrders @CityKey = 231412;

The code in Listing 5 first runs the test stored procedure using a value of 10 for the parameter. The second execution uses the value 231412 for the parameter. These two different parameters will cause the store procedure to process drastically different numbers of rows. After I run the code in Listing 5, I will explore the execution plan for each execution of the stored procedure. I will look at the properties of the TABLE SCAN operation to see what the optimizer thinks are the estimated and actual rows count for the table variables for each execution. The table scan properties for each execution can be found in Figure 8 and 9 respectfully.

word image 34 Improve Row Count Estimates for Table Variables without Changing Code

Figure 8: Table Scan Statistics for the first execution of the test stored procedure

word image 35 Improve Row Count Estimates for Table Variables without Changing Code

Figure 9: Table Scan Statistics for the second execution of the test stored procedure

Both executions got the same number of estimated rows counts but got considerably different actual row counts. This means that the deferred table compilation feature of version 15.x doesn’t resolve the parameter sniffing problem of a stored procedure.

What Editions Supports the Deferred Compilations for Table Variables?

Like many cool new features that have come out with each new release of SQL Server in the past, they are first introduced in Enterprise edition only, and then over time, they might become available in other editions. You will be happy to know that the Deferred Compilation for Table Variables feature doesn’t follow this typical pattern. As of the RTM release of SQL Server 2019, the deferred compilation feature is available in all editions of SQL Server, as documented here.

Improve Performance of Code using Table Variables without Changing Any Code

TSQL code that contains a table variable has been known not to perform well when the variable contains lots of rows. This is because the code that declares the table valuable is compiled before the table has been populated with any rows of data. Well, that has all changed when TSQL code is executed in SQL Server 2019 or Azure SQL DB when your database is running under compatibility level 150. When using a database that is in compatibility level 150, the optimizer defers the compilation of code using a table variable until the first time the table variable is used in a query. By deferring the compilation, SQL Server can obtain a more accurate estimate of the number of rows in the table variable. When the optimizer has better cardinality estimates for a table variable, the optimizer can pick more appropriate operators for the execution plan, which leads to better performance. Therefore, if you have found code where table variables don’t scale well when they contain a lot of rows, then possibly version 15.x of SQL Server might help. By running TSQL code under compatibility level 150, you can improve the performance of code using table variables without changing any code.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

Reduce CPU of Large Analytic Queries Without Changing Code

March 27, 2020   BI News and Info

When Microsoft came out with columnstore in SQL Server 2012, they introduced a new way to process data called Batch Mode. Batch mode processes a group of rows together as a batch, instead of processing the data row by row. By processing data in batches, SQL Server uses less CPU than row by row processing. To take advantage of batch mode, a query had to reference a table that contained a column store index. If your query only involved tables that contain data in row stores, then your query would not use batch mode. That has now changed. With the introduction of version 15.x of SQL Server, aka SQL Server 2019, Microsoft introduced a new feature call Batch Mode on Rowstore.

Batch Mode on Rowstore is one of many new features that was introduced in the Azure SQL Database and SQL Server 2019 to help speed up rowstore queries that don’t involve a column store. The new Batch Mode on Rowstore feature can improve performance of large analytic queries that scan many rows, where these queries aggregate, sort or group selected rows. Microsoft included this new batch mode feature in the Intelligent Query Processing (IQP). See Figure 1 for a diagram from Microsoft’s documentation that shows all the IQP features introduced in Azure SQL Database and SQL Server 2019. It also shows the features that originally were part of Adaptive Query Processing included in the older generation of Azure SQL Database and SQL Server 2017.

word image 46 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 1: Intelligent Query Processing

Batch Mode on Rowstore can help speed up your big data analytic queries but might not kick in for smaller OLTP queries (more on this later). Batch mode has been around for a while and supports columnstore operators, but it wasn’t until SQL Server version 15.x that batch mode worked on Rowstores without performing a hack. Before seeing the new Batch Mode on Rowstore feature in action, let me first explain how batch mode processing works.

How Batch Mode Processing Works

When the database engine processes a transact SQL statement, the underlying data is processed by one or more operators. These operators can process the data using two different modes: Row or Batch. At a high level, row mode can be thought of as processing rows of data, one row at a time. Whereas, batch mode processes multiple rows of data together in a batch. The processing of batches of rows at a time versus row by row can reduce CPU usage.

When batch mode is used for rowstore data, the rows of data are scanned and loaded into a vector storage structure, known as a batch. Each batch is a 64K internal storage structure. This storage structure can contain between 64 and 900 rows of data, depending on the number of columns involved in the query. Each column used by the query is stored in a continuous column vector of fixed size elements, where the qualifying rows vector indicates which rows are still logically part of the batch (see Figure 2 which came from a Microsoft Research paper).

Rows of data can be processed very efficiently when an operation uses batch mode, as compared to row mode processing. For instance, when a batch mode filter operation needs to qualify rows that meet a given column filter criteria, all that is needed is to scan the vector that contains the filtered column and mark the row appropriately in the qualifying rows vector, based on whether or not the column value meets the filter criteria.

word image 47 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 2: A row batch is stored column-wise and contains one vector for each column plus a bit vector indicating qualifying rows

SQL Server executes fewer instructions per row when using batch mode over row mode. By reducing the number of instructions when using batch mode, queries typically use less CPU than row mode queries. Therefore, if a system is CPU bound, then batch mode might help reduce the environment’s CPU footprint.

In a given execution plan, SQL Server might use both batch and row mode operators, because not all operators can process data in batch mode. When mixed-mode operations are needed, SQL Server needs to transition between batch mode and row mode processing. This transition comes at a cost. Therefore, SQL Server tries to minimize the number of transitions to help optimize the processing of mixed-mode execution plans.

For the engine to consider batch mode for a rowstore, the database compatibility level must be set to 150. With the compatibility level set to 150, the database engine performs a few heuristic checks to make sure the query qualifies to use batch mode. One of the checks is to make sure the rowstore contains a significate number of rows. Currently, it appears that the magic number seems to be 131,072. Dmitry Pilugin wrote an excellent post on this magic number. I also verified that this is still the magic number for the RTM release of SQL Server 2019. That means that batch mode doesn’t kick in for smaller tables (less than 131,072 rows), even if the database is set to compatibility mode 150. Another heuristic check verifies that the rowstore is using either a b-tree or heap for its storage structure. Batch mode doesn’t kick in if the table is an in-memory table. The cost of the plan is also considered. If the database optimizer finds a cheaper plan that doesn’t use Batch Mode on Rowstore, then the cheaper plan is used.

To see how this new batch mode feature works on a rowstore, I set up a test that ran a couple of different aggregate queries against the WideWorldImportersDW database.

Batch Mode on Rowstore In Action

This section demonstrates running a simple test aggregate query to summarize a couple of columns of a table that uses heap storage. The example runs the test aggregate query twice. The first execution uses compatibility level 140, so the query must use row mode operators to process the test query. The second execution runs under compatibility mode 150 to demonstrate how batch mode improves the query processing for the same test query.

After running the test query, I’ll explain how the graphical execution plans show the different operators used between the two test query executions. I’ll also compare the CPU and Elapsed time used between the two queries to identify the performance improvement using batch mode processing versus row mode processing. Before showing my testing results, I’ll first explain how I set up my testing environment.

Setting up Testing Environment

I used the WideWorldImportersDW database as a starting point for my test data. To follow along, you can download the database backup for this DB here. I restored the database to an instance of SQL Server 2019 RTM running on my laptop. Since the Fact.[Order] table in this database isn’t that big, I ran the code in Listing 1 to create a bigger fact table named Fact.OrderBig. The test query aggregates data using this newly created fact table.

Listing 1: Code to create the test table Fact.OrderBig

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

USE WideWorldImportersDW;

GO

CREATE TABLE Fact.[OrderBig](

[Order Key] [bigint],

[City Key] [int] NOT NULL,

[Customer Key] [int] NOT NULL,

[Stock Item Key] [int] NOT NULL,

[Order Date Key] [date] NOT NULL,

[Picked Date Key] [date] NULL,

[Salesperson Key] [int] NOT NULL,

[Picker Key] [int] NULL,

[WWI Order ID] [int] NOT NULL,

[WWI Backorder ID] [int] NULL,

[Description] [nvarchar](100) NOT NULL,

[Package] [nvarchar](50) NOT NULL,

[Quantity] [int] NOT NULL,

[Unit Price] [decimal](18, 2) NOT NULL,

[Tax Rate] [decimal](18, 3) NOT NULL,

[Total Excluding Tax] [decimal](18, 2) NOT NULL,

[Tax Amount] [decimal](18, 2) NOT NULL,

[Total Including Tax] [decimal](18, 2) NOT NULL,

[Lineage Key] [int] NOT NULL);

GO

INSERT INTO Fact.OrderBig

   SELECT * FROM Fact.[Order];

GO 100

The code in Listing 1 created the Fact.OrderBig table that is 100 times the size of the original Fact.[Order] table with 23,141,200 rows.

Comparison Test Script

To do a comparison test between batch mode and row mode, I ran two different test queries found in Listing 2.

Listing 2: Test script

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

USE WideWorldImportersDW;

GO

– Turn on time statistics

SET STATISTICS IO, TIME ON;

– Clean buffers so cold start performed

DBCC DROPCLEANBUFFERS

GO

– Prepare Database Compatibility level for Test #1

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 140;

GO

– Test #1

SELECT [Customer Key],

       SUM(Quantity) AS TotalQty,

       AVG(Quantity) AS AvgQty,

       AVG([Unit Price]) AS AvgUnitPrice

FROM Fact.[OrderBig]

WHERE [Customer Key] > 10 and [Customer Key] < 100

GROUP BY [Customer Key]

ORDER BY [Customer Key];

GO

– Clean buffers so cold start performed

DBCC DROPCLEANBUFFERS

GO

– Prepare Database Compatibility level for Test #2

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

– Test #2

SELECT [Customer Key],

     SUM(Quantity) AS TotalQty,

AVG(Quantity) AS AvgQty,

AVG([Unit Price]) AS AvgUnitPrice

FROM Fact.[OrderBig]

WHERE [Customer Key] > 10 and [Customer Key] < 100

GROUP BY [Customer Key]

ORDER BY [Customer Key];

GO

The code in Listing 2 executes two different tests, collects some performance statistics, and cleans the data buffer cache between each test. Both tests run the same simple aggregate query against the Fact.OrderBig table. Test #1 runs the aggregate SELECT statement using compatibility level 140, whereas Test #2 runs the same aggregate SELECT statement using compatibility level 150. By setting the compatibility level to 140, Test #1 uses row mode processing. Whereas Test #2, uses compatibility level 150, so batch mode can be considered for the test query. Additionally, I turned on the TIME statistics so I could measure performance (CPU and Elapsed time) between each test. By doing this, I can validate the performance note in Figure 3, that was found in this Microsoft documentation.

word image 48 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 3: Documentation Note on Performance

When I ran my test script in Listing 2, I executed it from a SQL Server Management Studio (SSMS) query window. In that query window, I enabled the Include Actual Execution Plan option so that I could compare the execution plans created for both of my tests. Let me review the execution artifacts created when I ran my test script in Listing 2.

Review Execution Artifacts

When I ran my test script, I collected CPU and Elapsed Time statistics as well as the actual execution plans for each execution of my test aggregate query. In this section, I’ll review the different execution artifacts to compare the differences between row mode and batch mode processing.

The CPU and Elapsed time statistics, as well as the actual execution plan for when I ran my first test query, which was using compatibility level 140, can be found in Figure 4 and Figure 5 respectfully.

word image 49 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 4: CPU and Elapsed Time Statistics for Test #1

word image 50 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 5: Actual Execution Plan under Compatibility Level 140 for Query 1

Figure 6 and 7 below, show the time statistics and the actual execution plan when I ran my test query under compatibility level 150.
word image 51 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 6: Execution Statistics for Test #2

word image 52 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 7: Execution Plan for Test #2

The first thing to note is that the plan that ran under compatibility level 150 (Figure 7) is more streamlined than the one that ran under compatibility mode 140 (Figure 6). From just looking at the execution plan for the second test query, I can’t tell whether or not the query (which ran under compatibility mode 150) uses batch mode or not. To find out, you must right-click on the SELECT icon in the execution for the Test #2 query (Figure 7) and then select the Properties item from the context menu. Figure 8 shows the properties of this query.

word image 53 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 8: Properties for Compatibility Level 150 Query (Test #2)

Notice that the property BatchModeOnRowstoreUsed is True. This property is a new showplan attribute that Microsoft added in SSMS version 18. When this property is true, it means that some of the operators used in processing Test #2 did use a batch mode operation on the Rowstore Fact.OrderBig table.

To review which operators used Batch Mode on Rowstore, you must review the properties of each operator. Figure 9 has some added annotations to the execution plan that shows which operators used batch mode processing and which ones used row mode processing.

word image 54 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 9: Execution Plan for Batch Mode query with Operator property annotations

If you look at the Table Scan (Heap) operator, you can see that the Fact.OrderBig table is a RowStore by reviewing the Storage Property. You can also see that this operation used batch mode by looking at the Actual Execution Mode property. All the other operators ran in batch mode, except the Parallelism operator, which used row mode.

The test table (Fact.OrderBig) contains 23,141,200 rows and the test query referenced 3 different columns. The query didn’t need all those rows because it was filtered to include the rows where the customerid was greater than 10 and less than 100. To determine the number of batches the query created, look at the properties of the table scan operator in the execution plan, which is shown in Figure 10.

word image 55 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 10: Number of batches used for Test #2.

The Actual Number of Batches property in Figure 8 shows that the table scan operator of the test #2 query created 3,587 batches. To determine the number of rows in each batch, use the following formula: Actual Number of Rows divided by the Actual Number of Batches. By using this formula, I got, on average, 899.02 rows per batch.

The cost estimate for each of the queries is the same, 50%. Therefore, to measure performance between batch mode and row mode, I’ll have to look at the TIME statistics.

Comparing Performance of Batch Mode and Row Mode

To compare performance between running batch mode and row mode queries, I ran my test script in Listing 2 ten different times. I then averaged the CPU and Elapsed times between my two different tests and then graphed the results in the chart found in Figure 11.

word image 56 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 11: CPU and Elapsed time Comparison between Row Mode and Batch Mode

The chart in Figure 11 shows that the row mode test query used a little more than 30% more CPU over the batch mode test query. Both the batch and row mode queries ran about the same elapsed time. Just like the note (Figure 4) above suggested, this first test showed considerable CPU improvement could be gained when a simple aggregate query uses Batch Mode processing. But not all queries are created equal when it comes to performance improvements using Batch Mode versus Row Mode.

Not All Queries are Created Equal When It Comes to Performance

The previous test showed a 30% improvement in CPU but little improvement in Elapsed Time. The resource (CPU and Elapsed Time) improvements using Batch Mode operations versus Row mode depend on the query. Here is another contrived test that shows some drastic improvements in Elapsed Time, using the new Batch Mode on Rowstore feature. The test script I used for my second performance test can be found in Listing 3.

Listing 3: Stock Item Key Query Test Script

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

– Turn on time statistics

SET STATISTICS IO, TIME ON;

– Clean buffers so cold start performed

DBCC DROPCLEANBUFFERS

GO

– Prepare Database Compatibility level for Test #1

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 140;

GO

SELECT [Stock Item Key],[City Key],[Order Date Key],[Salesperson Key],

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key]) AS StockAvgQty,

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key])

        AS StockCityAvgQty,

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key],

        [Order Date Key]) AS StockCityDateAvgQty,  

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key],

        [Order Date Key],[Salesperson Key])

        AS StockCityDateSalespersonAvgQty

FROM Fact.OrderBig

WHERE [Customer Key] > 10 and [Customer Key] < 100

– Clean buffers so cold start performed

DBCC DROPCLEANBUFFERS

GO

– Prepare Database Compatibility level for Test #2

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

SELECT [Stock Item Key],[City Key],[Order Date Key],[Salesperson Key],

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key]) AS StockAvgQty,

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key])

        AS StockCityAvgQty,

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key],

        [Order Date Key]) AS StockCityDateAvgQty,  

    AVG(Quantity) OVER(PARTITION BY [Stock Item Key],[City Key],

        [Order Date Key],[Salesperson Key])

        AS StockCityDateSalespersonAvgQty

FROM Fact.OrderBig

WHERE [Customer Key] > 10 and [Customer Key] < 100

In Listing 3, I used the OVER clause to create four different aggregations, where each aggregation had a different PARTITION specification. To gather the performance statistics for Listing 3 queries, I ran this script ten different times. Figure 12 shows the numbers for CPU and Elapsed Time numbers graphically.

word image 57 Reduce CPU of Large Analytic Queries Without Changing Code

Figure 12: CPU and Elapsed Time comparison for Window Function Query test

As you can see by creating the different aggregation in Listing 3, I once again saw a big performance improvement in CPU (around 72%). This time, I also got a big improvement in Elapsed Time (a little more than 45%) when batch mode was used. My testing showed that not all queries are created equal when it comes to performance. For this reason, I recommend you test all the queries in your environment to determine how each query performs using this new Batch Mode on Rowstore feature. If you happen to find some queries that perform worse using batch mode, then you can either rewrite the queries to perform better or consider disabling batch mode for those problem queries.

Disabling Batch Mode on Row Store

If you find you have a few queries that don’t benefit from using batch mode, and you don’t want to rewrite them, then you might consider turning off the Batch Mode on Rowstore feature with a query hint.

If you use the DISALLOW_BATCH_MODE hint, you can disable Batch Mode on Rowstore feature for a given query. The code in Listing 4 shows how I disabled batch mode for the first test query I used in this article.

Listing 4: Using “DISALLOW BATCH MODE” hint to disable batch mode for a single query

SELECT [Customer Key],

       SUM(Quantity) AS TotalQty,

       AVG(Quantity) AS AvgQty,

       AVG([Unit Price]) AS AvgUnitPrice

FROM Fact.[OrderBig]

WHERE [Customer Key] > 10 and [Customer Key] < 100

GROUP BY [Customer Key]

ORDER BY [Customer Key]

OPTION(USE HINT(‘DISALLOW_BATCH_MODE’));

When I ran the query in Listing 4 against the WideWorldImportersDW database running in compatibility mode 150, the query didn’t invoke any batch mode operations. I verified this by reviewing the properties of each operator. They all processed using a row mode operation. The value of using the DISALLOW_BATCH_MODE hint is I can disable the batch mode feature for a single query. This means it’s possible to be selective on which queries will not consider batch mode when your database is running under compatibility level 150.

Alternatively, you could disable the Batch Mode on Rowstore feature at the database level, as shown in Listing 5.

Listing 5: Disabling Batch Mode at the database level

– Disable batch mode on rowstore

ALTER DATABASE SCOPED CONFIGURATION SET BATCH_MODE_ON_ROWSTORE = OFF;

Disabling the batch mode feature at the database level still allows other queries to take advantages of the other new 15.x features. This might be an excellent option to use if you wanted to move to version 15.x of SQL Server while you complete testing of all of your large aggregation queries to see how they are impacted by the batch mode feature. Once testing is complete, reenable batch mode by running the code in Listing 6.

Listing 6: Enabling Batch Mode at the database level

– Enable batch mode on rowstore

ALTER DATABASE SCOPED CONFIGURATION SET BATCH_MODE_ON_ROWSTORE = ON;

By using the hint or database scoped configure method to disable batch mode, I have control over how I want this new feature to affect the performance of my row mode query operations. It is great that the team at Microsoft allows these different methods to disable/enable the Batch Mode on Rowstore feature. By allowing these different options for enable/disabling batch mode on rowstore, I have more flexibility in how I roll out the batch mode feature across a database.

Which Editions Support Batch Mode?

Before you get too excited about how this feature might help the performance of your large analytic queries, I have to tell you the bad news. Batch Mode on Rowstore is not available to all version of SQL. Like many cool new features that have come out in the past, they are first introduced in Enterprise edition only, and then over time, they might become available in other editions. Batch Mode on Rowstore is no exception. As of the RTM release of SQL Server 2019, the Batch Mode on Rowstore feature is only available in Enterprise Edition, as documented here. Also note that developer edition supports Batch Mode on Rowstore, but of course cannot be used for production work. Be careful when doing performance testing of this new feature on the developer edition of SQL Server 2019 if you plan to roll out your code into any production environment except Enterprise. If you want to reduce your CPU footprint using this new feature, then you better get out your checkbook and upgrade to Enterprise edition, or just wait until Microsoft rolls this feature out to other editions of SQL Server. It also works on Azure SQL Database.

Reduce CPU of Large Analytic Queries Without Changing Code

If you have large analytic queries that perform aggregations, you might find that using the new Batch Mode on Rowstore feature improves CPU and Elapsed time without changing any code if your query environment meets a few requirements. The first requirement is that your query needs to be running using SQL Server version 15.x (SQL Server 2019) or better. The second requirement is you need to be running on an edition of SQL Server that supports the Batch Mode on Rowstore feature. Additionally, the table being queried needs to have at least 131,072 rows and be stored in a b-tree or heap before batch mode is considered for the table.

I am impressed by how much less CPU and Elapsed time was used for my test aggregation queries. If you have a system that runs lots of aggregate queries, then migrating to SQL Server 2019 might be able to eliminate your CPU bottlenecks and get some of your queries to run faster at the same time.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More

Get Your Scalar UDFs to Run Faster Without Code Changes

February 13, 2020   BI News and Info

Over the years, you probably have experienced or heard that using user-defined functions (UDF’s) do not scale well as the number of rows processed gets larger and larger. Which is too bad, because we have all heard that encapsulating your code into modules promotes code reuse and is a good programming practice. Now the Microsoft SQL Server team have added a new feature to the database engine in Azure SQL Database and SQL Server 2019 that allows UDF’s performance to scale when processing large recordsets. This new feature is known as T-SQL Scalar UDF Inlining.

T-SQL Scalar UDF Inlining is one of many new features to improve performance that was introduced in the Azure SQL Database and SQL Server 2019. This new feature contains many options available in the Intelligent Query Processing (IQP) feature set. Figure 1 from Intelligent Query Processing in SQL Databases shows all the IQP features introduced in Azure SQL Database and SQL Server 2019, as well as features that originally were part of the Adaptive Query Processing feature set that was included in the older generation of Azure SQL Database and SQL Server 2017.

word image 7 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 1: Intelligent Query Processing

The T-SQL Scalar UDF Inlining feature will automatically scale UDF code without having to make any coding changes. All that is needed is for your UDF to be running against a database in Azure SQL Database or SQL Server 2019, where the database has the compatibility level set to 150. Let me dig into the details of the new inlining feature a little more.

T-SQL Scalar UDF Inlining

The new T-SQL Scalar UDF Inlining feature will automatically change the way the database engine interprets, costs, and executes T-SQL queries when a scalar UDF is involved. Microsoft incorporated the FROID framework into the database engine to improve the way scalar UDFs are processed. This new framework refactors the imperative scalar UDF code into relational algebraic expressions and incorporates these expressions into the calling query automatically.

By refactoring the scalar UDF code, the database engine can improve the cost-based optimization of the query as well as perform set based optimization that allows the UDF code to go parallel if needed. Refactoring of scalar UDFs is done automatically when a database is running under compatibility level 150. Before I dig into the new scalar UDF inlining feature, let me review why scalar UDF’s are inherently slow, and discuss the differences between imperative and relational equivalent code.

Why are Scalar UDF Functions inherently slow?

When running a scaler UDF on a database with a compatibility level set to less than 150, they just don’t scale well. By scale, I mean they work fine for a few rows but run slower and slower as the number of rows processed gets larger and larger. Here are some of the reasons why scalar UDF’s don’t work well with large recordsets.

  • When a T-SQL statement uses a scalar function, the database engine optimizer doesn’t look at the code inside a scalar function to determine its costing. This is because Scalar operators are not costed, whereas relational operators are costed. The optimizer considers scalar functions as a black box that uses minimal resources. Because scalar operations are not costed appropriately, the optimize is notorious for creating very bad plans when scalar functions perform expensive operations.
  • A Scalar function is evaluated as a batch of statements where each statement is run sequentially one statement after another. Because of this, each statement has its own execution plan and is run in isolation from the other statements in the UDF, and therefore can’t take advantage of cross-statement optimization.
  • The optimize will not allow queries that use a scalar function to go parallel. Keep in mind, parallelism may not improve all queries, but when a scalar UDF is being used in a query, that query’s execution plan will not go parallel.

Imperative and Relational Equivalent Code

Scalar UDFs are a great way to modularize your code to promote reuse, but all too often they contain procedural code. Procedural code might contain imperative code such as variable declarations, IF/ELSE structures, as well as WHILE looping. Imperative code is easy to write and read, hence why imperative code is so widely used when developing code for applications.

The problem with imperative code is that it is hard to optimize, and therefore query performance suffers when imperative code is executed. The performance of imperative code is fine when a small number of rows are involved, but as the row count grows, the performance starts to suffer. Because of this, you should not use them for larger record sets if they are executed on a database running with a compatibility less than 150. With the introduction of version 15.x of SQL Server, the scaling problem associated with UDFs has been solved by the refactoring of imperative code using a new optimization technique known as the FROID framework.

The FROID framework refactors imperative code into a single relational equivalent query. It does this by analyzing the scalar UDF imperative code and then converts blocks of imperative code into relational equivalent algebraic expressions. These relational expressions are then combined into a single T-SQL statement using APPLY operators. Additionally, the FROID framework looks for redundant or unused code and removes it from the final execution plan of the query. By converting the imperative code in a scalar UDF into re-factored relational expressions, the query optimizer can perform set-based operations and use parallelism to improve the scalar UDF performance. To further understand the difference between imperative code and relational equivalent code, let me show you an example.

Listing 1 contains some imperative code. By reviewing this listing, you can see it includes a couple of DECLARE statements and some IF/ELSE logic.

Listing 1: Imperative Code Example

1

2

3

4

5

6

7

8

9

10

DECLARE @Sex varchar(10) = ‘Female’;

DECLARE @SexCode int;

IF @Sex = ‘Female’

SET @SexCode = 0

ELSE

IF @Sex = ‘Male’

   SET @SexCode = 1;

     ELSE

        SET @SexCode = 2;

SELECT @SexCode AS SexCode;

I have then re-factored the code in Listing 1 into a relational equivalent single SELECT statement in Listing 2, much like the FROID framework might doing it when compiling a scalar UDF.

Listing 2: Relational Code Example

SELECT B.SexCode FROM (SELECT ‘Female’ AS Sex) A

OUTER APPLY  

  (SELECT CASE WHEN A.Sex = ‘Female’ THEN 0

          WHEN A.Sex = ‘Male’ THEN 1

     ELSE 2

END AS SexCode) AS B;

By looking at these two examples, you can see how easy it is to read the imperative code in Listing 1 to see what is going on. Whereas in Listing 2, which contains the relational equivalent code, requires a little more analysis/review to determine exactly what is happening.

  • Currently, the FROID framework is able to rewrite the following scalar UDF coding constructs into relational algebraic expressions:
  • Variable declaration and assignments using DECLARE or SET statement
  • Multiple variable assignments in a SELECT statement
  • Conditional testing using IF/ELSE logic
  • Single or multiple RETURN statements
  • Nested/recursive function calls in a UDF
  • Relational operations such as EXISTS and ISNULL

The two listings found in this section only logically demonstrate how the FROID framework might convert imperative UDF code into relational equivalent code using the FROID framework. For more detailed information on the FROID framework, I suggest you read this technical paper.

In order to see FROID optimization in action, let me show you an example that compares the performance of a scalar UDF running with and without FROID optimization.

Comparing Performance of Scalar UDF with and Without FROID Optimization

To test how a scalar UDF would perform with and without FROID optimization, I will run a test using the sample WorldWideImportersDW database (download here). In that database, I’ll create a scalar UDF called GetRating. The code for this UDF can be found in Listing 3.

Listing 3: Scalar UDF that contains imperative code

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

CREATE OR ALTER FUNCTION dbo.GetRating(@CityKey int)

RETURNS VARCHAR(13)

AS

BEGIN

   DECLARE @AvgQty DECIMAL(5,2);

   DECLARE @Rating VARCHAR(13);

   SELECT @AvgQty  = AVG(CAST(Quantity AS DECIMAL(5,2)))

   FROM Fact.[Order]

   WHERE [City Key] = @CityKey;

   IF @AvgQty / 40 >= 1  

  SET @Rating = ‘Above Average’;

   ELSE

  SET @Rating = ‘Below Average’;

   RETURN @Rating

END

By reviewing the code in Listing 3 you can see that I am creating my scalar UDF that I will be using for testing. This function calculates a rating for a [City Key] value. The rating returned is either “Above Average” or “Below Average” based on 40 being the average rating. Note that this UDF contains imperative code.

In order to test how scalar inlining can improve performance I will be running the code in Listing 4.

Listing 4: Code to test performance of scalar UDF

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

– Turn on Time Statistics

SET STATISTICS TIME ON;

GO

USE WideWorldImportersDW;

GO

– Set Compatibility level to 140

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 140;

GO

– Test 1

SELECT DISTINCT ([City Key]), dbo.GetRating([City Key]) AS CityRating

FROM Dimension.[City]

– Set Compatibility level to 150

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

– Test 2

SELECT DISTINCT ([City Key]), dbo.GetRating([City Key]) AS CityRating

FROM Dimension.[City]

GO

The code in Listing 4 runs two tests. The first test (Test 1) calls the scaler UDF dbo.GetRating using compatibility level 140 (SQL Server 2017). For the second test (Test 2), I only changed the compatibility level to 150 (SQL Server 2019) and ran the same UDF as Test 1 without making any coding changes to the UDF.

When I run Test 1 in Listing 4, I get the execution statistics shown in Figure 2 and the execution plan shown in Figure 3.

word image 8 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 2: Execution Statistics for Test 1
word image 9 Get Your Scalar UDFs to Run Faster Without Code Changes
Figure 3: Execution plan when using compatibility level 140 using Test 1

Prior to reviewing the time statistics and execution plan for Test 1 let me run Test 2. The time statistics and execution plan for Test 2 can be found in Figure 4 and Figure 5, respectfully.

word image 10 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 4: Execution Statistics for Test 2

word image 11 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 5: Execution plan when using compatibility level 150 using Test 2

Performance Comparison between Test 1 and Test 2

The only change I made between Test 1 and Test 2 was to change the compatibility level from 140 to 150. Let me review how the FROID optimization changed the execution plan and improved the performance when I executed my test using compatibility level 150.

Before running the two different tests, I turned on statistics time. Figure 6 compares the time statistics between the two different tests.

word image 12 Get Your Scalar UDFs to Run Faster Without Code Changes
Figure 6: CPU and Elapsed Time Comparison Between Test 1 and Test 2

As you can see, when I executed the Test 1 SELECT statement in Listing 4 using compatibility level 140, the CPU and elapsed time took a little over 30 seconds. Whereas, when I changed the compatibility level to 150 and ran the Test 2 SELECT statement in Listing 4, my CPU and Elapsed time used just over 1second of time each. As you can see, Test 2, which used compatibility level 150 and the FROID framework, ran magnitudes faster than List 1 which ran under compatibility 140 without the FROID framework optimization. The improvement I gained using the FRIOD framework and compatibility level 150 achieved this performance improvement without changing a single line of code in my test scalar UDF. To better understand why the time comparisons were so drastically different between these two executions of the same SELECT statement, let me review the execution plans produced by each of these test SELECT queries.

If you look at Figure 3, you will see a simple execution plan when the SELECT statement was run under compatibility 140. This execution plan didn’t go parallel and only includes two operators. All the work related to calculating the city rating in the UDF using the data in the Fact.[Order] table is not included in this execution plan. To get the rating for each city, my scalar function had to run multiple times, once for every [City Key] value found in the Dimension.[City] table. You can’t see this in the execution plan, but if you monitor the query using an extended event, you can verify this. Each time the database engine needs to invoke my UDF in Test 1, a context switch has to occur. The cost of the row by row operation nature of calling my UDF over and over again causes the query in Test 1 to run slow.

If we look at the execution plan in Figure 5, which is for Test 2, you see a very different plan as compared to Test 1. When the SELECT statement in Test 2 was run, it ran under compatibility level 150, which allowed the scalar function to be inlined. By inlining the scalar function, FROID optimization converted my scalar UDF into a relational operation which allowed my UDF logic to be included in the execution plan of the calling SELECT statement. By doing this, the database engine was able to calculate the rating value for each [City Key] using a set-based operation, and then joins the rating value to all the cities in the Dimension.[City] table using an inner join nested loop operation. By doing this set based operation in Test 2, my query runs considerably faster and uses fewer resources than the row by row nature of my Test 1 query.

Not all Scalar Functions Can be Inlined

Not all scalar function can be inlined. If a scalar function contains coding practices that cannot be converted to relational algebraic expressions by the FRIOD framework, then your UDF will not be inlined. For instance, if a scalar UDF contains a WHILE loop, then the scalar function will not be inlined. To demonstrate this, I’m going to modify my original UDF code so it contains a dummy WHILE loop. My new UDF is called dbo.GetRating_Loop and can be found in Listing 5.

Listing 5: Scalar UDF containing a WHILE loop

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

CREATE OR ALTER FUNCTION dbo.GetRating_Loop(@CityKey int)

RETURNS VARCHAR(13)

AS

BEGIN

   DECLARE @AvgQty DECIMAL(5,2);

   DECLARE @Rating VARCHAR(13);

– Dummy code to support WHILE loop

   DECLARE @I INT = 0;

   WHILE @I < 1

   BEGIN

  SET @I = @I + 1;

   END

   SELECT @AvgQty  = AVG(CAST(Quantity AS DECIMAL(5,2)))

   FROM Fact.[Order]

   WHERE [City Key] = @CityKey;

   IF @AvgQty / 40 >= 1  

  SET @Rating = ‘Above Average’;

   ELSE

  SET @Rating = ‘Below Average’;

   RETURN @Rating

END

By reviewing the code in Listing 5, you can see I added a dummy WHILE loop at the top of my original UDF. When I run this code using the code in Listing 6, I get the execution plan in Figure 7.

Listing 6: Code to run dbo.GetRating_Loop

1

2

3

4

5

6

7

8

9

10

USE WideWorldImportersDW;

GO

– Set Compatibility level to 150

ALTER DATABASE WideWorldImportersDW SET COMPATIBILITY_LEVEL = 150;

GO

– Test UDF With WHILE Loop

SELECT DISTINCT ([City Key]),

    dbo.GetRating_Loop([City Key]) AS CityRating

FROM Dimension.[City]

GO

word image 13 Get Your Scalar UDFs to Run Faster Without Code Changes

Figure 7: Execution plan created while execution Listing 6.

By looking at the execution plan in Figure 7, you can see that my new UDF didn’t get inlined. The execution plan for this test looks very similar to the execution plan I got when I ran my original UDF in Listing 3 under database compatibility level 140. This example shows not all scalar UDF functions will be inlined. Just those scalar UDF that use only the functionality support by the FRIOD framework will be inline.

Disabling Scalar UDF Inlining

With this new version of SQL Server, the design team wanted to make sure you could disable any new features at the database level or statement level. Therefore, you can use the code in Listing 6 or 7 to disable scalar UDF inlining. Listing 6 shows how to disable scalar UDF inlining at the database level.

Listing 6: Disabling inlining at the database level

ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = OFF;

Listing 7 shows how to disable scalar inlining when the scalar UDF is created.

Listing7: Disabling when defining UDF

CREATE FUNCTION dbo.MyScalarUDF (@Parm int)

RETURNS INT

WITH INLINE=OFF

...

Make Your Scalar UDF just Run faster by Using SQL Server version 15.x

If you want to make your Scalar UDF run faster without making any coding changes, then SQL Server 2019 is for you. With this new version of SQL Server, the FROID framework was added. This framework will refactor a scalar UDF function into relational equivalent code that can be placed directly into the calling statement’s execution plan. By doing this, a scalar UDF is turned into a set-based operation instead of being called for every candidate row. All it takes to have a scalar UDF refactored is to set your database to compatibility level 150.

Let’s block ads! (Why?)

SQL – Simple Talk

Read More
« Older posts
  • Recent Posts

    • Derivative of a norm
    • TODAY’S OPEN THREAD
    • IBM releases Qiskit modules that use quantum computers to improve machine learning
    • Transitioning to Hybrid Commerce
    • Bad Excuses
  • Categories

  • Archives

    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited