Category Archives: Business Intelligence

Adding Visual Flare In Power BI With Background Colors

Banner Image Adding Visual Flare In Power BI With Background Colors

Hello P3 Nation! Today’s “post” is going to be a video link. Sometimes there are certain subjects, concepts, or post ideas that just don’t translate well to the written word, and especially to screenshots. So this post will be in moving picture form! We’ve posted it to our YouTube Channel which you can go here. Bedside’s today’s video, we have TONS of content on that channel, so please take a look at our other awesome videos as well.

Today’s topic covers how to apply Background Colors in Power BI Desktop. An important component of any report is the artistic design that still lends itself to be easily understood or used by the clients. So in this video, we’ll learn how to apply some background colors to both the report and the visualizations to help make them pop! I’ve included below the video link, download link both the .pbix file and background colors .ppt, and links to other articles explaining the various features & design techniques I’ve applied in this report. Otherwise, enjoy the video!

My Top 5 Power BI Practices: Transforming Good to GREAT: Article talks about a lot of the Formatting and Design Practices you see above, plus the DAX Formulas table.

Power BI: Transforming Good to GREAT: Video that is an updated discussion & walkthrough of the article above. Discusses Formatting, Design Practices, What If Scenarios, and Forecasting.

You can download the file here

We like to think that is no accident.  We’re different.  First of a new breed – the kind who can speak tech, biz, and human all at the same time.

Want this kind of readily-absorbable, human-oriented Power BI instruction for your team? Hire us for a private training at your facility, OR attend one of our public workshops!

Let’s block ads! (Why?)


GE Aviation and Teradata Expand Partnership

CustomerSuccess social share smallest GE Aviation and Teradata Expand Partnership

GE Aviation and Teradata (NYSE: TDC) today announced that GE Aviation will become the exclusive provider of Teradata products and services for commercial aviation markets, providing the world’s biggest airlines with a single, comprehensive framework that combines high-performance analytics in the cloud from Teradata with edge-connectivity services from GE Aviation.
Teradata and GE Aviation previously announced a partnership to jointly provide products and services to commercial aviation markets. The two companies are extending this relationship to provide airlines with wide-ranging operational insights that power impactful business decisions.
“We can provide the best analytic environment for airline customers by adding Teradata’s powerful analytic solutions with built-in support for hybrid cloud environments,” said John Mansfield, Chief Digital Officer of GE Aviation. “This partnership enables us to bring a holistic framework of enterprise data and business solutions to airlines.”
An example of this would be GE Aviation’s FlightPulse application, which will include a connection to Teradata software to enable access to high-performance analytics enterprise-wide. FlightPulse automatically merges complex aircraft data with crew schedules, allowing commercial pilots to visualize personal analytics and conduct their own exploration. Empowered to conduct their own analysis and peer comparisons, pilots discover areas to optimize operations and efficiency, while reducing risk, fuel consumption and carbon emissions. By combining this operational data with passenger-based data within the Teradata system, an airline is able to create a holistic enterprise view of its data and extract meaningful business outcomes.
“Operationalizing the analytics that provide high-impact, trusted business outcomes is one of the hardest things to pull off in an enterprise-scale environment,” said Martyn Etherington, Chief Marketing Officer at Teradata. “The first step is often making sure various data silos can communicate, and our partnership with GE Aviation is designed to make this happen almost automatically for the airline industry. Ensuring that these large, global customers have quick access to a powerful, yet flexible, analytics solution like the Teradata Analytics Platform is the second step, and we are delighted that GE Aviation will be offering this product to its customers.”
“By expanding our partnership, the aviation market will gain an analytic solution that delivers comprehensive insights. We could not be more excited for the insights we are about to unlock for our customers,” said Andrew Coleman, Chief Commercial Officer for GE Aviation’s Digital Solutions business. “It became clear that, for data and analytics, Teradata excels at the size and scale required by a global airline.”
The Teradata Analytics Platform is a key offering from Teradata, the cloud-based data and analytics leader. It delivers powerful analytic functions and engines, that can be used with multiple data types. Coupled with the tools and languages that each user individually prefers to use, the Teradata Analytics Platform is the future of analytics, delivering secure, scalable, high-performance analytics in the cloud, on-premises or both.
For airline customers that are already using GE Aviation for operations, assets, and network management, adding the Teradata Analytics Platform would deliver enterprise-wide insights with results ranging from improved flight operations and predictive maintenance to increased operational efficiencies and higher customer satisfaction.
Via its extended partnership, Teradata and GE Aviation are particularly well positioned to enhance the aviation analytics market, which Markets and Markets Research estimates is a $ 2.16 billion industry that will grow to an estimated $ 4.23 billion by 2021.

About GE Aviation:
GE Aviation is part of GE (NYSE: GE), the world’s Digital Industrial Company, transforming industry with software-defined machines and solutions that are connected, responsive and predictive. With people, services, technology and scale, GE delivers better outcomes for customers by speaking the language of industry.

Let’s block ads! (Why?)

Teradata United States

Amazon’s Matt Wood on the major takeaways from AWS Summit 2018

 Amazon’s Matt Wood on the major takeaways from AWS Summit 2018

On Wednesday, just days ahead of Google’s Cloud Next conference, Amazon hosted its annual Amazon Web Services cloud computing conference, AWS Summit, at the Jacob K. Javits Convention Center in New York City. It didn’t hold back.

SageMaker, the Seattle company’s full-stack machine learning platform, got two major updates: SageMaker Streaming Algorithms and SageMaker Batch Transform. The former, which is available for neural network models created with Google’s TensorFlow, lets customers stream data from AWS’ Simple Storage Service (S3) directly into SageMaker GPU and CPU instances. The latter allows them to transfer large training datasets without having to break them up with an API call.

In terms of hardware, Amazon added Elastic Compute Cloud (EC2) to its Snowball Edge system, an on-premises Intel Xeon-based platform for data processing and collection. And it enhanced its local storage, compute, data caching, and machine learning inference capabilities via AWS Greengrass, AWS Lambda, and Amazon S3, enabling new categories of virtualized applications to run remotely in work environments with limited connectivity.

On the services front, Amazon Transcribe’s new Channel Synthesis tool merges call center audio from multiple channels into a single transcription, and Amazon Translate now supports Japanese, Russian, Italian, Traditional Chinese, Turkish, and Czech. Amazon Comprehend, Amazon’s natural language processing services (NLP), now boasts improved text analysis thanks to syntax identification.

Finally, Amazon revealed a slew of new and extended partnerships with major clients. Fortnite developer Epic Game said it’s building “new games [and] experiences” on AWS; 21st Century Fox will use Amazon’s cloud service for the “vast majority” of on-demand content delivery; Major League Baseball and Formula 1 are planning to tap AWS’ AI tools for real-time data analytics; and Celgene will leverage Amazon’s machine learning platform to expedite drug analysis and validation.

It’s a lot to take in. For a bit of context around this week’s announcements, I spoke with Dr. Matt Wood, general manager of artificial intelligence at AWS, who shed light on Amazon’s momentum in cloud computing, overarching trends in AI, and the problem of bias in machine learning models and datasets.

Here’s a transcript of our interview, which has been edited for length and clarity.

VentureBeat: Today, you announced SageMaker Streaming Algorithms, which allows AWS customers to train machine learning models more quickly. What was the motivation? Was this something for which customers expressed a deep desire?

Matt Wood: There are certain things across AWS that we want to invest in, and they’re the things that we think aren’t going to change over time. We’re building a business not for one year, 10 years, or 50 years, but 100 years — far in excess of when I’m going to be around and in charge of it. When you take that long-term view, you tend to put money not into the things you think are going to change, but into the things you think are going to stay the same.

For infrastructure, and for AWS — and this is true for machine learning as well — cost is really a big driver of that … It’s impossible for us to imagine our customers saying that they want the service to be more expensive, so we go out of our way to drive down costs.

A really good example is something we announced a couple of years ago that we call Trusted Advisor. Trusted Advisor is a feature you can turn on inside your AWS account that automatically, without you having to do anything, makes recommendations about how to reduce your AWS bill. We delivered over $ 300 million in annual savings to customers that way.

These are some of the advantages that the cloud provides, and they’re advantages that we want to maintain.

VentureBeat: On the client side of things, you announced a lot of strategic partnerships with Epic, Major League Baseball, and others, almost all of which said they’ll be using AWS as their exclusive cloud platform of choice. So what’s the movement there? What’s the feedback been like so far?

Wood: We see a lot of usage in sports analytics. Formula 1 chose AWS as their machine learning platform, Major League Baseball chose AWS as their machine learning platform, and the National Football League chose AWS as their machine learning platform. The reason for that is they want to drive better experiences for their viewers, and they see machine learning as a key piece of the advanced next-generation statistics they want to bring into their production environment — everything from route prediction [to] stat prediction.

That’s just one big area. Other areas are pharmaceuticals and health care. We have HIPAA compliance, which allows [our] customers to work with health care workloads, so we see a lot of momentum in disease prediction. We do diabetic retinopathy prediction, readmission prediction — all those sorts of things.

To that end, we announced [this week that] Bristol Myers Squibb is using SageMaker to accelerate the development of the innovative medicine that they build. Celgene is another really good example — Celgene actually runs Gluon, which is our machine learning library, on top of SageMaker, and they take advantage of the P3 GPUs with the Nvidia Volta under the hood. So, you know, that’s a really good example of the customer that has materially accelerated the ability to be able to bring drugs to market more quickly and more safely.

VentureBeat: Amazon offers a lot of machine learning services to developers, like Rekognition — your computer vision platform — and Amazon Translate. But you have a lot of competition in the space from Google, Microsoft, and others. So how are you differentiating your APIs and services from the rest out there?

Wood: Candidly, we don’t spend a [lot of] time thinking about what our competitors are up to — we tend to be way more customer-focused. We’ve launched 100 new services and features since Reinvent 2017, and no other provider has done more than half of that. I would say 90-95 percent of what we’ve launched has been directly driven by customer feedback, and the other 5-10 percent is driven by our attempts to read between the lines and try to figure out what customers don’t quite know to ask for yet.

SageMaker is really helpful in cases where customers have data which they believe has differentiating value. Then, there are application developers who may not have a lot of training data available or who just want to add some level of intelligence to their application quickly — that’s where Rekognition, Rekognition Video, Transcribe, Comprehend, Polly, Lex, and Translate come in.

We joke about this, but our broader mission is really to make machine learning boring and totally vanilla, just part of the course of doing business and another tool in the tool chest. Machine learning, we kind of forget, used to be a huge investment requirement in the hundreds of millions of dollars to get up and running. It was completely out of reach, and I think we’ve made huge progress in a very, very short amount of time.

We have a saying in Amazon: It’s still day one for the internet. And for machine learning, we haven’t even woken up and had our first cup of coffee yet. But there’s a ton of excitement and momentum. We have tens of thousands of active developers on the platform and 250 percent growth year over year. Eight out of 10 machine learning workloads run on AWS — twice as many as any other provider. And customers really value that focus on continuous platform improvement. I’m excited about where we’re headed.

VentureBeat: Voice recognition and natural language processing, in particular, are extremely competitive spaces right now. I know you said you don’t think too much about what your competitors are doing, but what kind of gains have you made relative to the market?

Wood: These services are off to a great start, and we see contact centers being a really big area.

A lot of customers use Amazon Lex as their first point of contact. The National Health Service (NHS) in the U.K. ran a pilot where they introduced a Lex chatbot, and it was able to handle 40 percent of their call volume. This is the centralized health provider in all of the U.K., so that’s really meaningful in terms of patients getting to talk to somebody more quickly, or NHS being able to operate its contact center more efficiently.

[This week] we announced Channel Splitting, where we were able to take call center recordings — two recordings, one of the agent and one of the customer — in the same file, split out the channel, transcribe them both independently, and merge the transcripts together. You get a single file out, and then you can take that and you can pass it off to Comprehend to find out what’s going on the in the conversation and what people were talking about. You can also run compliance checks to see if contact center agents are saying scripts exactly as they’re designed to be said.

From an efficiency perspective, large contact centers are expensive and difficult for most organizations to run, and from Lex through to the management, compliance, analytics, and insight you can get from the data there, we think they’re a really compelling AWS use case.

VentureBeat: Shifting gears a bit. You mentioned inclusion a bit earlier, and as you probably know, with respect to computer vision, we’ve got a long way to go — facial recognition is an especially difficult thing for developers and infrastructure providers to get right. So how do you think it might be tackled? How can we improve these algorithms that, for example, appear to be biased against people of color and certain ethnicities and races?

Wood: It’s the classic example of garbage in, garbage out. If you’re not really careful about where you get your data from and if you accidentally with good intentions introduce some selection criteria on the data in the perfect representative set, you’re going to introduce inaccuracies. The good news is that with machine learning, you can identify, measure, and systematically reduce those inaccuracies.

One of the key benefits of our services like SageMaker is the quicker you can train and retain models, the quicker you can identify areas of accuracy and start to narrow down the inaccuracies. So in that respect, any investment that we make, such as SageMaker Streaming Algorithms, contributes to spinning that flywheel faster and allows developers to iterate and build more sophisticated models that overcome some of the noise inside the data.

Basically, investment in our frameworks allows developers to build more sophisticated models, train models more quickly, and operate more efficiently in a production environment. All of it helps.

Let’s block ads! (Why?)

Big Data – VentureBeat

How Will Travel Look In A Digital World?

Part 1 in the “Advanced Integration” series

This blog is the first of a series that will drill down into technologies for advanced integration and enhanced intelligence and potential applications. We will highlight these capabilities, along with details on architectural design and evolutions in available underlying products and technology components. Here, we are exploring how an integrated platform could support a personal travel assistant.

The digital transformation journey for travel

The significant ongoing growth in global travel has triggered the need for more advanced travel-planning and execution tools to supplement existing solutions, which individually cover only certain aspects of the overall itinerary.

Today’s platform technology makes it possible to build a highly automated travel-planning and execution-monitoring solution. The technology can combine features from existing applications with intelligent microservices like satellite services, Internet of Things (IoT), machine learning, and blockchain technologies and data sources.

The result is an intelligent personal travel assistant, offering the traveler all data required in a fully automated way, in real time – from planning to execution, including routing, bookings, scheduling, checkouts, and final cost settlements. Imagine how a personal travel assistant could simplify the life of the frequent traveler, whether for business or leisure. The additional benefits are obvious in terms of administrative cost savings, and also for better alignment with various carriers and optimization of occupancies. For energy-intensive transportation providers, the result could also translate into more environmentally sustainable practices through resource optimization.

Travel-planning solutions today

Today, organizing a trip requires a lot of human judgment and manual steps during scheduling and execution. Multiple data sources need to be consulted. Today’s travel management solutions cover only the needs of the traveler in specific areas, like route planning, booking, or expense management; they offer little integration across the various areas. For example, a route planner provides travel schedule alternatives, but limited functionality in reservation and ticketing. Likewise, booking systems have limited functionality in automatic rebooking or subsequent payment adjustments, requiring lots of manual intervention.

Access to all relevant travel-data sources in real time is a prerequisite to produce qualified and updated travel schedules throughout the travel journey. These include carrier schedules (flights, train); lodging, dining, and entertainment (hotels, restaurants, performance venues); travelers’ profiles (route preferences, loyalty programs); and access to supporting services (hotlines, insurance). This is true of existing travel management solutions.

Elevating automation in travel planning

However, using historical traveler patterns, machine learning can now identify one or more routes, as well as lodging and entertainment suggestions, and propose these through the personal travel assistant, with the corresponding time and cost implications. The traveler can then select the best suitable option and by so doing, constantly update the machine learning engine. The engine can also release bookings and reservations in due time and issue ticketing, considering time-window restrictions, penalty clauses, soft/hard bookings judgments, etc. Contracts with various providers are used as a source.

Satellite services and IoT allow the location of the traveler to be monitored throughout the journey, as well as the location of each carrier and deviations from the original schedule. The machine learning engine can anticipate potential conflicts and reschedule the trip to a best alternative route going forward, making all necessary adjustments in bookings and reservations.

Ticketing and other required verification documents can be pushed to the personal travel assistant upon confirmation of the various carriers. Payment settlement follows through various channels (such as bank transfer, credit card, blockchain) upon confirmation of carrier usage, either detected through satellite and IoT data sources or confirmed manually. Final settlement of all costs and expenses can be fully automated, sharing the relevant data with standard travel and expense applications.

The foundation: a consolidated data platform

The foundation of the new solution is a data platform consolidating all relevant data sources, as well as offering required security capabilities and mobile access. For example, the future state could include:

  • Booking of travel and lodging followed automatically by route scheduling
  • Issuance of tickets and other documents upon confirmation with contractual best alternatives
  • Automatic settlement of payments and expense declarations
  • Planning and booking of transfers to and from the airport

Making this a reality, however, will require meaningful change to all existing compliance and authorization barriers. The feasibility requires flexibility in changing bookings in an automated way, as well as semi-authorized payment settlements and adjustments.

The feasibility

However, if those barriers to entry could be overcome through policy change, the improvements in the travel experience are considerable, including:

  • Time and cost saving during planning, rescheduling, and execution of trips, eliminating all paperwork and phone calls
  • Minimized hiccups and waiting times during travel, since all data related to availabilities, schedules, and calamities will be up-to-date and available in real time
  • Up-to-date information provided to the traveler, allowing for a smooth trip with minimal disruption or unexpected delays
  • Better visibility for all stakeholders, including travel agencies, carriers, hotels, and employers, on travel costs and faster settlement without errors

Connect with Frank on LinkedIn.

Connect with John on LinkedIn.

Let’s block ads! (Why?)

Digitalist Magazine

Voigt notation in Mathematica

 Voigt notation in Mathematica

In the computational mechanics software (Abaqus, Ansys, Comsol, etc), Voigt notation is always used to represent a symmetric tensor by reducing its order.

Now I would ask How can we get the Voigt Notation from second order tensor or fourth order tensor in a very efficient way in Mathematica.

e.g. second order tensor: Array[Subscript[a, ## & @@ Sort[{##}]] &, {6, 6}] // MatrixForm

PS: ‘(Manual) using hand writing’ is not a good way.

Reference Links:

Let’s block ads! (Why?)

Recent Questions – Mathematica Stack Exchange

7/19/18 Webinar: Next Generation Location and Data Analysis using Mapbox and Power BI

Join Charles Sterling and Sam Gehret as they walk through how Mapbox and Power BI can use location data to tell your story using next generation maps.


When: 7/19/18 10AM PST

If you are not familiar with Mapbox, it is the location data platform for mobile and web applications. They provide building blocks to add location features like maps, search, and navigation into any experience you create.

 7/19/18 Webinar: Next Generation Location and Data Analysis using Mapbox and Power BISam Gehret

Sam Gehret is a Solutions Engineer for BI and Data Viz at Mapbox. He currently manages the development and roadmap for the Mapbox custom visual for Power BI. Sam has over 7 years of Business Intelligence experience working in both product and sales at another large BI company. He holds a BA from Dartmouth College and is a graduate of the General Assembly Javascript bootcamp.

Let’s block ads! (Why?)

Microsoft Power BI Blog | Microsoft Power BI

How Olay used AI to double its conversion rate

 How Olay used AI to double its conversion rate

Olay, the popular skin care brand, started using AI to make recommendations to its millions of users almost two years ago, and says it has doubled the company’s sales conversion rate overall.

It’s just the latest retail company that has turned to AI to boost its engagement with users to increase its top line. The traction confirms surveys that show an increasing number of businesses are putting AI investments at the head of their agenda.

True, Olay has an advantage over most companies. The billion-dollar brand is owned by giant Procter & Gamble, and has been using AI in its core product for some time. It has 25 years of expertise in image recognition, which helps it identify skin problems and improvement areas for its users.

In 2016, with renewed excitement growing around the potential of AI in marketing products, Olay leveraged the technology in a new marketing push, launching the Olay Skin Advisor, an online tool that gives women an accurate skin-age estimate and recommendations for care.

The product is based on a single selfie, and leverages Olay’s image expertise. Skin Advisor offers up a personalized product regimen, taking into account problem areas it sees, as well as what the user tells it they are most concerned about (wrinkles, crow’s feet, dry skin, etc.).

It incorporates an AI-powered matching engine built by Nara Logics, a Boston company that specializes in content matching and also serves the CIA, among others. Its technology decides exactly which of Olay’s 100 or so products to recommend, and in what combinations.

We talked with the CEO of Nara Logics, Jana Eggers (see video below), about how Olay doubled its conversion rate with the Skin Advisor product, which now has engaged more than four million customers. Skin Advisor also increased the average basket size, for example increasing it by 40 percent in China alone, and cut the bounce rate of visitors to a third of what it was previously. While P&G doesn’t break out Olay results in its earnings, it recently cited demand for Olay products as a reason for exceeding expected sales.

It’s one of the series of cases we’ve been writing about in the run-up to our Transform event on August 21-22, where we are showcasing real examples of companies using AI to drive their business results. Our motto for the event has become “You can do it too!,” because it’s not just the big tech companies — Google, Amazon, Facebook — that can use AI.

Here are my six take-aways from the interview:

  • AI approaches are customized per industry. Nara Logics uses the same machine learning algorithm for Olay as it does for the U.S. government’s intelligence community. But it generates unique “knowledge graphs” for each industry. For Olay, the algorithm accommodates two requirements: First, rules track individual product features and ingredients, to ensure they’re matched to customers’ focus areas and complement each other when offered in suites. Second, it gauges what products are popular, from reviews, transactions and other sources: Moisturizers may be healthy, but women like light hydration moisturizers, not sticky ones. This incorporates a collaborative filtering approach similar to the recommendations from Amazon or Netflix.
  • You don’t have to hire Ph.D.s. Eggers says that while the giant AI-platform companies like Google, Amazon, and Microsoft are hiring data science Ph.D.s, most companies don’t have to hire these expensive employees. “Hire some great software engineers,” she says, and they’ll be excited about using these technologies.
  • Neural nets may be hyped, but they’re still useful. Eggers agrees that neural nets, a deep learning approach, had become overhyped last year. She says she’s seeing more balance now; some companies are moving away from that hype. That said, Nara Logics does use neural nets for collaborative filtering analysis or natural language processing. It also uses proprietary algorithms to filter out noise.
  • Retail, financial, and B2B sectors are ripe for AI. Eggers sees the retail and financial industries are moving quickly to adopt AI. She’s also seeing a lot of traction at B2B companies. These companies have discovered they’re not selling their services to other companies so much as they are selling them to individuals within those companies. This requires AI that makes recommendations based on what those people need in their specific roles.
  • It’s all about personalization. Skin Advisor serves recommendations to tens of thousands of people very week, and yet 94 percent of users receive recommendations unique to them — meaning no one else has received the same recommendation.
  • Men: Use it inside, or shave or something. Not sure if it was a bug or not, but when I first tried Skin Advisor, I was sitting outside, and it thought I was 59. I’m only 51. A couple of hours later, I tried it indoors, and it guessed 51. Bingo. (Later, I was told Skin Advisor doesn’t like men’s facial hair, so maybe that was it.)

This is just part of the story. Join us at Transform (ticket link here) where Jana Eggers sits down with Procter & Gamble’s Christi Putman, R&D Associate Director & Damon Frost, CIO-Beauty to hear more about how Olay is harnessing the power of AI.”

Thanks to all of our sponsors whose support makes Transform possible: Samsung, Worldpay, IBM, Helpshift, PullString, Yva, TiE Inflect and Alegion.

Let’s block ads! (Why?)

Big Data – VentureBeat

Know When to Hold ‘Em – Part I. Fun With Lists

Know when to hold em banner1 1024x581 Know When to Hold ‘Em – Part I. Fun With Lists

Last month I saw a puzzle posted on about poker hand probabilities and, like with many things these days, wondered how I could solve it with PowerBI.  I had been curious to learn more about the M list functions, and this seemed like a good application.  I didn’t finish until well after the deadline but did end up going way down the rabbit hole so thought I’d share what I did/learned.  This post describes how one can generate every possible combination of five cards (2,598,960 rows) and seven cards (133,784,560 rows) entirely from scratch (i.e., no “Get Data” step).  For the latter, there are 21 ways to pull 5 cards from 7 and to score it all in DAX we need each card on its row, resulting in a table with >14 billion rows!  While Power BI can handle that many rows, my computer (and hence my calendar) could not, so a way to reduce the data up to 700 fold is also described.  In Part 2 of this post, we’ll then use DAX to score all the possible hands generated below and do some analytics.

For this post, we’ll use Texas Hold ‘Em as the seven-card game, which means we need a deck of cards, a table for the two cards held by each player, and another for the five cards shared by all players.  These two are then combined to make all the seven-card combinations.  For each of the Two, Five, and Seven tables described below, the expected number of rows can be calculated using the same equation (i.e., 52!/((52-n)!*n!), where n is the number of cards.  Fortunately for us, the order of the cards doesn’t matter for what we are doing, so as we build it we’ll be able to eliminate hands with the same cards in different orders.  Of course, the order the cards come out makes all the difference while playing poker, so this model isn’t helpful to experienced poker players.

You can download the reduced size version of the file to follow along with here

Deck of Cards

In M, lists are the things inside curly brackets, and one could quickly make a list with 14 billion values with one line – {1..14,000,000,000}, but that isn’t very useful.  Instead, we begin building our deck of cards with a much shorter list – {2..14}, representing cards 2 through 10, J, Q, K, A.  We convert that to a table with the menu button in power query and add a custom column with another list for the four suits – ={“C”, “D”, “H”, “S”}.  We expand that list to make the cross join of ranks and suits and a deck of 52 cards.  To simplify things later for both M and DAX functions, the rank is prefixed with a “.” (for example, the two of clubs shows as “.2C”).  The “.” can easily be removed later with Replace Values, but avoids errors in the search/find M and DAX functions we’ll end up using (e.g., “12C” contains “2C,” but “.12C” does not contain “.2C”).

Two Cards Table

The query to make the table of two cards combos starts with a reference to the Deck query (“=Deck”) and then we simply add a custom column that also references the Deck query, and expand it as new rows (52 X 52 rows).  If these two columns are named “C1” and “C2”, we can now make a new list in a third custom column called “TwoList” with  “={[C1], [C2]}”.  Because we can’t be dealt the same card twice and because the order of cards doesn’t matter, we can clean this up some with the following steps using a couple of the M List Functions:

  • Sort the TwoList above with “=List.Sort([TwoList])”
  • Remove duplicates from that column (yes, remove duplicates works on lists too)
  • Use “=List.Distinct([TwoList])” to find the distinct count of cards in the list
  • Filter out (remove) the rows where the distinct count is 1
  • Add an Index column (“TwoIndex”) to be used for relationships/filtering

The above results in a table with the expected 1326 rows – 52!/(50!*2!).

Five Cards Table

Start with a reference to the Deck query and add four more custom columns each referencing the Deck query, expanding each one as you go.  Again eliminate duplicate lists and lists containing duplicate cards with the same steps above.  In practice, as this table gets fast quickly, I also repeated those steps after 4 cards were added to reduce the number of rows sooner.  We create another sorted list with “=List.Sort({[C1], [C2],[C3],[C4],[C5]})”.  Then we remove the card columns, add an index, duplicate the list column, and extract values from the duplicate. This gives us a table like below with the expected number of rows (2,598,960).

image thumb 6 Know When to Hold ‘Em – Part I. Fun With Lists

In Part 2 we’ll use DAX to score each hand, but DAX doesn’t have list functions, so we need to create another table (“FiveExp”) in which we expand each card to its row.  To do this, we start with a reference to the Five query (note how all the queries feed off the Deck query ultimately, so you can downsize the model easily for storage/troubleshooting by just changing the initial lists used to make the deck).  We then expand the FiveList to new rows and split that column at the rightmost character into “Rank” and “Suit” columns that we’ll use for DAX analyses.  The resulting FiveExp table looks as follows:

image thumb 7 Know When to Hold ‘Em – Part I. Fun With Lists

Seven Cards Table

For this one, we start with a reference to the Two table, remove all but the “TwoIndex” and “TwoList” columns, and add the “FiveIndex” and “FiveList” columns from the Five table.  We can then combine the two next steps by nesting two list functions, creating the sorted list of seven cards as follows:

=List.Sort(List.Combine([TwoList], [FiveList]))

From here we again remove duplicate lists and lists with duplicates (and add an index) to get the expected number of rows (133,784,560 rows).  It’s easy to type this all out, but by now these queries are getting very slow to load (many rows with many transformation steps).  Fortunately, as the standard deck of playing cards hasn’t changed in a while, we’ll probably never have to refresh the data.

Not done yet … still need to do DAX later, so we need to create all the ways to pull 5 cards from 7 (21 ways to do that, so #rows x 21) and expand each card to a new row (rows x 5), so we can score each one and find the max (best possible hand).

Scoring Table

We are going to use a second list to extract the 5 cards from the list of 7.  Items in a list are indexed starting from zero, so we need a table of lists of all the possible 5 index combinations from the list of the first seven index values ({0..6}).    Like with the Two table, we create every combination of “0” to “6” and make a list from them.  We then add the {0..6} list again and use the List.Difference() function to remove the two values in the “TwoLeaveOut” list from those in the “List0to6Index” list.  Below is a what the query looks like at this point, with an example difference list shown (first two index values removed).  We then add an index column (“ScoringIndex”) and keep it and just the “ScoringList” column.

image thumb 8 Know When to Hold ‘Em – Part I. Fun With Lists

SevenExp Table

For our last and biggest table, we start with a reference to the Seven table and remove all but the index and list columns.  We then add a column with the Scoring table and expand it (133M rows X 21!).  Now we use the List.Transform() function, which is probably the most powerful/versatile of the list functions.  To reference an individual value in a list, you can use the following syntax – {List}{indexposition}, we can pull the five index positions from a list of seven with the following M code in the Add Custom Column popup:




List.Transform([ScoreList], each templist{_})

//the “_” is where “each” of the index list values goes when it’s their turn

In the above, we first create a variable out of SevenList (so we can reference it inside the List.Transform(), which iterates through each of the ScoreList values (the 5 index values), extracting that index position from variable “templist,” and places them all in a new list.  One can do simple transforms or complex formulae, which is what makes List.Transform() so powerful.  Thanks to one of Imke’s posts to show me that one.

For the final step, we extract the new lists of 5 cards into rows, resulting in a table with >14 billion rows!!

To be honest, I never did hit that button on the Wonkavator (did not load the full SevenExp table); I didn’t have time to wait, and my computer could not handle the DAX expressions (see Part 2) on such big tables even if I did.  At first, I thought I’d let the cloud do the work and publish it to, but we can’t modify the query on the cloud (and I couldn’t publish w/o first loading the queries).  I also tried replacing the initial {2..14} in the Deck query with a Sharepoint Online list with those values that I could make small ({2..5}) for my computer to refresh and then publish, change the SP list, and let it auto-refresh.  However, the cloud timed out.  I even set up a virtual machine with my free Azure trial to no avail.  Note: Once you’ve set up a VM, it’s time to consider that data has become more than a hobby.  Fortunately, I was able to come up with a way to reduce the dataset significantly, avoiding all that extra processing and file size.  Note: the “Seven_Full” and “SevenExp_Full” are in the provided file, but I added a Keep Rows step to keep it small, instead of disabling load (this keeps my relationships and calculated columns intact, and avoids measure errors).

Reduced Dataset

When scoring poker hands, the suit only matters for flushes, and having 4 equivalent suits results in many variants of equivalent hands in the above Seven table.  I was able to simplify things by having only two suits – “S” (for suited or spades) and “N” (for not suited or not spades).  I built separate queries for the flush hands (“S” cards; “FiveFlush” and “SevenFlush” queries) and non-flush hands (“N” cards, “FiveNot” and “SevenNot” queries).  In each, a column with the number of hands represented is also calculated.  For example, the hand “.2S|.3S|.4S|.5S|.6S” represents 4 hands (the 6-high straight flushes in each suit), while the “.2N|.3N|.4N|.5N|.6N” represents 1024 hands (4^5).  To avoid double counting, we subtract the 4 flush hands from that to get 1020 equivalent hands. This one is simple, but it gets more complicated when you have multiple “N” cards of the same rank and 7-card hands with 5 “S” cards and 0-2 “N” cards.  I won’t go through every step and will just mention some of more interesting M expressions.  You can see all the steps in this file.

While this approach greatly reduces the dataset, it complicates both the M and DAX expressions.   We build the queries just like before, but a few steps are different/more complex.  For example:

There can be only one of each S card in a hand, so we first have to filter each list to just the S cards before we find the distinct count.  For this, we use List.FindText() as follows, followed by a filter step:

=List.Distinct(List.FindText([FiveList], “S”))

Next, while we can have multiple N cards of each rank, we can have no more than 4 of them (e.g., 4 “.7N”s would be 4 of a kind).  The M code below makes a list of the frequency that each N card rank is found in the list and then takes the max of that list.  We then filter out the rows where that max is >4.

= let

// Make a list of how many times each card is found in the list, take max with List.Max()

List.Max(List.Transform(List.Distinct(templist), each List.Count(List.FindText(templist,_))))

To tie our model back to a full standard deck with four suits, we can calculate for each hand the number of equivalent hands it is representing (depending on how many N cards it has and how many of those have the same rank).  We can then use the sum of the equivalent hand’s column in our DAX measures where we would have used a countrows() of the full table.  For the flush hands in the Seven table (hands contain 5-7 “S” cards and 0-2 “N” cards, as the hands with the 6th and/or 7th card not suited need to be represented too), the number of equivalent hands can be calculated with:

[Equiv Hands] (flushes)

4*Number.Power(3, [TwoNDC])

//where [TwoNDC] is a column with the distinct count of N cards in the hands calculated in a previous step with List.Count(List.Distinct(List.FindText([TwoList.1], “N”)))+0

For example, a hand with 5 “S” cards and two different “N” cards would represent 36 other hands (3 is used in the Number.Power since one of the suits is taken by the “S” cards, and the “N” cards have to be different.  When the two “N” cards are the same, only 3 possibilities exist for each of the 4 flush hands (12 total).

For the non-flush hands, we need to calculate the total possible and then subtract the number of flush hands possible.  To calculate the number of flush hands, it helps to introduce the concept of a Rank Pattern.  We’ll use it a lot in Part 2 for scoring each hand, but it helps now too.  The Rank Pattern is a way to encode the frequency of each rank in a given hand and can be calculated as follows with M (you’ll see DAX equivalent in Part 2).  Just look at the last line in which we raise 10 to the power of the count of each rank, and then sum those values.  For example, a full house has a Rank Pattern of 1100 (10^3+10^2), while 5 different ranks have a pattern of 50 (5 * 10^1).



ranklist=List.Transform([SevenList], each Text.Select(_, {“0”..”9″, “.”})),
countslist = List.Transform(List.Distinct(ranklist), each List.Count(List.FindText(ranklist, _)))
List.Sum(List.Transform(countslist, each Number.Power(10, _)))

There are only 4 Rank Patterns of 7 “N” hands that can result in flush hands, and the number of possible flushes is found through a nested if expression (not shown).  To calculate the number of possible hands (to subtract possible flushes from), we use the following expression:

[Equiv for N Hands]

Nlist = [SevenList],
NCount = List.Count(List.Distinct(Nlist))+0,

equiv=List.Product(List.ReplaceMatchingItems(List.Transform(List.Distinct(Nlist), each List.Count(List.FindText(Nlist, _))), {{1,4}, {2,6}, {3,4}, {4,1}}))

equiv -[PossibleFlushes]

In order to wrap things up, I won’t explain in great detail why this calculation helps us get to the equivalent number of hands.  Just know that it takes the product of a list generated by replacing the frequency that each rank is found in the list of N cards with the number of possible suit combinations (e.g., there are 4 versions of a hand with a single “.7N” card, but only one version with four “.7N” cards; there are no “S” cards in this query).

Go ahead, read that sentence again; there is a lot going on.


With this reduced approach, the Five table has 7,462 rows (348X less than the full 2.6M!), and the Seven table has 190,346 rows (703X less than the full 133M!).  The expanded SevenExp table now has just under 20M rows (far less the original 14B).  Fortunately, the sum of the equivalent hand’s column exactly matches the expected “full” table versions, as shown below:

image thumb 9 Know When to Hold ‘Em – Part I. Fun With Lists

We’ll score all this in Part 2, but the image below shows that we get the same probability of each type of hand using both the “full” and “reduced” approaches for the five-card hands (we’ll deal with seven card hands in Part 2).

image thumb 10 Know When to Hold ‘Em – Part I. Fun With Lists

I told you up front that I went way down this rabbit hole, so thanks to anyone that read through it all.  The fact that we can do stuff like this (both the brute force full tables and the advanced calculations needed for the reduced approach) shows the power of Power BI (#StuffTableauCan’tDo).  Stay tuned for Part II where we’ll score these hands with DAX (and show an M-based alternative), and make a dashboard to see the best possible hand given the two cards you are dealt vs. what the other players might be holding.

We “give away” business-value-creating and escape-the-box-inspiring content like this article in part to show you that we’re not your average “tools” consulting firm. We’re sharp on the toolset for sure, but also on what makes businesses AND human beings “go.”

In three days’ time imagine what we can do for your bottom line. You should seriously consider finding out 🙂

* – unless, of course, you have two turntables and a microphone.  We hear a lot of things are also located there.

Let’s block ads! (Why?)


Importance of Data Quality: How to Explain it To Your Boss

As an IT professional, you know what data quality means and why it’s important. But does your boss? If not, it’s time to explain the importance of data quality to the higher-ups in your company, in order to ensure that you have the tools and support you need to manage data most effectively.

For the purposes of this article, we’ll assume that your boss lacks special technical expertise related to data management. We’ll also assume that his or her main goal is to drive business value and cut costs.

We know: Those may be somewhat stereotypical assumptions. To be sure, not all bosses are technical know-nothings, and not all of them see the world only in terms of dollars and cents.

Still, to a greater or lesser extent, these are the core challenges that many IT teams face when trying to gain support from management for the tools and processes they need to manage data quality effectively. Even if your boss has an above-average level of technical expertise, that skillset may or may not extend to the nuances of data quality.

That’s why it’s important to develop a strategy for communicating the importance of data quality to your boss. The following pointers can help.

Importance of Data Quality How to Explain it To Your Boss banner Importance of Data Quality: How to Explain it To Your Boss

Explain the Importance of Data Quality in Real-World Terms

You might think of data quality issues in terms of database index problems or dirty disk partition tables. But unless your boss works alongside you in the IT trenches on a daily basis, he or she probably doesn’t understand these technical concepts.

That’s why you should talk about the importance of data quality using real-life examples that are easy for someone without deep technical knowledge to understand. For instance, you might say that a data quality problem occurs when a database contains multiple entries for the same person. That’s a pretty simple problem to understand.

Similarly, you could discuss the example of foreign or special characters within words that are formatted incorrectly. Chances are that your boss has seen this problem in action, and can understand why this issue could cause data management challenges.

Emphasize that Data Quality Matters to Customers, Not Just You

Bosses might be sympathetic when you tell them how data quality problems make your job harder. But if they think that the problem stops there, they are less likely to be sympathetic, because a harder job for you does not necessarily mean a problem for the business.

But when you explain that data quality problems can also impact customers — by, for example, leading to lost records or making it hard for support staff to reach clients — bosses are more likely to recognize the imperative of protecting data quality.

bigstock Business People Chatting In Of 120794507 600x Importance of Data Quality: How to Explain it To Your Boss

Explain that Data Quality Management Requires People, Tools, and Processes

You want your boss to understand that when you talk about the importance of data quality, you’re not just trying to get him or her to sign off on a purchase order for a new tool or server. You instead want holistic support for all of the people, tools, and processes that you need to get data quality right.

Depending on your circumstances, improving data quality may require hiring new IT staff. It may involve purchasing new tools. Or it might require implementing new company-wide data management best practices. No matter what you need, you want to ensure that your boss is ready to help you get it.

Present Data Quality Assurance as a Continuous Process

For similar reasons, it’s important to convey that achieving data quality is not a one-and-done type of task. You don’t just set up a new tool or process and consider your data quality problems permanently solved.

Instead, data quality requires an ongoing commitment, as well as continuous monitoring and improvement. This means you’ll need your boss to be on board with data quality for the long haul. Revisiting the data quality conversation on a routine basis with your boss, and presenting data to show how data quality is improving over time — and could be improved further — is one way to help him or her appreciate the continuous nature of data quality.

Stress that Data Quality Challenges are Getting Harder, Not Easier

Finally, you want to make sure that your boss understands that data quality problems are not something that will go away on their own, or that you’ll just learn to cope with. On the contrary, they tend to become greater in scope, due to the ever-increasing volume of data that companies collect, as well the need to integrate data across diverse IT infrastructures that are composed of different types of legacy and modern technologies.

We also have a new eBook focused on Strategies for Improving Big Data Quality available for download. Take a look!

Let’s block ads! (Why?)

Syncsort Blog

Coast Autonomous’ self-driving shuttles are boring, and that’s by design

Between 47th and 48th street in the heart of Times Square, Coast Autonomous, a startup based in Pasadena, California, today showed off the fruit of its six-year research project: a slow-moving, self-driving shuttle designed to ferry folks from destination to destination at speeds of around 25 miles per hour.

I stopped by and hitched a ride down the block.

It wasn’t the most exciting demo — concrete planters separated the featureless P-1 shuttle, which looks sort of like a miniature bus, from Manhattan’s rush hour traffic and curious onlookers — and the shuttle moved only up and down 47th street. But that was sort of the point.

“Self-driving cars should be boring,” chief technology officer Pierre Lefèvre told me in an interview. “Nobody really wants the alternative.”

Just because it’s boring doesn’t mean it’s uncomfortable. The air-conditioned P-1 trades wheel axels for hubs with electric motors, and lacks a steering wheel, pedals, and dashboard, allowing it to accommodate a wider-than-average cabin. It also boasts a reconfigurable, nontraditional seating arrangement that has passengers sitting abreast from each other in a semicircle, opposite the shuttle’s doorway.

Coast Autonomous claims it can fit a maximum of 14 seated passengers and six standing, but it felt a bit cosy with five (four journalists and Pierre).

For the purposes of the demo, Pierre started and stopped the P-1 with an Xbox controller paired wirelessly to a console embedded in the ceiling. (In the future, the console’s screen will display route information.) He didn’t drive it, though — lidar sensors, wireless transceivers, GPS, cameras, and an AI software platform developed in-house helped the shuttle traverse the geofenced area, recognizing road signs and traffic lights and communicating with v2I (vehicle-to-infrastructure) sensors as it went.

Still, Coast Autonomous isn’t taking any chances. Before it deploys a shuttle in a city, it uses a car-mounted sensor array to map its route, constructing a 3D model of the surroundings. Shuttles come to an immediate stop when they encounter pedestrians or objects in their way, and as they drive, remote operators monitors their progress, ready step in and take control in the event of an emergency.

The end goal is to minimize the impact on car and pedestrian traffic around campuses, airports, business parks, campuses, theme parks, resorts, and city centers, Pierre said. To that end, the P-1 lasts up to five hours on a charge with air conditioning (and ten hours without) — it’s stored and containers and charged wirelessly when not in use — and programmed to run on a fixed loop during peak hours and on-demand as streets become less congested. When they’re deployed commercially, passengers will be able to use Coast Autonomous’ mobile app to specify pickup locations and destinations.

 Coast Autonomous’ self driving shuttles are boring, and that’s by design

The Times Square demo wasn’t Coast Autonomous’ first rodeo. It’s run over 60 self-driving demonstrations in seven countries, moving over 120,000 passengers.

The numbers are impressive, but it’s a cutthroat industry. Mercedes-Benz maker Daimler recently announced that it’ll deploy self-driving shuttles in San Francisco by 2019. Another competitor, French driverless shuttle maker Navya, is already testing vehicles in Las Vegas, Anne Arbor, Austin, and elsewhere.

And that’s just the autonomous shuttle sector. Google subsidiary Waymo’s more than 600 Fiat Chrysler Pacifica minivans have driven more than seven million road miles; General Motors plans to launch an autonomous car ridesharing service next year; and self-driving startup raised $ 102 million last week to test self-driving cars in Beijing.

But despite the momentum, it hasn’t exactly been smooth sailing for the autonomous car industry. The National Highway Traffic Safety Administration put a temporary halt to demonstrations last year, while they investigated an accident involving one of Navya’s Las Vegas shuttles. And in March, an Uber-developed driverless car collided with a pedestrian, killing her.

Still, Coast Autonomous is confident that its technology is ready for public roads. The P-1 uses off-the-shelf parts, which makes it less expensive to produce and maintain than similar solutions on the market. And because it travels at low speeds and drives in a comparatively controlled environment, Pierre claims that it’s inherently safer than the competition.

“We are convinced that the deployment of driverless vehicles in low-speed environments, like our P-1 Shuttle and autonomous golf cart, are much closer to commercialization than self-driving vehicles designed to travel at highway speeds,” Adrian Sussmann, managing director at Coast Autonomous, said in a statement. “This is mainly because operating at low speeds is much safer, requires less sensors, and is therefore much more cost effective. We are already seeing significant interest and expect to deploy our first fleets in 2019.”

Let’s block ads! (Why?)

Big Data – VentureBeat