Category Archives: Data Mining

Why You Should Already Have a Data Governance Strategy

Garbage in, garbage out. This motto has been true ever since punched cards and teletype terminals. Today’s sophisticated IT systems depend just as much on good quality data to bring value to their users, whether in accounting, production, or business intelligence. However, data doesn’t automatically format itself properly, any more than it proactively tells you where it’s hiding or how it should be used. No, data just is. If you want your business data to satisfy criteria of availability, usability, integrity, and security, you need a data governance strategy.

Data governance in general is an overarching strategy for organizations to ensure the data they use is clean, accurate, usable, and secure. Data stakeholders from business units, the compliance department, and IT are best positioned to lead data governance, although the matter is important enough to warrant CEO attention too. Some organizations go as far as appointing a Data Governance Officer to take overall charge. The high-level goal is to have consistent, reliable data sets to evaluate enterprise performance and make management decisions.

Ad-hoc approaches are likely to come back to haunt you. Data governance has to become systematic, as big data multiplies in type and volume, and users seek to answer more complex business questions. Typically, that means setting up standards and processes for acquiring and handling data, as well as procedures to make sure those processes are being followed. If you’re wondering whether it’s all worth it, the following five reasons may convince you.

banner blog 2 Why You Should Already Have a Data Governance Strategy

Reason 1: Ensure data availability

Even business intelligence (BI) systems won’t look very smart, if users cannot find the data needed to power them. In particular, self-service BI means that the data must be easy enough to locate and to use. After years of hearing about the sinfulness of organizational silos, it should be clear that even if individual departments “own” data, the governance of that data must be done in the same way across the organization. Authorization to use the data may be restricted, as in the case of sensitive customer data, but users should not ignore its existence, when it could help them in their work.

Availability is also a matter of having appropriate data that is easy enough to use. With a trend nowadays to store unstructured data from different sources in non-relational databases or data lakes, it can be difficult to know what kind of data is being acquired and how to process it. Data governance is therefore a matter of first setting up data capture to acquire what your enterprise and its different departments need, rather than everything under the sun. Governance then also ensures that data schemas are applied to organize data when it is stored, or that tools are available for users to process data, for example to run business analytics from non-relational (NoSQL) databases.

Reason 2: Ensure users are working with consistent data

When the CFO and the COO work from different sets of data and reach different conclusions about the same subjects, things are going to be difficult. The same is true at all other levels in an enterprise. Users must have access to consistent, reliable data, so that comparisons make sense and conclusions can be checked. This is already a good reason for making sure that data governance is driven across the organization, by a team of executives, managers, and data stewards with the knowledge and authority to make sure the same rules are followed by all.

Global data governance initiatives may also grow out of attempts to improve data quality at departmental levels, where individual systems and databases were not planned for information sharing. The data governance team must deal with such situations, for instance, by harmonizing departmental information resources. Increased consistency in data means fewer arguments at executive level, less doubt about the validity of data being analyzed, and higher confidence in decision making.

Reason 3: Determining which data to keep and which to delete

The risks of data hoarding are the same as those of physical hoarding. IT servers and storage units full of useless junk make it hard to locate any data of value or to do anything useful with it afterwards. Users use stale or irrelevant data as the basis for important business decisions, IT department expenses mushroom, and vulnerability to data breaches increases. The problem is unfortunately common. 33% of the data stored by organizations is simply ROT (redundant, obsolete, or trivial), according to the Veritas Data Genomics Index 2017 survey.

Yet things don’t have to be that way. Most data does not have to be kept for decades, “just in case.” As an example, retailing leader Walmart uses only the last four weeks’ transactional data for its daily merchandising analytics. It is part of good data governance strategy to carefully consider which data is important to the organization and which should be destroyed. Data governance also includes procedures for employees to make sure data is not unnecessarily duplicated, as well as policies for systematic data retirement (for instance, for archiving or destruction) according to age or other pertinent criteria.

Reason 4: Resolve analysis and reporting issues

An important dimension in data governance is the consistency across an organization of its metrics, as well as the data driving them. Without clearly recorded standards for metrics, people may use the same word, yet mean different things. Business analytics are a case in point, when analytics tools vary from one department to another. Self-service analytics or business intelligence can be a boon to an enterprise, but only if people interpret metrics and reports in a consistent way.

When reports lack clarification, the temptation is often to blame technology. The root cause, however, is often the mis-configuration of the tools and systems involved. It may even be in their faulty application, as in the case of reporting tools being wrongly applied to production databases, triggering problems in performance that mean that neither transactions nor analytics are satisfactorily accomplished. Ripping out and replacing fundamentally sound systems is not the solution. Instead, improved data governance brings more benefit, faster, and for far less cost.

Reason 5: Security and compliance with laws concerning data governance

Consequences for non-compliance with data regulations can be enormous, especially where private individuals’ information is concerned. A case in point, the European General Data Protection Regulation (GDPR) for May 2018 sets non-compliance fines up to some $ 22 million or four percent of the offender’s worldwide turnover, whichever is the higher, for data misuse or breach affecting European citizens.

Effective data governance helps an organization to avoid such issues, by defining how its data is to be acquired, stored, backed up, and secured against accidents, theft, or misuse. These definitions also include provision for audits and controls to ensure that the procedures are followed. Realistically, organizations will also conduct suitable awareness campaigns to makes sure that all employees working with confidential company, customer, or partner data understand the importance of data governance and its rules. Education and awareness campaigns will become increasingly important as user access to self-service solutions increases, as will the levels of data security already inherent in those solutions.


If you think about data as a strategic asset, the idea of governance becomes natural. Company finances must be kept in order with the necessary oversight and audits, workplace safety must be guaranteed and respect the relevant regulations, so why should data – often a key differentiator and a confidential commodity – be any different? As IT self-service and end-user empowerment grow, the importance of good data governance increases too. Business user autonomy in spotting trends and taking decisions can help an enterprise become more responsive and competitive, but not if it is founded on data anarchy.

Effective data governance is also a continuing process. Policy definition, review, adaptation, and audit, together with compliance reviews and quality control, are all regularly effected or repeated as a data governance life cycle. As such, data governance is never finished, because new sources, uses, and regulations about data are never finished either. For contexts such as business intelligence, especially in a self-service environment, good data governance helps users to use the right data in the right way, to generate business insights correctly and take sound business decisions.

banner blog 2 Why You Should Already Have a Data Governance Strategy

Tags: |

Let’s block ads! (Why?)

Blog – Sisense

Data Science and Visual Analytics for Operations in the Energy Sector

iStock 855386302 e1528773593698 Data Science and Visual Analytics for Operations in the Energy Sector

In recent years, Oil and Gas Companies have been challenged to adapt to lower crude prices. With the recent crude price increase, there has never been a better time for energy companies to transform their operations.

From upstream exploration and production to logistics and downstream refining, energy trading, and the portfolio investments; there are opportunities for optimization. All of these areas benefit from today’s advances in data science and visual analytics. The past few years many companies were forced to reduce costs or consolidate; it was a period of survival. Now, the successful companies of the future are digitizing smarter.

Driving business operations from analytic insights applies to many facets of the digital energy business including:

Modernized Grids and Smarter Oilfields

With TIBCO Systems of Insight:

  • Analysts can create self-service analytic apps to deliver insights into all aspects of a process, quality, and costs.
  • Data scientists can develop machine learning intelligence into sensors, processes, and equipment to reduce data bottlenecks and take action at the point of impact.  
  • Operations and IT developers can empower more users and scale complex, computationally intensive workloads in the cloud.

Asset Portfolio Value Optimization

Using Spotfire, analysts can invoke smart data wrangling, data science, and advanced geoanalytics to develop accurate valuation of assets and resource plays for optimal capital allocation. Spotfire community templates for decline curve analysis and geoanalytics enable these sophisticated calculations to run with point-click configuration, invoking Spotfire’s powerful inbuilt TIBCO Runtime R engine.

Predictive Maintenance, Process Control, and Process Optimization

Spotfire and TIBCO Statistica can readily analyze large amounts of data from internal and external IoT data sources. The combination of your industry expertise with TIBCO’s latest visual, predictive, and prescriptive analytics techniques enable you to address all of your process and equipment surveillance challenges.

Business Operations and Supply Chain Management

Provide managers, engineers, and business users self-service access to data, visualizations, ​and analytics for visibility across the entire value chain. Respond to evolving needs and deliver actionable insights that enable people and systems to make smarter decisions. Reduce time spent on compliance reporting and auditing.

Energy Trading

Develop insights faster and bring clarity to business issues in a way that gets all the traders, managers, and financial decision-makers on the same page quickly. For companies trading in multiple commodities, TIBCO Connected Intelligence can be deployed as a single analytics platform that brings a consolidated view of risks and positions, compliance, and results. Read more about it.

Learn More Firsthand

Listen to TIBCO’s Chief Analytics Officer Michael O’Connell explain how companies are leveraging the latest Spotfire innovations, optimizing exploration and production efforts and investments, and gaining a decisive advantage. And hear Stephen Boyd from Chevron present a real-world case study on TIBCO Connected Intelligence. Register now for the quarterly Houston area TIBCO Spotfire® User Group Meeting taking place on Thursday, June 14th, at the Hilton Garden Inn. Or find a Spotfire Meetup near you.

Let’s block ads! (Why?)

The TIBCO Blog

Why Low-Code is a Good Fit for Marketing

iStock 832112086 e1528815762943 Why Low Code is a Good Fit for Marketing

Digital Transformation

Let’s block ads! (Why?)

The TIBCO Blog

Pentaho 8.1 is available

Pentaho 8.1 is available

The team has once again over delivered on a dot release! Below are what I think are the many highlights of Pentaho 8.1 as well as a long list of additional updates.
If you don’t have time to read to the end of my very long blog, just save some time and download it now. Go get your Enterprise Edition or trial version from the usual places

For CE, you can find it on the community home!


One of the biggest themes of the release: Increased support for Cloud. A lot of vendors are fighting for becoming the best providers, and what we do is try to make sure Pentaho users watch all that comfortably sitting on their chairs, having a glass of wine, and really not caring about the outcome. Like in a lot of areas, we want to be agnostic – which is not saying that we’ll leverage the best of each – and really focus on logic and execution.
It’s hard to do this as a one time effort, so we’ve been adding support as needed (and by “as needed” I really mean based on the prioritization given by the market and our customers). A big focus of this release was Google and AWS:
 Pentaho 8.1 is available

Google Storage (EE)

Google Cloud Storage is a RESTful unified storage for storing and accessing data on Google’s infrastructure. PDI support for import and export Data To/From Cloud Storage is now done through a new VFS driver (gs://). You may even use it on the several steps that support it as well as browse it’s contents.
These are the roles required on Google Storage for this to work:
     Storage Admin
     Storage Object Admin
     Storage Object Creator
     Storage Object Viewer
In terms of authentication, you’ll need the following environment variable defined:
From this point on, just treat it as a normal VFS source.

 Pentaho 8.1 is available

 Google BigQuery – JDBC Support  (EE/CE)

BigQuery is Google’s serverless, highly scalable, low cost enterprise data warehouse. Fancy name for a database, and that’s how we treat it.
In order to connect to it first we need the appropriate drivers. Steps here are pretty simple:
2.      Copy google*.* files from Simba driver to /pentaho/design-tools/data-integration/libs folder
Host Name will default to but your mileage may vary.
Unlike the previous item, authentication doesn’t use the previously defined environment variable as does Google VFS. Authentication here is done at the JDBC driver level, though a driver option, OAuthPvtKeyPath, set in the Database Connection Option and the you need to point to the Google Storage certificate through the P12 key format.
The following Google BigQuery roles are required:
1.      BigQuery Data Viewer
2.      BigQuery User
 Pentaho 8.1 is available

Google BigQuery – Bulk Loader  (EE)

While you can use a regular table output to insert data into BigQuery that’s going to be slow as hell (who said hell was slow? This expression makes no sense at all!). So we’ve added a step for that: Google BigQuery Loader.
This step leverages google’s loading abilities, and is processed out on Google, not on PDI. So the data, that has to be either in Avro, JSON or CSV has to be previously copied to Google Storage. From that point on is pretty straightforward. Authentication is done via the GOOGLE_APPLICATION_CREDENTIALS environment variable point to the Google JSON file.
 Pentaho 8.1 is available
Google Drive  (EE/CE)
While Google Storage will probably be seen more frequently in production scenarios, we also added support for Goggle Drive, a file storage and synchronization service, allows users to store files on their servers, synchronize files across devices, and share files.
This is also done through a VFS driver, but given it’s a per user authentication a few steps need to be fulfilled to leverage this support:
     Copy your Google client_secret.json file into (The Google Drive option will not appear as a Location until you copy the client_secret.json file into the credentials directory and restart)
o   Spoon: data-integration/plugins/pentaho-googledrive-vfs/credentials directory, and restart spoon.
o   Pentaho Server:  pentaho-server/pentaho-solutions/system/kettle/plugins/pentaho-googledrive-vfs/credentials directory and restart the server
     Select Google Drive as your Location. You are prompted to login to your Google account.
     Once you have logged in, the Google Drive permission screen displays.
     Click Allow to access your Google Drive Resources.
     A new file called StoredCredential will be added to the same place where you had the client_secret.json file. This file will need to be added to the Pentaho Server credential location and that authentication will be used

Analytics over BigQuery  (EE/CE, depending on the tool used)

This JDBC connectivity to Google BigQuery, as defined previously for Spoon, can also be used throughout all the other Business Analytics browser and client tools – Analyzer, CTools, PIR, PRD, modeling tools, etc. Some care has to be taken here, though, as BigQuery’s pricing is related to 2 factors:
     Data stored
     Data queried
While the first one is relatively straightforward, the second one is harder to control, as you’re charged according to total data processed in columns selected. For instance, a ‘select *’ query should be avoided if only specific columns are needed. To be absolutely clear, this has nothing to do with Pentaho, these are Google BigQuery pricing rules.
So ultimately, and a bit like we need to do on all databases / data warehouses, we need to be smart and work around the constraints (usually speed and volume, on this case price as well) to leverage best what these technologies have to offer. Some examples are given here:
     By default, there is BigQuery caching and cached queries are free. For instance, if you run a report in Analyzer, clear the Mondrian cache, and then reload the report, you will not be charged (thanks to the BigQuery caching)
     Analyzer: Turn off auto refresh, i.e, this way you design your report layout first, including calculations and filtering, without querying the database automatically after each change
     Analyzer: Drag in filters before levels to reduce data queried (i.e. filter on state = California BEFORE dragging city, year, sales, etc. onto canvas)
     Pre-aggregate data in BigQuery tables so they are smaller in size where possible (to avoid queries across all raw data)
     GBQ administrators can set query volume limits by user, project, etc. (quotas)

AWS S3 Security Improvements (IAM) (EE/CE)

PDI is now able to get IAM security keys from the following places (in this order):
1.      Environment Variables
2.      Machine’s home directory
3.      EC2 instance profile
This added flexibility helps accommodate different AWS security scenarios, such as integration with S3 data via federated SSO from a local workstation, by providing secure PDI read/write access to S3 without making user provide hardcoded credentials.
The IAM user secret key and access key can be stored in one place so they can be leveraged by PDI without repeated hardcoding in Spoon. These are the environment variables that point to them:

 Pentaho 8.1 is available

Big Data / Adaptive Execution Layer (AEL) Improvements

 Pentaho 8.1 is available

Bigger and Better (EE/CE)

AEL provides spectacular scale out capabilities (or is it scale up? I can’t cope with these terminologies…) by seamlessly allowing a very big transformation to leverage a clustered processing engine.
Currently we have support for Spark through the AEL layer, and throughout the latest releases we’ve been improving it in 3 distinct areas:
     Performance and resource optimizations
o   Added Spark Context Reuse that, under certain circumstances can speed up startup performance on the range to 5x faster, proving specially useful under development conditions
o   Spark History Server integration, providing a centralized administration, auditing and performance reviews of the transformations executed in Spark
o   Ability to passing down to the cluster customized spark properties, allowing a finer-grained control of the execution process
     Increased support for native steps (eg, leveraging the spark specific group by instead of the PDI engine one)
     Adding support for more cloud vendors – and we just did that for EMR 5.9 and MapR 5.2
This is the current support matrix for Cloud Vendors:

 Pentaho 8.1 is available

Sub Transformation support (EE/CE)

This one is big, as it was the result of a big and important refactor on the kettle engine. AEL Now supports executing sub transformations through the Transformation Executor step, a long-standing request since the times of good-old PMR (Pentaho Map Reduce)
 Pentaho 8.1 is available

Big Data formats: Added support for Orc (EE/CE)

Not directly related to AEL, but most of the use cases where we want the AEL execution we’ll need to input data in a big data specific format. In previous releases we added support for Parquet and Avro, and we now added support for ORC (Optimized Record Columnar), a format favored by Hortonworks.
Like the others, Orc will be handled natively when transformations are executed in AEL
 Pentaho 8.1 is available 

Worker Nodes (EE)

 Pentaho 8.1 is available

Jumping from scale-out to scale-up (or the opposite, like I mentioned, I never know), we continue to do lots of improvements on the Worker Nodes project. This is an extremely strategic project for us as we integrate with the larger Hitachi Vantara portfolio.
Worker nodes allow you to execute Pentaho work items, such as PDI jobs and transformations, with parallel processing and dynamic scalability with load balancing in a clustered environment. It operates easily and securely across an elastic architecture, which uses additional machine resources as they are required for processing, operating on premise or in the cloud.
It uses the Hitachi Vantara Foundry project, that leverages popular technologies under the hood such as Docker (Container Platform), Chronos (Scheduler) and Mesos/Marathon (Container Orchestration).
For 8.1 there are several other improvements:
     Improvements tn Monitoring, with accurate propagation of Work Items status for monitoring
     Performance improvements by optimizing the startup times for executing the work items
     Customizations are now externalized from docker build process
     Job clean up functionality

 Pentaho 8.1 is available


 Pentaho 8.1 is available

In Pentaho 8.0 we introduced a new paradigm to handle streaming datasources. The fact that it’s a permanently running transformation required a different approach: The new streaming steps define the windowing mode and point to a sub transformation that will then be executed on a micro batch approach.
That works not only for ETL within the kettle engine but also in AEL, enabling spark transformations to feed from Kafka sources.

New Streaming Datasources: MQTT, and JMS (Active MQ / IBM MQ) (EE/CE)

Leveraging on the new streaming approach, there are 2 new steps available – well, one new and one (two, actually) refreshed.
The new one is MQTT – Message Queuing Telemetry Transport – an ISO standard publish-subscribe-based messaging protocol that works on top of the TCP/IP protocol. It is designed for connections with remote locations where a “small code footprint” is required or the network bandwidth is limited.  Alternative IoT centric protocols include AMQP, STOMP, XMPP, DDS, OPC UA, WAMP

 Pentaho 8.1 is available

There are 2 new steps – MQTT Input and MQTT Output, that connect with the broker for consuming and publishing back the results.
Other than this new, IoT centered streaming source, there are 2 new steps, JMS Input and JMS Output. These steps replace the old JMS Consumer/Producer and the IBM Websphere MQ steps, supporting, in the new mode the following message queue platforms:
     IBM MQ
Safe Stop (EE/CE)
This new paradigm to handle streaming sources introduced a new challenge that we never had to face. Usually, when we triggered jobs and transformations, they had a well defined start and end; Our stop functionality was used when we wanted to basically kill a running process because something was not going well.
However, on these streaming use cases, a transformation may never finish. So stopping a transformation the way we’ve always done – by stopping all steps at the same time – could have unwanted results.
So we implemented a different approach – We added a new option to safe stop a transformation implemented within Spoon, Carte and the Abort step, that instead of killing all the step threads, stops the input steps and lets the other steps gracefully finish the processing, so no records currently being processed are lost.

 Pentaho 8.1 is available

This is especially useful in real-time scenarios (for example reading from a message bus). It’s one of those things that when we look back seems pretty dumb that it wasn’t there from the start. It actually makes a lot of sense, so we went ahead and made this the default behavior.

Streaming results (EE/CE)

When we launched streaming in Pentaho 8.0 we focused on the processing piece. We could launch the sub transformation but we could not get results back. Now we have the ability to define which step on the sub-transformation will send back the results to follow the rest of the flow.

 Pentaho 8.1 is available

Why is this important? Because of what comes next…
Streaming Dataservices (EE/CE)
There’s a new option new option to run data service in streaming mode. This will allow the consumers (on this case CTools Dashboards) to get streaming data from this dataservice.

 Pentaho 8.1 is available

Once defined, we can test these options within the test dataservices page and see the results as they come.

 Pentaho 8.1 is available

This screen exposes the functionality as it would be called from a client. It’s important to know that the windows that we define here are not the same as the ones we defined for the micro batching service. The window properties are the following:
     Window Size – The number of rows that a window will have (row based), or the time frame that we want to capture new rows to a window (time based).
     Every – Number of rows (row based), or milliseconds (time based) that should elapse before creating a new window.
     Limit – Maximum number of milliseconds (row based) or rows (time based) which will be used to wait for a new window to be generated.

CTools and Streaming Visualizations (EE/CE)

We took a holistic approach to this feature. We want to make sure we can have a real time / streaming dashboard leveraging what was set up before. And this is where the CTools come in. There’s a new datasource in CDE available to connect to streaming dataservices:

 Pentaho 8.1 is available

Then the configuration of the component will select the kind of query we want – Time or number of records base, window size, frequency and limit. This gives us a good control for a lot of use cases.

 Pentaho 8.1 is available

This will allow us to then connect to a component the usual way. While this will probably be more relevant for components like tables and charts, ultimately all of them will work.
It is possible to achieve a level of multi-tenancy by passing a user name parameter from the PUC session (via CDE) to the transformation as a data services push-down parameter. This will enable restriction of the data viewed on a user by user basis
One important note is that the CTools streaming visualizations do not yet operate on a ‘push’ paradigm – this is on the current roadmap. In 8.1, the visualizations poll the streaming data service on a constant interval which has a lower refresh limit of 1 second. But then again… if you’re doing a dashboard of this types and need a refresh of 1 second, you’re definitely doing something wrong…

Time Series Visualizations (EE/CE)

One of the biggest use cases for streaming, from a visualization perspective, is time series. We improved the support for CCC for timeseries line charts, so now data trends over time will be shown without needing workarounds.
This applies not only to CTools but also to Analyzer

 Pentaho 8.1 is available

Data Exploration Tool Updates (EE)

We’re keeping on our path of improving our Data Exploration Tool. It’s no secret that we want to make it feature complete so that it can become the standard data analysis tool for the entire portfolio.
This time we worked on adding filters to the Stream view.
 Pentaho 8.1 is available  Pentaho 8.1 is available 
We’ll keep improving this. Next on the queue, hopefully, will be filters on the model view and date filters!

Additional Updates

As usual, there were several additional updates that did not make it to my highlights above. So for the sake of your time and not creating a 100 page blog – here are even more updates in Pentaho 8.1.
Additional updates:
     Salesforce connector API update (API version 41)
     Splunk connection updated to version 7
     Mongo version updated to 3.6.3 driver (supporting 3.4 and 3.6)
     Cassandra version updated to support version 3.1 and Datastax 5.1
     PDI repository browser performance updates, including lazy loading
     Improvements on the Text and Hadoop file outputs, including limit and control file handling
     Improved logging by removing auto-refresh from the kettle logging servlet
     Admin can empty trash folder of other users on PUC
     Clear button in PDI step search in spoon
     Override JDBC driver class and URL for a connection
     Suppressed the Pentaho ‘session expired’ pop-up on SSO scenarios, redirecting to the proper login page
     Included the possibility to schedule generation of reports with a timestamp to avoid overwriting content
In summary (and wearing my marketing hat) with Pentaho 8.1 you can:
      Deploy in hybrid and multi-cloud environments with comprehensive support for Google Cloud Platform, Microsoft Azure and AWS for both data integration and analytics
      Connect, process and visualize streaming data, from MQTT, JMS, and IBM MQ message queues and gain insights from time series visualizations
      Get better platform performance and increase user productivity with improved logging, additional lineage information, and faster repository access

Let’s block ads! (Why?)

Pedro Alves on Business Intelligence

F1 Race Recap: Second Place for the Silver Arrows in Canada

M161588 960x640 F1 Race Recap: Second Place for the Silver Arrows in Canada

It was a beautiful day in Montreal as ten Formula One teams arrived at the Circuit Gilles Villeneuve for the Canadian Grand Prix. After a hard-fought qualifying round, Mercedes-AMG Petronas Motorsport drivers Valtteri Bottas and Lewis Hamilton landed P2 and P4 starting positions on the grid. With Red Bull’s Max Verstappen in front of Hamilton and Ferrari’s Sebastian Vettel taking P1, would Hamilton be able to bring home his seventh Canadian GP win from the second row?

As they warmed up cars warmed up in the formation lap, the drivers prepared for a fight around the first several turns.

Despite going wheel-to-wheel with Verstappen, Bottas managed to avoid the overtake and hang on to his 2nd place position. Shortly after the race began, Hamilton began sensing an issue.

Fortunately the brilliant technicians at Mercedes-AMG Petronas Motorsport were able to diagnose and repair his issue, getting him sorted out in seconds and back on the track.

Just as Bottas was making progress to close the gap on Vettel, the Flying Finn drifted off course, costing valuable time.

As the final laps flew by, Bottas held on to 2nd place, as Hamilton settled for 5th.

Despite the setbacks and challenges, Bottas and Hamilton fought it out for 70 laps and managed to bring home some valuable points to continue their lead in the Constructors’ World Championship standings.

Up next: the Grand Prix de France on June 24!

Learn more about TIBCO’s partnership with Mercedes-AMG Petronas Motorsport that provides the team with a data-driven competitive edge.

Let’s block ads! (Why?)

The TIBCO Blog

Your Services are Decomposing – How Will You Manage Them?

iStock 670517478 1 960x640 Your Services are Decomposing – How Will You Manage Them?


Let’s block ads! (Why?)

The TIBCO Blog

[Infographic] Everything You Ever Wanted to Know About Donut Charts

The donut chart. Often seen as just another version of a pie chart, the donut chart doesn’t often get much praise and recognition. And that’s fair. There are only a few very specific scenarios in which using a donut chart is the best way to visualize data. And, by the way, I recognize many people would argue there’s never a good time to use a donut chart.

However, today is National Donut Day, and if there was ever a day to let the donut chart take center stage this has got to be it.

So, Happy National Donut Day! Here’s everything you ever wanted to know about donut charts.

And for the record, I’d take a donut over a pie any day of the week.

Donut Day Infographic [Infographic] Everything You Ever Wanted to Know About Donut Charts
Embed this infographic on your site:

Tags: |

Let’s block ads! (Why?)

Blog – Sisense

Collaboratively Innovate with Version Control in Spotfire Analytics

Version control (also known as revision control or source control) for DevOps and software development is a well-adopted practice to eliminate the risk of manual errors and enable teams to work together to develop code. However, many BI analysts and developers aren’t taking advantage of the benefits of version control when it comes to developing analytics reports. Typically, analytics reports are made up of a group of individually-created versions, which are saved and merged manually without source control as a backup. So, how should BI analysts and developers adopt version control and what challenges could this best practice address?

Learning from Software Development Best Practices

One of the most important best practices for software development is “commit early, often, and with clear messages to your future self or others.” In other words, take baby steps for success and commit after even small crunches of valid code to provide an easy-to-follow history tracking with a detailed changelog. This comprehensive history serves as the basis for deciding on the final version of code.  It’s also important in the case of an emergency to discover what went wrong compared to previous versions.

Here are several other best practices for maintaining version control when developing code:

Write Descriptive Commit Messages

In the moment, it’s easy to generate a creative abbreviation. However, these abbreviations need to have meaning now and in the future to both yourself and others. Writing descriptive commit messages helps provide enough information that anyone can understand.

Keep File Naming Conventions Consistent

An easy-to-recognize file structure encourages teamwork and establishes a single point of understanding between team members. Elements should be recognizable at a glance to accelerate collaboration and eliminate time that could be wasted trying to identify files in the repository.

Merge Versions Seamlessly

Merging can be useful when trying to combine the work of several team members with previously separated codes and allows for accelerated delivery cycles.

Version Control for BI Analysts & Developers

BI analysts and developers face these common challenges when building analytical reports:

– Coordinating development, changing enterprise analytics workflows and managing multiple report versions, especially with several contributors, is risky due to the possibility of human error.

– Development cycles and production releases without appropriate governance can lead to mistakes in the final release.

– A lack of standardized development processes can cause conflicts, bottlenecks to collaboration and wasted time due to manual verification.

With these challenges in mind, let’s take a look at how leveraging the best practices of software development can improve your BI and analytics practice. Imagine a team that consists of three business analysts is just about to kick off an Agile project where quick and accurate deliverables are key.

Version Control in Spotfire Collaboratively Innovate with Version Control in Spotfire Analytics

Peter from Germany                   Maria from Spain                        Paul from the UK

Although Peter, Maria, and Paul are all very talented Spotfire experts in data analytics and setting KPI targets, they all have different working habits and work in different parts of the world across various time zones. Under these circumstances, it is even more important to align their versions transparently and consistently by integrating version control best practices into their daily routine, including:

– Governing production releases

– Restoring mission-critical deployments in production

– Tracking and coordinating changes by Report Template Authors

– Merging work efficiently

By adopting these best practices, BI analysts and developers can achieve several business benefits:

Accelerated Quality Report Delivery through Better Collaboration

With history tracking, change log generation, and detailed report comparison, multiple BI developers can work on the same report at the same. With this type of collaboration, customers can receive a quality report at an accelerated rate.

Safeguarded Brainstorming & Elimination of Human Mistakes

With the ability to restore previous versions, users can brainstorm and try ideas safely while building dashboards. BI and analytics developers can eliminate costly human errors by adopting version control.

Enhanced Teamwork across Different Locations and Time Zones

Merging independently-developed versions of an analytical report can be risky. With version control, all changes are tracked and users can share and edit the same document at the same. Ultimately, version control takes collaboration to the next level and increases effective work across different locations and time zones.

These are just some of the benefits that BI analysts and developers can take advantage of by adopting a software development best practice, such as version control. To learn more about this topic and EPAM’s BI Version Control Accelerator for TIBCO Spotfire, a turn-key solution that accelerates the delivery of Spotfire reports by 4-5x, join our webinar series or contact us at to request a free demo!

Let’s block ads! (Why?)

The TIBCO Blog

How T-Mobile Moved from Monoliths to Microservices with TIBCO and HCL

iStock 824167622 How T Mobile Moved from Monoliths to Microservices with TIBCO and HCL

Digital Transformation

Let’s block ads! (Why?)

The TIBCO Blog