• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

Thoughts on the 2017 KDNuggets Poll on Data Science Tools

May 27, 2017   BI News and Info
toast Thoughts on the 2017 KDNuggets Poll on Data Science Tools

Are you a data scientist and do not know KDNuggets.com?  How is this possible?  Ok, go there right now, add a bookmark, and make this part of your daily reading list.  But don’t forget to come back here afterwards to read the rest of this post.

KDNuggets is one of the most popular portals for data science, and is a great source for news and information.  It probably will not be winning a design award any time soon.  But the rich, deep content is why you will go back over and over, and that’s what really matters.

I spend more time on KDNuggets than usual in May, because that’s when the annual KDNuggets poll What data science solution did you use in the past 12 months? comes out. Gregory Piatetsky-Shapiro, the editor of KDNuggets and one of the best-known data scientists in the world, has been doing this poll for 18 years.

Gregory just published the results for 2017, and about 2,900 people have shared their software preferences for data science tools. And as always, there is a lot to learn from those results.

What’s new in data science in 2017?

First things first: RapidMiner was again voted as the most popular general data science platform and this is all thanks to our user community!  33% of all voters said that they are using RapidMiner, which is an amazing result. Many thanks to all of you!

But we know that data scientists are using up to 6 different tools in parallel so besides RapidMiner, what other tools are people using?

Let’s start with the programming languages. It should not come as a surprise that R and Python are the two leading languages for data science.  This year, Python got slightly more votes than R which might not be a significant difference really.  But in general Python has shown the bigger growth rates in the previous years, and I would not be surprised to see Python to take over the leading position over R in the future.  And then there is of course SQL, which made the third place among the programming languages.  SQL will of course never die, so no surprise here.

Connected to Python growth is Anaconda, a Python distribution with package management. Big shout out to our friends at Anaconda for growing that quickly!

On an infrastructure level, Apache Spark was used by 23% of all data scientists but Hadoop only by 7%.  And while we are talking about big data, the library MLLib only was used by 5% and hence much less than many other options.  To be honest, this was a bit of a surprise to me.

Deep Learning is all the rage

Yes, I am guilty for not playing along with the crazy deep learning hype of the past few years.  After all, the technology is much less innovative than most people believe. But I will admit that there is a strong growth trend around deep learning in our field.

This year, more than 32% of all data scientists said that they are using deep learning, up from 18% in 2016 and 9% in 2015.  Doubling every year is impressive growth indeed.

There are now a dozen or so deep learning libraries.  The most widely used one of course is Google’s Tensorflow, now used by 20% of all data scientists.

RapidMiner’s history with the KDNuggets poll

I view this poll a bit like a sporting event. It won’t make or break a vendor, but I at least take it serious. I think all vendors should take it seriously, and it looked like more vendors did this year.

The history of RapidMiner in the poll is interesting as well.  In 2006, our co-founder Ralf Klinkenberg was already why YALE was not an option in the poll (YALE was the former name of RapidMiner, and an acronym for “Yet Another Learning Environment”).  Who could know that only 11 years later machine learning would be all the hype?

RapidMiner was first included in the poll in 2007, and YALE was the most widely used open source platform from the start.  But some of our commercial competitors like SAS and SPSS were ahead of us back then.  But thanks to our loyal community and user base this changed quite quickly.  In 2008, we ended up just shy of SPSS Clementine (which later became SPSS Modeler).  We remained in the top 3 for a couple of years, and during that time other open source solutions like R started to gain more traction in the poll.

Starting In 2011, RapidMiner took over first place among all data science platform tools, and we have been able to keep this position since then.  One of the great things, however, is that data scientists now have many different approaches and often mix and match the different solutions.  There are clearly leading data science platforms like RapidMiner and in addition we have two great programming languages for data science as well, namely R and Python.

And then there are dozens of libraries like MLLib or Tensorflow, most of them accessible through RapidMiner as well.  So, you will be able to find the right tool for your problem and this is a wonderful situation to be in for data scientists.  Compare this to software offerings in the earlier years of this poll (check out the links above).

It’s a great time to be a data scientist indeed!

Let’s block ads! (Why?)

RapidMiner

2017, data, KDNuggets, Poll, Science, thoughts, tools
  • Recent Posts

    • PUNNIES
    • Cashierless tech could detect shoplifting, but bias concerns abound
    • Misunderstood Loyalty
    • Pearl with a girl earring
    • Dynamics 365 Monthly Update-January 2021
  • Categories

  • Archives

    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited