• Home
  • About Us
  • Contact Us
  • Privacy Policy
  • Special Offers
Business Intelligence Info
  • Business Intelligence
    • BI News and Info
    • Big Data
    • Mobile and Cloud
    • Self-Service BI
  • CRM
    • CRM News and Info
    • InfusionSoft
    • Microsoft Dynamics CRM
    • NetSuite
    • OnContact
    • Salesforce
    • Workbooks
  • Data Mining
    • Pentaho
    • Sisense
    • Tableau
    • TIBCO Spotfire
  • Data Warehousing
    • DWH News and Info
    • IBM DB2
    • Microsoft SQL Server
    • Oracle
    • Teradata
  • Predictive Analytics
    • FICO
    • KNIME
    • Mathematica
    • Matlab
    • Minitab
    • RapidMiner
    • Revolution
    • SAP
    • SAS/SPSS
  • Humor

How IBM builds an effective data science team

December 23, 2017   Big Data

Data science is a team sport. This sentiment rings true not only with our experiences within IBM, but with our enterprise customers, who often ask us for advice on how to structure data science teams within their own organizations.

Before that can be done, however, it’s important to remember that the various skills required to execute a data science project are both rare and distinct. That means we need to make sure that each team member can focus on what he or she does best.

Consider this breakdown of a data science project, along with the skills required for each role:

While each role is certainly distinct, each team member does need to have T-shaped skills — meaning they’ll need to have depth in their own role but also a cursory understanding of the adjacent roles.

Let’s explore each role from the chart in a little more depth.

Product owners

Product owners are the subject matter experts, with a deep understanding of the particular business sector and its concerns. In some instances, the primary role of the product owner will be on the business side, while they work periodically with the data science team to address a specific data science problem or set of problems before cycling back into the broader role.

In fact, cycling back to the normal role is a benefit to the data science team. It means the product owner acts as the ultimate end user of the models and can offer concrete feedback and requests. It also means the product owner can advocate for data science from within the business units themselves.

Product owners are most often responsible for:

  • Defining the business problem and working with data scientists to define the working hypothesis
  • Helping to locate data and data stewards as necessary
  • Brokering and resolving data quality issues

Data engineers

Data engineers are the wizards who move all the data to the center of gravity and connect that data via services and message queues. They also build APIs to make the data generally available to the enterprise, and they’re responsible for engineering the data onto the platform that best fits the needs of the team. With data engineers, we look for these top three skills:

  • Proficient in at least three of the following: Python, Scala, Java, Ruby, SQL
  • Proficient at consuming and building REST APIs
  • Proficient at integrating predictive and prescriptive models into applications and processes

Data scientists

Data scientists tend to fill one of two distinct roles: machine learning engineers and decision optimization engineers. Because market conditions have caused “data scientist” to be such a hot role, making this distinction can remove some confusing wiggle room. (For our detailed thoughts on this, see our recent article on VentureBeat.)

Machine learning engineers

Machine learning engineers build the machine learning models, which means identifying the important data elements and features to use in each model. They determine which types of models to use, and they test the accuracy and precision of those models. They’re also responsible for the long-term monitoring and maintenance of the models. They need these top three skills:

  • Training and experience applying probability and statistics
  • Experience in data modeling and evaluation and a deep understanding of supervised and unsupervised machine learning
  • Experience programming in at least two of the following: Python, R, Scala, Julia, or Java, with a preference for Python expertise

Decision optimization engineers

Decision optimization engineering skills and experiences overlap with machine learning engineers, but the differences are important. Decision optimization engineers need these top three skills:

  • Experience applying mathematical modeling and/or constraint programming to a range of industry problems
  • Proficient programming skills in Python and the ability to apply predictive models as input into decision optimization problems
  • Experience building Monte Carlo simulation/optimization for what-if scenario analysis

Data journalists

That brings us to data journalists, the team members who help represent the output of the model in the context of the data that drove it and who can clearly articulate the business problem at hand. With data journalists, we look for these top three skills:

  • Coding skills in either Python, Java, or Scala
  • Experience integrating data and the output of predictive and prescriptive models within the context of a business problem
  • Proficiency with data parsing, scraping, and wrangling

If you can gather together a team with these essential skills — and if you can ensure they collaborate well and maintain a meaningful understanding of one another’s work — you’ll be well on your way to uncovering the insights and understanding that can supercharge whatever organization you’re leading.

Without them, you could be flying blind.

Seth Dobrin is vice president and chief data officer at IBM Analytics.

Let’s block ads! (Why?)

Big Data – VentureBeat

Builds, data, Effective, Science, Team
  • Recent Posts

    • ANOTHER SIMPLE EXAMPLE OF FASCIST NAZI LEFTISTS AT WORK
    • Nvidia and Harvard develop AI tool that speeds up genome analysis
    • Export with large E instead of small e
    • You’ll be back
    • Building AI for the Global South
  • Categories

  • Archives

    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
© 2021 Business Intelligence Info
Power BI Training | G Com Solutions Limited