Learn Things So You Can Build Things — A Data Analyst’s Opinion

This blog post is guest written by Tomi Mester of data36.com.
When it comes to data science, it’s not about what you learn. It’s about what you are able to build with what you’ve learned.
The field of data science has been growing rapidly—especially in the last few years. We see exciting new tools and methods emerge all the time. And while these can be great, I feel that these can cause some confusion as well. Why? Because they make data professionals think about the wrong questions.
Asking the wrong questions
What do I mean by asking the wrong questions?
Examples of wrong questions might be:
- What are the coolest new tools to try out?
- What are the most exciting data science problems nowadays?
- How can we fit these into our business (to experiment with them)?
Instead, we want to ask better questions like:
- What business problems (or opportunities) do we have right now?
- How can data help with this?
- Why and how will our data project be useful for the company?
- What should I learn to start building it?
Within data science, there is enormous hype around new tools every time a new machine learning algorithm is released. Or a new cloud-based solution is available. Or a new module is implemented for this or that programming language. And so on.
But aren’t these new tools important? Well, yes, but…
Tools are important, but with a caveat
Let’s think about an example from cook. You can’t cook soup without a spoon. But when eating the soup, very few people will say: “Hmmm, you have a pretty nice wooden spoon.” Instead, most of them will say: “Yum, this food tastes really good!”
And that’s because, at the end of the day, tools are just tools. You have to learn how to use them…
But that’s not the full sentence. It’s rather:
You have to learn how to use them so you can build useful things with them…
And that’s still not quite all.
You have to learn how to use them so you can build useful things with them that will have a positive impact on your business’s bottom line.
Maybe it sounds obvious written down. And if it is for you, that’s great. But I see many data professionals choose to focus on fancy data science solutions over the data science solutions they actually need. And then they hit a wall.
Unpopular opinion: most data scientists won’t need to know anything about deep learning
Let me give you just one example: deep learning.
I run a data science blog where I publish tutorials for aspiring data scientists on topics like the basics of Python or the basics of SQL, and so on.
And I get this question every week from someone: “When will you publish a tutorial on deep learning?”
And the answer is always the same: never.
Okay, I have to admit, I played around with the idea to quickly draft an introductory article on the topic… But it was tempting only for one reason: I know I’d get a lot of clicks for that article.
Most people want to learn about deep learning only because it’s popular. Why is it popular? Because it’s used for cool stuff, like self-driving cars at Tesla—and for that reason it gets a huge amount of media attention. That makes people excited and suddenly everyone wants to apply deep learning in their own projects.
But (at least in my opinion) it doesn’t work that way! A data science project should always start by defining the problem you want to solve. And once you have that, then you can choose the best tool to get the job done!
The naked reality is that in, most data science projects, there is a much higher demand for more traditional tools, like:
- descriptive analytics and reporting
- data cleaning and data wrangling
- automating your processes
- simple predictions and forecasting
- simple classification methods
I know, at first, these sound less cool than deep learning… But believe me, when you are working on a real project, they are just as exciting (if not more)! Why? Because they get you useful information a lot more quickly than building trying to tackle a project with something complicated like deep learning.