A question data scientists might ask is: How are variables distributed?
Predictive Analytics Questions
The next question is predictive, as predictive analytics should be somewhat familiar to you because we've talked about it several times before. I think it's also a very obtainable type of question because a lot of business people want to know what's going to happen in the future, which is what predictive analytics questions are really all about.
How can we predict sales moving forward? If we want to know how our sales of brooms are going to be moving forward for a given store, we would use a predictive solution, or we would have a predictive question.
In this example, we have historical broom sales, and then this is today, and then the predictive question would be like, "what is it gonna be in the future?"
That takes us to causal, and I would like to first introduce you to Fire Dude. Fire Dude is the effect of something. What is it the effect of? We know that it's the effect of heat, fuel, and oxygen. If we take away fuel, Fire Dude's gone, but we really like Fire Dude, so we'll keep fuel here, for now.
This is a pretty idealistic view of what causal is. In reality, usually, there's an unknown number of causes for a particular effect. Not only are there an unknown number of causes, we also don't know how much they impact the effect. If we take away one cause, it might minimally impact the effect. If we take away a different cause, it might have a huge impact on the effect. Point being that this is a huge puzzle to solve and it's very complicated to identify not only what your causes are, how many you have, but, also, what impact it actually has on the effect. So this one is very important and really powerful in helping to drive action for business people, but it's also very difficult to obtain.
That's generally a good mix of what kind of data science questions we generally tackle. As a data scientist, we have exploratory, inferential, predictive, and causal.
Posted by Gage Peake