So, you’ve started using xAPI and you’re collecting lots and lots of data. (Yay!)… And you’ve got some important strategic questions which you hope to answer. (Excellent!)  But now you have to actually start getting information out of your data… (Oh!)

If you’ve done data analysis before this might be a metaphorical walk in the park. But if you haven’t worked with data before, you might be feeling more like you’re facing a scene from The Revenant.  

And you’re wondering: How do I even get started?

The short answer: you just start.   

What’s the Least I Need to Know? 4 Key Questions

No, really.  Dive into your data, it’s not like you are going to break anything.  You could make mistakes – in fact, you probably will – but if you go into analysis with an open mind, and a few pointers, you’ll catch those mistakes, and improve the odds that you gain meaningful insights.

You don’t have to be a statistician, but you do need to know a bit about some descriptive statistics, basic things like:

1. What is the center of your data?

Mean, Median and Mode – they each give a different kind of insight, and they are super easy to calculate on a spreadsheet.

2. What’s the Spread?

Remember back in math class, learning about Standard Deviation?  (Quick reminder: Basically it’s a measure how much variance from the center is there in the data set.)  You can have the same median (average) test score for a cohort, but if the results are distributed broadly across a wide range of numbers, that’s a different situation to having most of the scores clustered fairly near the average.

One of the most important things to understand about our data, if you’re going to interpret it well, is something more practical than statistical:  Where does your data come from?  How do the numbers come into being?

To answer this we need to look at a couple of elements:

3. How do we define our data – what parameters did we use when we pulled our data?

This can be anything from how we address timestamps to how we define criteria for success.  For example, saying 80% of people passed an assessment doesn’t mean much if you don’t know what was required in terms of content and mastery to pass.

4. What are outside factors that will affect our data?

It might be something as simple as measuring course participation – if we see significant change and want to know why, we might want to be aware of factors which influence participation but have nothing to do with the course content itself.

Getting Started With Data

Now you’ve got your basic questions, it’s time to start exploring your data.

When you first get your data, set it up in your spreadsheet of choice and check it for errors – a quick visual look to be sure it makes sense and that formats are as you expected.  

Then start to dig in a little. Look at the centers and standard deviation; graph the data just to see what the distribution looks like; do a frequency distribution…  Just get familiar with it, so you can see if the numbers make sense.

Scratching the Surface

Once you start exploring you will begin to notice trends or maybe something unexpected; the insights which come from your initial exploration of your data will often help you formulate new and better questions.  These ideas for getting started are just the tip of a very big iceberg – insights lead to new questions and new opportunities for analysis.  

Statistics and analytics are fields that are both deep and wide, and and we are barely scratching the surface.  We don’t need to be experts, but we do need to take care that we understand any analyses we undertake; that is something which can be learned over time.  

For now, a bit of thoughtful data exploration is starting point which can help you quickly get some real value out of the data you are collecting. So, what’s stopping you?