Speaking The Language Of Information: Data Literacy

Data Literacy: The Language Of Information

I'm in the last stretch of the Harvard Edx online data science program certification. The certification includes eight courses and ends with a capstone exam. The content covers statistics, probability, data wrangling, linear regression, modeling and inference, machine learning, etc. But this is not a typical "watch this video and answer a quiz question" challenge. It is hands-on coding in RStudio and DataCamp online. It has been 4 months now...

When I posted on this (as I finished another course in the program), someone asked me a good question about why I am doing this whole thing. Do we (L&D professionals) need data science? Is the course designed well or is it just me with strong self-motivation?

Does L&D Need To Turn Into A Data Scientist?

The short answer is no. Most likely your organization has dedicated roles for this, people with a strong background in mathematics, statistics, computer science, etc. (Or they're actively looking for one as we now have a shortage.) It is unlikely that an average L&D professional would work in R to build machine learning models, doing sentiment analysis of comments and discussions around a topic. However, we do need to be able to speak the language of information: data. We need basic data literacy to ask the right questions and understand the answers.

I like Gartner's approach to this issue:

Imagine an organization where the marketing department speaks French, the product designers speak German, the analytics team speaks Spanish and no one speaks a second language. Even if the organization was designed with digital in mind, communicating business value and why specific technologies matter would be impossible.

Speaking of foreign languages, there are two things you should focus on when going to a foreign country and you decide to learn the basics of its language:

  1. Not to be sold
  2. Not to be critically misunderstood

That's the point of data literacy for L&D. Point 1 is making sure what we present is valid. Just because a chart looks convincing, it doesn't mean it's true. Point 2 is about being able to articulate our questions, concerns, ideas, creative thoughts in terms of data. Standard deviation, anyone?

Problem-Solving As A Business Value

To support the business, to be part of the problem solving, to be part of a data-driven organization, we need to speak the basic language of information. Gartner states, "If no one outside the department understands what is being said, it doesn’t matter if data and analytics offer immense business value and are a required component of digital business."

As for the Harvard Edx course program, it is challenging. I wouldn't say it's a MOOC experience like other programs that feel like a cohort. There's one discussion per chapter, at the end. It is heavy on content (if you're not already familiar with stats, R, probability, etc.). The hands-on coding is what makes the course worth going through. Otherwise, I would have a course completion that means nothing.

Why am I doing this? Personally, I don't talk about or write about things I don't actually do. That doesn't mean everyone in L&D who wants to learn basic data literacy would need to walk down the path of data science.

In fact, let's clear up a couple of questions:

  • What is data? Is data the same as information? Or insight? How about wisdom?

Without falling into oversimplification, I like this explanation from GURU99 [2] using the DIKW (Data Information Knowledge Wisdom) model. I added a couple of things to make it more illustrative:

Data: 100
Information: 100 miles
Knowledge: 100 miles is quite a far distance. Driving at an average of 65 miles per hour, it would take approximately 39 minutes.
Wisdom: While the speed limit is 65 on Route 476, today it is raining, and a stretch of that route always get slippery. The average time would be closer to 50 minutes.

Raw data becomes information through human or machine interpretation. Depending on what question you're trying to answer, the same data can be gold or trash, or most likely a mix of the two. That's the job of people working with data: digging for gold.

Critics of the DIKW argue that it is an archaic pyramid model with oversimplification. We've seen this with many models in L&D when a complex subject is reduced to a pyramid, cone, or another pretty simple infographic [3]. Be skeptical about pyramids! At the same time, it is a good idea not to have assumptions in our communication language about what we mean by data, information, knowledge, insight, and wisdom. The good news is that L&D professionals have been dealing with the dilemma of knowledge versus information for a long time. What might be new to us is a deeper understanding of how to dig for gold in data.

It is also a general misconception that data is a spreadsheet with numbers. Yes, it is a common type of appearance, but don't limit your thinking to numbers in Excel. Think of unstructured data in videos, recorded webinars, forum discussions, etc. (See also qualitative vs. quantitative data research.)

What Is Data Literacy?

Gartner defines data literacy as the ability to read, write and communicate data in context, including an understanding of data sources and constructs, analytical methods and techniques applied — and the ability to describe the use case, application and resulting value.

What's The Difference Between Data Analytics And Data Science?

Finally, if you look at job requirements for data analysts and data scientists, you'll find some overlap in skills and tools. At the same time, their role in working with data is different [4].

The responsibility of data analysts can vary across industries and companies, but fundamentally, data analysts utilize data to draw meaningful insights and solve problems. They analyze well-defined sets of data using an arsenal of different tools to answer tangible business needs (e.g. why sales dropped in a certain quarter, why a marketing campaign fared better in certain regions, how internal attrition affects revenue, etc.).

If you're into data visualization, crafting the story behind data, fiddling with well-defined data sets, then analytics is for you. A subset of data analytics is learning analytics, focusing on data derived from learning-related activities. As an L&D professional, you're most likely to work with data analysts to gain insights into what's happening in the workforce and how and when learning (or training) can help (or helped). In short, analytics deal with the insights you can gain from known data.

Data scientists, on the other hand, are dealing with the unknown. These are the people creating models for the analysts to use. This is a more code-heavy job.

Drew Conway, data science expert and founder of Alluvium, created a Venn diagram that describes a data scientist as someone who has mathematical and statistical knowledge, hacking skills, and substantive expertise.

What's Next?

I'll be speaking at Learning Solutions 2020 on this very topic: Data Literacy for L&D. Come and let's explore together this new language! Am I done after the course? No way, it's a long road. I've been studying Hungarian for four decades, English for three decades, and yet, I'm still learning every day. However, you won't be able to sell me anymore.

And, if you know me, you're wondering where "games and gamification" fit in. Well, games and gamification can provide more meaningful data points than any MCQ or static click-through. More to come!


[1] A Data and Analytics Leader’s Guide to Data Literacy

[2] Difference between Information and Data

[3] The Problem with the Data-Information-Knowledge-Wisdom Hierarchy


Originally published at www.linkedin.com.