What is Knowledge Mining And Why Data Is Not Knowledge
In the following article I will share what is knowledge mining, the difference between knowledge and data, and finally the design principles to which companies should pay attention in order to succeed.
Data is accumulating at an exponential rate and it needs your help to become knowledge. IBM has estimated that once the Internet of Things and Services gets fully underway data will be doubling every twelve hours. And the rate of change is speeding up.
This does not mean that knowledge will be doubling every half-day. Something has to happen to data for it to yield knowledge, and that’s where Instructional Design comes in.
It’s the electronic-digital age of course that is responsible for this explosion of data. What was not born digital is being digitized –one of Google’s missions– and data that is born digital stays that way. Both are waiting to be made useful.
The comparison to oil or other forms of carbon is compelling. Oil is basically inert and stored in the earth’s crust until it is mined and refined into a useful –albeit ultimately lethal– energy source.
Data should be considered our latest natural resource; useless until mined and refined, and of course like other natural resources can be used for either ethical purposes or not.
Data is just that: Data. It has no intrinsic meaning or usefulness. Like buried coal or oil it is inert; it is static and useless to us. It does however have enormous potential; it has incalculable possibilities. These possibilities are only actualized when conditions become favorable for the potential to become realized, valuable, and useful.
If we are to make data useful we have to change the conditions in which it exists. In its “natural state” data is simply a digital item of code; it is a representation or symbol of a fact or statistic. For it to become useful instructional designers (and in reality that means everybody) must change the conditions in which the data exists. We must make the inert active. We must release the potential of this natural resource. We must make meaning from stasis.
The first thing to understand is that we now have 3 kinds of data: Structured, unstructured, and more recently data not directly created by humans.
- Structured data is things like spread sheets or Word docs; alphanumeric key strokes recorded on a disk somewhere.
- Unstructured data is animations, graphics, sounds and videos; different from spread sheets and memos but still digital representations of human ideas.
- Machine data is data is that is created by a machine and sent to another machine for its use; think robots and machine learning.
All of these different kinds of data still remain only potential sources of knowledge. For their potential to be changed into action these data must have the conditions in which they exist – their native state - changed. And this can only be done by the application of human intellectual capital. Who else but instructional designers can transform these data by the application of human minds? It’s modern-day alchemy and requires intelligence: and you are that intelligence. Artificial intelligence is catching up, but I do not see it performing the functions I am going to describe in this post in the foreseeable future.
It is important to understand that the background in which this transformative activity is taking place is daunting. Never in the history of the world has data exploded and is accelerating at such a rate. In some areas, such as the bio-sciences, the doubling is taking place every few months. Imagine what decoding the human genome did to our information base.
Data and information is also spreading at an ever-increasing pace, thanks in large part to the internet and other forms of media, and the increasing bandwidth of communications, and the clock speed of our computers.
Let me put it this way: In a single K12 school generation, data doubles at least eight times. That means that 256 times more information is available to today’s 12th graders than when they began school as kindergartners.
We are in the latest Industrial Revolution and it is unlike any revolution that has gone before. Klaus Schwab, the founder of the World Economic Forum in Davos calls it the Fourth Industrial Revolution where speed, new skills and rapid globalization are the underlying characteristics. Others call it Industry 4.0 and point to six design principles to which companies should pay attention in order to succeed.
- First, interoperability: Smart factories and the products they make continue to communicate via the Internet of Things and Services.
- Second, virtualization where a virtual copy of the plant/factory is linked by sensors to its products, and customers to simulate any contingencies.
- Third, decentralization of decision-making made possible by the cyber-physical systems described above. In many cases the cyber system will make the decisions autonomously.
- Fourth, the real-time collection of data –and its analysis– makes insights available immediately.
- Fifth, services will now be ordered and delivered via the Internet of Services – the fridge knows it is low on a product, and the next thing you know a drone delivers it.
- Sixth, flexibility and modularly. As conditions change –competition, customer preferences etc.– a cyber-physical set up is able to re-configure itself rapidly.
I would humbly suggest a Seventh. I call it Knowledge Mining and it is the substance of this article. It is all about questioning: The relentless interrogation of raw data until it gives up its secrets and becomes useful knowledge or intelligence.
So, I maintain that it is the job of Instructional designers to teach others how to become Infoliterate by developing their ability to discriminately and sensibly transform data into useful consumer-knowledge: Knowledge Mining.
Here’s the best news in all this; humans are born problem-solvers, and it’s only when we are in that natural state –looking for answers– that we understand when we don’t know something, why we do not know it and why we must know it. And that leads me to the HOW.
The quick answer to the how is a series of filters, and these filters are in fact questions and answers. Note the plural. A single interrogation – asking a question and then answering is never enough. An answer must lead to another question – that’s called Critical Thinking. Or Jeopardy!
In the 1970s Toyota’s automobiles had a dreadful quality image. To address this Toyota introduced the 5 Whys technique, an iterative, interrogation technique used to explore the cause-and-effect relationships underlying a particular problem, and quickly and simply get to the root-cause. Each ‘why’ question forms the basis of the next question. The "5" in the name derives from an empirical observation on the number of iterations typically required to resolve the problem.
Filtering data through questioning to solve problems is a pathway to knowledge, wisdom and power. Making meaning from raw data in other words, and it can and should be learned.
The key is to turn Data –the raw, unfiltered, accumulating noise of our society– first into somewhat usable information, and then into knowledge –basic literacy about the subject matter– then into an informed opinion or wisdom, upon which one can begin making a judgment. The final step is turning wisdom into power, with the idea that this power should be shared.
A Pyramid Of Understanding
Look at the diagram for a moment. Think of the process of achieving knowledge through inquiry, and hence power, as a pyramid of human understanding. We should constantly climb this pyramid as each new piece of data comes to our attention; looking to make sense out of all the sounds and sights that bombard us.
This pyramid has six levels, separated by five filters. At the bottom is a mine full of raw data—sights, sounds, smells and touches – which is largely unusable, unfiltered, unsorted, and therefore unavailable and unintelligible. Like coal buried miles underground, it is full of potential. But that potential cannot be unlocked without certain processes being undertaken. The data on where the coal is, and how to get at it and process it, turns that data about coal literally into power.
Filter #1: Value And Relevance
The first filter is where we retrieve the data, evaluate it, retain some, and discard a great deal. This is called the value phase and it is human nature to do this automatically. Think of it like this. If you hear brakes screeching your automatic pilot –instinct– gives the sound a value: The closer it is the higher the value it has. It helps to know this so we can recreate this action consciously when evaluating data coming our way and take appropriate action.
In discussing the value and relevance of information with others an excellent exercise is to get them to ask themselves the following: Do these data help? How do these data help? Do these data add to the process of learning, or solving a problem? Which pieces of data clarify, and which are irrelevant? Then ask them to discard some data and retain other data based on this filtering process.
At this stage you are looking for relevance. Our human nature gives us the skill to do this at the speed of light –and without it ever rising to the level of the conscious part of the brain– we ask and answer the same questions over and again: Does this piece of data (it’s not information yet) help? Does this piece of data help clarify my situation? Does it help solve my immediate problem? Does it have value?
Having established the value of the data because of its relevance, we retain it. It is now somewhat usable information and is beginning the process of changing its nature from data to information.
Filter #2: Source And Bias
This information is now ready for its second filter. In this stage, you should be looking at the source of the information, and importantly be on the lookout for bias. You should be asking skeptical questions about authenticity, accuracy, and checking to see if the information is current. Information that is biased, inaccurate, or inauthentic is not valuable information.
After information has now passed through these two filters it can be deemed accurate, authentic, unbiased information on a particular subject –any subject– of interest. You started with raw data; say an advertising proposal. You checked the raw data in the proposal, such as the number and demographics of people whom the agency claims read a particular blog on which they propose placing ads, and you were satisfied. Now, it is no longer data: just numbers and letters, it is knowledge of the subject matter.
Filter #3: Ethics
The third filter—and one that is always necessary even with ad agency proposals - is the one where the first ethical factor comes into play. Are you using the information for nefarious purposes or righteous ones? Did you or the ad agency come into possession of the information lawfully?
Passing through the ethical filter takes knowledge and transforms it into wisdom, which is the realm of opinion and judgment. If an opinion has been arrived at using the concrete steps outlined, it has a much better chance of being a wise judgment based on the facts. Remember there are always ethical implications of any opinion, judgment, or decision.
Filter #4: Conduct
The fourth filter is also one which has to do with conduct. It is where questions of cultural awareness and respect for confidentiality, and the rights of others are asked, and answered. Having passed the tests of the fourth filter, one should be in real possession of the power to make a wise judgment or hold a valued opinion. Data has now been transformed from useless bits of random fragments that are nothing but potential, to instruments of personal power.
Filter #5: Purpose
The fifth filter has to do with purpose or use and keeping on the right side of the law. How is the power –systematically gained by mining data, turning it into information and then knowledge, then wisdom and finally power– going to be used? For good or evil? The final filter has to do with right conduct; using this power only for the benefit of all. This filter is where questions of cultural sensitivity, respect for confidentiality and private property, and good conduct are asked again.
Power only has a value when it is used. If you were the only person in the world, you would automatically have immense power, but in the absence of others it would be sterile and impotent; not really power at all. Another way of saying this is that knowledge has much less value if it is not shared. The whole idea of gaining power is to further the cause of humanity; otherwise wielding power becomes a purely selfish act, and does not advance the cause of humanity, but only the cause of an individual. Selfishness, although promoted by scientists as the best way forward from a biological, evolutionary standpoint, or as the only way for market forces to work, is at its heart the opposite of the teachings of all major faiths.
You decide which works for you when wielding power: altruism or selfishness; this is the classical conundrum of human nature, remembering of course that we all have free will.
Data at the bottom of the pyramid of human understanding is a resource which, when undifferentiated from its surroundings, has no value. Power, at the top of the pyramid, without wisdom, and without sharing its benefits, is of value only to its holder.
In a study cited recently by Sir Ken Robinson, a prominent education theorist, 98% of children between the ages of 3 and 5 exhibit a capacity for divergent thinking: making connections between seemingly unrelated facts and scenarios and using metaphors. At age 15, only 10% retained this capacity and at age 25, the percentage dropped to 5%. Could this be because we train children to take tests, not to uncover and interpret facts and come up with novel ideas?
How else can we teach people how to be creative if not by teaching them about data, information, knowledge, and power? The more disciplined an employee becomes about finding, interpreting, and correlating information, the more they will be able to focus on divergent thinking; the cornerstone of creativity, which includes metaphors and analogies.
So, the lesson on infoliteracy is clear: We can either be overwhelmed by the tidal wave of data or we can choose to use it to our advantage.
I hope that all this has led you to ask how you can begin to implement knowledge mining at your organization – I have an answer. I designed a methodology with all the above as my design criteria. Register here and spend fifteen free minutes with the slide show. And if you like what you see, the handbook is $9.99. Become a thought leader.