I was, quite rightly, brought up on a couple of points I made in this article. So let me say right here and now:
I’m no expert on this, but…
One of the things I’m very interested in is information and its presentation. Over the years I’ve spent in IT and as a trainer teaching Apple’s Mac OS X courses, I’ve noticed a few things about the subject. Now this is by no means a definitive treatise on the subject. In fact it’s really rather lacking in depth. With that caveat, read on…
I’m a very visual person, and reams of numbers in a spreadsheet really don’t make much sense to me without spending hours trying to… well, visualise the relationships between them. The same goes for concepts. I’m known for always reverting to a diagram when trying to explain something outside of the other person’s skill set. It really does help. At work I like to have a whiteboard on a wall near my desk, it’s a lot cheaper than a flip-chart given how much I use it! True, people sneak in and draw silly pictures on it when I’m not there… but I digress.
There’s a part of one of Apple’s courses where I’m talking about the Keychain and its password, how the password is usually the same as the User’s Account (login) password but doesn’t have to be. I go on to talk about the various ways of changing the User’s Account password and the effect they have on the Keychain password. By the time I’ve finished most people’s eyes are nicely glazed over. Then I draw a simple 3 column table and using different colour pens, go over it all again, reinforcing what I’m saying with a table and diagram which shows the relationships. Their eyes unglaze, and the most common phrase uttered is “Oh, that’s actually quite simple.”
Which it is. But here, in this case, language gets in the way. What I was describing is not open to interpretation, it’s a description of fact. Take a long description in literature, for instance the one Tolkien gives of the Battle of Helm’s Deep (OK, The Battle of the Hornberg for those who are going to pull me up on it) where thousands of words paint a picture conveying ideas we’re free to interpret; colour, sound, movement. The shade of colour, tone of sound and exactitude of movement is unimportant. What’s important is the whole. A well written description can even fool you into smelling or tasting things. And this is a clue to how it’s working. When you read a long descriptive passage that ‘really takes you there’ what it’s doing is acting on memory, emotion and imagination. This doesn’t have much of a chance to happen during learning, for the simple reason that it’s new – a whole new concept each time. You can’t evoke memory, emotion and imagination on something you’ve never come across. Yes, you can link the new information to information already stored in memory to act as a trigger and, to a lesser extent, you can provoke an emotional or imaginative response. But it’s very much harder to do using only words, especially if the subject matter is rather dry.
What works better then, is a combination of the two. The use of words to describe what a diagram is about and then let the diagram explain itself. If the diagram is well constructed, incredibly complex concepts can be got across in a fraction of the time, and with much greater retention than using words alone. This is because the relationships within the data are given to you as well as relationships to already known data outside the set. This is the part about linking the new information to current memory that helps the individual recall the new data via information that already exists in memory. The trick here is the ‘well constructed diagram’ part. Without that the whole thing falls apart. Another thing to remember is that one size does not fit all. While my way of creating a table diagram worked for most people there were a couple of instances where it didn’t clarify things at all. In which case, use a different diagram (in both these instances instead of laying it out in a table, a mind-map/flowchart worked to convey the information).
It’s not just in learning situations where we need to make sense of information. We have to do it all day, every day. Read any newspaper and you’ll be told the economy is doing this or that. For example, Instagram was bought for $1bn, Apple is worth $532.50bn (at time of writing), there are x number of y doing z. Sure, fine, thank you, but what does it all mean? An excellent proponent of the easy to understand diagram of the everyday is David McCandless. I saw his TED Talk and was amazed that here was a man who went out of his way to try to make sense of the deluge of information we’re bombarded with. A man who could make a diagram of seemingly meaningless data and show the meaning behind it, I do urge you to get hold of a copy of his book if any of this interests you. Yes I know Edward Tufte got there first, but I’ll be honest and say I didn’t know about him until a few months ago and I haven’t had a chance to read his books yet, so I can’t really comment on him. But I’m lead to believe (and from what I’ve seen so far) he’s the godfather of data visualisation.
Unfortunately a lot of people think that data presented graphically somehow holds less weight than an Excel spreadsheet of numbers. Nothing could be further from the truth. Far from dumbing-down the data, you’re actually doing the exact opposite by making that self-same data more understandable. It’s not dumbing-down, it’s intellegenting-up.
So far we’ve seen that presenting new information in the classroom, and apparently mundane data in everyday life, in a visually relatable way is important to our understanding and retention of it. But it’s not just how the data is presented that’s important. The veracity of the data being presented is essential. Without that, everything becomes meaningless again. Why? Because if the data does not tell the truth it’s worse than a lie.
A simple demonstration of this is a joke I heard aeons ago, bear with me here:
It was decreed that every doctor in the land would advertise their skill by flying a flag for each patient who died under their care. That way everyone would know who the best doctor was as he’d have the fewest flags flying.
One day a man is taken ill and his wife looks to all the surgeries in the local vicinity, all were flying 30 flags or more. She heard tell of a doctor a bit further away who only flew 20 flags, and wanting the best for her husband went to him.
“Doctor!” she cried. “You must come at once, my husband is gravely ill!”
“I’d love to come, but I’m inundated with work. I can’t quite believe it as I’ve only been open for a couple of days!” Replies the doctor.
This illustrates extremely well, I think, how statistics have been given a bad name over the years. In the story the woman jumped to the conclusion that fewer flags automatically meant a better doctor. She ignored all the other possible factors surrounding the data. Take 2 doctors: the one in our story and another, closer. Imagine the doctor closer to the woman had been practicing medicine for 30 years and had 30 flags. That averages to one death a year. Then take the doctor she went to who’d been open 2 days and had 20 flags, that’s 10 deaths a day! Or potentially 3650 times more deaths a year! Which doctor would you go to given that information? But even that doesn’t give you the whole picture. There are other factors. What if, in the 2 days he’d been open, because of the 0 flags he would have had on the first day, all the terminally ill people around had gone to him out of desperation? Then the deaths would not be directly attributable to him regardless of his skill as a healer, and our data is skewed again. And it can go on ad infinitum.
A year or so ago, a good friend of mine was studying for her Master’s in Geography and Education – two fields in which statistics play a prominent role. She was helping on a project to show how well School Principles were performing based on how well pupils did in exams. Simple, right? You get the data from each school and say school A had 100 pupils pass English and school B had 200 pupils pass English. School B > School A.
Wrong. What if there were only 100 pupils in total at School A and 1000 at School B? Then 100% of school A passed compared to only 20% of school B. School A > School B.
Again, wrong. There are so many other factors to take into consideration that I got a bit lost. The mere act of cleaning up the data took months, the final statistical analysis of the cleaned up data probably took an hour.
And this is important. Really, Really important. Your data must be clean. You have to understand the factors that could affect your results and compensate for them. You have to make sure that the results you give are a true representation of what it is you’re actually looking at. If the results are wildly unexpected it doesn’t mean they’re wrong. It just means they’re wildly unexpected. Go back and check. Better yet, give the raw data to someone else and get them to clean it up and see what results they get. If they’re the same they’re probably right, for a given value of right.
I think this is why statistics have a bad name, and probably why they’ve a reputation for being boring. The bad name from the annoying truth that statistics are used time and again to ‘prove’ one thing over another in cases where the data isn’t quite as clean as it could be. (The classic case being politicians showing us how much better off we’d be under their government. Again, we have varying values of ‘better off’). And I think the fact that so much time has to be spent cleaning up the data before you get to the actual problem solving is maybe where they got their ‘boring’ epithet. But please, give me statistics over accountancy any day! Remember, statistics are used to prove when accountants are being fraudulent…
Veracity of data, the lack of truth of what raw numbers tell us, is one of the reasons I’m dead against stats displays on helpdesks. They’re nothing but a morale-drain. All they show is the length of time spent on each call and how many calls have been taken. They show nothing about what goes into each call. It’s the equivalent of showing that a particular engineer has ’20 flags’. And what did we just learn about that? Take that with the fact management tend not to understand this lack of veracity and use the raw numbers as a weapon. But this is a subject for another article.
The good statistics, and good use of statistics, prize has to go to Professor Hans Rosling. His TED Talks are fantastic and take (what I am assured are very clean) data and use them in all the best ways I’ve described throughout this article. He is also the weight behind Gapminder.Org which shows relationships in statistical public health data from around the world. There’s also a free-to-view documentary about Statistics he presented. Another statistician who deserves a mention for his humour alone is Sebastian Wernicke. Again, excellent use of data cleanup. It’ll make sense when you watch the video.
Let’s recap, then:
- Showing the relationships between data is better than words on their own
- One diagram doesn’t fit all
- Your data has to be clean
- Once you can show meaningful data in a visually appealing way that helps people make sense of, and remember, what it is you’re trying to say, then you have it cracked
And yes, I am aware there’s not one picture in this whole article.