Data visualisation: Seeing the full picture
First of all I would like to say that 7 years of scientific training have almost rendered me unable to write in the first person, so please be patient with the following and I will do my best to leave out the P-values.
Ironically I would like to write about data visualisation without presenting any graphs to you, this may be a challenge as I can already feel myself tempted to reach for the mouse to open my stats package. This topic might at first seem a tad on the academic side but the truth is we all know this subject well, as we have all been long trained in drawing the ‘x’ and the ‘y’ axis.
Handling the data becomes easier
‘Data’ is the most fashionable word in science at the moment and those who are truly able to work with it are considered somewhat unicorns in the job market. The infrastructure and IT capability as well as the technology that generates the numbers have progressed quickly and most industries have been left lagging behind when it comes to the skills to handle it. The same can even be said in academia with some researchers relying on a statistician for help with their analyses. Consider agriculture at the farm level; dairy parlours are able to measure and churn out all kinds of interesting numbers from milk yields to lameness scores and even electrical conductivity readings of milk but what are we doing with it? I once spent months collecting data from dairy parlour software and putting it into excel. Time consuming was an understatement and I really developed a taste for coffee during that period of my life between milking in the morning and data farming in the afternoon. However the possibilities with such large data sets are really quite exciting and the good news is that the technology continues to be one step ahead of us and no longer do we have to copy hundreds of numbers from one file to another.
Milking robots are generating large amounts of data. But how to deal with it and what to conclude from large datasets? Photo: Ton Kastermans
Take a second thought
Often the first thing that a person does when presented with a large set of numbers is summarise the descriptive statistics. We go straight for the average, the min, the max, and the standard deviation. In doing so we are potentially missing a trick. Of course these descriptive numbers are important, as they always will be, but the larger data sets that we have access to today can often have characteristics that are hidden by our usual approach. What I am suggesting here is not that everyone should study for a masters in statistics, but that you take a second thought. The human brain instinctively looks for patterns, so play to your strengths and put the quality data on a control chart, map the variation by country and check the pattern of distribution. If you cannot, then find someone who can or has the time to do so, as there are increasing numbers who are able. Who knows what decisions we would make differently if we could see the full picture?
To comment, login here
Or register to be able to comment.