Posts

Showing posts from April, 2017

Just a little glimpse of a data story I’ll tell ’Bout a North Country singer that you all know well

Image
Screenshot of the Tableau Dashboard. Available [ here ] and at the end of this post. A dream of Bob Dylan When I was a kid, my parents wanted to ensure that I’d do as well as I possibly could in exams and get a place in University. To this end they hired a personal tutor to give me additional tuition in a variety of subjects that I kinda sucked at.* That’s why they hired JS. The routine was always the same … JS would come to my house and I’d attempt to feign interest and proficiency in my coursework for an hour or so, a couple of times a week. One evening I was listening to the radio when he came over and his first words were ‘What’s that on the radio?’ … well, no, actually … he didn’t say that at all … his language was peppered with obscenities and, when I said it was just something from the charts, he only got more agitated. Lessons were abandoned for the evening as I was treated to a tirade on the poor quality of what made the charts (it was the mid 80s … he wasn’t wrong)...

OpenRefine – an experiment in data cleaning

Image
Photo by r2hox (Flickr/Creative Commons) In a recent blog post on Northern Ireland’s Renewal Heat Incentive (RHI) scandal [ here ] I spent quite a bit of time recording all of the changes, tweaks, and decisions I had to make to get the data into a usable format. With any dataset it is important to understand the transformations that went into bringing it to its final form. If other researchers are unable to follow your process and consistently achieve the same results from the same dataset it brings your analysis into question. Beyond that, it brings the whole endeavour of data science and data analysis into disrepute. If you can’t rely on the figures to tell a consistent story, you can’t make consistent decisions, and you can’t gain reliable insights. You certainly can’t trust the folks who are furnishing you this flawed and unreliable nonsense. If you can’t rely on the information you’re seeing on your dashboard, what is it other than a collection of interesting, but meaningles...