Image by Abron on Flickr. Some rights reserved
Data journalism is not a new phenomenon. Speaking at the International Journalism Festival in Perugia, Steve Doig from the Walter Cronkite school of journalism highlighted this by talking about the impact of the rise of the personal computer in the early 1980s and how this helped journalists track “patterns” in the data they were getting hold of.
Before this technology arrived, such reporting was “often based simply on anecdotes”, he said. Giving the example of covering “the problem of drunk driving”, journalists would have previously had to have referenced a “bad example of such an accident” before moving to discuss the “larger problem”, he explained.
The nice thing about data journalism is it lets you go beyond anecdotes to evidence.
His workshop ran through some of the key features of Excel to help journalists sort, filter, “transform” and “summarise” data.
Below is a summary of some of the key points he raised – the full tutorial is available online.
- Sorting, filtering, transforming and summarising data with Excel
When it comes to the most common format of data, Doig said it “tends to be alphabetical”, which will not make it immediately clear to a journalist what the story, or stories, behind the data are.
So we want this to be “more journalistically interesting”, Doig said. As an example he demonstrated how journalists can sort numbers by highest or lowest.
When it comes to filtering data, he described some particularly large datasets as “forests”, and that journalists “only want to see the trees that we’re interested in”.
Using Excel journalists can hide data they are less interested in and effectively keep their work area tidy.
Journalists can also use Excel to “transform data using functions and formulas”. For example, he showed the delegates how to create new variables, such as working out a crime rate per 100,000 people when you already have statistics on population and crime. This then helps the journalist “make fair comparisons between places of different size”.
Finally, you can “collapse your data down by categories”. This can be achieved by using pivot tables, which enables the users to select certain variables and bring those together.
For example, if you wanted to look at the number of murders by region, but the data is also broken down into smaller geographic areas, you could build a pivot table, select the ‘region’ variable in ‘row labels’ and select the column stating the number of murders and put it in ‘values’. This would combine the number of murders per region.
- Data stories are not only for economics or business journalism
Here is just a selection of the different types of data story subjects Doig highlighted:
– Budgets and taxes
– Crime patterns
– School test scores
– Auto accidents
– Demographic change
– Pet licences
– Air quality
– Sports statistics
- A simple toolbox can get you far when you are starting out
Highlighting some of the key tools for working with datasets, Doig said Excel lets journalists do the majority of the work they would need to, supported by database software like Access, mapping tools like ArcMap, a text editor and social network analysis plug-ins such as NodeXL.
And when it comes to visualising the data he pointed to data journalism staple Google Fusion tables, as well as coding language such as Ruby, Django, perl, python.
- Tap into industry resources
Doig recommended a number of outlets and online platforms offering industry expertise on data journalism:
– Data journalism handbook
– EJC
– NICAR
– Investigative reporters and editors
– SKUP
– Global Investigative Journalism Network