Browse > Home /

#iweu: The web data revolution – a new future for journalism?

November 15th, 2010 | 1 Comment | Posted by in Data, Events, Investigative journalism


David McCandless, excited about data

Rounding off Internet Week Europe on Friday afternoon, the Guardian put on a panel discussion in its Scott Room on journalism and data: ‘The web data revolution – a new future for journalism’.

Taking part were Simon Rogers, David McCandless, Heather Brooke, Simon Jeffery and Richard Pope, with Dr Aleks Krotoski moderating.

McCandless, a leading designer and author of data visuals book Information is Beautiful, made three concise, important points about data visualisations:

  • They are relatively easy to process;
  • They can have a high and fast cognitive impact;
  • They often circulate widely online.

Large, unwieldy datasets share none of those traits, they are extremely difficult and slow to process and pretty unlikely to go viral. So, as McCandless’ various graphics showed – from a light-hearted graph charting when couples are most likely to break up to a powerful demonstration of the extent to which the US military budget dwarfs health and aid spending – visualisations are an excellent way to make information accessible and understandable. Not a new way, as the Guardian’s data blog editor Simon Rogers demonstrated with a graphically-assisted report by Florence Nightingale, but one that is proving more and more popular as a means to tell a story.

David McCandless: Peak break-up times, according to Facebook status updates

But, as one audience member pointed out, large datasets are vulnerable to very selective interpretation. As McCandless’ own analysis showed, there are several different ways to measure and compare the world’s armies, with dramatically different results. So, Aleks Krotoski asked the panel, how can we guard against confusion, or our own prejudices interfering, or, worse, wilful misrepresentation of the facts?

McCandless’ solution is three-pronged: firstly, he publishes drafts and works-in-progress; secondly, he keeps himself accountable by test-driving his latest visualisations on a 25-strong group he created from his strongest online critics; third, and most important, he publishes all the raw data behind his work using Google docs.

Access to raw data was the driving force behind Heather Brooke’s first foray into FOI requests and data, she told the Scott Room audience. Distressed at the time it took her local police force to respond to 999 calls, she began examining the stats in order to build up a better picture of response times. She said the discrepancy between the facts and the police claims emphasised the importance of access to government data.

Prior to the Afghanistan and Iraq war logs release that catapulted WikiLeaks into the headlines – and undoubtedly saw the Guardian data team come on in leaps and bounds – founder Julian Assange called for the publishing of all raw data alongside stories to be standard journalistic practice.

You can’t publish a paper on physics without the full experimental data and results, that should be the standard in journalism. You can’t do it in newspapers because there isn’t enough space, but now with the internet there is.

As Simon Rogers pointed out, the journalistic process can no longer afford to be about simply “chucking it out there” to “a grateful public”. There will inevitably be people out there able to bring greater expertise to bear on a particular dataset than you.

But, opening up access to vast swathes of data is one thing, and knowing how to interpret that data is another. In all likelihood, simple, accessible interfaces for organising and analysing data will become more and more commonplace. For the release of the 400,000-document Iraq war logs, OWNI.fr worked with the Bureau of Investigative Journalism to create a program to help people analyse the extraordinary amount of data available.

Simply knowing where to look and what to trust is perhaps the first problem for amateurs. Looking forward, Brooke suggested aggregating some data about data. For example, a resource that could tell people where to look for certain information, what data is relevant and up to date, how to interpret the numbers properly.

So does data – ‘the new oil’ – signal a “revolution” or a “new future” for journalism? I am inclined to agree with Brooke’s remark that data will become simply another tool in the journalists armoury, rather than reshape things entirely. As she said, nobody is talking about ‘telephone-assisted reporting’, completely new once upon a time, it’s just called reporting. Soon enough, the ‘computer-assisted reporting’ course she teaches now at City University will just be ‘reporting’ too.

See also:

Guardian information architect Martin Belam has a post up about the event on his blog, currybetdotnet

Digital journalist Sarah Booker liveblogged presentations by Heather Brooke, David McCandless and Simon Rogers.

Tags: , , , , , , , , , , , ,

Similar posts:

Editor&Publisher: New AP regional investigative teams will boost CAR and data journalism

April 6th, 2010 | No Comments | Posted by in Editors' pick, Jobs, Journalism

The Associated Press (AP) is creating four regional investigative teams to support its staff across the US with “reporting and presentation resources”, in particular by using journalists with expertise in computer-assisted reporting (CAR), Flash interactives and access to public records.

Now, any reporter in a region who has an idea for a story that requires high-level data analysis will have a partner. If an editor has an idea for a project that lends itself to an interactive map or another data-driven multimedia project, they can work with the team. When a big, breaking story happens anywhere in the country, we’ll tap the region’s I-team [the name given to the newly created teams] to begin digging into public records and inspection reports while the story is still developing, not days after the fact.

Full story at this link…

Tags: , , , , , ,

Similar posts:

Video: Why the Guardian is pushing for more open data

March 12th, 2010 | No Comments | Posted by in Handy tools and technology

Stephen Dunn, who heads up the Guardian’s technology strategy, talks to Beet.tv in the video below about how opening up and making better use of data can provide journalistic and business opportunities for publishers:


Tags: , , , , ,

Similar posts:

#Tip of the day from Journalism.co.uk – inspiration for data journalism

March 9th, 2010 | No Comments | Posted by in Top tips for journalists

Data: Looking for ways of incorporating data into your journalism? You should find this GapMinder presentation very inspirational. Tipster: Judith Townend.

To submit a tip to Journalism.co.uk, use this link – we will pay a fiver for the best ones published.

Tags: , ,

Similar posts:

Hacks and Hackers play with data-driven news

Last Friday’s London-based Hacks and Hacker’s Day, run by ScraperWiki (a new data tool set to launch in beta soon), provided some excellent inspiration for journalists and developers alike.

In groups, the programmers and journalists paired up to combine journalistic and data knowledge, resulting in some innovative projects: a visualisation showing the average profile of Conservative candidates standing in safe seats for the General Election (the winning project); graphics showing the most common words used for each horoscope sign; and an attempt to tackle the various formats used by data.gov.uk.

One of the projects, ‘They Write For You’ was an attempt to illustrate the political mix of articles by MPs for British newspapers and broadcasters. Using byline data combined with MP name data, the journalists and developers created this pretty mashup, which can be viewed at this link.

The team took the 2008-2010 data from Journalisted and used ScraperWiki, Python, Ruby and JavaScript to create the visualisation: each newspaper shows a byline breakdown by party. By hovering over a coloured box, users can see which MPs wrote for which newspaper over the same two year period.

The exact statistics, however, should be treated with some caution, as the information has not yet been cross-checked with other data sets.  It would appear, for example, that the Guardian newspaper published more stories by MPs than any other title, but this could be that Journalisted holds more information about the Guardian than its counterparts.

While this analysis is not yet ready to be transformed into a news story, it shows the potential for employing data skills to identify media and political trends.

Tags: , , , , , , , , ,

Similar posts:

#Tip of the day from Journalism.co.uk – data books and resources

January 7th, 2010 | No Comments | Posted by in Top tips for journalists

Like data and nifty online tools? Check out these suggested ‘gifts’ from flowingdata.com. Some very handy data resources in the list at this link… Tipster: Judith Townend.

To submit a tip to Journalism.co.uk, use this link – we will pay a fiver for the best ones published.

Tags: , ,

Similar posts:

David McCandless: Odds of dying from blogging?

November 3rd, 2009 | No Comments | Posted by in Editors' pick, Multimedia

It’s 35,000,000 to 1, according to set of graphics from InformationIsBeautiful.net (hat tip to @fionacullinan).

Screengrab of David McCandless infographic

While the blogging comparison might be slightly irreverent (and viewed alongside the very real threat to bloggers in countries with limited press freedom), Google is cited as the source for this stat and the whole set gives some interesting ideas for visualising data.

Full graphics at this link…

Tags: , , , ,

Similar posts:

#Tip of the day from Journalism.co.uk – ideas for data mashups

October 30th, 2009 | No Comments | Posted by in Top tips for journalists

Ideas for mashing local data: check out Kent County Council’s initiative for ideas in your local area. Tipster: Judith Townend.

To submit a tip to Journalism.co.uk, use this link – we will pay a fiver for the best ones published.

Tags: ,

Similar posts:

#Tip of the day from Journalism.co.uk – data inspiration for stories

October 21st, 2009 | No Comments | Posted by in Top tips for journalists

Data journalism: Want to make more use of data in your stories, but don’t know where to start? Fantastic blog Flowing Data has a list of 30 starting points for finding the data you need on a range of basic subjects. Tipster: Laura Oliver.

To submit a tip to Journalism.co.uk, use this link – we will pay a fiver for the best ones published.

Tags: , , ,

Similar posts:

#DataJourn: Royal Mail cracks down on unofficial postcode database

A campaign to release UK postcode data that is currently the commercial preserve of the Royal Mail (prices at this link) has been gathering pace for a while. And not so long ago in July, someone uploaded a set to Wikileaks.

How useful was this, some wondered: the Guardian’s Charles Arthur, for example.

In an era of grassroots, crowd-sourced accountability journalism, this could be a powerful tool for journalists and online developers when creating geo-data based applications and investigations.

But the unofficial release made this a little hard to assess. After all, the data goes out of date very fast, so unless someone kept leaking it, it wouldn’t be all that helpful. Furthermore it would be in defiance of the Royal Mail’s copyright, so would be legally risky to use.

At the forefront of the ‘Free Our Postcodes’ campaign is Earnest Marples, the site named after the British postmaster general who introduced the postcode. Marples is otherwise known as Harry Metcalfe and Richard Pope, who – without disclosing their source – opened an API which could power sites such as PlanningAlerts.com and Jobcentre Pro Plus.

“We’re doing the same as everyone’s being doing for years, but just being open about it,” they said at the time of launch earlier this year.

But now they have closed the service. Last week they received cease and desist letters from the Royal Mail demanding that they stop publishing information from the database (see letters on their blog).

“We are not in a position to mount an effective legal challenge against the Royal Mail’s demands and therefore have closed the ErnestMarples.com API, effective immediately,” Harry Metcalfe told Journalism.co.uk.

“We’re very disappointed that Royal Mail have chosen to take this course. The service was supporting numerous socially useful applications such as Healthwhere, JobcentreProPlus.com and PlanningAlerts.com. We very much hope that the Royal Mail will work with us to find a solution that allows us to continue to operate.”

A Royal Mail spokesman said: “We have not asked anyone to close down a website. We have simply asked a third party to stop allowing unauthorised access to Royal Mail data, in contravention of our intellectual property rights.”

Tags: , , , , , , , , , , , , , ,

Similar posts:

© Mousetrap Media Ltd. Theme: modified version of Statement