Category Archives: Data

News organisations should get ready for data, says Martin Moore

While the individual newspapers involved in WikiLeak’s latest military document release may be considering lessons for next time, Martin Moore from the Media Standards Trust says all news organisations should be preparing for future waves of data from such sources.

Writing on the PBS Mediashift Idea Lab he says the ‘data dump’ process is likely to to become an increasingly common method of information release as reporters and sources become more experienced in handling such material.

Soon every news organization will have its own “bunker” — a darkened room where a hand-picked group of reporters hole up with a disk/memory stick/laptop of freshly opened data, some stale pizza and lots of coffee.

He proposes five questions for news outlets to consider in preparation for processing leaked material in the best way for the reader, including how to use public intelligence to generate the most stories from material, how to personalise data for their own specific audiences and how to ensure transparency and trust in the publication of documents.

The expenses files, the Afghan logs, the COINs database (a massive database of U.K. government spending released last month) are all original documents that can be tagged, referenced and linked to. They enable journalists not only to refer back to the original source material, but to show an unbroken narrative flow from original source to final article. This cements the credibility of the journalism and gives the reader the opportunity to explore the context within the original source material. Plus, if published in linked data, the published article can be directly linked to the original data reference.

He adds that preparation will be key to securing future scoops, as “organizations that become known for handling big data sets will have more whistleblowers coming to them”.

See his full post here…

HelpMeInvestigate.com looks at campaign expenses after Goldsmith case

Crowdsourcing website HelpMeInvestigate.com has launched probes into MPs’ campaign expenses. The move follows Channel 4’s investigation into Zac Goldsmith, who is alleged to have exceeded the spending limit set for his Richmond constituency.

So far, the focus has fallen on the closely-fought Edgbaston race, where Labour’s Gisela Stuart held her seat with a reduced majority of 1,274, but investigations have also begun in other Birmingham constituencies and in Brighton.

Posting on the HelpMeInvestigate.com blog, the site’s founder Paul Bradshaw said he was undergoing this investigation after Goldsmith and the Conservative Party claimed that they were justified in only accounting for election materials that were used in the campaign, as opposed to materials that were not used as they had become out-of-date.

“We want to see if this is true. Are other candidates not claiming for the expense of ‘unused’ materials? Or is Goldsmith an exception?” writes Bradshaw.

“We’ve started one investigation in Birmingham but would really welcome sister investigations in other towns and cities.”

The website is currently in beta testing, meaning new users can only access the site after requesting an invite.

HelpMeInvestigate on campaign expenses at this link.

Survey attempts to track the changing skills of online journalists

We know that many journalists today aim to have a finger in every multimedia pie – a ‘print’ journalist wants to understand how to communicate by video or audio, while online reporters should be prepared to build and manage online communities.

The Online Journalism Review is running a simple survey to measure this changing skillset of modern-day online journalists.

A few points, before we get to the vote: First, I’m just going to assume that everyone’s got basic reporting, text writing and copy editing, so those aren’t listed as options. Next, I do not wish to infer that everyone needs to develop all of these skills. Many journalists continue to work in newsrooms where they are expected to specialize. And even independent journalists often can rely on networks, contractors, vendors and open source solutions to cover many of their publishing needs. So if you don’t want help with a particular skill, just leave the box next to it blank.

But the more skills you develop, the more freedom and flexibility you have as a journalist in the online publishing market. I know personally OJR readers who’ve mastered each of the skills listed below, so if you do want to add more to your journalism repertoire, your fellow readers have the capacity to help.

The results already make for interesting reading, with the growing importance of good images and strong communities online reflected in the statistics – so far rated the two top skills mastered by journalists during their career

See the full post here…

BBC News redesign architect gets technical about changes

If you are more interested in the cogs and wheels behind the BBC News site’s redesign than the end product, a post by their chief technical architect John O’Donovan this week should be of interest.

The BBC has one of the oldest and largest websites on the internet and one of the goals of the update to the News site was to also update some of the core systems that manage content for all our interactive services.

O’Donovan first outlines the reasoning behind keeping with a Content Production System (CPS), rather than moving over to Content Management System (CMS), before giving a detailed look at the latest model – version 6 – that they have opted for.

The CPS has been constantly evolving and we should say that, when looking at the requirements for the new news site and other services, we did consider whether we should take a trip to the Content Management System (CMS) Showroom and see what shiny new wheels we could get.

However there is an interesting thing about the CPS – most of our users (of which there are over 1,200) think it does a pretty good job [checks inbox for complaints]. Now I’m not saying they have a picture of it next to their kids on the mantelpiece at home, but compared to my experience with many organisations and their CMS, that is something to value highly.

The main improvements afforded by the new version, according to O’Donovan, include a more structured approach, an improved technical quality of content produced and an ability to use semantic data to define content and improve layouts.

See his full post here…

Hacks and Hackers look at health, education and leisure

Online journalism expert Paul Bradshaw gives a detailed post on his experiences of a recent Hacks and Hackers day in Birmingham organised by Scraperwiki, experiences which he claims will “challenge the way you approach information as a journalist”.

Talking through the days events, Bradshaw observes how journalists had to adapt their traditional skills for finding stories.

Developers and journalists are continually asking each other for direction as the project develops: while the developers are shaping data into a format suitable for interpretation, the journalist might be gathering related data to layer on top of it or information that would illuminate or contextualise it.

This made for a lot of hard journalistic work – finding datasets, understanding them, and thinking of the stories within them, particularly with regard to how they connected with other sets of data and how they might be useful for users to interrogate themselves.

It struck me as a different skill to that normally practised by journalists – we were looking not for stories but for ‘nodes’: links between information such as local authority or area codes, school identifiers, and so on. Finding a story in data is relatively easy when compared to a project like this, and it did remind me more of the investigative process than the way a traditional newsroom works.

His team’s work led to the creation of a map pinpointing all 8,000 GP surgeries around the UK, which they then layered with additional data enabling them to view issues on a geographical measure.

See his full post here…

‘If you could see my desk, you’d weep’: Santa Fe reporter trawls data for wealth story

Here’s a great example of the value in data for journalists.

The Reynolds Center for Business Journalism ran a feature on Corey Pein, a journalist for Santa Fe Reporter, who spent two weeks working through raw data to compile a list of the wealthiest residents on his patch.

His resources included property records, nonprofit tax returns, donor lists, private aircraft registrations and court records.

If you could see my desk, you’d weep over the messes of paper I create for those feature-length stories.

His final story won him first place in the AltWeekly Awards for Innovation/Format Buster.

The publisher has used public database site, Socrata, to create five searchable online databases from the information Pein’s work uncovered.

Is Facebook falling out of favour?

Newspaper sites are more popular with internet users than social networks such as Facebook and Myspace, according to the annual American Customer Satisfaction Index (ACSI).

The statistics, which were also quoted in a Washington Times post, show online news outlets topped the tables with satisfaction scores of 82 per cent (FoxNews.com), 77 per cent (USAToday.com) and 76 per cent (NYTimes.com).

By comparison, social networking site Facebook achieved just 64 per cent, while Myspace was even lower at 63 per cent.

According to the report, this puts Facebook in the bottom 5 per cent of all measured private sector companies.

Wikipedia claimed the highest social media rating with 77 per cent, while YouTube achieved 73 per cent. Search engines also outperformed social media, with Google receiving 80 per cent, closely followed by Bing at 77 per cent.

This was the first time social media websites have been measured by ACSI, which pulls data from interviews with around 70,000 customers.

I’m a journalist – should I learn programming?

Many reporters are starting to move on from the world of HTML or CSS coding and getting to grips with more technical programming knowledge.

But web development isn’t for everyone, so how do you know if it will be right for you? Using some trusty know-how and specially selected questions, digital journalist Mark Luckie has tried to help reporters answer that very question.

His flowchart, shown below, is hosted on his 10,000 words blog.

Online innovator to leave university post after ‘complicated decision’

Online journalism innovator Paul Bradshaw has taken voluntary redundancy from his post as course leader for the online journalism MA at Birmingham City University, in what he says was a “complicated decision”.

Bradshaw, who is also founder of the Online Journalism Blog, hopes he can now invest more time in his own projects, with immediate plans to develop his Help Me Investigate site.

“It was a very complicated decision,” he told Journalism.co.uk. “There are a lot of opportunities around data journalism that I want to explore and I want to spend more time on Help Me Investigate. I felt it was probably the right time to dive in to more of those opportunities and now I have time to accept offers I have been made. But I am wary of taking too much work on. Part of the point is to invest more time in Help Me Investigate. I plan to start some development work and explore business models soon.”

Bradshaw is also already working on two different books, his own on magazine editing which is set to be completed by the end of the year and another dedicated to online journalism, which he is contributing to with former FT.com news editor Liisa Rohumaa, likely to be out by early next year.

On top of all that, he admits he may  keep his toes in the teaching pool.

“I will certainly miss parts of teaching,” he told Journalism.co.uk. “I absolutely, enormously enjoyed teaching the students this year. Some of their work has been the best so far. I may still do a bit of teaching, but I think I have always wanted to keep growing and developing. The students say they are gutted, but they were quite excited and positive about what I am doing. I am experiencing a huge jumble of emotions. I am excited about the possibilities but I am really going to miss the students and staff.”

CJR and the Texas Tribune: Is data both journalism and a business?

The Columbia Journalism Review takes an in-depth look at news start-up the Texas Tribune, which launched in November last year “billing itself not only as an antidote to the dwindling capitol press corps but also as a new force in Texas political life”. CJR considers how sustainable the venture is editorially and commercially:

The Tribune’s biggest magnet by far has been its more than three dozen interactive databases, which collectively have drawn three times as many page views as the site’s stories (…) The Tribune publishes or updates at least one database per week, and readers e-mail these database links to each other or share them on Facebook, scouring their neighborhood’s school rankings or their state rep’s spending habits. Through May, the databases had generated more than 2.3 million page views since the site’s launch

Full story on CJR…