Browse > Home / Archive by category 'Data'

Getstats: 12 ‘number hygiene’ rules for journalists in full

February 1st, 2012 | No Comments | Posted by in Data, Training

A campaign launched by the Royal Statistical Society has proposed 12 “rules of thumb for journalists” in order to encourage a better understanding of numbers in news.

Getstats is also calling for numeracy and statistics to be taught in journalism schools.

More details and a 12 point summary is at this link.

The full 12 rules of “number hygiene” for journalists are below:

1. You come across a number in a story or press release. Buyer beware. Before making it your own, ask who cooked it up; what are their credentials; are they selling something. What other evidence do we have (what numbers are they not showing us?); why this number, now? If the number comes from a study or research, has anyone reputable said it is any good?

2. Sniff around. Do the numbers refer to a whole group of people or things or a sample of them? If it’s a sample, are the people being questioned or the things being referred to a fair representation of the wider group? Say a company is claiming something applies to the population at large. If it is basing the story on a sample, such as a panel of internet users, the company goes back to time and again then beware: the panel may not be representative.

3. More probing. What was the sample asked? The wording of a question can hugely influence the answer you get. People’s understanding of what it means to ‘be employed’ or the nature of ‘violent crime’ may differ. What the public understands may not match the survey researcher’s idea. In government surveys bigamy was till recently classed as a violent crime. Might researchers’ choice of words have led people into a particular response?

4. One number is often used to sum up the group being measured, the average. But different averages measure different things. The mean is extremely sensitive to highs and lows: the very fact of Bill Gates coming to live in the UK would push up mean wealth. The median tells us, for example, the income of an average person – half the population get less, half more. Comparing earnings, the mode tells us the salary most people earn.

5. There is a lot of uncertainty about. We need to be sure the number on offer is a result and not just due to chance. With a sample, check the margin of error, the plus or minus 3 per cent figure, usually stated by reputable polling companies. A poll saying 52 per cent of people are in favour of something is not definitively saying half are in favour: it could be 49 per cent. Beware league tables, except in sports reports. Chelsea is higher than Arsenal for a simple and genuine reason: the side has collected more points. With hospitals or schools, a single score is a never likely valid basis for comparison (a teaching hospital may appear to have a worse score, but only because sicker patients are referred to it). Comparisons between universities or police forces are unreliable if the scores fall within margins of error. Midshires scores 650 on the ranking and Wessex 669: they could be performing at the same level or their respective positions reversed.

6. The numbers you are given show a big increase or sharp decrease. Yet a single change does not mean a trend. Blips happen often. Blips go away, so we have to ask whether the change in the numbers is just a recovery or return to normal after a one-off rise or fall (what statisticians refer to as ‘regression to the mean’). The numbers may come from a survey, like (say) ONS figures for household spending or migration. Is the change bigger than the margin of error?

7. Unless researchers carried out a controlled experiment (such as a trial of a new drug, based on a randomly chosen group, some of whom don’t know they are getting a placebo), it’s very difficult confidently to state that a causes b. Instead, the numbers may show an association (a correlation) between two things, say obesity and cancer. Beware spurious connections, which may be explained by a third or background factor. If use of mobile phones by children is associated with later behavioural disorders, the connexion could be the parents, and the way their behaviour affects both things. If the numbers suggest an association, the important thing is to assess its plausibility, on the back of other evidence. Finding a link can stimulate further study, but can’t itself be the basis for some new government policy. Recommendations for changing daily behaviour such as eating should not be based on speculative associations between particular food and medical conditions.

8. A key question for any number is ‘out of how many?’ Some events are rare — such as the death of a child. That’s why they are news, but that’s also why they deserve being put in context. Noting scarcity value is the way to reporting the significance of an event. An event’s meaning for an individual or family has to be distinguished from its public importance.

9. Billions and millionths are too big and too small to grasp. We take figures in if they are humanized. One way is comparing with, say, the whole UK; another is to plot the effect on an individual. Colourful comparisons can make risk intelligible: the risk of dying being operated on under a general anaesthetic is on average the same as the risk being killed while travelling 60 miles on a motorbike.

10. Good reporting gives a balanced view of the size of the numbers being reported. Better to focus on the most likely number rather than the most extreme, for example in stories about the effects of a flu pandemic. ‘Could be as high as’ points to an extreme; better to say ‘unlikely to be greater than’. Numbers may be misperceived so try to eliminate bias.

11. Risk is risky. ‘Eating bacon daily increases an individual’s lifetime risk of bowel cancer by 20 per cent.’ Another way of saying that is: out of 100 people eating a bacon sandwich every day one extra person will get bowel cancer. Using the first without noting the second tells a story that is both alarmist and inaccurate. If the information is available, express changes in risk in terms of the risks experienced by 100 or 100,000 people.

12. The switch from print to digital brings opportunities to present numbers more dynamically and imaginatively, for example in scatter plots. Graphics can show a trend. Stacked icons in graphs can show effects on 100 people. But the same rules of thumb apply whatever the medium: is the graphic clear; does it tell the story that is in the text.

Tags: , ,

Similar posts:

Tool of the week for journalists: Tableau Public, for data visualisations

Tool of the week: Tableau Public

What is it? A data visualisations tool, allowing you to create interactive graphs, charts and maps.

How is it of use to journalists? Tableau Public is a free tool that allows journalists to upload an Excel spreadsheet or text file and turn the data into an interactive visualisation that you can embed on your news site or blog.

Here are five examples of how Tableau has been used by news sites to tell stories. A quick browse will give you a sense of how the tool can be used to explain news stories.

One of Tableau’s real strengths is providing the reader with the opportunity to move a slider or select a drop down and see how the visualisation alters when a variable changes.

In order to create a visualisation you will need a PC (or a Windows environment on your Mac) and to download the free software.

I was able to upload an Excel file and within less than two minutes had produced a map showing what are predicted to be the most-populous countries in 2100.

I had previously used this data set to create a visualisation in Google Fusion Tables and Tableau was equally easy to navigate.

For those who have not tried creating data visualisations, Tableau requires no technical ability and is easier to use than the wizard options that allow you to create graphs in Excel.

There are options for sorting and reordering data, plus changing the colours and view options.

Tableau also has a paid-for option. The difference between the free tool and the premium option is that Tableau Public requires you to publish your visualisation to the web.

Tableau launched version 7.0 a couple of weeks ago and will soon be adding functionality allowing you to create a map using UK postcodes, according to Ross Perez, data analyst at the US-based company.

Disclaimer: Tableau Public is a sponsor of the Journalism.co.uk-organised conference news:rewired. This relationship did not influence this review.

Tags: , , ,

Similar posts:

#Tip of the day from Journalism.co.uk – publishing data online

December 15th, 2011 | No Comments | Posted by in Data, Top tips for journalists

On the Help Me Investigate blog founder Paul Bradshaw outlines four ways data can be published online, which he says can be done “either for others to see the raw material, or to invite them to help you explore it”. His tips include using platforms such as Google Docs or BuzzData. Read the full post here.

Tipster: Rachel McAthy

If you have a tip you would like to submit to us at Journalism.co.uk email us using this link – we will pay a fiver for the best ones published.

Tags: , , ,

Similar posts:

#Tip of the day from Journalism.co.uk – using spreadsheets for data stories

December 7th, 2011 | No Comments | Posted by in Data

Poynter has a helpful lesson in Excel and other spreadsheet software for journalists dealing with data.

The post explains how to split names in a single column to two columns, for example.

Poynter’s post on how journalists can use Excel to organise data for stories is at this link.

There will be a workshop on data journalism – led by Simon Rogers, editor of the Guardian’s Datastore and Datablog – at Journalism.co.uk’s news:rewired – media in motion conference for journalists. The news:rewired agenda is at this link.

Tipster: Sarah Marshall

If you have a tip you would like to submit to us at Journalism.co.uk email us using this link– we will pay a fiver for the best ones published.

Tags: , ,

Similar posts:

Guardian study finds just 22.6% of journalists are female

December 6th, 2011 | No Comments | Posted by in Data, Journalism

The New York Times newsroom in 1942. By Marjory Collins [Public domain], via Wikimedia Commons

The Guardian today published the findings from its research into gender in the press, based on “a simple count of newspaper bylines” and those appearing on the Today programme on Radio 4.

The bylines were said to have been taken from articles published in a total of seven newspapers from 13 June to 8 July. The Guardian reports that the research, led by Kira Cochrane, found that women journalists accounted for just 22.6 per cent, as opposed to 77.4 per cent for male reporters.

National papers were all shown to have large gender gaps in byline averages. The Daily Mail and the Guardian recorded the lowest male dominance at 68 per cent male and 72 per cent male respectively.

In its ever-open approach to data the Guardian has made all the data available as a downloadable spreadsheet and is asking its audience to get involved by posing the question: “What can you do with this data?”

Read more here.

Research published earlier this year, commissioned by the Women in Journalism group, found that almost three quarters of journalists working in the national press were male.

Tags: , , ,

Similar posts:

#Tip of the day from Journalism.co.uk – try ScraperWiki’s new screencasts

December 5th, 2011 | No Comments | Posted by in Data, Top tips for journalists

Data journalists and anyone interested in the field should take a look at ScraperWiki’s new screencasts.

There’s a link to them on Nicola Hughes’ DataMinerUK blog.

Tipster: Sarah Marshall

If you have a tip you would like to submit to us at Journalism.co.uk email us using this link– we will pay a fiver for the best ones published.

Tags: , , ,

Similar posts:

What’s happening to mark open data day

December 2nd, 2011 | No Comments | Posted by in Data, Events

The use of open data in our newsrooms has been growing in the past few years and many people believe that the future of data journalism relies on the collaboration between developers, designers and journalists to create better ways of extracting information from open datasets.

Tomorrow (3 December) is International Open Data Day and there is a series of worldwide events set up to gather coders, programmers and journalists around “live hacking” challenges.

International Open Data Hackathon

Where? The Barbican in London and around the world

When? Saturday, 3 December from 11am

Better tools. More Data. Bigger Fun. That’s how the 2011 Open Data Day Hackathon describes this year’s global event, taking place in more than 32 countries this weekend.

For journalists, it’s an occasion to give hacking a go and meet people from the world of data.

The past year has seen open data continue to gain traction around the world with new open data catalogues launched in Europe, North America and Africa and more data available from organisations such as the World Bank.

Open Data Day is a gathering of citizens in cities around the world to write applications, liberate data, create visualisations and publish analyses using open public data. Its aim is to show support for and encourage the adoption of open data policies by the world’s local, regional and national governments.

Join the Open Knowledge Foundation and CKAN at the Barbican tomorrow (Saturday, 3 December) as they assemble a “crack-team” of coders to break data out of its internet prisons and load it into the Data Hub.

For details about the event, see this blog post, and sign up on the event’s meetup page or by filling out the event’s Google form.

Participants will be on IRC and will also be using the hashtags #seizedata and #odhdLDN on Twitter. All journalists, data scrapers, coders and #opendata enthusiasts can join.

David Eaves, the organiser of this year’s Open Data Hackathon believes this event is a great opportunity to teach journalists, as well as the general public, how to tackle data on a day-to-day basis:

Its a Maker Faire-like opportunity for people to celebrate open data by creating visualisations, writing up analyses, building apps or doing what ever they want with data.

What I do want is for people to have fun, to learn, and to engage those who are still wrestling with the opportunities around open data … And we’ve got better tools. With a number of governments using Socrata there are more API’s out there for us to leverage. ScraperWiki has gotten better and new tools like Buzzdata, the Data Hub and Google’s Fusion Tables are emerging every day.

Who’s it for? Everyone. David Eaves says:

If you have an idea for using open data, want to find an interesting project to contribute towards, or simply want to see what’s happening, then definitely come along.

You can also check out the HackFest 2011 topic page on BuzzData.

London “Random Hacks of Kindness” event

Where? @Forward in London, and around the world

When? 3-4 December 2011, from 9am Saturday until 6pm Sunday

Starting on the same day as the Open Data Hackathon, the Random Hacks of Kindness’ Codesprint will gather thousands of experts in 25 countries to develop open tech solutions over two days of hacking challenges.

The unprecedented gatherings in collaboration with Google, Microsoft, Yahoo!, NASA, HP and the World Bank will bring together some of the world’’ most innovative social enterprises and volunteer technologists.

London’s event promises to be exciting as over 100 tech heads will gather to tackle one issue: financial exclusion and illiteracy. It will be the first ever hack day addressing this theme.

Financial and enterprise education group MyBnk will head a panel of CEOs and IT specialists from LSE, Morgan Stanley, Fair Finance, Three Hands, Toynbee Hall and the Forward Foundation to make major advances in helping young people master money management.

Mike Mompi, head of strategy and innovation at My BNK and the organiser of London RHoK event says:

The main objectives of the weekend are problem solving, capacity building, partnerships, and impact

A £500 cash prize will be given at the end of Sunday for the winning solution (among other prizes) and several media organisations, including The Huffington Post, will be joining in.

People from RHoK have hosted three global events to date, in 31 cities around the globe with over 3,000 participants. Past events resulted in apps and alert systems to warn people of bushfires in Australia and recipients of food stamps to sources of fresh produce in Philadelphia.

The RHoK community is open for anyone to join.

If you want to get an idea of what’s in store for this weekend, check out last year’s hackathon videos.

You will be able to follow the event on Twitter @RHoKLondon and the hashtag #rhokLDN. It is still possible to sign up for this weekend’s free event via this link.

Tags: , , ,

Similar posts:

How open data has changed journalism

December 2nd, 2011 | 1 Comment | Posted by in Comment, Data

Tomorrow (Saturday 3 December) is International Open Data Day. We have been asking what the open data movement has done for journalism.

Simon Rogers, editor of the Guardian’s Datablog and Datastore – @smfrogers

It’s only been a couple of years and you could argue that open data has changed the world: Wikileaks, government spending, what we know about the riots… The irony is that the governments behind much of this data have only contributed the numbers; the hard work has been done by an army of developers and data journalists who have created stories and new ways of telling them. When we started the Datablog in 2009, we thought it would be popular only with developers; now everyone wants to know the facts behind the news.

Nicola Hughes@DataMinerUK

It’s the knowing how to use it that’s vital. it’s a (re)source.

Borja Bergareche@borjabergareche

It’s helped us do the best journalism of the 20th century in the 21st century.

Lucy Chambers, community coordinator, Open Knowledge Foundation – @lucyfedia

Evidence-based journalism. Journalists will back up stories, readers will expect to be able to verify facts.

Andrew Gregory@andrew__gregory

Open data is useful. But original journalism also requires good human sources.

Rune Ytreberg@ytreberg

The obvious: Open data has made journalism more transparent.

Megan Cunningham@megancunningham

Open data has accelerated the opportunities for crowd sourced investigative journalism. But the potential hasn’t been realised.

Harriet Minter@Harriet_Minter

It’s forced journalists to embrace spreadsheets, brought interactives to the forefront and given us many bad infographics.

Greg Hadfield@GregHadfield (who is organising the UK’s first open-data cities conference)

Data – whether open or not – has always fuelled journalism. Data that is increasingly “open” (in the fullest sense of the term) will transform journalism.

Ironically, a lot of the best revelatory journalism of the past has depended on journalists unearthing data (ie “stuff”) that others want to keep locked. Ideally, by lawful means. Therefore, openness may remove some of the mystique that journalists delight in, as people who know things that only those “in the know” know.

An open-data tsunami will mean that more journalism will be about interpreting – and putting into context – data that is open to all, at least in its rawest, unrefined form.

To an even greater degree, journalism will be about adding value to data by transforming it into information. The best journalism will be to add value to information, to provide insight, even wisdom.

Openness of data will change the behaviour of individuals and organisations. But not immediately and not in every case. Would MPs have played fast and loose with their expenses if they knew data about each claim would be published openly and in real time? Sad to say, it is quite possible some would.

Much good journalism has involved shedding light on data that was routinely (although not widely) available and which was only rarely studied or analysed.

Importantly, some of the best journalism has involved making connections and spotting patterns. I’m thinking of earlier parliamentary abuses, such as the “cash-for-questions” scandal of the mid-1990s, before Hansard was on the web, and when it was rarely read in print by journalists.

Those were the days when typewriters and telephones – rather than computers and the internet – were the primary journalistic tools. When bars and restaurants – rather than offices and desktops – were the venues for journalistic enterprise.

With more data openly available – along with more tools easily available for mining, sifting and interpreting it (as in the case of the Wikileaks material) – there are many more needles to be found in the burgeoning haystacks of unstructured data.

But even when every day is #Opendata Day, the best stories may remain hidden in full public view – until one of the new generation of journalists stumbles expertly across them.

Tags: , ,

Similar posts:

UCLan project awarded £64,000 from Google to support ‘news entrepreneurs’

November 30th, 2011 | No Comments | Posted by in Awards, Data

The University of Central Lancashire’s Journalist Leaders Programme has secured €75,000 (£64,000) of Google funding to support “news entrepreneurs” after being named as one of three winners of the International Press Institute’s News Innovation Contest.

The programme, founded by researcher, academic and consultant on newsroom and digital business innovation François Nel (pictured), will develop a project called Media and Digital Enterprise (MADE), to offer an “innovative training, mentoring and research programme”.

The funding awarded by IPI will be spent by the UCLan programme on working “to create sustainable news enterprises – whether for social or commercial purposes – by helping innovators”.

Nel told Journalism.co.uk MADE will “support the entire news ecosystem as we need innovation across the sector”.

He is now looking for people with entrepreneurial ideas who are interested in news innovation.

The other two winners of the contest are Internews Europe, a European non-profit organisation created in 1995 to help developing countries establish and strengthen independent media organisations to support freedom of expression and freedom of access to information, alongside the World Wide Web Foundation, a Swiss public charity founded by Sir Tim Berners-Lee, the inventor of the world wide web.

In February Google announced it was awarding $2.7 million to the Vienna-based IPI for its contest.

There were round 300 applicants, reduced first to 74 and then to 26 before the three winners were selected by a panel of seven judges, including journalism professor and commentator Jeff Jarvis.

The winners of the total fund of $600,000 were announced yesterday; Nel heard this morning how much the MADE project is being allocated, telling Journalism.co.uk “it’s fantastic to have support for news innovations”.

Nel and others working on the Leaders Programme have been working with news organisations, including Johnston Press, Trinity Mirror and the Guardian Media Group, looking at digital processes and innovative business models.

MADE allows us to pull those strands together and work with directly with news entrepreneurs. And we’re really excited about the possibility of putting this to the test.

Nel explained that MADE will “deliver good skills for a whole range of news start-ups” and he is now “looking to work with individuals, groups and companies, who are interested in news innovation” to get involved.

The project will help develop new skills and test the business plans, offering bespoke support to those with entrepreneurial ideas.

We’re looking to support five good people and good ideas for at least three months so that we can give those ideas legs.

The project includes various partners that were part of the bid, including one to build content and one to build communities.

Developers at ScraperWiki will be working with the project to develop innovations in data journalism and build content. Another partner is Sarah Hartley who is now working on the Guardian’s social, local, mobile project n0tice, with this area of the project focusing on building communities.

MADE will also involve Nel’s colleagues at Northern Lights, an award-winning business incubation space at UCLan.

The project also has an international element, involving groups in Turkey, drawing on Nel’s connections in the country.

Nel explained why the funding and ongoing support from IPU is vital.

In the digital news media space the cyber world is littered with start ups. The corpses of news start ups are every here. What we really need to do is help news entrepreneurs stay up and that’s what we are trying to do here.

Tags: , , , , ,

Similar posts:

Tool of the week for journalists – Playground, to monitor social media analytics

Tool of the week: Playground, by PeopleBrowsr.

What is it? A social analytics platform which contains over 1,000 days of tweets (all 70 billion of them), Facebook activity and blog posts.

How is it of use to journalists? “Journalists can easily develop real-time insights into any story from Playground,” PeopleBrowsr UK CEO Andrew Grill explains.

Complex keyword searches can be divided by user influence, geolocation, sentiment, and virtual communities of people with shared interests and affinities.

These features – and many more – let reporters and researchers easily drill down to find the people and content driving the conversation on social networks on any subject.

Playground lets you use the data the way you want to use it. You can either export the graphs and tables that the site produces automatically or export the results in a CSV file to create your own visualisations, which could potentially make it the next favourite tool of data journalists.

Grill added:

The recent launch of our fully transparent Kred influencer platform will make it faster and easier for journalists to find key influencers in a particular community.

You can give Playground a try for the first 14 days before signing up for one of their subscriptions ($19 a month for students and journalists, $149 for organisations and companies).

Jodee Rich, the founder of PeopleBrowsr, gave an inspiring speech at the Strata Summit in September on how a TV ratings system such as Nielsen could soon be replaced by social media data thanks to the advanced online analytics that PeopleBrowsr offers.

 

Playground’s development is based on feedback from its community of users, which has been very responsive. Ideas can be sent to contact[@]peoplebrowsr.com or by tweeting @peoplebrowsr.

Tags: , , , , , , , ,

Similar posts:

© Mousetrap Media Ltd. Theme: modified version of Statement