Menu
Browse > Home /

Follow the Guardian Hack Day 2011

Yesterday and today, staff at the Guardian have been having a get together that sums up the kind of thing the organisation is really good at.

The Guardian Hack Day is about getting its developers in a room and getting them to build stuff, with helpful advice from staff from editorial, commercial, or anywhere I think.

Information architect Martin Belam probably describes it better:

I suppose we should explain a bit more about what a “hack day” is at the Guardian. Essentially for two working days our tech team puts aside their normal work, and gets to work on a project of their own choosing. Sometimes they will work as teams, sometimes as individuals. (And sometimes I think they have been secretly coding the things for months in advance anyway). Other people, like the design and UX team, and commercial & editorial staff, are also encouraged to take part if they can spare the time.

This is certainly not the first hack day, but they are liveblogging this one, and it makes for interesting reading. It is coming to a close now, I got sidetracked away from posting something about it yesterday, but you can still follow the day two liveblog here, and you can look back on the goings on from yesterday here.

A nice hack from someone outside the Guardian also appeared today: http://latertodayguardian.appspot.com/

Created by Chris Thorpe, who used to work for the Guardian’s Open API Platform team, it uses a Guardian JSON feed to turn the news organisation’s new experimental open newslist into a great looking column-based page, with links to reporters’ Twitter accounts and a Guardian API search to try and match the newslist to published stories.

Tags: , , , ,

Similar posts:

Currybet: There is a lot of data journalism to be done on riots

August 12th, 2011 | No Comments | Posted by in Data, Editors' pick

In a blog post today (12 August), information architect at the Guardian, Martin Belam, calls on journalists to make the most of the data now available in relation to the riots which took place this week.

He says using the data is “vital” and the resulting journalism will have the power to “help us untangle the truth from those prejudiced assumptions”. But he adds about the importance of ensuring the data is not misinterpreted in time to come.

The impact of the riots is going to be felt in data-driven stories for months and years to come. I’ve no doubt that experienced data crunchers like Simon Rogers or Conrad Quilty-Harper will factor it into their work, but I anticipate that in six months time we’ll be seeing stories about a sudden percentage rise in crime in Enfield or Central Manchester, without specific reference to the riots. The journalists writing them won’t have isolated the events of the last few days as exceptions to the general trend.

… There can be genuine social consequences to the misinterpretation of data. If the postcodes in Enfield become marked as a place where crime is now more likely as a result of one night of violence, then house prices could be depressed and insurance costs will rise, meaning the effects of the riots will still be felt long after broken windows are replaced. It is the responsibility of the media to use this data in a way that helps us understand the riots, not in a way that prolongs their negative impact.

Read his full post here…

This followed a blog post by digital strategist Kevin Anderson back on Sunday, when he discussed how the circumstances provide an opportunity for data journalists to work with social scientists and use data to test speculated theories, with reference to the data journalism which took place after the 1967 riots in Detroit.

… I’m sure that we’ll see hours of speculation on television and acres of newsprint positing theories. However, theories need to be tested. The Detroit riots showed that a partnership amongst social scientists, foundations, the local community and journalists can prove or disprove these theories and hopefully provide solutions rather than recriminations.

Tags: , , , , , , ,

Similar posts:

Currybet: Michael Blastland on ‘designing for doubt’

June 6th, 2011 | No Comments | Posted by in Data, Editors' pick, Events, Journalism

Guardian lead information architect Martin Belam has got his excellent Currybet blog back up and running after a short break. He has a post up today about April’s London IA event, featuring writer and statistician Michael Blastland.

Martin and I saw Michael speak at a Media Standards Trust event in March, where he spoke about the potential pitfalls in reporting crime statistics. At the London IA event he gave a talk entitled “designing for doubt”, continuing to argue that journalists, and politicans, make a very poor job of working with numbers.

He illustrated his talk with several case studies, showing how easy it was to manipulate numbers. One was the impact of an education programme on the rate of teenage pregnancies in the Orkney Islands. A selective graph seemed to show dramatic results, with the incidence of youth pregnancy slashed. A more detailed look at the numbers revealed the fundamental truth of Michael Blastland’s simple but common sense message:

“Numbers go up and down. And sometimes stay the same.”

Women are not, he pointed out, queuing up on the Orkneys to get pregnant at a nicely regular rate to please statisticians. With a low sample size there are always likely to be wide fluctuations in the numbers of pregnant teenagers from year to year.

See the full post on Currybet.net at this link.

I blogged on another session at the MST event, about crowdsourcing: From alpha users to a man in Angola: Adventures in crowdsourcing and journalism

Tags: , , , ,

Similar posts:

#bbcsms: A round-up of the best blogs on the BBC Social Media Summit

Various delegates from the BBC Social Media Summit last week have spent the weekend writing blog posts reflecting on the two-day event.

If you are looking for a concise round-up of the main points of the day, go to Martin Belam’s notes from the BBC Social Media Summit.

He explains Al Jazeera‘s defence of criticism it received for being part of the story of the Arab uprisings, not just reporting it. He also reports that the New York Times is to experiment with its Twitter feed so that it becomes “a fully human experience without the automated headlines being pumped through it”.

If you want more detail, see Adam Tinworth’s series of live blogs, like this one on the session on technology and innovation.

Dave Wyllie also provides a good session-by-session summary in his core values post. He also reflects:

I left with the feeling that journalism is moving at great speed with some promising entrepreneurs and future figures emerging in their own startups. The rest are working in established businesses or broadcasting.

It’s UK based print I’m worried about, many didn’t even turn up. Maybe they didn’t get the invite or maybe they thought we were full of shit.

The most thought-provoking blog is from Mary Hamilton in her blog #bbcsms: what I learned about ego, opinion, art and commerce. She takes up the repeated use of the term ‘mainstream’.

Perhaps a more honest hashtag would be #bbcmsmsms. But it’s also telling: those who were invited to participate, and thus set the agenda and drive change, were not social media people from the Sun, or from Archant’s local divisions, or from the Financial Times. Of course it’s easier for organisations working with likeminded people to reach a consensus, but in doing so we miss the chance to learn from people outside the echo chamber.

So, like Wyllie, Hamilton also notes the absence of the UK regional news organisations. She goes on to say that issues raised may have been different if they had been there.

Esra Dogramaci of Al Jazeera faced some very hostile questioning on the topic of training people to use citizen journalism tools. Will Perrin of Talk About Local did not. Of course there are hundreds of reasons why the responses were different – not least the potential harm that people in Arabic dictatorships can come to as a result of doing journalism – but one of them is territory. Al Jazeera is invading the “mainstream”. Talk About Local is invading the regional space. If there had been many Archant, Johnston or Trinity Mirror folks there, I think Will would have faced some tricky interrogation too.

She makes some interesting points on the ‘fight to be first’:

There’s still significant opposition to this notion from both individual journalists and news organisations. We fear being scooped. Outside the financial trade press, where being first by a few seconds can move markets, the business model of being first is largely an illusion. In fact, the business model is in being the most widely read, and being first is no longer a guarantee that you will gather the most eyeballs for your effort.

The fight to be first stifles innovation, because it erases partner contributions. Traditional media have always done this with stories. Now we are seeing it with innovations, too – even with innovative ways of using familiar tools. The NYT can commit to their experiment of turning off the auto-feed on their Twitter account; this isn’t new, and it’s in part because other news organisations have succeeded that the NYT can experiment without too much fear of failure.

At the end of the day, Alan Rusbridger claimed that the Guardian invented live-blogging. That stakes a claim, draws a line around an innovation that is simply a new way of using a tool, that has existed for nearly as long as the tool has existed. And suddenly, we are fighting over the origin of the thing, rather than celebrating its existence and finding new ways to use it. Suddenly it’s all about the process, about who scooped who, not about the meaning of the events themselves.

Round and round we go.

In his post #bbcsms and the ethic of the link Joseph Stashko discusses circular arguments. He says that one of sessions adopted the wrong starting point:

So when the session titled ‘Can startups compete with mainstream media?’ began I was somewhat puzzled.

The discussion that followed was very good, but the question was framed in the wrong way. It attempted to compare two different things. They shouldn’t be looking to compete with each other, because it takes us back to a bloggers vs journalists style debate again – the two should look to complement each other rather than compete.

It’s a mindset which seemed to be uncomfortably pervasive throughout the day. As someone remarked to me afterwards “I thought we were over that sort of debate…apparently not”.

He goes on to say:

In 2011 I don’t think we should be asking the questions that are based around what the roles of startups and mainstream media are. Mainstream media have recognisable brands, huge manpower, contacts, prestige and reach. Startups are more nimble, can specialise easily and can get things done quicker.

When I want to start work on a new project, I don’t identify someone who can do things that I can’t and then try and learn all their skills myself – I ask them to come and help me. It’s madness that we’re still having to debate this, but possibly appropriate given that it was held at the BBC.

He asks three questions of the point of such conferences:

How many more case studies of Twitter do we really need?
How many more examples of how you can harness the wisdom of crowds?
And how many more discussions about the futility of mainstream media building their own versions of existing services rather than employing the ethic of the link to connect people to knowledge?

The Media Blog also asks a question in its post journalism, is it ever ‘just a numbers game’? Here it’s worth noting Wyllie’s summary of the session which explains that “the room seemed to divide into two camps: live by your stats to influence your content OR ignore stats for they are perverse and influence you in the wrong ways”.

The Media Blog takes the example of the Daily Mail’s website.

And while it is difficult to cast either extreme of the Mail’s split personality as quality journalism, it is clear that simply chasing clicks with pics and key words is not. For example, a Google search for US socialite and ‘home movie’ star “Kim Kardashian” on the Daily Mail website returns 186,000 results. A search for “Kim Kardashian”+”bikini” returns just 1,000 fewer – 185,000 results – which is still more than results for “David Cameron” and “Gordon Brown” put together.

But asking if journalism and web traffic is ‘just a numbers game’ the post acknowledges that not all stories generate hundreds – let alone hundreds of thousands – of clicks and questions the “business sense” of editorial decisions in only selecting stories which generate hits which “is to assume that all important news would also have the good grace to be popular news”.

Publishers just need to remember the subtle differences between getting more readers to their content and producing content purely to bring in more readers. Somewhere between the two lies a dividing line marked ‘quality journalism’.

So what about the future? Mary Hamilton suggests an opening up:

We need people who take elements not just from journalism but also from other areas: user experience design, anthropology, web culture, psychology, history, games, literature, art, statistics. We need to interrogate journalism with tools outside the journalistic sphere; we need not just to borrow from other disciplines but exchange with them.

And comment below Hamilton’s post expands this further:

Your last point is a valid, and reflects what I took out of the day; innovators and non-mainstream thinkers are looking to be involved, traditional outlets are sitting back and waiting for invites. They should be the ones sending out the innovations.

“With capability comes responsibility”, I believe was one of the finer quotes of the day.

Tags: , , , , , , ,

Similar posts:

From alpha users to a man in Angola: Adventures in crowdsourcing and journalism

Yesterday’s Media Standards Trust data and news sourcing event presented a difficult decision early on: Whether to attend “Crowdsourcing and other innovations in news sourcing” or “Open government data, data mining, and the semantic web”. Both sessions looked good.

I thought about it for a bit and then plumped for crowdsourcing. The Guardian’s Martin Belam did this:

Belam may have then defied a 4-0 response in favour of the data session, but it does reflect the effect of networks like Twitter in encouraging journalists – and others – to seek out the opinion or knowledge of crowds: crowds of readers, crowds of followers, crowds of eyewitnesses, statisticians, or anti-government protestors.

Crowdsourcing is nothing new, but tools like Twitter and Quora are changing the way journalists work. And with startups based on crowdsourcing and user-generated content becoming more established, it’s interesting to look at the way that they and other news organisations make use of this amplified door-to-door search for information.

The MST assembled a pretty good team to talk about it: Paul Lewis, special projects editor, the Guardian; Paul Bradshaw, professor of journalism, City University and founder of helpmeinvestigate.com; Turi Munthe, founder, Demotix; and Bella Hurrell, editor, BBC online specials team.

From the G20 protests to an oil field in Angola

Lewis is perhaps best known for his investigation into the death of Ian Tomlinson following the G20 protests, during which he put a call out on Twitter for witnesses to a police officer pushing Tomlinson to the ground. Lewis had only started using the network two days before and was, he recalled, “just starting to learn what a hashtag was”.

“It just seemed like the most remarkable tool to share an investigation … a really rich source of information being chewed over by the people.”

He ended up with around 20 witnesses that he could plot on a map. “Only one of which we found by traditional reporting – which was me taking their details in a notepad on the day”.

“I may have benefited from the prestige of breaking that story, but many people broke that story.”

Later, investigating the death of deportee Jimmy Mubenga aboard an airplane, Lewis again put a call out via Twitter and somehow found a man “in an oil field in Angola, who had been three seats away from the incident”. Lewis had the fellow passenger send a copy of his boarding pass and cross-checked details about the flight with him for verification.

But the pressure of the online, rolling, tweeted and liveblogged news environment is leading some to make compromises when it comes to verifying information, he claimed.

“Some of the old rules are being forgotten in the lure of instantaneous information.”

The secret to successful crowdsourcing

From the investigations of a single reporter to the structural application of crowdsourcing: Paul Bradshaw and Turi Munthe talked about the difficulties of basing a group or running a business around the idea.

Among them were keeping up interest in long-term investigations and ensuring a sufficient diversity among your crowd. In what is now commonly associated with the trouble that WikiLeaks had in the early days in getting the general public to crowdsource the verification and analysis of its huge datasets, there is a recognised difficulty in getting people to engage with large, unwieldy dumps or slow, painstaking investigations in which progress can be agonisingly slow.

Bradshaw suggested five qualities for a successful crowdsourced investigation on his helpmeinvestigate.com:

1. Alpha users: One or a small group of active, motivated participants.

2. Momentum: Results along the way that will keep participants from becoming frustrated.

3. Modularisation: That the investigation can be broken down into small parts to help people contribute.

4. Publicness: Publicity vía social networks and blogs.

5. Expertise/diversity: A non-homogenous group who can balance the direction and interests of the investigation.

The wisdom of crowds?

The expression “the wisdom of crowds” has a tendency of making an appearance in crowdsourcing discussions. Ensuring just how wise – and how balanced – those crowds were became an important part of the session. Number 5 on Bradshaw’s list, it seems, can’t be taken for granted.

Bradshaw said that helpmeinvestigate.com had tried to seed expert voices into certain investigations from the beginning, and encouraged people to cross-check and question information, but acknowledged the difficulty of ensuring a balanced crowd.

Munthe reiterated the importance of “alpha-users”, citing a pyramid structure that his citizen photography agency follows, but stressed that crowds would always be partial in some respect.

“For Wikipedia to be better than the Encyclopaedia Britannica, it needs a total demographic. Everybody needs to be involved.”

That won’t happen. But as social networks spring up left, right, and centre and, along with the internet itself, become more and more pervasive, knowing how to seek out and filter information from crowds looks set to become a more and more important part of the journalists tool kit.

I want to finish with a particularly good example of Twitter crowdsourcing from last month, in case you missed it.

Local government press officer Dan Slee (@danslee) was sat with colleagues who said they “didn’t get Twitter”. So instead of explaining, he tweeted the question to his followers. Half an hour later: hey presto, he a whole heap of different reasons why Twitter is useful.

Tags: , , , , , , , , , , , , , , , , , , , ,

Similar posts:

A look at the Guardian Hacks SXSW event

The Guardian played host to designers, developers and journalists at the weekend for its “Guardian Hacks SXSW” event. (The raw data reveals that there were 82 developers, 12 girls and 12 ‘full beards’, among other things.)

Guardian information architect Martin Belam takes a look at some of the day’s hacks on his blog:

The hack that appeared to draw the most gasps from the assembled journalists in the room, and consequently won, was Articlr, which was presented by Jason Grant. It was a back-end tool for easily monitoring social media and rival coverage of a story in real-time, and then simply dragging-and-dropping elements from external sites into a story package. With a bit of geo-location goodness thrown in. I fully expect the feature request to be on my Guardian desk by about 11am this morning…

Plus you can see full coverage from the Guardian at this link and related Twitter goings on using the #gsxsw hashtag.

Tags: , , , ,

Similar posts:

Martin Belam: My favourite comment spam

We get at lot of spam comments here at Journalism.co.uk. Some of them are uplifting:

This is really good stuff for me. Must admit that you are one of the coolest bloggers I’ve ever seen.

That kind of comment makes for a great start to the morning.

Some of them are confusing, but still basically positive:

A person essentially help to make seriously articles I would state. This is the first time I frequented your web page and thus far? I surprised with the research you made to create this particular publish amazing. Wonderful job!

We did do a lot of research to make that particular publish amazing, so it’s nice to have it recognised.

Some of them contain constructive criticism:

I have read your article, and I think that it’s a little bit biased. (maybe its just me.) Hmm… Maybe next time try be more objective, I know it’s hard to be good journalist, but it worth it.

That kind of thing reminds us about trying to be good journalists every day.

We are, of course, not the only ones to benefit from such feedback and guidance.

Guardian information architect Martin Belam has written about this kind of spam in the past, which uses these inane comments to try and sneak through links for SEO purposes. Belam posted some of his own favourite spam today with some responses, and its pretty funny.


“Alistair conditioning” is my favourite concealed keyword ever.


See the full post on currybetdotnet at this link.

Tags: , , ,

Similar posts:

Martin Belam: The death of RSS? Not at the Guardian

January 5th, 2011 | No Comments | Posted by in Data, Editors' pick, Online Journalism

In this post on his Currybet.net blog Martin Belam responds to discussions about the future of RSS feeds. While feeds may remain a niche tool, the latest CMS release at the Guardian, where Belam works as an information architect, sees links to RSS feeds made much more easy to find, he says.

Previously we didn’t automatically link to an RSS feed from an individual article page. This was because articles could ‘belong’ to various different areas of the site, and so it wasn’t always obvious which RSS feed should be chosen as the parent. This blog post of mine, for example, ‘appeared’ on the Open Platform blog, the Datablog, and in the Technology and Politics sections.

We’ve just changed that in release 103 of our CMS, in response to a request on our new Developer Blog. Now in the <HEAD> of our articles you’ll get an auto-discovery link to all of the related keyword feeds.

Tags: , ,

Similar posts:

Currybet: What open government data giveth, closed state data taketh away

November 25th, 2010 | No Comments | Posted by in Data, Editors' pick

Government information architect Martin Belam has an interesting post about some of the limitations of the recent government data release, particularly the difficulty of – and cost associated with – cross-referencing the data with Companies House records.

Using the Guardian’s data explorer tool, you can get a comprehensive list of suppliers. Wouldn’t it be wonderful if you could instantly cross-reference that with the records at Companies House?

I’d love to be able to get an instant snapshot of how many of these companies are large, medium or small enterprises. Over time you could use that to measure whether the intention to open up Government service tendering to wider competition was on track or not.

Full post at this link…

Tags: , , , ,

Similar posts:

Why the US and UK are leading the way on semantic web

Following his involvement in the first Datajournalism meetup in Berlin earlier this week, Martin Belam, the Guardian’s information architect, looks at why the US and UK may have taken the lead in semantic web, as one audience member suggested on the day.

In an attempt to try and answer the question, he puts forward four themes on his currybet.net blog that he feels may play a part. In summary, they are:

  • The sharing of a common language which helps both nations access the same resources and be included in comparative datasets.
  • Competition across both sides of the pond driving innovation.
  • Successful business models already being used by the BBC and even more valuably being explained on their internet blogs.
  • Open data and a history of freedom of information court cases which makes official information more likely to be made available.

On his full post here he also has tips for how to follow the UK’s lead, such as getting involved in hacks and hackers type events.

Tags: , , , ,

Similar posts:

© Mousetrap Media Ltd. Theme: modified version of Statement