Category Archives: Handy tools and technology

Editors’ Weblog: Auto-translation problems at La Tribune

A cost-saving measure of automatic translation is produces ‘some confusing results’ for La Tribune, the Editors’ Weblog reports (via the AP).

“In a bid to increase its international audience, the French business newspaper La Tribune has begun using software to translate its website into English, German, Spanish and Italian. Unfortunately for the paper, the cost-saving measure of automatic translation produces some confusing results.

“A current headline on the English-language site reads: “The United States: confidence of the consumers in Bern, reduced trade deficit,” which appears to make a serious error in geography. What do American consumers have to do with the Swiss capital?”

Full story at this link…

Bill Thompson (@billt) on two cultures: those literate in code and everyone else

Bill Thompson, well-known for the BBC World Service programme Digital Planet, and his pieces for the BBC (e.g) gave a  version of his ‘Two Cultures’ speech [which he first made in Cambridge on May 27] at OpenTech in London last Saturday. It was billed like this:

“It’s fifty years since CP Snow’s famous lecture on the Two Cultures – science and literature. We seem to have a different divide these days, between ‘people like us’ and the rest. What might be done about this?”

Thompson (@billt on Twitter) believes that computer literacy should mean more than word processing, a sentiment that seemed to go down well in the hall. You can read more about his views in this BBC article: “We don’t need a nation of programmers, but we do need to be confident that everyone knows what programmers do and what programs look like.”

Richard Elen (@Brideswell) filmed it, and has helpfully shared this video on the Bridewell Associates Blog. So if you weren’t there, sit back and enjoy some glorious geekery; even the intro includes a joke about writing in binary (his title for his speech is the ’10 cultures’)…

Bill Thompson on “The Two Cultures Problem”: OpenTech 2009 from Richard Elen on Vimeo.

A guide to newspapers on Twitter

National newspapers have a total of 1,068,898 followers across their 120 official Twitter accounts – with the Guardian, Times and FT the only three papers represented in the top ten.

The Guardian’s the clear winner, as @GuardianTech’s place on Twitter’s Suggested User List means it has 831,935 followers – 78 per cent of the total. @GuardianNews is 2nd with 25,992, @TimesFashion 3rd with 24,762 and @FinancialTimes 4th with 19,923.

Complete list of national newspaper Twitter accounts

Other findings:

  • Glorified RSS Out of 121 accounts, just 19 do something other than running as a glorified RSS feed. The other 114 do no retweeting, no replying to other tweets etc. (The 19 are the ones with a blue background in their URL and a yes in the last column).
  • No following. They don’t do much following. Leaving GuardianTech out of it, there are 236,963 followers of these accounts, but they follow just 59,797. Are newspapers bringing their no-linking-out approach to Twitter? Or is it just because they’re pumping RSS feeds straight to Twitter, and therefore see no reason to engage with the community?
  • Rapid drop-off There are only six Twitter accounts with more than 10,000 followers. I suspect many of these accounts are invisible to most people as the newspapers aren’t engaging much – no RTing of other people’s tweets means those other people don’t have an obvious way to realise the newspaper accounts exist.
  • Sun and Mirror are laggards The Sun and Mirror have a lot of work to do – they have few accounts with any followers. And they don’t promote their Twitter accounts on their sites. The Mail only seems to have one account but it is the 20th largest in terms of followers.

More on newspaper Twitter accounts:

Some papers publish lists of their Twitter accounts:

Other useful places:

  • Newspaper people on Twitter from mediaUK
  • Newspaper titles on Twitter (inc local) from mediaUK
  • Twitian – a list of people at the Guardian who use Twitter (and their latest tweets), created by Paul Carvill.
  • #followjourn – a daily recommendation service from Journalism.co.uk.

This post originally appeared on MacolmColes.co.uk.

4iP Blog: 4iP invests in Newspaper Club

4ip, the investment fund for public service media backed by Channel 4, has announced new funding for Newspaper Club – a service which allows users to create their own printed edition from any rights-cleared content online.

How will it make money? By asking for a small portion of the printing price and selling bigger packages to corporate clients – the BBC is already using the system for an internal newsletter.

Full post at this link…

PageSuite lands 40 title publishing deal with bizjournals

UK based digital publishers PageSuite have landed a deal with America’s largest publisher of metropolitan business newspapers, American City Business Journals. The company will launch of all 40 of their bizjournal titles online, using PageSuite as their provider.

Bizjournals cover 40 industries, distributed across 41 cities and their websites have more than eight million unique visitors per month. “Our production teams have found the software to be very flexible and user friendly,” commented Eric Mick from bizjournals, in a release.

PageSuite already publishes digital editions for some of the world’s biggest newspapers including Metro Canada, Metro UK & Ireland, San Francisco Examiner, Express Newspapers, The Guardian Weekly & Brazil’s largest daily newspaper Zero Hora.

ReportingOn – end of Knight News funding and the next stage

Back in February last year, Journalism.co.uk caught up with Ryan Sholin, director of News Innovation at Publish2, about his project ReportingOn, which had received funding from the Knight News Challenge.

“I call ReportingOn ‘the backchannel for your beat,” Sholin told us.

“This isn’t about the craft of journalism – this is about the nuts and bolts of finding angles, sources, and data to bolster local news reporting.”

Today the funding from the initiative comes to an end, but that doesn’t mean ReportingOn will – in fact Sholin is gearing up for the launch of version 2.0, which will see the base of the platform, built using Django, go OpenSource

You can follow developments with the project on its blog or Twitter account. But we thought it was time for an update from the man himself:

What changes are being made that will affect the user?
[Sholin] It’s an absolute re-imagining of the network.  The first time out, I built it to be quite Twitter-esque in the hopes that journalists would use it like Twitter, asking questions of their followers and sharing ideas about stories they were working on.  That didn’t happen organically, or if it was going to, it was going to take years.

So, with the help of a professional development and design team, we’ve rebuilt the site from the ground up, framed around the act of asking and answering questions.  There’s no 140-character limit, but what you will find are lots of basic features that make sense in this sort of social network.  You can ‘watch’ users, beats, or a particular question, viewing everything in an activity feed that brings you the latest questions and answers from the journalists, topics, and particular issues you’re interested in. [See Sholin’s demo of the service as it stood on June 17 below]

Why was it necessary to make these changes?
Although the first version of ReportingOn was a great proof of concept, a fun experiment, and a solid first iteration of the network, doing all the development myself didn’t produce a feature-complete, extensible codebase that I could open-source and let the community build on.  I wanted to take the next step to develop a backchannel for beat reporters that could be used as is, or reproduced as a question & answer tool for any purpose, especially by a news organization.

Has this involved significant amounts of back-end work/technological change?
Most definitely.  The site has been completely rebuilt.  It’s still built on the Django platform, but rather than me teaching myself this style of programming in the middle of the night and at the crack of dawn to demonstrate what one curious journalist might be capable of, it was built by the professional team at Lion Burger, who are also responsible for tools like Snipt.net and recently built afeedapart.com for the popular ‘An Event Apart’ series of Web design conferences in the US.

BBC executives’ expenses: the links. Now play!

You’ll find the information tucked away on the BBC Freedom of Information site at this link.

Update: Journalism.co.uk wasted a little bit of its time getting the annoyingly inaccessible BBC PDFs into spreadsheet format, but knew that the Guardian’s data people would be doing that too. So we’ll get back to other duties while you have fun with this from the Guardian’s DataBlogDATA: download the full spreadsheet of BBC executive expenses.

More to follow from Journalism.co.uk, but in the meantime the links you’ll need if you want to play yourself. Some files are missing – BBC Information informs us that there are more to come.

And the individuals:

Mark Thompson’s expenses 2008/09 PDF (63KB)
Mark Thompson’s expenses 2007/08 PDF (47KB)
Mark Thompson’s expenses 2006/07 PDF (48KB)
Mark Thompson’s expenses 2005/06 PDF (44KB)
Mark Thompson’s expenses 2004/05 PDF (42KB)

Mark Byford’s expenses 2008/09 PDF (41KB)
Mark Byford’s expenses 2007/08 PDF (41KB)
Mark Byford’s expenses 2006/07 PDF (42KB)
Mark Byford’s expenses 2005/06 PDF (41KB)
Mark Byford’s expenses 2004/05 PDF (40KB)

Jana Bennett’s expenses 2008/09 PDF (47KB)
Jana Bennett’s expenses 2007/08 PDF (48KB)
Jana Bennett’s expenses 2006/07 PDF (48KB)
Jana Bennett’s expenses 2005/06 PDF (48KB)
Jana Bennett’s expenses 2004/05 PDF (51KB)

Tim Davie’s expenses 2008/09 PDF (47KB)
Tim Davie’s expenses 2007/08 PDF (42KB)
Tim Davie’s expenses 2006/07 PDF (47KB)
Tim Davie’s expenses 2005/06 PDF (41KB)

Erik Huggers’ expenses 2008/09 PDF (39KB)

  • Lucy Adams, Director, BBC People – biography to be published shortly

Zarin Patel’s expenses 2008/09 PDF (42KB)
Zarin Patel’s expenses 2007/08 PDF (42KB)
Zarin Patel’s expenses 2006/07 PDF (45KB)
Zarin Patel’s expenses 2005/06 PDF (40KB)
Zarin Patel’s expenses 2004/05 PDF (37KB)

John Smith’s expenses 2008/09 PDF (44KB)
John Smith’s expenses 2007/08 PDF (44KB)
John Smith’s expenses 2006/07 PDF (44KB)
John Smith’s expenses 2005/06 PDF (46KB)
John Smith’s expenses 2004/05 PDF (46KB)

Caroline Thomson’s expenses 2008/09 PDF (50KB)
Caroline Thomson’s expenses 2007/08 PDF (51KB)
Caroline Thomson’s expenses 2006/07 PDF (51KB)
Caroline Thomson’s expenses 2005/06 PDF (45KB)
Caroline Thomson’s expenses 2004/05 PDF (50KB)

And while we’re about it – here’s the link to the Audit Committee standing orders:

And the Register of interests:

MORE:

Expenses:

General:

Travel and transport:

Let the expenses data war commence: Telegraph begins its document drip feed

Andy Dickinson from the Department of Journalism at UCLAN sums up today’s announcement in this tweet: ‘Telegraph to drip-publish MP expenses online’.

[Update #1: Editor of Telegraph.co.uk, Marcus Warren, responded like this: ‘Drip-publish? The whole cabinet at once….that’s a minor flood, I think’]

Yes, let the data war commence. The Guardian yesterday released its ‘major crowdsourcing tool’ as reported by Journalism.co.uk at this link. As described by one of its developers, Simon Willison, on his own blog, the Guardian is ‘crowdsourcing the analysis of the 700,000+ scanned [official] MP expenses documents’. It’s the Guardian’s ‘first live Django-powered application’. It’s also the first time the news site has hosted something on Amazon EC2, he says. Within 90 minutes of launch, 1700 users had ‘audited’ its data, reported the editor of Guardian.co.uk, Janine Gibson.

The Telegraph was keeping mum, save a few teasing tweets from Telegraph.co.uk editor Marcus Warren. A version of its ‘uncensored’ data was coming, but they would not say what and how much.

Now we know a bit more. As well as printing its data in a print supplement with Saturday’s newspaper they will gradually release the information online. As yet, copies of claim forms have been published using Issuu software, underneath each cabinet member’s name. See David Miliband’s 2005-6 expenses here, for example. From the Telegraph’s announcement:

  • Complete records of expense claims made by every Cabinet minister have been published by The Telegraph for the first time.”
  • “In the coming weeks the expense claims of every MP, searchable by name and constituency, will be published on this website.”
  • “There will be weekly releases region by region and a full schedule will be published on Tuesday.”
  • “Tomorrow [Saturday], the Daily Telegraph will publish a comprehensive 68-page supplement setting out a summary of the claims of every sitting MP.”

Details of what’s included but not included in the official data at this link.  “Sensitive information, such as precise home addresses, phone numbers and bank account details, has been removed from the files by the Telegraph’s expenses investigation team,” the Telegraph reports.

So who is winning in the data wars? Here’s what Paul Bradshaw had to say earlier this morning:

“We may see more stories, we may see interesting mashups, and this will give The Guardian an edge over the newspaper that bought the unredacted data – The Telegraph. When – or if – they release their data online, you can only hope the two sets of data will be easy to merge.”

Update #2: Finally, Martin Belam’s post on open and closed journalism (published Thursday 18th) ended like this:

“I think the Telegraph’s bunkered attitude to their scoop, and their insistence that they alone determined what was ‘in the public interest’ from the documents is a marked contrast to the approach taken by The Guardian. The Telegraph are physically publishing a selection of their data on Saturday, but there is, as yet, no sign of it being made online in machine readable format.

“Both are news organisations passionately committed to what they do, and both have a strategy that they believe will deliver their digital future. As I say, I have a massive admiration for the scoop that The Telegraph pulled off, and I’m a strong believer in media plurality. As we endlessly debate ‘the future of news™’ I think both approaches have a role to play in our media landscape. I don’t expect this to be the last time we end up debating the pros and cons of the ‘closed’ and ‘open’ approaches to data driven journalism.”

It has provoked an interesting comment from Ian Douglas, the Telegraph’s head of digital production.

“I think you’re missing the fundamental difference in source material. No publisher would have released the completely unredacted scans for crowdsourced investigation, there was far too much on there that could never be considered as being in the public interest and could be damaging to private individuals (contact details of people who work for the MPs, for example, or suppliers). The Guardian, good as their project is, is working solely with government-approved information.”

“Perhaps you’ll change your mind when you see the cabinet expenses in full on the Telegraph website today [Friday], and other resources to come.”

Related Journalism.co.uk links:

Buzzmachine: Could Google’s Wave be new reporting tool?

Jeff Jarvis ponders the potential of Wave – Google’s next generation email product announced last week (see video below) – as a tool for journalists:

“In Wave, I see more than a new generation of email cum wikis cum Twitter cum groupware. Because it can feed blog and web pages and Twitter, I see a new way to create content, collaborative and live. I see a new way to make news,” he writes.

“Imagine a team of reporters – together with witnesses on the scene – able to contribute photos and news to the same Wave (formerly known as a story or a page). One can write up what is known; a witness can add facts from the scene and photos; an editor or reader can ask questions. And it is all contained under a single address – a permalink for the story – that is constantly updated from a collaborative team.”

Full post at this link…

ReadWriteWeb: CNET signs up for Open Calais

CNET.com will now share data from its technology reviews, news and blog posts on using Thomson Reuters’ Open Calais platform, allowing other publishers to use the information.

According to this report, CNET will publish certain sets of editorial data and some commercial information, for example data on its software download services, using the semantic API.

Signing up to OpenCalais will also enable CNET to generate topic pages.

Full story at this link…