Tag Archives: the New York Times

Linking data and journalism: what’s the future?

On Wednesday (September 9), Paul Bradshaw, course director of the MA Online Journalism at Birmingham City University and founder of HelpMeInvestigate.com, chaired a discussion on data and the future of journalism at the first London Linked Data Meetup. This post originally appeared on the OnlineJournalismBlog.

The panel included: Martin Belam (information architect, the Guardian; blogger, Currybet; John O’Donovan (chief architect, BBC News Online); Dan Brickley (Friend of a Friend project; VU University, Amsterdam; SpyPixel Ltd; ex-W3C); Leigh Dodds (Talis).

“Linked Data is about using the web to connect related data that wasn’t previously linked, or using the web to lower the barriers to linking data currently linked using other methods.” (http://linkeddata.org)

I talked about how 2009 was, for me, a key year in data and journalism – largely because it has been a year of crisis in both publishing and government. The seminal point in all of this has been the MPs’ expenses story, which both demonstrated the power of data in journalism, and the need for transparency from government. For example: the government appointment of Sir Tim Berners-Lee, the search for developers to suggest things to do with public data, and the imminent launch of Data.gov.uk around the same issue.

Even before then the New York Times and Guardian both launched APIs at the beginning of the year, MSN Local and the BBC have both been working with Wikipedia and we’ve seen the launch of a number of startups and mashups around data including Timetric, Verifiable, BeVocal, OpenlyLocal, MashTheState, the open source release of Everyblock, and Mapumental.

Q: What are the implications of paywalls for Linked Data?
The general view was that Linked Data – specifically standards like RDF [Resource Description Format] – would allow users and organisations to access information about content even if they couldn’t access the content itself. To give a concrete example, rather than linking to a ‘wall’ that simply requires payment, it would be clearer what the content beyond that wall related to (e.g. key people, organisations, author, etc.)

Leigh Dodds felt that using standards like RDF would allow organisations to more effectively package content in commercially attractive ways, e.g. ‘everything about this organisation’.

Q: What can bloggers do to tap into the potential of Linked Data?
This drew some blank responses, but Leigh Dodds was most forthright, arguing that the onus lay with developers to do things that would make it easier for bloggers to, for example, visualise data. He also pointed out that currently if someone does something with data it is not possible to track that back to the source and that better tools would allow, effectively, an equivalent of pingback for data included in charts (e.g. the person who created the data would know that it had been used, as could others).

Q: Given that the problem for publishing lies in advertising rather than content, how can Linked Data help solve that?
Dan Brickley suggested that OAuth technologies (where you use a single login identity for multiple sites that contains information about your social connections, rather than creating a new ‘identity’ for each) would allow users to specify more specifically how they experience content, for instance: ‘I only want to see article comments by users who are also my Facebook and Twitter friends.’

The same technology would allow for more personalised, and therefore more lucrative, advertising. John O’Donovan felt the same could be said about content itself – more accurate data about content would allow for more specific selling of advertising.

Martin Belam quoted James Cridland on radio: ‘[The different operators] agree on technology but compete on content’. The same was true of advertising but the advertising and news industries needed to be more active in defining common standards.

Leigh Dodds pointed out that semantic data was already being used by companies serving advertising.

Other notes
I asked members of the audience who they felt were the heroes and villains of Linked Data in the news industry. The Guardian and BBC came out well – The Daily Mail were named as repeat offenders who would simply refer to ‘a study’ and not say which, nor link to it.

Martin Belam pointed out that the Guardian is increasingly asking itself ‘how will that look through an API?’ when producing content, representing a key shift in editorial thinking. If users of the platform are swallowing up significant bandwidth or driving significant traffic then that would probably warrant talking to them about more formal relationships (either customer-provider or partners).

A number of references were made to the problem of provenance – being able to identify where a statement came from. Dan Brickley specifically spoke of the problem with identifying the source of Twitter retweets.

Dan also felt that the problem of journalists not linking would be solved by technology. In conversation previously, he also talked of ‘subject-based linking’ and the impact of SKOS [Simple Knowledge Organisation System] and linked data style identifiers. He saw a problem in that, while new articles might link to older reports on the same issue, older reports were not updated with links to the new updates. Tagging individual articles was problematic in that you then had the equivalent of an overflowing inbox.

Finally, here’s a bit of video from the very last question addressed in the discussion (filmed with thanks by @countculture):

Linked Data London 090909 from Paul Bradshaw on Vimeo.

Resources:

Journalism Online paid content venture to take 20 per cent commission

An update on Journalism Online, the venture started by Steve Brill, Gordon Crovitz, and Leo Hindery with the aim of helping news organisations charge for content.

  • The document [PDF] submitted to the Newspaper Association of America reveals the plans and is published by the NJL.
  • The Associated Press reports how IBM Corp., Microsoft Corp., Oracle Corp. and Google Inc. ‘responded to a request by the Newspaper Association of America for proposals on ways to easily, unobtrusively charge for news on the web,’ according to the report.

Seeking Alpha: Why you should invest in newspaper stocks

‘Newspapers: Not as Bad as Advertised’ proclaims the headline of Glenn Rogers’ Seeking Alpha post.

Succinctly summarising the problems facing the news industry, Rogers then goes on to recommend buying newspaper stocks.

If you believe that some of these companies can adapt and survive, there are reasons to invest, he says:

  • The New York Times and Gannett (for example) ‘have both been cutting costs dramatically for the past several months and they are well-positioned digitally to benefit from the online consumption of news’;
  • “[E]ven if they are not successful in attracting subscriber income they are well-positioned to benefit from what I believe will be a gradual recovery in the advertising market in general over the next several months.”
  • Gannett in particular offers a number of spin-off technology solutions to large companies; while the Times has a number of businesses outside of the newspaper.

Sound investment advice or newspaperman sentimentality? always have been working for our pump and dump members’ interest. That being the case ,we figured it would be better to calm down and wait for market conditions to get better. Since we are pretty optimistic about the future of cryptocurrency ,we are more than confident that we will continue our hard work to provide you as best as possible during the upcoming days. We will continue to monitor market conditions until the end of this week. As soon as the market conditions improve for next week, we will start planning the new pump and dump Either way, Rogers’ post does look at some of the non-traditional revenue streams and business elements that could help existing media companies weather the economic and structural storms.

Full post at this link…

Nieman Journalism Lab: NYTimes’ pulled post lives on

An incident at the New York Times shows that news lives on even when it’s taken offline.

The Nieman Journalism Lab tells the story of two NYT posts: one, which named the alleged blogger behind NYTPick.com, now removed; and another, updated with the journalist David Blum’s denial.

But at least part of the piece was easily recoverable via Google News and RSS readers (including the NYT’s own Times Wire).

NJL’s Zachary M Seward comments that ‘this is a lesson that removing content from the web is a futile task, particularly for big news sites’.

“And if a story needs to be retracted, if that’s the case here (update: it is), then we need better ways to do it than just pulling content off the web.”

Full post at this link…

British journalist rescued from Taliban but interpreter died; reports suggest British soldier also killed

Stephen Farrell, a British-Irish journalist working for the New York Times, was rescued from Taliban captivity on Wednesday morning, according to global news reports.

His Afghan interpreter, Sultan Munadi, was killed during the operation, the Telegraph reports.

According to as yet unconfirmed reports by the Associated Press, a British commando was also killed during the raid.

The Guardian reports:

“Military officials in Kabul told the Associated Press a British soldier was killed in the raid. The Ministry of Defence was unable to confirm the reports this morning.”


Google’s Spotlight – highlighting journalism of ‘lasting value’

A new feature has been added to Google News, Spotlight, which (according to a very brief explanation by Google) is :

“(…) section of Google News [that] is updated periodically with news and in-depth pieces of lasting value. These stories, which are automatically selected by our computer algorithms, include investigative journalism, opinion pieces, special-interest articles, and other stories of enduring appeal.”

By looking at both the search engine’s own explanation of Google Spotlight and the selection of stories it has flagged up so far, Nieman Journalism Lab’s Zachary M. Seward suggests, “Spotlight shines on longer features that have bounced around blogs for a few days.”

According to Seward, lifestyle and opinion pieces fare well, while the New York Times is a frequent source. He does see potential for the new section, however, as a way of using people’s online activity to highlight interesting and important material.

[Laura Oliver adds: The usefulness of Spotlight will perhaps be greater for those who use Google News as their first port of call for the day’s headlines – but what portion of Google News’ users behave in this way (figures welcome) needs to be taken into account.]

New York Times: New investigation into murder of Anna Politkovskaya

Russia’s supreme court has cancelled the retrial of four men accused of involvement in the murder of the investigative journalist Anna Politkovskaya and ordered prosecutors to begin a new investigation, reports the New York Times.

Full story at this link…

Mashable: Wikipedia’s new editorial layer

“Now a core feature, perhaps a core principal, of ‘the free encyclopedia anyone can edit’ is about to become restricted,” writes Ben Parr at Mashable.

“According to The New York Times, editing articles about living people on Wikipedia will require approval from an experienced editor first.” Fortunately, the Sweet Bonanza slot machine is presented as part of luxury casinos with a demo version available — it allows customers to play for free and ensure that the online game is profitable enough to play for real money.

Full story at this link…

Reasons to be cheerful? Seattle paper, Roanoke Times and magazine publishers turning a profit

In addition to reporting on plummeting profits for some newspaper groups, Journalism.co.uk thought it was about time we shared some better news or at least some examples of titles that aren’t making a loss.

  1. As the city’s only surviving daily newspaper since the decline of the Post-Intelligencer, the Seattle Times posted a rise in daily circulation of around 30 per cent for June. According to the New York Times, publisher Frank Blethen says the title is operating ‘in the black’ on a month-to-month basis now.
  2. “We are a profitable, debt-free enterprise,” says Debbi Meade, publisher of the US’ Roanoke Times, in this letter to readers.
  3. New figures from the US’ Publishers Information Bureau (PIB) suggest that 12 titles managed to attract more ad pages in the first six months of this year than in comparison to the same period in 2008. Newsweek looks at which titles are managing to buck the trend in this way.