Tag Archives: James Cridland

Tweet, Like and Google +1 buttons: lessons in privacy

There are two articles that are essential reading for anyone who has a news site or blog, and interesting to anyone who cares about the data they are sharing online.

It is something we have written about in the past: Like and Tweet buttons – what news sites need to know about dropped cookies.

The first is this excellent article by James Cridland, managing director of Media UK. In his post “It’s a matter of privacy” he explains why his site has stripped out code and moved away from the official Twitter and Facebook buttons.

Whenever you see a tweet button, that means that site owner has added a small piece of code from Twitter onto their page. Load the page, and, whether you like it or not, Twitter is aware that someone has just loaded that page. If you’re signed in to Twitter, Twitter know that you’ve visited it. You don’t have to hit the tweet button or do aything else.

The same goes for the Facebook like button. Any page which uses it loads code from Facebook: and if you’re logged in (or even if you’re not), Facebook knows that you’ve seen that page – regardless of whether you click on the like button.

And the same goes for the Google +1 button. While there’s no evidence that Google Analytics knows who you are even if you are signed into your Google Account, Google +1 certainly does. Once more, simply by loading a page with a Google +1 button on it, you signal back to Google that you’ve looked at that page.

Cridland also points out that the collection of data slows the page loading time too.

Privacy is also a theme also taken up by the Guardian in the article, which first appeared on developer Adrian Short’s blog headlined “Why Facebook’s new Open Graph makes us all part of the web underclass“.

Short argues that by relying on social media sites business, including news sites, are poor tenants ruled by the whims our rich landlords. He too discussed how all social media sites pose privacy questions to sites and illustrates why Facebook, which launched a new type of Open Graph apps last week, is worth studying.

Facebook’s abuse of its Like button to invade people’s privacy is much less publicised. We all think we know how it works. We’re on a website reading an interesting page and we click the Like button. A link to the page gets posted to our wall for our friends to see and Facebook keeps this data and data about who clicks on it to help it to sell advertising. So far, so predictable.

What most people don’t know is that the Like button tracks your browsing history. Every time you visit a web page that displays the Like button, Facebook logs that data in your account. It doesn’t put anything on your wall, but it knows where you’ve been. This happens even if you log out of Facebook. Like buttons are pretty much ubiquitous on mainstream websites, so every time you visit one you’re doing some frictionless sharing. Did you opt in to this? Only by registering your Facebook account in the first place. Can you turn it off? Only by deleting your account. (And you know how easy that is.)

The article goes on to explain that most users accept the dropping of cookies and the collection of data as a necessary part of browsing. However, Short highlights an important point:

What Facebook is doing is very different. When it records our activity away from the Facebook site it’s a third party to the deal. It doesn’t need this data to run its own services. Moreover, Facebook’s aggregation and centralisation of data across all our disparate fields of activity is a very different thing from our phone company having our phone data and our bank having our finances. Worst of all, the way Facebook collects and uses our data is both unpredictable and opaque. Its technology and policies move so quickly you’d need to be a technical and legal specialist and spend an inordinate amount of time researching Facebook’s activities on an ongoing basis to have any hope of understanding what they’re doing with your data.

Short recognises that business – including news sites – rely on social media for their success. And he doesn’t offer any solutions.

Perhaps the first step is to follow BBC News and Media UK in using unofficial Twitter and Facebook buttons.

Update: The Next Web has today (27 September) published a post stating that Facebook has confirmed is collects data from Like buttons.

The post states:

Facebook has confirmed that the way it collects information from its users may result in the transmission of user data from third-party websites, even when they are logged out, but has asked for users to trust the company and will fix a total of three cookie-related issues within the next 24 hours.

London riots: Five ways journalists used online tools

7 Replies

Since riots started in London on Saturday, 6 August, journalists – and many non-journalists, who may or may not think of themselves as citizen reporters – have been using a variety of online tools to tell the story of the riots and subsequent cleanup operation.

Here are five examples:

1. Maps

James Cridland, who is managing director of Media UK, created a Google Map – which has had more than 25,000 views.

Writing on his blog (which is well worth a read), Cridland explains how and why he verified the locations of riots before manually adding reports of unrest to his map one by one.

I realised that, in order for this map to be useful, every entry needed to be verified, and verifiable for others, too. For every report, I searched Google News, Twitter, and major news sites to try and establish some sort of verification. My criteria was that something had to be reported by an established news organisation (BBC, Sky, local newspapers) or by multiple people on Twitter in different ways.

Speaking to Journalism.co.uk, he explained there was much rumour and many unsubstantiated reports on Twitter, particularly about Manchester where police responded by repeatedly announcing they had not had reports of copycat riots.

A lot of people don’t know how to check and verify. It just shows that the editor’s job is still a very safe one.

Hannah Waldram, who is community co-ordinator at the Guardian, “used Yahoo Pipes, co-location community tools and Google Maps to create a map showing tweets generated from postcode areas in London during the riots”. A post on the OUseful blog explains exactly how this is done.

Waldram told Journalism.co.uk how the map she created last night works:

The map picks up on geotagged tweets using the #Londonriots hashtag in a five km radium around four post code areas in London where reports of rioting were coming in.

It effectively gives a snapshot of tweets coming from a certain area at a certain time – some of the tweets from people at home watching the news and some appearing to be eyewitness reports of the action unfolding.

2. Video

Between gripping live reporting on Sky News, reporter Mark Stone uploaded footage from riots in Clapham to YouTube (which seems to have inspired a Facebook campaign to make him prime minister).

3. Blogs

Tumblr has been used to report the Birmingham riots, including photos and a statement from West Midlands Police with the ‘ask a question’ function being put to hugely effective use.

4. Curation tools

Curation tools such as Storify, used to great effect here by Joseph Stashko to report on Lewisham; Storyful, used here to tell the story of the cleanup; Bundlr used here to report the Birmingham riots, and Chirpstory, used here to show tweets on the unravelling Tottenham riots, have been used to curate photos, tweets, maps and videos.

5. Timelines

Channel 4 News has this (Flash) timeline, clearly showing when the riots were first reported and how unrest spread. Free tools such as Dipity and Google Fusion Tables (see our how to: use Google Fusion Tables guide) can be used to create linear (rather than mapped) timelines.

If you have seen any impressive interactive and innovative coverage of the riots please add a link to the comments below.

James Cridland: TalkSport web traffic soars

Linking data and journalism: what’s the future?

5 Replies

On Wednesday (September 9), Paul Bradshaw, course director of the MA Online Journalism at Birmingham City University and founder of HelpMeInvestigate.com, chaired a discussion on data and the future of journalism at the first London Linked Data Meetup. This post originally appeared on the OnlineJournalismBlog.

The panel included: Martin Belam (information architect, the Guardian; blogger, Currybet; John O’Donovan (chief architect, BBC News Online); Dan Brickley (Friend of a Friend project; VU University, Amsterdam; SpyPixel Ltd; ex-W3C); Leigh Dodds (Talis).

“Linked Data is about using the web to connect related data that wasn’t previously linked, or using the web to lower the barriers to linking data currently linked using other methods.” (http://linkeddata.org)

I talked about how 2009 was, for me, a key year in data and journalism – largely because it has been a year of crisis in both publishing and government. The seminal point in all of this has been the MPs’ expenses story, which both demonstrated the power of data in journalism, and the need for transparency from government. For example: the government appointment of Sir Tim Berners-Lee, the search for developers to suggest things to do with public data, and the imminent launch of Data.gov.uk around the same issue.

Even before then the New York Times and Guardian both launched APIs at the beginning of the year, MSN Local and the BBC have both been working with Wikipedia and we’ve seen the launch of a number of startups and mashups around data including Timetric, Verifiable, BeVocal, OpenlyLocal, MashTheState, the open source release of Everyblock, and Mapumental.

Q: What are the implications of paywalls for Linked Data?
The general view was that Linked Data – specifically standards like RDF [Resource Description Format] – would allow users and organisations to access information about content even if they couldn’t access the content itself. To give a concrete example, rather than linking to a ‘wall’ that simply requires payment, it would be clearer what the content beyond that wall related to (e.g. key people, organisations, author, etc.)

Leigh Dodds felt that using standards like RDF would allow organisations to more effectively package content in commercially attractive ways, e.g. ‘everything about this organisation’.

Q: What can bloggers do to tap into the potential of Linked Data?
This drew some blank responses, but Leigh Dodds was most forthright, arguing that the onus lay with developers to do things that would make it easier for bloggers to, for example, visualise data. He also pointed out that currently if someone does something with data it is not possible to track that back to the source and that better tools would allow, effectively, an equivalent of pingback for data included in charts (e.g. the person who created the data would know that it had been used, as could others).

Q: Given that the problem for publishing lies in advertising rather than content, how can Linked Data help solve that?
Dan Brickley suggested that OAuth technologies (where you use a single login identity for multiple sites that contains information about your social connections, rather than creating a new ‘identity’ for each) would allow users to specify more specifically how they experience content, for instance: ‘I only want to see article comments by users who are also my Facebook and Twitter friends.’

The same technology would allow for more personalised, and therefore more lucrative, advertising. John O’Donovan felt the same could be said about content itself – more accurate data about content would allow for more specific selling of advertising.

Martin Belam quoted James Cridland on radio: ‘[The different operators] agree on technology but compete on content’. The same was true of advertising but the advertising and news industries needed to be more active in defining common standards.

Leigh Dodds pointed out that semantic data was already being used by companies serving advertising.

Other notes
I asked members of the audience who they felt were the heroes and villains of Linked Data in the news industry. The Guardian and BBC came out well – The Daily Mail were named as repeat offenders who would simply refer to ‘a study’ and not say which, nor link to it.

Martin Belam pointed out that the Guardian is increasingly asking itself ‘how will that look through an API?’ when producing content, representing a key shift in editorial thinking. If users of the platform are swallowing up significant bandwidth or driving significant traffic then that would probably warrant talking to them about more formal relationships (either customer-provider or partners).

A number of references were made to the problem of provenance – being able to identify where a statement came from. Dan Brickley specifically spoke of the problem with identifying the source of Twitter retweets.

Dan also felt that the problem of journalists not linking would be solved by technology. In conversation previously, he also talked of ‘subject-based linking’ and the impact of SKOS [Simple Knowledge Organisation System] and linked data style identifiers. He saw a problem in that, while new articles might link to older reports on the same issue, older reports were not updated with links to the new updates. Tagging individual articles was problematic in that you then had the equivalent of an overflowing inbox.

Finally, here’s a bit of video from the very last question addressed in the discussion (filmed with thanks by @countculture):

Linked Data London 090909 from Paul Bradshaw on Vimeo.

Resources:

A Skim-Read Introduction to Linked Data
Linked Data: The Story So Far (PDF) by Tom Heath, Christian Bizer and Berners-Lee
Sir Tim Berners-Lee at TED.

JamesCridland: BBC Radio 4 reaching out

Editors Blog | Journalism.co.uk

Online journalism news

Tag Archives: James Cridland

James Cridland: TalkSport web traffic soars

Linking data and journalism: what’s the future?