Tag Archives: semantic web

Why the US and UK are leading the way on semantic web

Following his involvement in the first Datajournalism meetup in Berlin earlier this week, Martin Belam, the Guardian’s information architect, looks at why the US and UK may have taken the lead in semantic web, as one audience member suggested on the day.

In an attempt to try and answer the question, he puts forward four themes on his currybet.net blog that he feels may play a part. In summary, they are:

  • The sharing of a common language which helps both nations access the same resources and be included in comparative datasets.
  • Competition across both sides of the pond driving innovation.
  • Successful business models already being used by the BBC and even more valuably being explained on their internet blogs.
  • Open data and a history of freedom of information court cases which makes official information more likely to be made available.

On his full post here he also has tips for how to follow the UK’s lead, such as getting involved in hacks and hackers type events.

Got trouble swallowing? That’s not a problem with Kamagra Oral Jelly: read article at lowlibido

Editors Weblog: Transforming the web with semantic technology

A post on the editorsweblog.org is looking at the way semantic technology could transform online news searches.

The technology is explained as having the ability to change the internet “from a massive searchable text file into a queryable database”, where related media, or facts, are linked together across independent websites.

For newspapers, semantic technology improves reader engagement by linking together related articles. For readers, that means more context on each story and a more personalized experience. And for advertisers, it means better demographic data than ever before.

Watch the video below, courtesy of editorsweblog.org, for more information on ‘website rules’ being created to make semantic searches more efficient in the future.

Read the full post here…

#ds10: Ultraknowledge – search and visualising the news

Why does search have to produce the same set of results that we always get?

One of Andrew Lyons’, commercial director of Ultraknowledge (UKn), opening questions at the Digital Storytelling conference last week as he talked delegates through UKn’s work with the Independent.

The Independent’s NewsWall, launched in January, is a new way of organising stories and navigating through them. It provides a “visual documentation” of a topic and what’s happened in that subject area. (Similar efforts are being made by Daylife’s technology and the Guardian’s News Zeitgeist.

When searched, the wall will return 30 picture-led stories as results, and figures for dwell time on the wall are proving interesting, said Lyons.

The next part will be the ability to save my search for a topic to my Facebook page and then only have it update when it’s relevant to me.

UKn can now start to produce sponsored NewsWalls around events such as the forthcoming World Cup or general election. It will also be opening up the archive of content available through the Independent’s NewsWall from two years to the full 23 years of its history.

UKn has already worked with other publishers to create more intelligent and visually organised search results pages, such as those produced by an initial search on Metro.co.uk.

But the firm wants to take this a step further, by helping news organisations build topic pages for breaking news items by cleverly tagging and organising archived work, and through its latest – and yet-to-be launched project – StoryTriggers – a way to help journalists and news organisations find new leads and spot breaking news trends.

Sometimes the story that you’re after isn’t on your beat, so how do you find it. But when you’re dealing with news its changing, fast – how do you SEO for this? How do you tag it and relate it to what’s happened in the past and what’s happening in the future? (…) We want to be an innovation lab for publishers.

A history of linked data at the BBC

Martin Belam, information architect for the Guardian and CurryBet blogger, reports from today’s Linked Data meet-up in London, for Journalism.co.uk.

You can read the first report, ‘How media sites can use linked data’ at this link.

There are many challenges when using linked data to cover news and sport, Silver Oliver, information architect in the BBC’s journalism department, told delegates at today’s Linked Data meet-up session at ULU, part of a wider dev8d event for developers.

Initally newspapers saw the web as just another linear distribution channel, said Silver. That meant we ended up with lots and lots of individually published news stories online, that needed information architects to gather them up into useful piles.

He believes we’ve hit the boundaries of that approach, and something like the data-driven approach of the BBC’s Wildlife Finder is the future for news and sport.

But the challenge is to find models for sport, journalism and news

A linked data ecosystem is built out of a content repository, a structure for that content, and then the user experience that is laid over that content structure.

But how do you populate these datasets in departments and newsrooms that barely have the resource to manage small taxonomies or collections of external links, let alone populate a huge ‘ontology of news’, asked Silver.

Silver says the BBC has started with sport, because it is simpler. The events and the actors taking part in those events are known in advance. For example, even this far ahead you know the fixture list, venues, teams and probably the majority of the players who are going to take part in the 2010 World Cup.

News is much more complicated, because of the inevitable time lag in a breaking news event taking place, and there being canonical identifiers for it. Basic building blocks do exist, like Geonames or DBpedia, but there is no definitive database of ‘news events’.

Silver thinks that if all news organisations were using common IDs for a ‘story’, this would allow the BBC to link out more effectively and efficiently to external coverage of the same story.

Silver also presented at the recent news metadata summit, and has blogged about the talk he gave that day, which specifically addressed how the news industry might deal with some of these issues:

How media sites can make use of linked data

Martin Belam, information architect for the Guardian and CurryBet blogger, reports from today’s Linked Data meet-up in London, for Journalism.co.uk.

The morning Linked Data meet-up session at ULU was part of a wider dev8d event for developers, described as ‘four days of 100 per cent pure software developer heaven’. That made it a little bit intimidating for the less technical in the audience – the notices on the rooms to show which workshops were going on were labelled with 3D barcodes, there were talks about programming ‘nanoprojectors’, and a frightening number of abbreviations like RDF, API, SPARQL, FOAF and OWL.

What is linked data?

‘Linked data’ is all about moving from a web of interconnected documents, to a web of interconnected ‘facts’. Think of it like being able to link to and access the relevant individual cells across a range of spreadsheets, rather than just having a list of spreadsheets. It looks a good candidate for being a step-change in the way that people access information over the internet.

What are the implications for journalism and media companies?

For a start it is important to realise that linked data can be consumed as well as published. Tom Heath from Talis gave the example of trying to find out about ‘pebbledash’ when buying a house.

At the moment, to learn about this takes a time-consuming exploration of the web as it stands, probably pogo-sticking between Google search results and individual web pages that may or may not contain useful information about pebbledash. [Image below: secretlondon123 on Flickr]

In a linked data web, finding facts about the ‘concept’ of pebbledash would be much easier. Now, replace ‘pebbledash’ as the example with the name of a company or a person, and you can see how there is potential for journalists in their research processes. A live example of this at work is the sig.ma search engine. Type your name in and be amazed / horrified about how much information computers are already able to aggregate about you from the structured data you are already scattering around the web.

Tom Heath elaborates on this in a paper he wrote in 2008: ‘How Will We Interact with the Web of Data?‘. However, as exciting as some people think linked data is, he struggled to name a ‘whizz-bang’ application that has yet been built.

Linked data at the BBC

The BBC have been the biggest media company so far involved in using and publishing linked data in the UK. Tom Scott talked about their Wildlife Finder, which uses data to build a website that brings together natural history clips, the BBC’s news archive, and the concepts that make up our perception of the natural world.

Simply aggregating the data is not enough, and the BBC hand-builds ‘collections’ of curated items. Scott said ‘curation is the process by which aggregate data is imbued with personalised trust’, citing a collection of David Attenborough’s favourite clips as an example.

Tom Scott argued that it didn’t make sense for the BBC to spend money replicating data sources that are already available on the web, and so Wildlife Finder builds pages using existing sources like Wikipedia, WWF, ZSL and the University of Michigan Museum of Zoology. A question from the floor asked him about the issues of trust around the BBC using Wikipedia content. He said that a review of the content before the project went live showed that it was, on the whole, ‘pretty good’.

As long as the BBC was clear on the page where the data was coming from, he didn’t see there being an editorial issue.

Other presentations during the day are due to be given by John Sheridan and Jeni Tennison from data.gov.uk, Georgi Kobilarov of Uberblic Labs and Silver Oliver from the BBC. The afternoon is devoted to a more practical series of workshops allowing developers to get to grips with some of the technologies that underpin the web of data.

The Media Consortium: Media organisations should share more metadata

Great post here looking at the semantic web and how it will influence the future of  online journalism.

The next phase of the semantic web will be “a step beyond aggregation that aims to makes information more meaningful and useful” and journalists and media organisations can aid this development by sharing metadata more broadly and focus more on users’ long-term experiences of their websites.

Together, such data may be more valuable than if media organisations reserved data for their own purposes. Pooling metadata can help improve artificial intelligence, which drives the automated aspects of discovering new information on the semantic web.

The benefits for media organisations? Better websites for users, the capacity for news to challenge readers and bottom-up rather than top-down approaches to journalism and making meaning, suggests the post.

via The Media Consortium » Radical New Ways of Meaning-Making and Filtering.

[insite] – interview with semantic web expert Brooke Aker

Over at our sister blog insite, the excellent Colin Meek has conducted the second interview in his series on the semantic web.

On the receiving end this time is Brooke Aker, founder of Acuity Software and Cipher Systems, who answers qs on semantic search, the failures of Web 2.0 and uses for Web 3.0.

More on the semantic web can be read in our feature ‘Web 3.0: what it means for journalists’.

Slideshow on ‘Journalists and the Social Web’

Following Colin Meek’s articles for Journalism.co.uk on how journalists can get the most out of the semantic web, below is Colin’s presentation from Saturday’s seminar in Oslo on using the social web: