It’s not enough to copy those numbers into a story; what differentiates reporters from consumers is our ability to analyse data and spot trends. To make data easier to access, reorganise and sort, those figures must be pulled into a spreadsheet or database. The mechanism to do this is called web scraping, and it’s been a part of computer science and information systems work for years.
It often takes a lot of time and effort to produce programs that extract the information, so this is a specialty. But what if there was a tool that didn’t require programming?
Michelle Minkoff offers a simple guide for journalists who want to learn how to scrape data from websites, but don’t know how to start, using OutWit Hub – an extension for the Firefox browser.
Yesterday Journalism.co.uk attended a Digital Editors Network meeting to discuss data for journalism and journalists – more to follow on Journalism.co.uk…