AAJA Convention Guide: Dan Hill on “Webscraping 101”


AAJA has asked some of this year’s speakers to give a sneak peek into their sessions. Let Dan Hill tell you what to expect during his “Webscraping 101” session.

Data rules everything around journalists, whether we’re out in the field covering our beats, filing reports in the newsroom or engaging our audiences on social media. Collecting and comprehending data are crucial skills as the abundance of online information grows.

If you’ve ever labored over pasting information on a website into your own document or Excel spreadsheet and wondered if there’s an easier way to get online data for your story, I encourage you to participate in the Webscraping 101 workshop.

Presenters Albert Sun, Frank Bi and I have used scrapers, simple scripts written in programming languages, to quickly collect data from websites, and we’re excited to show you how to make a scraper. We’ve set up a website with data on farmers’ markets in New York City, and we’ll walk you line by line through writing a scraper to extract the information.

I’m ready to make a big promise: everyone who attends Webscraping 101 with a laptop will leave the session with a spreadsheet of farmers market data. What could be more exciting than structured data?

We’ll also talk about when scraping is appropriate in a journalistic context and share tools that help you scrape without writing code. Please bring a computer and come ready to code!

Webscraping 101 runs from 10 a.m. to noon on Thursday, August 22, of the convention, so that means I’ll have to miss “Investigative Reporting With Tight Resources.” If weren’t leading an awesome workshop writing a scraper and making CSVs out of online data, I’d listen to panelists including my AAJA Voices mentor Adam Causey talk about executing investigative projects.

I’m excited to see so many web-focused sessions on the schedule, like a social media panel featuring fellow Voices alum Kyle Kim. Code With Me Founders and fellow Northwestern Wildcats Tom Giratikanon and Sisi Wei are leading sessions at this year’s convention, and after seeing Chrys Wu at all the nerdy data meetups in New York this summer, I’m excited to hear what she has to say about design and user experience. If you don’t have a your own website yet, drop by How To Build An Online Portfolio. Frank Bi, Bobby Boos and I will hook you up.

I’ve had a blast visiting New York for my first time as an intern at The Wall Street Journal this summer, and I look forward to exploring the city with my AAJA friends. Scrape with you soon!

Dan Hill recently graduated from Northwestern University with degrees in journalism and computer science. He has participated in AAJA’s J Camp and Voices programs for student journalists and scraped data on deadline as an intern at The Wall Street Journal, The Washington Post and The Sacramento Bee. Follow him on Twitter: @DanHillReports