Transcript
Page 1: Webscraping for jounalists

Webscraping for journalistsCAJ May 13, 2011

“A little Wget magic”

Page 2: Webscraping for jounalists

Webscraping

Using software that simulates a web browser to download large quantities of information from a web site.

Page 3: Webscraping for jounalists

Why webscrape?

• Assemble your own copy of online data• Save time pointing-and-clicking

Page 4: Webscraping for jounalists

Why webscrape?

• Data publishers (governments) want you to access data on their terms

Page 5: Webscraping for jounalists
Page 6: Webscraping for jounalists
Page 7: Webscraping for jounalists
Page 8: Webscraping for jounalists

Is it legal?

Yes. But.

Do it ethically.Watch for robots.txt


Top Related