consuming the twitter streaming api with ruby and mongodb
DESCRIPTION
A talk on connecting to the Twitter Streaming API and storing the tweets in MongoDB with Ruby.TRANSCRIPT
Consuming the Twitter Streaming API with Ruby and MongoDB
Jeff LinwoodLone Star Ruby Conference V
August 12, 2011http://www.jefflinwood.com
@jefflinwood
Friday, August 12, 2011
Goals
• Watch for any tweet that contains certain keywords
• Store those tweets into the MongoDB database
Friday, August 12, 2011
Demo (@aplusk)
Friday, August 12, 2011
Twitter Streaming API+ Mongo DB
• Twitter Streaming API
• MongoDB
• Not a web application
Friday, August 12, 2011
The many APIs of Twitter
• Twitter Streaming API
• User Streams
• Site Streams (for the big boys)
• REST API
• Search API
Friday, August 12, 2011
Twitter Streaming API
• Keywords
• Users
• Locations
Great photo is by rachel_thecathttp://www.flickr.com/photos/23209605@N00/2786126623/
Friday, August 12, 2011
TweetStream gem by Michael Bleighhttps://github.com/intridea/tweetstream
Friday, August 12, 2011
Connecting to the Twitter Streaming API• JSON responses
• HTTP Basic Authentication
• One stream per account (dev/prod)
• Leave it open!
• Don’t constantly reconnect, and if you do, back off
Friday, August 12, 2011
Limitations of Twitter Streaming API
• 400 Keywords
• 5,000 User Ids
• 25 Location Boxes
• Can ask Twitter for increased access
Friday, August 12, 2011
A Tweet, in JSON
Friday, August 12, 2011
Intro to MongoDB
• NoSQL - what does that mean?
• Great fit for JSON-oriented applications
• If you don’t know your schema in advance
• Query language
• Map Reduce
Friday, August 12, 2011
Storing data in MongoDB
• Native format of MongoDB is BSON, similar to JSON
• Connect to a database (similar to MySQL)
• Connect to a collection (created if it doesn’t exist)
• Insert JSON (in our case, a tweet)
Friday, August 12, 2011
MongoDB + Ruby
• mongo gem
• bson_ext gem
• http://www.mongodb.org/display/DOCS/Ruby+Language+Center
Friday, August 12, 2011
Considerations for MongoDB
• Tweets - very verbose JSON
• Date format in Tweets not same as MongoDB
• May want to pre-process Tweets
• Can use both MongoDB and MySQL in same app if you want
Friday, August 12, 2011
Here’s the code
• Okay, the whole thing is really done in about three lines.
Friday, August 12, 2011
Where do you run this?
• Command line
• Your own server
• Heroku + MongoLab
• Other cloud services
Friday, August 12, 2011
MongoLab
Friday, August 12, 2011
Now what?
• Step 1: Collect Tweets
• Step 2: ????
• Step 3: Profit!
Friday, August 12, 2011
To Learn More
• https://dev.twitter.com/docs/streaming-api/concepts
• http://www.mongodb.org/
• https://github.com/jefflinwood/Tweeter-Keeper
• http://www.jefflinwood.com/
• @jefflinwood on Twitter
Friday, August 12, 2011