You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A small library that downloads web pages, parse them and saves the needed data in mongodb.
Uses HPricot.
Right now it's parsing IMDB movies, using simple multithreading, it's a quick trial of some functions. The IMDB part was inspired by an article that used the technique. I wondered it threading and mongodb from there.