Download and archive all your likes and following in your tumblr blog using tumblr API.
Require Python >= 3.5, you can install the python from official website or install Anaconda 3 instead.
* You can use virtualenv, conda create or pipenv to create an isolated running environment.
Install the requirement by:
pip install -r requirements.txt
Or install the requirements manually:
Package Name |
---|
git+https://github.com/tumblr/pytumblr.git |
requests |
pyyaml |
beautifulsoup4 |
lxml |
To be noticed, the official pytumblr package may not be the latest version, so it's better to use pip install git+https://github.com/tumblr/pytumblr.git
to download the latest version of pytumblr.
In some regions you will need a proxy to use this downloader. If the downloader is not running with proxy, try to set the proxy
Global Mode
and re
-
Enter https://www.tumblr.com/oauth/apps to register an application and get a OAuth Key. OAuth 2.0 is the way to authentication and access the content of your blog via tumblr API. (Get to know about OAuth)
But note that Tumblr API has rate limits, so don't overuse it or spread your OAuth Key to public.
Rate Limits
Newly registered consumers are rate limited to 1,000 requests per hour, and 5,000 requests per day. If your application requires more requests for either of these periods, please use the 'Request rate limit removal' link on an app above.
-
After you registration you will get a OAuth Consumer Key and a Secret Key.
-
Config the functions you need in main.py, currently three functions are provided:
Function Name | Explanation |
---|---|
download_likes() | Download all the posts you liked |
download_following() | Download all the posts in the blogs you are following |
download_blog(name or url of the blog ) |
Download all the posts in the blog you specified |
-
The required parameter of
download_blog
is the name or URL of the blog. Take the official support blog as an example, the blog name should besupport
, and the URL should besupport.tumblr.com
. -
download_blog
has two optional parameters:before_timestamp
is a unix timestamp, all posts posted before this timestamp will be downloaded from the newest to the oldest. If not specified (which is default), it will use present time as the parameter, which means all the posts will be downloaded. This parameter is useful when the script breaks down and you want to resume it.max_count
is used to control the max count of downloaded posts in case it takes to much time to download one blog. If not specified, all the posts will be downloaded.
-
download_likes
has two optional parameters:before_timestamp
is described above.rename
is used when you want to rename all the files asblog-{no.}
.True
is default option. If set false, it will use original post's name.
-
download_following
has three optional parameters:-
start_blog
, which you can use it to specify which blog to start. This is useful when the script breaks down and you want to resume it. -
start_page
is the page number to start. -
max_page
is the max page number to download in case it takes too much time downloading one blog.max_page
cannot be larger than 50, since downloader cannot access 50 and more pages via tumblr API.When using the offset parameter the maximum limit on the offset is 1000. If you would like to get more results than that use either before or after.
-
-
Set the downloader to not download reblog posts by setting
downloader.reblog = False
. -
Set the downloader to not download content that has already been downloaded, say from a previous run, by setting
downloader.redownload = False
.
- Run
python main.py
to start the first-time config, you will be redirected to a interactive console provided by pytumblr.- First input the OAuth Consumer Key and Secret Key you get before.
- The console will return an authorize url to authorize your own tumblr account to the downloader. Copy and paste it in web browser and visit it. The page will ask you to authorize. Allow it. Then the url will be redirected to another url, which contains the
oauth_verifier
token, copy and paste it back to the console. - Finally the downloader will get
oauth_token
andoauth_token_secret
and continue to download your blog.