Weibo Favorites Backup Tool

This is a command-line tool for backing up and restoring Weibo favorites ("我的收藏"). With this tool, you can easily download all of your favorite Weibo posts and save them to your local machine.

Motivation

I collected too many posts in Weibo, which was inconvenient to manage. After a period of time, I found that some of them had become invalid, so I decided to write a command line tool to back up and restore them.

How to Use

Installation

To install the Weibo Favorites Backup Tool, follow these steps:

Clone the repository:

git clone https://github.com/xujiajiadexiaokeai/get-weibo-favorites.git
cd get-weibo-favorites

Install dependencies:

pip install -e .

Usage

The tool has two main components:

A crawler for backing up your Weibo favorites
A web interface for managing and viewing your backed-up favorites

Getting Cookies

First, you need to get your Weibo cookies:

python -m weibo_favorites.crawler.auth

This will:

Open a Chrome browser window
Navigate to Weibo's login page
Wait for you to manually log in
Save your cookies to data/weibo_cookies.json after successful login

Scheduling Crawl Favorites

After obtaining the cookies, you can initiate the scheduling process to back up your favorites:

python -m weibo_favorites.scheduler

This will:

Load cookies from data/weibo_cookies.json
Periodically fetch your Weibo favorites page by page
Save the results to data/weibo_favorites.db

Running the Web Interface

python -m weibo_favorites.web.app

The web interface will be available at http://localhost:5001

The interface will display:

Crawler running status
Crawler running logs
Backed-up favorites

Configuration

All configuration items are managed in src/weibo_favorites/config.py:

Data file paths: DATA_DIR
Log file paths: LOGS_DIR
API configuration: BASE_URL, REQUEST_DELAY
Logging configuration: LOG_LEVEL, LOG_FORMAT

Data Format

The crawler saves favorites in JSON format. Each favorite contains:

Basic information:
- id: Weibo ID (string)
- created_at: Creation time of the Weibo post
- collected_at: Time when the post was collected by the crawler
- url: Direct link to the Weibo post
- source: Post source (e.g., "iPhone客户端", "微博 weibo.com")
Content:
- text: Raw text content of the post
- text_html: HTML formatted text content
- is_long_text: Whether the post is a long text post
- links: List of external links in the post
User information:
- user_id: User ID (string)
- user_name: Username (screen name)

Example:

{
    "id": "4884450687058493",
    "created_at": "Tue Mar 28 20:15:47 +0800 2023",
    "url": "https://weibo.com/1727858283/MzqWMtCeF",
    "user_name": "Example User",
    "user_id": "1727858283",
    "is_long_text": true,
    "text": "这是一条微博的原始文本内容",
    "text_html": "这是一条微博的<a href='...'>HTML格式</a>文本内容",
    "source": "iPhone客户端",
    "links": ["https://example.com/link1", "https://example.com/link2"],
    "collected_at": "2023-12-20 15:30:45"
}

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
src/weibo_favorites		src/weibo_favorites
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Weibo Favorites Backup Tool

Motivation

How to Use

Installation

Usage

Getting Cookies

Scheduling Crawl Favorites

Running the Web Interface

Configuration

Data Format

License

About

Releases

Packages

Languages

License

xujiajiadexiaokeai/get-weibo-favorites

Folders and files

Latest commit

History

Repository files navigation

Weibo Favorites Backup Tool

Motivation

How to Use

Installation

Usage

Getting Cookies

Scheduling Crawl Favorites

Running the Web Interface

Configuration

Data Format

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages