diff --git a/README.md b/README.md index b1d6faa..f9b0276 100644 --- a/README.md +++ b/README.md @@ -3,9 +3,9 @@ [![PyPI](https://img.shields.io/pypi/v/warcdb.svg)](https://pypi.org/project/warcdb/) [![Tests](https://github.com/Florents-Tselai/WarcDB/actions/workflows/run-tests.yaml/badge.svg?branch=main)](https://github.com/Florents-Tselai/WarcDB/actions/workflows/run-tests.yaml) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/Florents-Tselai/WarcDB/blob/main/LICENSE) -![GitHub Stars](https://img.shields.io/github/stars/Florents-Tselai/WarcDB) +[![GitHub Stars](https://img.shields.io/github/stars/Florents-Tselai/WarcDB) -`WarcDB` is a an `SQLite`-based file format that makes web crawl data easier to share and query. +`WarcDB` is an `SQLite`-based file format that makes web crawl data easier to share and query. It is based on the standardized [Web ARChive format](https://en.wikipedia.org/wiki/Web_ARChive), used by web archives, and defined in [ISO 28500:2017](https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.1/).