Skip to content

Commit

Permalink
Merge pull request #11 from edsu/pyproject
Browse files Browse the repository at this point in the history
Update to using pyproject.toml
  • Loading branch information
Florents-Tselai committed Oct 18, 2023
2 parents fe4f91c + f37b883 commit e5f7446
Show file tree
Hide file tree
Showing 6 changed files with 60 additions and 66 deletions.
5 changes: 3 additions & 2 deletions .github/workflows/run-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,9 @@ jobs:

- name: Install test requirements
run: |
pip install tox
pip install poetry
poetry install
- name: Run tests
run: |
tox
poetry run pytest
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@ build/
dist/
warcio_sqlite.egg-info/
.idea/
*.db
*.db
poetry.lock
__pycache__
18 changes: 17 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ warcdb import archive.warcdb ./tests/google.warc ./tests/frontpages.warc.gz "htt

warcdb enable-fts ./archive.warcdb response payload

# Saarch for records that mention "stocks" in their response body
# Search for records that mention "stocks" in their response body
warcdb search ./archive.warcdb response "stocks" -c "WARC-Record-ID"
```
As you can see you can use any mix of local/remote and raw/compressed archives.
Expand Down Expand Up @@ -107,6 +107,22 @@ where json_extract(h.value, '$.header') like '%Cookie%'
SQL
```
## Develop
You can use poetry to install dependencies and run the tests:
```
$ git clone https://github.com/Florents-Tselai/WarcDB.git
$ cd WarcDB
$ poetry install
$ poetry run pytest
```
Then when you are ready to publish to PyPI:
```
$ poetry publish --build
```
Resources on WARC
----------------
Expand Down
37 changes: 37 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
[tool.poetry]
name = "warcdb"
version = "0.1.0"
description = "WarcDB: Web crawl data as SQLite databases"
authors = ["Florents Tselai <florents@tselai.com>"]
readme = "README.md"
license = "Apache License, Version 2.0"
repository = "https://github.com/Florents-Tselai/warcdb"
classifiers = [
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
]

[tool.poetry.scripts]
warcdb = "warcdb:warcdb_cli"

[tool.poetry.dependencies]
python = "^3.9"
sqlite-utils = "^3.26"
warcio = "^1.7"
click = "^8.1"
more-itertools = "^10.1"
tqdm = "^4.66"
requests = "^2.31"

[tool.poetry.group.test.dependencies]
pytest = "^7.4"
black = "^23.10"

[tool.pytest.ini_options]
testpaths = [
"tests"
]

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
56 changes: 0 additions & 56 deletions setup.py

This file was deleted.

6 changes: 0 additions & 6 deletions tox.ini

This file was deleted.

0 comments on commit e5f7446

Please sign in to comment.