Releases: CoLRev-Environment/colrev
Releases · CoLRev-Environment/colrev
Version 0.7.1
Added
- Github action: publish to PyPI
Version 0.7.0
Added
- Add retrieve and pdfs as high-level operations
- Metadata preparation can add records to separate origin feeds
- Initial package manager functionality (registering packages and displaying them in the docs)
- Search: update of records and propagation of changes
- Several SearchSources (including SearchSource query validation)
- Revisions of CLI (verbose mode, user feedback)
- Colrev merge (reconciliation coding when merging git branches)
- dedupe --merge/--unmerge
- Integrated colrev pre-commit hooks
- PRISMA diagram (data endpoint)
- Obsidian (data endpoint)
- Preparation: not-in-toc exception/warning
- Setup of pytests
Changed
- Curated records are now explicitly identified through curation_IDs
- Revise colrev validate (commits, users, properties)
- Detailed advisor (using get_advice() for data endpoints)
- Performance improvements and simplification of status (cli)
- Moved correction functionality to SearchSources (refactored correction path)
- Preparation: simplified preparation rounds (default settings)
- Retrieve TEIs through local_index (if available) instead of recreating it
- Replace pathos by Threadpool
- Revise the documentation
- Revise and extend exceptions
Removed
- Remove persistent colrev-ids
- Remove realtime review
- Dependencies ansiwrap and p-tqdm
Fixed
- **kwargs calls in ReviewManager
- Indexing of non-curated records
- Address special cases in dedupe (active learning)
Version 0.6.0
Added
- Web-based editor for project settings
- Comprehensive architecture refactoring
- Conformance with pylint, mypy, flake8
- Introduced packages
- Updated file and directory structure
- Documentation of modules, classes, and methods
- Github-pages as a data package_endpoint
Changed
- Renamed from colrev_core to colrev (integrated cli)
- Switch to poetry for dependency management
- Renamed scripts to package_endpoints
- PDF-hash generation based on Docker to avoid platform dependency issues
- Switch to Jinja templates (instead of concatenating multiple strings)
Fixed
- Concurrent request session handling
- StatusStats calculations
Version 0.5.0
Added
- Push/pull (including corrections), sync, validate, service operations
- Data provenance model (colrev_data_provenance, colrev_masterdata_provenance)
- Extensible endpoints (search, prep, prescreen, pdf-get, pdf-prep, screen, data)
- Prescreen scope
Changed
- Improvements: prep, dedupe operations
- Performance improvements (e.g., status, bibtexparser > pybtex)
- Extended Record class (e.g., merge and fuse_best_fields)
- LocalIndex: Elasticsearch to Opensearch
- Dedupe: testing and parameter optimization (option to prevent same-source merges)
- Settings.json and validation
- Updated documentation
- Testing and refactoring (e.g., for Windows, prefer keyword arguments in functions, python package type information)
Version 0.4.0
Added
- Extract functionality: ReviewDataset, Process
- Developed LocalIndex, EnvironmentManager, OpenSearch
- Curation model, including Resource installation and a "correction path"
- Search operation (reintegrating paper_feed and local_paper_index)
- Prep exclusion based on languages
Changed
- Object-oriented refactoring of the whole codebase
- Use Zotero translators (instead of bibutils) for imports
- Duplicate identification (add FP safeguards based on LocalIndex, add a procedure for small samples)
- Consistent PDF path handling
- Structured data extraction based on csv
Fixed
- Loggers
- Performance issues in prep and status
Version 0.3.0
Added
- Introduced ReviewManager and integrated hooks/checks
- Fetch metadata from Open Library
- Required fields for misc
- Information on needs_manual_preparation (man_prep_hints)
- Activated mypy hooks
- Introduced custom load scripts
- Documentation
- LocalIndex: hash-table implementation for indexing and retrieval
Changed
- Dedupe: based on active learning (dedupe-io)
- Improved batches
- Pass records instead of BibDatabase
- PDF prep and longer pdf hashes
Removed
- CLI: now in separate colrev repository
Fixed
- Initializing repositories
- Backward search adds two entries to search_details
- Logging (reinitialize after batches/commits)
Version 0.2.0
Added
- Status model (rev_status, md_status, pdf_status)
- Implemented cli interface
- Import formats (bib, ris, endn, pdf, text list of references)
- Docker services for import, ocr, building the paper etc.
- Metadata repositories for record preparation (crossref, dblp, semantic scholar)
- PDF preparation (OCR, metadata validation)
- Commit message reporting
- Check and validation of iteration completeness
- Support for building papers based on pandoc
Changed
- Integrated review process status (including prescreen, screen inclusion vs exclusion) in the references.bib
- Renamed scripts and cli entrypoints
- Refactored code
- Tracing from hash_id to origin links
- Extended and refactored pre-commit hooks
Removed
- R scripts for sample statistics (the goal is to implement them in Python)
- hash_id function, trace_entry, trace_hash_id
Fixed
- Bugs in
analysis/combine_individual_search_results.py
and inanalysis/acquire_pdfs.py
- Catch exceptions and check bad responses in
analysis/acquire_pdfs.py
- Bug in git modification check for
references.bib
inanalysis/utils.py
- Exception in
anaylsis/screen_2.py
(IndexError) - Global constant conflict with
analysis/entry_hash_function.py
(nameparser.config/CONSTANTS)
Version 0.1.0
Added
- First version of the pipeline, including
status
,reformat_bibliography
,trace_entry
,trace_hash_id
,combine_individual_search_results
,cleanse_records
,screen_sheet
,screen_1
,acquire_pdfs
,screen_2
,data_sheet
anddata_pages
- Environment setup including
Dockerfile
andMakefiles