Skip to content

Releases: CoLRev-Environment/colrev

Version 0.7.1

08 Mar 07:35
Compare
Choose a tag to compare

Added

  • Github action: publish to PyPI

Version 0.7.0

16 Jan 21:05
Compare
Choose a tag to compare

Added

  • Add retrieve and pdfs as high-level operations
  • Metadata preparation can add records to separate origin feeds
  • Initial package manager functionality (registering packages and displaying them in the docs)
  • Search: update of records and propagation of changes
  • Several SearchSources (including SearchSource query validation)
  • Revisions of CLI (verbose mode, user feedback)
  • Colrev merge (reconciliation coding when merging git branches)
  • dedupe --merge/--unmerge
  • Integrated colrev pre-commit hooks
  • PRISMA diagram (data endpoint)
  • Obsidian (data endpoint)
  • Preparation: not-in-toc exception/warning
  • Setup of pytests

Changed

  • Curated records are now explicitly identified through curation_IDs
  • Revise colrev validate (commits, users, properties)
  • Detailed advisor (using get_advice() for data endpoints)
  • Performance improvements and simplification of status (cli)
  • Moved correction functionality to SearchSources (refactored correction path)
  • Preparation: simplified preparation rounds (default settings)
  • Retrieve TEIs through local_index (if available) instead of recreating it
  • Replace pathos by Threadpool
  • Revise the documentation
  • Revise and extend exceptions

Removed

  • Remove persistent colrev-ids
  • Remove realtime review
  • Dependencies ansiwrap and p-tqdm

Fixed

  • **kwargs calls in ReviewManager
  • Indexing of non-curated records
  • Address special cases in dedupe (active learning)

Version 0.6.0

12 Oct 19:44
Compare
Choose a tag to compare

Added

  • Web-based editor for project settings
  • Comprehensive architecture refactoring
  • Conformance with pylint, mypy, flake8
  • Introduced packages
  • Updated file and directory structure
  • Documentation of modules, classes, and methods
  • Github-pages as a data package_endpoint

Changed

  • Renamed from colrev_core to colrev (integrated cli)
  • Switch to poetry for dependency management
  • Renamed scripts to package_endpoints
  • PDF-hash generation based on Docker to avoid platform dependency issues
  • Switch to Jinja templates (instead of concatenating multiple strings)

Fixed

  • Concurrent request session handling
  • StatusStats calculations

Version 0.5.0

28 Jun 09:25
Compare
Choose a tag to compare

Added

  • Push/pull (including corrections), sync, validate, service operations
  • Data provenance model (colrev_data_provenance, colrev_masterdata_provenance)
  • Extensible endpoints (search, prep, prescreen, pdf-get, pdf-prep, screen, data)
  • Prescreen scope

Changed

  • Improvements: prep, dedupe operations
  • Performance improvements (e.g., status, bibtexparser > pybtex)
  • Extended Record class (e.g., merge and fuse_best_fields)
  • LocalIndex: Elasticsearch to Opensearch
  • Dedupe: testing and parameter optimization (option to prevent same-source merges)
  • Settings.json and validation
  • Updated documentation
  • Testing and refactoring (e.g., for Windows, prefer keyword arguments in functions, python package type information)

Version 0.4.0

06 Apr 20:57
Compare
Choose a tag to compare

Added

  • Extract functionality: ReviewDataset, Process
  • Developed LocalIndex, EnvironmentManager, OpenSearch
  • Curation model, including Resource installation and a "correction path"
  • Search operation (reintegrating paper_feed and local_paper_index)
  • Prep exclusion based on languages

Changed

  • Object-oriented refactoring of the whole codebase
  • Use Zotero translators (instead of bibutils) for imports
  • Duplicate identification (add FP safeguards based on LocalIndex, add a procedure for small samples)
  • Consistent PDF path handling
  • Structured data extraction based on csv

Fixed

  • Loggers
  • Performance issues in prep and status

Version 0.3.0

06 Feb 09:49
Compare
Choose a tag to compare

Added

  • Introduced ReviewManager and integrated hooks/checks
  • Fetch metadata from Open Library
  • Required fields for misc
  • Information on needs_manual_preparation (man_prep_hints)
  • Activated mypy hooks
  • Introduced custom load scripts
  • Documentation
  • LocalIndex: hash-table implementation for indexing and retrieval

Changed

  • Dedupe: based on active learning (dedupe-io)
  • Improved batches
  • Pass records instead of BibDatabase
  • PDF prep and longer pdf hashes

Removed

  • CLI: now in separate colrev repository

Fixed

  • Initializing repositories
  • Backward search adds two entries to search_details
  • Logging (reinitialize after batches/commits)

Version 0.2.0

12 Nov 10:15
Compare
Choose a tag to compare

Added

  • Status model (rev_status, md_status, pdf_status)
  • Implemented cli interface
  • Import formats (bib, ris, endn, pdf, text list of references)
  • Docker services for import, ocr, building the paper etc.
  • Metadata repositories for record preparation (crossref, dblp, semantic scholar)
  • PDF preparation (OCR, metadata validation)
  • Commit message reporting
  • Check and validation of iteration completeness
  • Support for building papers based on pandoc

Changed

  • Integrated review process status (including prescreen, screen inclusion vs exclusion) in the references.bib
  • Renamed scripts and cli entrypoints
  • Refactored code
  • Tracing from hash_id to origin links
  • Extended and refactored pre-commit hooks

Removed

  • R scripts for sample statistics (the goal is to implement them in Python)
  • hash_id function, trace_entry, trace_hash_id

Fixed

  • Bugs in analysis/combine_individual_search_results.py and in analysis/acquire_pdfs.py
  • Catch exceptions and check bad responses in analysis/acquire_pdfs.py
  • Bug in git modification check for references.bib in analysis/utils.py
  • Exception in anaylsis/screen_2.py (IndexError)
  • Global constant conflict with analysis/entry_hash_function.py (nameparser.config/CONSTANTS)

Version 0.1.0

08 May 08:00
Compare
Choose a tag to compare

Added

  • First version of the pipeline, including status, reformat_bibliography, trace_entry, trace_hash_id, combine_individual_search_results, cleanse_records, screen_sheet, screen_1, acquire_pdfs, screen_2, data_sheet and data_pages
  • Environment setup including Dockerfile and Makefiles