Skip to content

Latest commit

 

History

History
195 lines (158 loc) · 5.24 KB

README.md

File metadata and controls

195 lines (158 loc) · 5.24 KB

NCTPtools

Tools to assist in the management of data for the Nebraska Crop Testing Program (NCTP).

Configure

Get the variety testing API token, fill in the config.json in the root of this project adding a value to api_token field.

Optionally change any of the domains if you need to. Note that the develop and stage domains don't work due to CAS.

Usage

1. Select your global options (-e {env} and -l)

python cli.py <global options> [ payloads | upload ]

Global Options Help

$ python cli.py -h
usage: cli.py [-h] -e {local,valet,prod,develop,staging} [-l] {payloads,upload} ...

options:
  -h, --help            show this help message and exit
  -e {local,valet,prod,develop,staging}, --env {local,valet,prod,develop,staging}
                        Prepare for or upload to the specified environment
  -l, --loud            Log everything about outgoing requests

actions:
  {payloads,upload}     Actions
    payloads            Generate request payloads
    upload              Upload payloads to a variety testing environment

2. Generate your request payloads

Payloads Subcommand

Generates a directory of json objects that are use as the POST/PUT request bodies of your upload.

$ python cli.py -e prod -l payloads -h
usage: cli.py payloads [-h] -i INPATH -o OUTPATH

options:
  -h, --help            show this help message and exit
  -i INPATH, --inpath INPATH
                        Path to trial csv data provided by PREC
  -o OUTPATH, --outpath OUTPATH
                        Location to dump the payloads

3. Upload your payloads

Uploads the payloads generated from the Payloads action.

Upload Subcommand

$ python cli.py -e local -l upload -h                                   
usage: cli.py upload [-h] -i INPATH [-y YEAR] [-r] {all,sites,results,varieties} ...

positional arguments:
  {all,sites,results,varieties}

options:
  -h, --help            show this help message and exit
  -i INPATH, --inpath INPATH
                        Path to json payloads generated by the payloads command
  -y YEAR, --year YEAR  The trial harvest year of this data
  -r, --rewrite         Rewrite i.e. delete existing and write again

The rewrite option is dangerous and I would avoid using it if not developing this tool.

If you have to make corrections after an upload

If something was already uploaded and you want to make a correction, verify that the corrected record has an id and re-run the script.

If any of the request payloads have an id, then the upload request will be a PUT rather than a POST so will not create a new record but instead update the existing one.

Updating Variety Characteristics from Previous Years Rather than creating new ones

Adding the previous year's id to the characteristic payloads will update those older records rather than creating new ones and will ensure that all newly added performance results will reference the existing records.

Simply find a variety and add an "id" field with the id:

{
  "id": 123, <~added manually
  "published_at": "2023-03-01",
  "name": "Amplify SF",
  "brand": "PlainsGold",
  "origin": "CSU/CWRF",
  "family": "",
  "alternate_name": "",
  "ne_variety": "0",
  "maturity": "3",
  ...
}

Logging / Manifest

  • Manifest: Generated everytime you run the upload action. The manifest is located in the root of the payloads directory and lists data that failed to be uploaded. After uploading, you'll want to check through it, make any corrections and then rerun it.
  • Capturing Logs: If you want/need logs then store the output of the command in a log file. Do this by appending > log.txt to the command. python cli.py [...] upload > log.txt && less log.txt

Notes on Input File Structure and Content

Directory Structure

Example contents of inpath directory structure:

2022
├── characteristics.csv
├── intensive_management
│   └── 31033.csv
├── irrigated
├── rainfed
│   ├── 31007.csv
│   ├── 31033.csv
│   ├── 31095.csv
│   ├── 31101.csv
│   ├── 31105.csv
│   ├── 31109.csv
│   ├── 31111.csv
│   ├── 31135.csv
│   └── 31155.csv
└── trial_sites.csv

File Content

Characteristics file header:

characteristics.csv

- published_at (if null will automatically be set to today's date)
- name
- brand
- origin
- family
- alternate_name
- ne_variety
- maturity
- winter_hardiness
- straw_strength
- plant_height
- coleoptile
- hessian_fly
- leaf_rust
- stem_rust
- stripe_rust
- tan_spot
- soil_borne_mosaic
- wheat_streak_mosaic
- fusarium_head_blight
- wheat_stem_sawfly
- target_ne_environment
- characteristic_notes

Sites file header:

trial_sites.csv

- published_at (if null will automatically be set to today's date)
- collaborator
- planting_date
- harvest_date
- fertilization
- herbicide
- fungicide
- soil_type
- tilled
- irrigated
- irrigation
- crop_rotations
- latitude
- longitude
- trial_notes
- fips
- status_trial_succeeded
- status_notes
- organic
- intensive_management

Results files header

{site_type}/{fips}.csv

- grain_yield
- bushel_weight
- plant_height
- protein
- kernel_weight
- name (Must be reported exactly the same was whatss in the characteristics file)
- variety_name (will be ignored)