Skip to content

Evaluation Declaration Example 4

James Brown edited this page Jul 19, 2024 · 1 revision
# NOTE: The observed and predicted data, below, are available in the repository in 
# systests/smalldata. Either download or clone the repository and modify the sources paths
# accordingly.

# An example of an evaluation of single-valued, operational flow forecasts applying a time
# series metric, such as time-to-peak-error. Such evaluations are inherently different than
# other evaluations which most often yield results per lead time or other temporal pooling
# window. In this case, the forecast time series is treated as a whole, the peak identified,
# the time to peak computed, and that value compared against an equivalent computation based
# on observations.
label: Example 4

# Observations are provided in a PI-timeseries file as hourly streamflow. Though the 
# path below is relative, for most executions of WRES, it will need to be an absolute
# path starting from the directory /mnt/wres_share/... on the WRES Deployment
# Platform. 
observed:
  sources: 
    - smalldata/25510317T00_FAKE3_observations.xml
  variable: DISCHARGE

# The forecasts are provided in multiple PI-timeseries XML files, each specified
# separately.
predicted:
  sources:
    - smalldata/25510317T12_FAKE3_forecast.xml
    - smalldata/25510318T00_FAKE3_forecast.xml
    - smalldata/25510318T12_FAKE3_forecast.xml
    - smalldata/25510319T00_FAKE3_forecast.xml
  variable: STREAMFLOW

# The measurement unit.
unit: CMS

# Thresholds are used to restrict the range of outputs for which timing errors 
# will be computed. Here, the thresholds will select those portions of the 
# paired time series whose "observed" and "predicted" values are "greater than" 
# 183 CMS and, separately, are greater than 184 CMS. Thus, three sets of 
# results should be expected, one for all data, which is the default threshold, 
# and one for each of the prescribed thresholds of 183 CMS and 184 CMS. The 
# measurement unit is defined in the "unit" declaration, above. 
thresholds:
  values: [183.0, 184.0]
  apply_to: observed and predicted

metrics:
  # A time-series metric is a metric that operates on a time-ordered list of
  # pairs, i.e. a paired time series. The "time to peak error" is one example 
  # of a time-series metric. It measures the duration, in decimal hours, 
  # between the "observed" peak value and the "predicted" peak value. A 
  # negative value indicates that the predicted value occurs too early. There 
  # is one time to peak error for each paired time series. Additionally, 
  # summary statistics may be computed from the collection of errors. For 
  # example, if there are ten paired time series, there will be ten time to 
  # peak errors, and these errors may be summarized with a mean or a median 
  # (etc.). The required summary statistics are declared as 
  # "summary_statistics".
  - name: time to peak error
    summary_statistics:
      - median
      - minimum
      - maximum
      - mean absolute
      - mean
      - standard deviation

  # This is another example of a time-series metric. In this case, each
  # timing error is expressed as a fraction of the duration between
  # the start of the paired time-series and the time of the observed
  # peak. Thus, emphasis is placed on large timing errors that occur 
  # at early forecast lead times, rather than small timing errors that
  # occur at late forecast lead times. 
  - name: time to peak relative error
    summary_statistics:
      - median
      - minimum
      - maximum
      - mean
      - mean absolute
      - standard deviation

# The decimal format to use when writing numeric outputs.
decimal_format: '#0.000000'

# The output formats to write.
output_formats:
  - csv
  - pairs
  - png
Clone this wiki locally