-
Notifications
You must be signed in to change notification settings - Fork 5
CaSPAr file naming convention and variables
Please find the French version of this page here.
Veuillez trouver la version française de cette page ici.
Each file downloaded has a common file naming convention for convenience.
For deterministic products the naming convention is YYYYMMDDHH.nc
(e.g. for RDPS one file is 2017100200.nc
) where YYYYMMDDHH
specifies the issue date of the forecast. The time axis in the NetCDF file are the forecast horizons (in hours).
The naming convention for ensemble products is the same except all ensembles are in independent files. The convention is YYYYMMDDHH_EEE.nc
(e.g. one member of CaLDAS is named 2017100200_002.nc
) where the 002
indicates that it is the second ensemble member. The ensemble number 000
is a special member in that it is the reference ensemble member. Again the YYYYMMDDHH
indicates the issue time of the forecast. If you request an ensemble product in CaSPAR, you will receive all members of the ensemble rather than just being able to request single members. (If you really think about it why would you want an ensemble if you didn't want all of the members?)
Each CaSPAr data product can have several variables associated with it. Each of these variables are used in the ECCC modelling process and are sometimes only available during specific forecast horizons. Most variables are available at all forecast horizons. Some variables are used to initialize the forecast and are only available at the very first time step, usually T=0
. In other cases a variable may only be available at specific forecast horizons such as every 4 or every 6 hours.
A netCDF file will have data stored in a (lat,lon,time)
array where time is the forecast horizon. Unfortunately, the axis variables in the netCDF file must have consistent dimensions. If a variable is requested that is not available at all horizons a missing value will automatically be inserted. The details are somewhat irrelevant for our purposes here, but the netCDF API efficiently replaces and compresses the missing values so the difference in file size is negligible.
A simple variable naming convention is applied, i.e. PPPP_T_VV_LLLLL
where:
-
PPPP
- product name (in case you extract it) but there's also a global netCDF attribute calledproduct
which you should use since the length ofPPPP
will be variable and may include underscores. -
T
- type of product. This will be aP
for prediction or forecast andA
for analysis. -
VV
- an internal variable name used by ECCC that CaSPAr has kept for consistency. In the netCDF file thelong_name
will also be available for further description of the variable. -
LLLLL
- product level indicator.
The product level indicator LLLLL
can actually have several meanings. For atmospheric variables this is the percentage of the atmosphere based on pressure elevation. Divide the number by 100 to get the percentage value. 10000
is the bottom of the atmosphere, 0
is the top of the atmosphere, 09950
would be 99.5% etc. For CaSPAr, as the name implies, the focus is surface predictions so we only archive atmospheric variables near the surface. The level 0
can also indicate some surface variables. In CaSPAr this has been conveniently replaced with SFC
. Other variables have integer numbers for levels. We again replace this for users convenience. For soil information this is usually 10cm
meaning a soil layer of a depth from 0-10cm or Profile
which is 0-[2]m or the full depth of the modelled soil profile. It is important to note that this overlaps with the 0-10cm data. In other cases the integers represent other types of land cover. For example, in RDPS 1=Vegetated Land
, 2=Glaciers
, 3=Open Water
. Again the integers have been replaced in the variable names.
Variable names are following the convention <Product>_<Type:A=Analysis,P=Prediction>_<ECCC name>_<Level/Tile/Category>
. Variables with level 10000
are at surface level. The height [m] of variables with level 0XXXX
needs to be inferred using the corresponding fields of geopotential height (GZ_0XXXX - GZ_10000
).
In a weather prediction model wind is treated as a two component vector. People are used to hearing reported wind by direction on a compass rose (0-360 degrees) and by its speed. Output from the NWP model is based on the u-component vector which runs parallel to the x-axis and the v-component vector which runs parallel to the y-axis. In both cases the vectors and angles are measured from meridians of longitude. Just like in map projections, a climate model grid must be 'wrapped' around a geoid Earth which leads to distortion. In order to minimize the distortion it is common for the lat-lon grid of a climate model to be rotated and warped to minimize the distortion over the area of interest. The WRF model uses this approach. In CaSPAr when you request the UU
and VV
components these are the raw output from the climate model which follows the CaSPAr philosophy of not modifying data provided to users. This allows the user to have the ultimate control over the types of transformation that are applied. However, in the case of the u and v wind components CaSPAr also provides corrected values based on an unrotated grid. These are the variables UUC
and VVC
, which should be selected unless you as a user wishes to do the correction. CaSPAr also converted the u- and v-components into wind speed (UVC
) and wind direction (WDC
).
Some variables are specified to be at surface level in the variable names (i.e., showing SFC
; ECCC IP1=12000
) even though the variables are indeed at higher levels. Those variables are:
- variables related with wind (
UU
,VV
,UV
,WD
,UUC
,VVC
,UVC
,WDC
) are at 10m rather thanSFC
- variable temperature (
TT
) is at 1.5m rather thanSFC
- variable dew point (
TD
) is at 1.5m rather thanSFC
© 2017-2023 - Canadian Surface Prediction Archive - caspar.data@uwaterloo.ca
Funded under the Floodnet program.
Table of Contents: