-
Notifications
You must be signed in to change notification settings - Fork 13
Standard (ASCII) format for station data
Weather station data are most often stored in the form of text/csv files instead NetCDF. In the following, we describe the standard format for observational datasets considered in loadeR
, which is the same defined within the COST Action VALUE. Then, the VALUE ECA&D dataset is going to be used as example, which contains weather data of 86 stations spread over Europe, and is available for download:
value <- tempfile(fileext = ".zip")
download.file("www.value-cost.eu/sites/default/files/VALUE_ECA_86_v2.zip",
destfile = value)
# Data inventory
di <- dataInventory(dataset = value)
NOTE: To see examples of available station data go to section 3.2. Accessing and loading station data.
In order to explore data formats in detail, next the "VALUE_ECA_86_v2.zip" is decompressed:
valuefiles <- tempdir()
unzip(value, exdir = valuefiles)
Station data and metadata are stored as a collection of csv files strictly following this structure:
This file contains the information regarding the weather stations. The first three columns are the minimum information required for defining an station dataset, so these are compulsory. The remaining data (altitude and source in this case) are an example of optional metadata than can be additionally included in this file. The datasets can have as many metadata as one may want, but columns station_id
, longitude
and latitude
are mandatory, and their names must match exactly the ones shown in this example.
head(read.table(paste0(valuefiles, "/VALUE_ECA_86_v2/stations.txt"), sep = ",", header = TRUE))
## station_id, name, longitude, latitude, altitude, source
## 1 000012, GRAZ, 15.450000, 47.083100, 366, ECA&D
## 2 000013, INNSBRUCK, 11.400000, 47.266700, 577, ECA&D
## 3 000014, SALZBURG, 13.000000, 47.800000, 437, ECA&D
## 4 000015, SONNBLICK, 12.950000, 47.050000, 3106, ECA&D
## 5 000016, WIEN, 16.350000, 48.233100, 198, ECA&D
## 6 000017, UCCLE, 4.366400, 50.800000, 100, ECA&D
This file contains the information regarding the variables contained in the dataset, including their identification ID (variable_id
), variable name (longname
), units of measure (unit
), the code used to identify missing data (missing_code
) and other info that can be optionally included (e.g. type
).
NOTE: The use of special characters in 'variable_id' codes is discouraged
head(read.table(paste0(valuefiles, "/VALUE_ECA_86_v2/variables.txt"), sep = ",", header = TRUE))
## variable_id, longname, unit, missing_code, type
## 1 precip, Total_precipitation_accumulated_in_24_hours, mm, NaN, observation
## 2 tmean, Daily_maximum_temperature, degC, NaN, observation
## 3 tmin, Daily_minimum_temperature, degC, NaN, observation
## 4 tmax, Daily_mean_temperature, degC, NaN, observation
Variables are stored separately in text files named as indicated by the variable field in the variables.txt file. The first column of the file represents the observation date dates, following the format YYYYMMDD. More exceptionally in downscaling applications, time records for subdaily data can be indicated using the format YYYYMMDDHH. The remaining columns (2 to n) correspond to the observed series at each station, following the order defined in the first line of the file. This is a (truncated) example file for the minimum daily temperature data of this dataset:
head(read.table(paste0(valuefiles, "/VALUE_ECA_86_v2/tmin.txt"), sep = ",", header = TRUE))
## YYYYMMDD, X000012, X000013, X000014, X000015, X000016, X000017, X000021,
## 1 19610101, -4.7, -4.8, -5.2, -13.7, -1.7, 1.2, -1.4,
## 2 19610102, -1.2, -2, -1.2, -13.2, -0.3, 2, 0.3,
## 3 19610103, -4.3, -2.7, -3.2, -13.2, -1.5, 2, -0.6,
## 4 19610104, 0.8, -2.5, -7.5, -14.6, 0.2, 2.6, 5.5,
## 5 19610105, 0.8, -5, -6.1, -17.4, 0.5, 1.1, 3.7,
## 6 19610106, -4.4, -7, -6.9, -18.4, -1.4, 2, 1.2,
NOTE: To see examples of available station data go to section 3.2. Accessing and loading station data.
print(sessionInfo())
## R version 3.2.3 (2015-12-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.3 LTS
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=es_ES.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] magrittr_1.5 formatR_1.2 tools_3.2.3 htmltools_0.2.6
## [5] yaml_2.1.13 stringi_0.4-1 rmarkdown_0.6.1 knitr_1.10.5
## [9] stringr_1.0.0 digest_0.6.8 evaluate_0.7
- Package Installation (and known problems)
- Model Data (reanalysis and climate projections)
- Observations (station and gridded data)
- Standard data manipulation