Skip to contents

These are the column headers for CSV-formatted external data files. The files should be UTF-8 encoded.

Missing values should be supplied as blank cells, not as NA or some other code.

Other columns can also be supplied, but will typically be ignored.

Contaminant data

The data file has one row for each measurement.

column name type mandatory comments
country character yes identifies the source of the data; for international assessments this is typically the country of origin, but for national assessments it could be a local monitoring authority
must match country in station file
no missing values
station_code alphanumeric yes the station (code) where the sample was collected
must match station_code in station file
no missing values
station_name alphanumeric yes the station (name) where the sample was collected; this is often more intuitive to a user than station_code
must match station_name in station file
no missing values
sample_latitude numeric (decimal degrees) need not match station_latitude in station file
sample_longitude numeric (decimal degrees) need not match station_longitude in station file
year integer yes monitoring year
doesn’t necessarily match date since a sampling season running from e.g. November 2021 to May 2022 might all be considered the 2022 monitoring year
no missing values
date date: use ISO 8601 standard e.g. 2023-06-28 sampling date
depth numeric (m) sediment: assumed to be a surface sediment sample with depth being the lower depth of the grab
water: assumed to be a surface water sample with depth being the upper depth of the sample
biota: not used, so can supply whatever is useful (or omit)
species character yes (biota) latin name which must match a submitted_species in the species reference table
sex character see ICES reference codes for SEXCO
required for EROD assessments
desirable if sex is used to subdivide timeseries (see subseries)
n_individual integer number of pooled individuals in the sample
required for imposex assessments
subseries character used to split up timeseries by e.g. sex or age
for example: juvenile, adult_male, adult_female
missing values indicate that all records in a timeseries will be considered together (no subdivision)
sample alphanumeric yes links measurements made on the same individuals (biota), in the same sediment grab or in the same water sample
no missing values
don’t use the same value for samples collected in different years, at different stations or in different species
determinand character yes must match values in determinand reference table
most will be in ICES reference codes for PARAM but can provide own values
no missing values
matrix character yes see ICES reference codes for MATRX
basis character yes (biota & sediment) W, D or L
no missing values for chemical measurements in biota or sediment
not mandatory for water where basis is always taken to be W
unit character yes see ICES reference codes for MUNIT
no missing values
value numeric yes no missing values
censoring character typically D, Q or < indicating a value less than the limit of detection, less than the limit of quantification, or some other (non-specified) less than
a missing value indicates that the measurement is not a less-than (i.e. is uncensored)
limit_detection numeric same unit as value
limit_quantification numeric same unit as value
uncertainty numeric analytical uncertainty in the measurement
same unit as value
unit_uncertainty character SD, U2 or %
if uncertainty is present, unit_uncertainty must also be present
method_pretreatment character use ICES reference codes for METPT
method_analysis character use ICES reference codes for METOA
required for bile metabolite measurements
method_extraction character use ICES reference codes for METCX
required for sediment normalisation (typically for metals)

Station data

The station file has one row for each station.

current_name Type mandatory Comments
OSPAR_region character the regional columns can be called anything (and are optional)
for OSPAR assessments, use OSPAR_region and OSPAR_subregion
for HELCOM assessments use HELCOM_subbasin, HELCOM_L3 and HELCOM_L4
for other assessments any regional columns must be explicitly identified when calling read_data using the control argument
OSPAR_subregion character see above
country character yes no missing values
station_code alphanumeric yes no missing values
station_name character yes no missing values
station_longname character typically a more intuitive name for the station than station_name
station_latitude numeric (decimal degrees) yes no missing values
station_longitude numeric (decimal degrees) yes no missing values
station_type character see ICES reference codes for MSTAT
typically B (baseline), RH (representative) or IH (impacted)
waterbody_type character see ICES reference codes for WLTYP
typically a code indicating transitional (estuarine) waters, coastal waters or open sea