Add stations to contaminant data from an ICES extraction
Source:R/import_functions.R
add_stations.RdAdds the station name and station code to the contaminant data from an ICES extraction. This is done by either matching the station names submitted with the data to the station dictionary, or by matching the sample coordinates to the station dictionary, or a combination of both.
Arguments
- data
A data frame with the contaminant data from an ICES extraction
- stations
A data frame with the ICES station dictionary
- info
A HARSAT information list which must contain the elements
purpose,compartment, andadd_stations. The latter is a list of control parameters supplied throughcontrol_defaultorcontrol_modifywhich control how the station matching is achieved. See details.
Value
A data frame containing the contaminant data augmented by variables containing the station code and the station name
Details
info$add_stations is a list of control parameters that modify the
station matching process. The default values depend on info$purpose and
are given at the end of this section. The elements of info$add_stations
are:
method: a string specifying whether the stations are matched by"name","coordinates", or"both". Ifinfo$purposeis"custom",methodis restricted to either"name"(the default) or"coordinates". Ifinfo$purposeis"OSPAR","HELCOM"or"AMAP", then method is set to"both"by default and stations are matched by name or coordinates according to rules specified by OSPAR, HELCOM or AMAP data providers. Currently, stations are matched by name for Denmark, France (biota and water - all years; sediment 2009 onwards), Germany (biota HELCOM - all years; biota OSPAR, biota AMAP, sediment, water 2023 onwards), Ireland, Norway, Portugal, Spain (2005 onwards), Sweden, The Netherlands (2007 onwards), United Kingdom. All other stations are matched by coordinates.area: a vector of strings containing one or more of"OSPAR","HELCOM"and"AMAP"; this restricts the stations to those in the corresponding convention area(s);NULLmatches to all stations in the station dictionarydatatype: a logical specifying whether the stations should be restricted to those with an appropriate datatype. IfTRUE, a contaminant measurement in biota (for example) will only be matched to stations withstation_datatypecontaining the string"CF". Similarly, a biological effect measurement in biota will only be matched to stations withstation_datatypecontaining the string"EF"temporal: a logical withTRUEindicating that stations should be restricted to those withstation_purpmcontaining the string"T"governance_type: a string:"none","data","stations"or"both"which controls how data and station governance are used in station matching. Data governance information is found indata$is_amap_monitoring,data$is_helcom_monitoringanddata$is_ospar_monitoringwhich are based on the monitoring programme (MPROG) information provided in submissions to the ICES data base. Station governance information is found instations$station_programgovernancewhich is provided by data submitters to the ICES station dictionary.governance_typeis used in conjunction withgovernance_id(see below) as follows:"none"means that data and station governance are both ignored"data"means that matching will be restricted by data governance but not station governance. For example, ifgovernance_id == "HELCOM", then data will only be matched to a station ifis_helcom_monitoring == TRUE(with all stations considered regardless of station governance). Ifgovernance_idtakes multiple values, e.g.c("OSPAR", "AMAP"), then data will only be matched to a station if e.g. eitheris_ospar_monitoring == TRUEoris_amap_monitoring == TRUE."stations"means that matching will be restricted by station governance but not by data governance. For example, ifgovernance_id == "HELCOM", then the stations will be restricted to those wherestation_programgovernancecontains"HELCOM"(with all data considered regardless of data governance). Ifgovernance_idtakes mulitple values, e.g.c("OSPAR", "AMAP"), then the stations will be restricted to those wherestation_programgovernancecontains e.g. either"OSPAR"or"AMAP"."both"uses both data and station governance. For example, ifgovernance_id == "HELCOM"then data will only be matched ifis_helcom_monitoring == TRUEand the candidate stations will be restricted to those wherestation_programmegovernancecontains"HELCOM". Ifgovernance_idcontains multiple values, e.g.c("OSPAR", "AMAP"), then data withis_ospar_monitoring == TRUEandis_amap_monitoring == FALSEare matched to stations wherestation_programgovernancecontains"OSPAR"; data withis_ospar_monitoring == FALSEandis_amap_monitoring == TRUEare matched with stations wherestation_programgovernancecontains"AMAP"; but measurements whereis_ospar_monitoring == TRUEandis_amap_monitoring == TRUEare matched to stations wherestation_programgovernancecontains either"OSPAR"or"AMAP".
governance_id: used in conjunction withgovernance_type. Ifgovernance_type == "none", thengovernance_idshould be set toNULL. Otherwise,governance_idmust be a vector of strings containing one or more of"AMAP","HELCOM"and"OSPAR".grouping: a logical withTRUEindicating that stations will be grouped into meta-stations as specified bystations$station_asmtmimeparentwhich is provided by data submitters to the ICES station dictionary. Defaults toFALSEapart from wheninfo$purpose == "OSPAR".check_coordinates: a logical withTRUEindicating that, when stations are matched by name, the sample coordinates must also be within the station geometry. Not implemented yet, so defaults toFALSE.
The default values of info$add_stations depend on info$purpose as follows:
"AMAP":list(method = "both", area = "AMAP", datatype = FALSE, temporal = FALSE, governance_type = "none", governance_id = NULL, group = FALSE, check_coordinates = FALSE)"HELCOM":list(method = "both", area = "HELCOM", datatype = FALSE, temporal = FALSE, governance_type = "none", governance_id = NULL, group = FALSE, check_coordinates = FALSE)"OSPAR":list(method = "both", area = "OSPAR", datatype = TRUE, temporal = TRUE, governance_type = "both", governance_id = c("OSPAR", "AMAP"), group = TRUE, check_coordinates = FALSE)"custom":list(method = "name", area = NULL, datatype = FALSE, temporal = FALSE, governance_type = "none", governance_id = NULL, group = FALSE, check_coordinates = FALSE)