Title: | Tidy Water Quality Data and Calculate Thresholds for British Columbia |
---|---|
Description: | Tidies water quality data and calculates water quality thresholds for British Columbia. |
Authors: | Joe Thorley [aut, ctr] , Colin Millar [aut, ctr], Andy Teucher [aut, cre], Sebastian Dalgarno [ctb] , Wendy Wang [ctb, ctr], Stephanie Hazlitt [ctb], Robyn Irvine [ctb, ctr], Province of British Columbia [cph], Ayla Pearson [ctb] |
Maintainer: | Andy Teucher <[email protected]> |
License: | Apache License (== 2.0) | file LICENSE |
Version: | 0.3.1.9003 |
Built: | 2024-12-16 03:44:56 UTC |
Source: | https://github.com/poissonconsulting/wqbc |
Calculates the approved "short" or "long"-term or the "long-daily"
upper water quality thresholds for freshwater life in British Columbia.
The water quality data is automatically cleaned using clean_wqdata
prior to calculating the limits to ensure: all variables are recognised,
all values are non-negative and in the standard units, divergent replicates
are filtered and all remaining replicates are averaged. Only limits whose
conditions are met are returned.
calc_limits( x, by = NULL, term = "long", dates = NULL, keep_limits = TRUE, delete_outliers = FALSE, estimate_variables = FALSE, clean = TRUE, limits = wqbc::limits, messages = getOption("wqbc.messages", default = TRUE), use = "Freshwater Life" )
calc_limits( x, by = NULL, term = "long", dates = NULL, keep_limits = TRUE, delete_outliers = FALSE, estimate_variables = FALSE, clean = TRUE, limits = wqbc::limits, messages = getOption("wqbc.messages", default = TRUE), use = "Freshwater Life" )
x |
A data.frame of water quality readings to calculate the limits for. |
by |
A optional character vector of the columns in x to calculate the limits by. |
term |
A string indicating whether to calculate the "long" or "short"-term or "long-daily" limits. |
dates |
A optional date vector indicating the start of 30 day long-term periods. |
keep_limits |
A flag indicating whether to keep values with user supplied upper or lower limits. |
delete_outliers |
A flag indicating whether to delete outliers or merely flag them. |
estimate_variables |
A flag indicating whether to estimate total hardness, total chloride and pH for all dates. |
clean |
Should the data be run through |
limits |
A data frame of the limits table to use. |
messages |
A flag indicating whether to print messages. |
use |
A string indicating the Use. |
If a limit depends on another variable
such as pH, Total Chloride, or Total Hardness and no value was recorded for the date of interest
then the pH, Total Chloride or Total Hardness value is assumed to be the
average recorded value over the 30 day period.
The one exception is if estimate_variables = TRUE
in which case a parametric model
is used to predict the pH, Total Chloride and Total Hardness for all dates with a value of any variable.
Existing values are replaced.
If, in every year, there are less then 12 pH/Total Chloride/Total Hardness then an average value is taken.
Otherwise, if there is only one year with 12 or more values a simple seasonal smoother is used.
If there is two years with 12 or more values then a seasonal smoother with a trend is fitted.
Otherwise a model with trend and a dynamic seasonal component is fitted.
When considering long-term limits there must be at least 5 values spanning 21 days. As replicates are averaged prior to calculating the limits each of the 5 values must be on a separate day. The first 30 day period begin at the date of the first reading while the next 30 day period starts at the date of the first reading after the previous period and so on. The only exception to this is if the user provides dates in which case each period extends for 30 days or until a provided date is reached. It is important to note that the averaging of conditional variables, the 5 in 30 rule and the assignment of 30 day periods occurs independently for all combination of factor levels in the columns specified by by.
If the user wishes to consider the long-term thresholds without the above requirements that
there are at least 5 values spanning 21 days etc then they should set
term = "long-daily"
clean_wqdata
and lookup_limits
## Not run: demo(fraser) ## End(Not run)
## Not run: demo(fraser) ## End(Not run)
The Canadian Council of Ministers of the Environment (CCME) Water Quality Index 1.0 User's Manual example dataset in tidy format.
ccme
ccme
A data frame with 120 rows and 7 columns:
The date of the reading.
The name of the variable.
The value of the reading.
The detection limit.
The minimum permitted value.
The maximum permitted value.
The units of the value, detection limit and lower and upper limits.
Cleans water quality data. After standardization using standardize_wqdata
replicates (two or more readings for the same variable on the same date) are averaged
using the mean
function.
Readings for the same variable on the same date but at different levels of the
columns specified in by are not considered replicates. The clean_wqdata
function is automatically called by calc_limits
prior
to calculating limits.
clean_wqdata( x, by = NULL, max_cv = Inf, sds = 10, ignore_undetected = TRUE, large_only = TRUE, delete_outliers = FALSE, remove_blanks = FALSE, messages = getOption("wqbc.messages", default = TRUE), FUN = mean )
clean_wqdata( x, by = NULL, max_cv = Inf, sds = 10, ignore_undetected = TRUE, large_only = TRUE, delete_outliers = FALSE, remove_blanks = FALSE, messages = getOption("wqbc.messages", default = TRUE), FUN = mean )
x |
The data.frame to clean. |
by |
A character vector of the columns in x to perform the cleaning by. If you have multiple stations specify the column name that contains the station IDs. |
max_cv |
A number indicating the maximum permitted coefficient of variation for replicates. |
sds |
The number of standard deviations above which a value is considered an outlier. |
ignore_undetected |
A flag indicating whether to ignore undetected values when calculating the average deviation and identifying outliers. |
large_only |
A flag indicating whether only large values which exceed the sds should be identified as outliers. |
delete_outliers |
A flag indicating whether to delete outliers or merely flag them. |
remove_blanks |
Should blanks be removed? Blanks are assumed to be denoted by
a value of |
messages |
A flag indicating whether to print messages. |
FUN |
The function to use for summaries, e.g. |
If there are three or more replicates with a coefficient of variation (CV) in
exceedance of max_cv
then the replicates with the highest absolute deviation
is dropped until the CV is less than or equal to max_cv
or only two values remain. By default all values are averaged.
A max_cv value of 1.29 is exceeded by two zero and one positive value (CV = 1.73) or by two identical positive values and a third value an order or magnitude greater (CV = 1.30). It is not exceed by one zero and two identical positive values (CV = 0.87).
calc_limits
and standardize_wqdata
clean_wqdata(wqbc::dummy, messages = TRUE)
clean_wqdata(wqbc::dummy, messages = TRUE)
The standard variables and codes recognised by the wqbc package with their standard units and the R function to use when averaging multiple samples.
codes
codes
A data frame with 4 variables:
The name of the variable.
The EMS code in expanded form.
The standard units for the variable.
The Variable Code in the Environment Canada data.
Compresses EMS codes by removing EMS_ from start and replacing all '_' with '-'. This function is provided because wqbc stored EMS codes in expanded form.
compress_ems_codes(x)
compress_ems_codes(x)
x |
A character vector of codes to compress. |
compress_ems_codes(c("EMS_0014", "EMS_KR-P", "0-15"))
compress_ems_codes(c("EMS_0014", "EMS_KR-P", "0-15"))
Convert values to different units
convert_values(x, from, to, messages)
convert_values(x, from, to, messages)
x |
a numeric vector of values to convert |
from |
units to convert from |
to |
units to convert to |
messages |
should messages be printed when |
Currently supported units for from
and to
are:
c("ng/L", "ug/L", "mg/L", "g/L", "kg/L", "pH", "degC", "C", "CFU/dL", "MPN/dL", "CFU/100mL", "MPN/100mL", "CFU/g", "MPN/g", "CFU/mL", "MPN/mL", "Col.unit", "Rel", "NTU", "m", "uS/cm")
a numeric vector of the converted values
convert_values(1, "ug/L", "mg/L", messages = FALSE) df <- data.frame( value = c(1.256, 5400000, 12300, .00098), units = c("mg/L", "ng/L", "ug/L", "g/L"), stringsAsFactors = FALSE ) df df$units_mg_L <- convert_values(df$value, from = df$units, to = "mg/L", messages = FALSE) df
convert_values(1, "ug/L", "mg/L", messages = FALSE) df <- data.frame( value = c(1.256, 5400000, 12300, .00098), units = c("mg/L", "ng/L", "ug/L", "g/L"), stringsAsFactors = FALSE ) df df$units_mg_L <- convert_values(df$value, from = df$units, to = "mg/L", messages = FALSE) df
A dummy data set to illustrate various data cleaning functions.
dummy
dummy
A data frame with 4 columns:
The date of the reading.
The name of the variable.
The value of the reading.
The units of the value.
demo(dummy)
demo(dummy)
The standard variables and codes stored in the EMS database.
ems_codes
ems_codes
A tibble with 4 variables:
The name of the variable.
The EMS code.
Throws an error without the call as part of the error message.
error(...)
error(...)
... |
zero or more objects which can be coerced to character (and which are pasted together with no separator) or a single condition object. |
base::stop
Expands EMS codes by adding EMS_ to start if absent and replacing all '-' with '_'. This function is provided because wqbc stored EMS codes in expanded form.
expand_ems_codes(x)
expand_ems_codes(x)
x |
A character vector of codes to expand |
expand_ems_codes(c("0014", "KR-P", "0_15", "EMS_ZN_T"))
expand_ems_codes(c("0014", "KR-P", "0_15", "EMS_ZN_T"))
Surface freshwater quality monitoring in the Fraser River Basin is carried out under the Canada-British Columbia Water Quality Monitoring Agreement. Monitoring is conducted to assess water quality status and long-term trends, detect emerging issues, establish water quality guidelines and track the effectiveness of remedial measures and regulatory decisions.
fraser
fraser
A data frame with 8 columns:
The unique water quality station number.
The date of the reading.
The name of the variable.
The value of the reading.
The units of the value.
The full name of the station.
The latitude of the station in decimal degrees.
The longitude of the station in decimal degrees.
http://open.canada.ca/data/en/dataset/9ec91c92-22f8-4520-8b2c-0f1cce663e18
## Not run: demo(fraser) ## End(Not run)
## Not run: demo(fraser) ## End(Not run)
Calculates the geometric mean by adding 1 before logging and subtracting 1 before exponentiating so that provides results even with zero counts. Not used by any wqbc functions but provided as may be helpful if averaging bacterial counts.
geomean1(x, na.rm = FALSE)
geomean1(x, na.rm = FALSE)
x |
A numeric vector of non-negative numbers. |
na.rm |
A flag indicating whether to remove missing values. |
mean(0:9) geomean1(0:9)
mean(0:9) geomean1(0:9)
The short and long term water quality limits for British Columbia recognised by the wqbc package.
limits
limits
A data frame with 6 variables:
The name of the variable.
The name of the Use.
The term of the limit i.e. "Short" versus "Long".
A logical R expression to test for the required condition.
The upper limit or an R expression defining the upper limit.
The units of the upper limit.
R function to calculate statistic of value.
Returns compressed recognised water quality EMS codes.
If variables = NULL
the function returns all recognised codes.
Otherwise it first substitutes the provided variables for recognised
variables using substitute_variables
and then
looks up the matching codes from codes
.
lookup_codes( variables = NULL, messages = getOption("wqbc.messages", default = TRUE) )
lookup_codes( variables = NULL, messages = getOption("wqbc.messages", default = TRUE) )
variables |
An optional character vector of variables to lookup codes. |
messages |
A flag indicating whether to print messages. |
lookup_limits
and expand_ems_codes
lookup_codes() lookup_codes(c("Aluminum", "Arsenic Total", "Boron Something", "Kryptonite"), messages = TRUE )
lookup_codes() lookup_codes(c("Aluminum", "Arsenic Total", "Boron Something", "Kryptonite"), messages = TRUE )
Looks up the long or short-term water quality limits for BC. If the limits depend on on the pH, total hardness (CaCO3), total chloride or the concentration of methyl mercury and site specific values are not provided then the dependent limits are returned as missing values.
lookup_limits( ph = NULL, hardness = NULL, chloride = NULL, methyl_mercury = NULL, term = "long", use = "Freshwater Life" )
lookup_limits( ph = NULL, hardness = NULL, chloride = NULL, methyl_mercury = NULL, term = "long", use = "Freshwater Life" )
ph |
A number indicating the pH in pH units at the site of interest. |
hardness |
A number indicating the total hardness (CaCO3) in mg/L at the site of interest. |
chloride |
A number indicating the total chloride concentration in mg/L at the site of interest. |
methyl_mercury |
A number indicating the total concentration of methyl mercury in ug/L at the site of interest. |
term |
A string indicating whether to lookup the "long" or "short"-term limits. |
use |
A string indicating the Use. |
lookup_limits(ph = 8, hardness = 100, chloride = 50, methyl_mercury = 2) lookup_limits(term = "short")
lookup_limits(ph = 8, hardness = 100, chloride = 50, methyl_mercury = 2) lookup_limits(term = "short")
Returns a character vector of the recognised units.
lookup_units()
lookup_units()
lookup_units()
lookup_units()
Returns a character vector of the recognised uses.
lookup_use()
lookup_use()
lookup_use()
lookup_use()
Returns recognised water quality variables.
If codes = NULL
the function returns all recognised variable names.
Otherwise it
looks up the matching variables from codes
. Whether or
not the codes are compressed or expanded is unimportant.
lookup_variables( codes = NULL, messages = getOption("wqbc.messages", default = TRUE) )
lookup_variables( codes = NULL, messages = getOption("wqbc.messages", default = TRUE) )
codes |
An optional character vector of codes to look up variables. |
messages |
A flag indicating whether to print messages. |
lookup_limits
and expand_ems_codes
lookup_variables() lookup_variables(c("AL-D", "EMS_AS_T", "B--T", "KRYP"), messages = TRUE)
lookup_variables() lookup_variables(c("AL-D", "EMS_AS_T", "B--T", "KRYP"), messages = TRUE)
If by = NULL
plot_timeseries returns a ggplot object.
Otherwise it returns a list of ggplot objects.
plot_timeseries( data, by = NULL, y0 = TRUE, size = 1, messages = getOption("wqbc.messages", default = TRUE) )
plot_timeseries( data, by = NULL, y0 = TRUE, size = 1, messages = getOption("wqbc.messages", default = TRUE) )
data |
A data frame of the data to plot. |
by |
A character vector of the columns to plot the time series by. |
y0 |
A flag indicating whether to expand the y-axis limits to include 0. |
size |
A number of the point size. |
messages |
A flag indicating whether to print messages. |
plot_timeseries(ccme[ccme$Variable == "As", ]) plot_timeseries(ccme, by = "Variable")
plot_timeseries(ccme[ccme$Variable == "As", ]) plot_timeseries(ccme, by = "Variable")
Set a value where the actual value of a measurement is less than the method detection limit (MDL)
set_non_detects( value, mdl_flag = NULL, mdl_value = NULL, mdl_action = c("zero", "mdl", "half", "na") )
set_non_detects( value, mdl_flag = NULL, mdl_value = NULL, mdl_action = c("zero", "mdl", "half", "na") )
value |
a numeric vector of measured values |
mdl_flag |
a character vector the same length as |
mdl_value |
a numeric vector the same length as |
mdl_action |
What to do with values below the detection limit. Options
are |
You must supply either mdl_flag
or mdl_value
, or both. When only
mdl_flag
is supplied, it is assumed that the original value
has been
set to the MDL, and will be adjusted according to the mdl_action
. When
only mdl_value
is supplied then any value
less than that will be
adjusted appropriately using the corresponding mdl_value
. When both
mdl_flag
and mdl_value
are supplied, any value
with a corresponding
<
in the mdl_flag
vector will be adjusted appropriately using the
corresponding mdl_value
.
a numeric vector the same length as value with non-detects adjusted accordingly
The standard variables, VMV codes and Variable Codes provided by Environment Canada.
site_limits
site_limits
A tibble
The name of the station as a character.
The variable name as character.
The term ('Short' or 'Long') as a character.
The condition as a character.
The upper limit as a character.
The units as a character.
The EMS codes as a character.
The station EMS ID as a character.
The use name as character.
Standardizes a water quality data set using substitute_variables
and substitute_units
so that all remaining values have the
recognised codes and variables in codes
and the standard
units. If column Code is present then a Variable column is created using
lookup_variables
. The standardize_wqdata
function
is called by clean_wqdata
prior to cleaning.
standardize_wqdata( x, strict = TRUE, messages = getOption("wqbc.messages", default = TRUE) )
standardize_wqdata( x, strict = TRUE, messages = getOption("wqbc.messages", default = TRUE) )
standardize_wqdata(wqbc::dummy, messages = TRUE)
standardize_wqdata(wqbc::dummy, messages = TRUE)
The water quality stations for British Columbia with their coordinates.
stations
stations
A tibble with 4 variables:
The station ID (chr).
The EMS name of the station (chr).
The station latitude in decimal degrees (dbl).
The station longitude in decimal degrees (dbl).
Substitutes provided unit names for recognised units. Before matching all spaces and "units" or "UNITS" are removed. The case is not important. Where there are no matches missing values are returned.
substitute_units(x, messages = getOption("wqbc.messages", default = TRUE))
substitute_units(x, messages = getOption("wqbc.messages", default = TRUE))
x |
The character vector of units to substitute. |
messages |
A flag indicating whether to print messages. |
substitute_units(c("mg/L", "MG/L", "mg /L ", "Kg/l", "gkl"), messages = TRUE)
substitute_units(c("mg/L", "MG/L", "mg /L ", "Kg/l", "gkl"), messages = TRUE)
Substitutes provided variable names for recognised names. The case is not important. Where there are no matches missing values are returned. When strict = TRUE all words in a recognised variable must be present in x but when strict = FALSE (soft-deprecated) the only requirement is that the first word is present. When strict = FALSE recognised variables with the same first word such as "Iron Dissolved" and "Iron Total" are excluded from matches. In both cases the only requirement is that all words or just the first word are present in x. The order of the words does not matter nor does the presence of other words. This means that a value such as "Total Fluoride Hardness" matches two recognised variables which causes an error. The code also considers Aluminium to be a match with Aluminum.
substitute_variables( x, strict = TRUE, messages = getOption("wqbc.messages", default = TRUE) )
substitute_variables( x, strict = TRUE, messages = getOption("wqbc.messages", default = TRUE) )
substitute_variables(c( "ALUMINIUM SOMETHING", "ALUMINUM DISSOLVED", "dissolved aluminium", "BORON Total", "KRYPTONITE", "Total Fluoride Hardness" ), messages = TRUE)
substitute_variables(c( "ALUMINIUM SOMETHING", "ALUMINUM DISSOLVED", "dissolved aluminium", "BORON Total", "KRYPTONITE", "Total Fluoride Hardness" ), messages = TRUE)
Compute annual summaries of water quality observations.
summarise_for_trends( data, breaks = NULL, FUN = "median", messages = getOption("wqbc.messages", default = TRUE) )
summarise_for_trends( data, breaks = NULL, FUN = "median", messages = getOption("wqbc.messages", default = TRUE) )
data |
The data.frame to analyse. |
breaks |
A numeric vector used to create groups of consecutive months, if NULL the full year is used. |
FUN |
The function to use for yearly summaries, e.g. median, mean, or max. |
messages |
A flag indicating whether to print messages. |
The data must contain the columns Station, Date, Variable, Value, and Units.
A tibble data.frame with rows for each Station, Variable, Year and month grouping.
# select one station data(yuepilon) data <- yuepilon[yuepilon$Station == "02EA005", ] # estimate trend (using simple sen slope) trend <- test_trends(data, messages = TRUE) # get the data used in the test datasum <- summarise_for_trends(data) plot(datasum$Year, datasum$Value, main = paste("p-value =", round(trend$significance, 3)), ylab = "Value", xlab = "Year", las = 1 )
# select one station data(yuepilon) data <- yuepilon[yuepilon$Station == "02EA005", ] # estimate trend (using simple sen slope) trend <- test_trends(data, messages = TRUE) # get the data used in the test datasum <- summarise_for_trends(data) plot(datasum$Year, datasum$Value, main = paste("p-value =", round(trend$significance, 3)), ylab = "Value", xlab = "Year", las = 1 )
Calculates summary statistics for water quality data using log-normal maximum-likelihood models.
summarise_wqdata( x, by = NULL, censored = FALSE, na.rm = FALSE, conf_level = 0.95, quan_range = 0.5 )
summarise_wqdata( x, by = NULL, censored = FALSE, na.rm = FALSE, conf_level = 0.95, quan_range = 0.5 )
x |
The data.frame to summarise. |
by |
A character vector specifying the columns in x to independently summarise by. |
censored |
A flag specifying whether to account for non-detects. |
na.rm |
A flag specifying whether to exclude missing Value values when summarising. |
conf_level |
A number between 0 and 1 specifying confidence limits. By default calculates 95% confidence intervals. |
quan_range |
A number between 0 and 1 specifying the quantile range. By default calculates the inter-quartile range. |
The data set must include a numeric 'Value' and a character or factor 'Variable' column.
By default the summary statistics are independently calculated for each Variable. The user can specify additional columns to independently calculate the statistics by using the by argument.
If the user wishes to account for non-detects using left-censored maximum-likelihood (by setting censored = TRUE) the data set must also include a numeric DetectionLimit column.
Missing values in the DetectionLimit column are assumed to indicate that the Values are not censored. Missing values in the Value column are always considered to be missing values. If the user wishes to exclude missing values in the Value column they should set na.rm = TRUE.
A tibble of the summary statistics.
data.frame(Variable = "var", Value = 1:5, stringsAsFactors = FALSE)
data.frame(Variable = "var", Value = 1:5, stringsAsFactors = FALSE)
Analyses time series using the Thiel-Sen estimate of slope. It requires at least 6 years of data.
test_trends( data, breaks = NULL, FUN = "median", messages = getOption("wqbc.messages", default = TRUE) )
test_trends( data, breaks = NULL, FUN = "median", messages = getOption("wqbc.messages", default = TRUE) )
data |
The data.frame to analyse. |
breaks |
A numeric vector used to create groups of consecutive months, if NULL the full year is used. |
FUN |
The function to use for yearly summaries, e.g. median, mean, or max. |
messages |
A flag indicating whether to print messages. |
The data must contain the columns Station, Date, Variable, Value, and Units.
A tibble data.frame with rows for each Station, Variable, and month grouping, and additional columns for the sen slope estinate, 95\
data <- wqbc::yuepilon trend <- test_trends(data, breaks = 6, messages = TRUE) ## Not run: demo(test_trends) ## End(Not run)
data <- wqbc::yuepilon trend <- test_trends(data, breaks = 6, messages = TRUE) ## Not run: demo(test_trends) ## End(Not run)
Tidies water quality data downloaded from Environment Canada website. It
is recommended to obtain the data via canwqdata::wq_site_data()
or
canwqdata::wq_basin_data()
It retains and renames required columns and sets the timezone to PST.
tidy_ec_data( x, cols = character(0), mdl_action = c("zero", "mdl", "half", "na", "none") )
tidy_ec_data( x, cols = character(0), mdl_action = c("zero", "mdl", "half", "na", "none") )
x |
The data to tidy. |
cols |
additional columns from the EMS data to retain specified as a character vector of column names that exist in the data. The dafault columns retained are:
|
mdl_action |
What to do with values below the detection limit. Options
are |
A tibble of the tidied rems data.
Tidies water quality data downloaded from EMS database using the bcgov/rems package. It retains and renames required columns and sets the timezone to PST.
tidy_ems_data( x, cols = character(0), mdl_action = c("zero", "mdl", "half", "na", "none") )
tidy_ems_data( x, cols = character(0), mdl_action = c("zero", "mdl", "half", "na", "none") )
x |
The data to tidy. |
cols |
additional columns from the EMS data to retain specified as a character vector of column names that exist in the data. The dafault columns retained are:
|
mdl_action |
What to do with values below the detection limit. Options
are |
It sets values that are flagged as being less than the detection limit to zero. It does not alter values that are flagged as being greater than the detection limit - that is left up to the user.
A tibble of the tidied rems data.
The standard variables, VMV codes and Variable Codes provided by Environment Canada.
vmv_codes
vmv_codes
A tibble
The name of the variable.
The VMV code.
The Variable Code in the Environment Canada data.
There can be more than one VMV Code for a variable!
A crosswalk table linking EMS codes to Environment and Climate Change Canada VMV codes
vmv_ems
vmv_ems
A tibble
EMS Code for the variable
EMS name for the variable
EMS name for the unit
EMS code for the unit
EMS code for the method
EMS name for the method
EMS method detection limit
VMV code (unique for variable, method, and unit)
VMV code for the variable
VMV name for the variable
VMV name for the variable type
VMV name for the unit
VMV name for the unit
VMV code for the method
VMV name for the method
Hydrometric data are collected and compiled by Water Survey of Canada's eight regional offices. The information is housed in two centrally-managed databases: HYDEX and HYDAT.
yuepilon
yuepilon
A data frame with 5 columns:
Unique 7-character station identification code.
The year of the data stored as if it was taken on the 1st Jan.
The name of the variable.
The value of the reading.
The units of the value.
The full name of the station.
The latitude of the station in decimal degrees.
The longitude of the station in decimal degrees.
HYDAT is a relational database that contains the actual computed data for the stations listed in HYDEX. These data include: daily and monthly means of flow, water levels and sediment concentrations (for sediment sites). For some sites, peaks and extremes are also recorded.
WSC now offers hydrometric data and station information in a single downloadable file, either in Microsoft Access Database format or in SQLite format, updated on a quarterly basis.
This database was used to derive the yuepilon dataset, which is a table of annual mean river flows for four sites: 02FB007, 02KB001, 02EA005 and 02GA010.
http://www.ec.gc.ca/rhc-wsc/default.asp?lang=En&n=9018B5EC-1