Title: | Detect Heatwaves and Cold-Spells |
---|---|
Description: | The different methods for defining, detecting, and categorising the extreme events known as heatwaves or cold-spells, as first proposed in Hobday et al. (2016) <doi: 10.1016/j.pocean.2015.12.014> and Hobday et al. (2018) <https://www.jstor.org/stable/26542662>. The functions in this package work on both air and water temperature data. These detection algorithms may be used on non-temperature data as well. |
Authors: | Robert W. Schlegel [aut, cre, ctb]
|
Maintainer: | Robert W. Schlegel <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.5.3.9005 |
Built: | 2025-02-20 06:02:14 UTC |
Source: | https://github.com/robwschlegel/heatwaver |
A dataset containing the daily maximum and minimum air temperatures (in degrees Celsius) and date for Algiers, Algeria for the period 1961-01-01 to 2005-12-31.
Algiers
Algiers
A data frame with 16436 rows and 3 variables:
date, as.Date() format
daily max. temperature, in degrees Celsius
daily min. temperature, in degrees Celsius
...
lon/lat:
Mr. Haouari Mahmoud, IHFR, Algeria
Calculate yearly means for event metrics.
block_average(data, x = t, y = temp, report = "full", returnDF = TRUE)
block_average(data, x = t, y = temp, report = "full", returnDF = TRUE)
data |
Accepts the data returned by the |
x |
This column is expected to contain a vector of dates as per the
specification of |
y |
This is a column containing the measurement variable. If the column
name differs from the default (i.e. |
report |
Specify either |
returnDF |
The default ( |
This function needs to be provided with the full output from the detect_event
or exceedance
functions. Note that the yearly averages are calculated only for
complete years (i.e. years that start/end part-way through the year at the beginning
or end of the original time series are removed from the calculations).
This function differs from the python implementation of the function of the
same name (i.e., blockAverage
, see https://github.com/ecjoliver/marineHeatWaves)
in that we only provide the ability to calculate the average (or aggregate)
event metrics in 'blocks' of one year, while the python version allows
arbitrary (integer) block sizes.
Note that if this function is used on the output of exceedance
, all of the metrics
(see below) with relThresh
in the name will be returned as NA
values.
The function will return a data frame of the averaged (or aggregate) metrics. It includes the following:
year |
The year over which the metrics were averaged. |
count |
The number of events per year. |
duration |
The average duration of events per year [days]. |
duration_max |
The maximum duration of an event in each year [days]. |
intensity_mean |
The average event "mean intensity" in each year [deg. C]. |
intensity_max |
The average event "maximum (peak) intensity" in each year [deg. C]. |
intensity_max_max |
The maximum event "maximum (peak) intensity" in each year [deg. C]. |
intensity_var |
The average event "intensity variability" in each year [deg. C]. |
intensity_cumulative |
The average event "cumulative intensity" in each year [deg. C x days]. |
rate_onset |
Average event onset rate in each year [deg. C / days]. |
rate_decline |
Average event decline rate in each year [deg. C / days]. |
total_days |
Total number of events days in each year [days]. |
total_icum |
Total cumulative intensity over all events in each year [deg. C x days]. |
intensity_max_relThresh
, intensity_mean_relThresh
,
intensity_var_relThresh
, and intensity_cumulative_relThresh
are as above except relative to the threshold (e.g., 90th percentile) rather
than the seasonal climatology.
intensity_max_abs
, intensity_mean_abs
, intensity_var_abs
, and
intensity_cumulative_abs
are as above except as absolute magnitudes
rather than relative to the seasonal climatology or threshold.
Albertus J. Smit, Eric C. J. Oliver, Robert W. Schlegel
Hobday, A.J. et al. (2016), A hierarchical approach to defining marine heatwaves, Progress in Oceanography, 141, pp. 227-238, doi: 10.1016/j.pocean.2015.12.014
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) out <- block_average(res) summary(glm(count ~ year, out, family = "poisson")) library(ggplot2) ggplot(data = out, aes(x = year, y = count)) + geom_point(colour = "salmon") + geom_line() + labs(x = NULL, y = "Number of events")
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) out <- block_average(res) summary(glm(count ~ year, out, family = "poisson")) library(ggplot2) ggplot(data = out, aes(x = year, y = count)) + geom_point(colour = "salmon") + geom_line() + labs(x = NULL, y = "Number of events")
Calculates the categories of MHWs or MCSs produced by detect_event
in
accordance with the naming scheme proposed in Hobday et al. (2018).
category( data, y = temp, S = TRUE, name = "Event", climatology = FALSE, MCScorrect = FALSE, MCSice = FALSE, season = "range", roundVal = 4, lat_col = FALSE )
category( data, y = temp, S = TRUE, name = "Event", climatology = FALSE, MCScorrect = FALSE, MCSice = FALSE, season = "range", roundVal = 4, lat_col = FALSE )
data |
The function receives the full (list) output from the
|
y |
The column containing the measurement variable. If the column
name differs from the default (i.e. |
S |
This argument informs the function if the data were collected in the
southern hemisphere (TRUE, default) or the northern hemisphere (FALSE) so that it may correctly
output the |
name |
If a character string (e.g. "Bohai Sea") is provided here it will be used
to name the events in the |
climatology |
The default setting of |
MCScorrect |
When calculating marine cold-spells (MCSs) it may occur in some areas that the bottom thresholds for the more intense categories will be below -1.8C, this is physically impossible on Earth, so if one wants to correct the bottom thresholds to not be able to exceed -1.8C, set this argument to TRUE (default is FALSE). |
MCSice |
Sensu Schlegel et al. (2021; Marine cold-spells), it is advisable to classify a MCS with an event threshold below -1.7°C as a 'V Ice' category event. |
season |
This argument allows the user to decide how the season(s) of occurrence for
the MHWs are labelled. The default setting of |
roundVal |
This argument allows the user to choose how many decimal places
the outputs will be rounded to. Default is 4. To
prevent rounding set |
lat_col |
The user may set |
An explanation for the categories is as follows:
Events that have been detected, but with a maximum intensity that does not double the distance between the seasonal climatology and the threshold value.
Events with a maximum intensity that doubles the distance from the seasonal climatology and the threshold, but do not triple it.
Events that triple the aforementioned distance, but do not quadruple it.
Events with a maximum intensity that is four times or greater than the aforementioned distance.
If 'MCSice = T', a MCS with an event threshold below -1.7°C will be classified here.
The function will return a data.frame with results similar to those seen in
Table 2 of Hobday et al. (2018). This provides the information necessary to
appraise the extent of the events in the output of detect_event
based on the
category ranking scale. The category thresholds are calculated based on the difference
between the given seasonal climatology and threshold climatology. The four category levels
are then the difference multiplied by the category level.
The definitions for the default output columns are as follows:
event_no |
The number of the event as determined by |
event_name |
The name of the event. Generated from the |
peak_date |
The date (day) on which the maximum intensity of the event was recorded. |
category |
The maximum category threshold reached/exceeded by the event. |
i_max |
The maximum intensity of the event above the threshold value. |
duration |
The total duration (days) of the event. Note that this includes
any possible days when the measurement value |
p_moderate |
The proportion of the total duration (days) spent at or above the first threshold, but below any further thresholds. |
p_strong |
The proportion of the total duration (days) spent at or above the second threshold, but below any further thresholds. |
p_severe |
The proportion of the total duration (days) spent at or above the third threshold, but below the fourth threshold. |
p_extreme |
The proportion of the total duration (days) spent at or above the fourth and final threshold. |
season |
The season(s) during which the event occurred. If the event
occurred across two seasons this will be displayed as e.g. "Winter/Spring".
Across three seasons as e.g. "Winter-Summer". Events lasting across four or more
seasons are listed as "Year-round". December (June) is used here as the start of
Austral (Boreal) summer. If "start", "peak", or "end" was given to the |
If climatology = TRUE
, this function will output a list of two dataframes.
The first dataframe, climatology
, will contain the following columns:
t |
The column containing the daily date values. |
event_no |
The numeric event number label. |
intensity |
The daily exceedance (default is degrees C) above the seasonal climatology. |
category |
The category classification per day. |
The second dataframe, event
, contains the default output of this function,
as detailed above.
Robert W. Schlegel
Hobday et al. (2018). Categorizing and Naming Marine Heatwaves. Oceanography 31(2).
Schlegel et al. (2021). Marine cold-spells. Progress in Oceanography 198(102684).
res_WA <- detect_event(ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"))) # Note that the name argument expects a character vector cat_WA <- category(res_WA, name = "WA") tail(cat_WA) # If the data were collected in the northern hemisphere # we must let the function know this, as seen below res_Med <- detect_event(ts2clm(sst_Med, climatologyPeriod = c("1983-01-01", "2012-12-31"))) cat_Med <- category(res_Med, S = FALSE, name = "Med") tail(cat_Med) # One may also choose to have this function output the daily # category classifications as well by setting: climatology = TRUE cat_WA_daily <- category(res_WA, name = "WA", climatology = TRUE) head(cat_WA_daily$climatology)
res_WA <- detect_event(ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"))) # Note that the name argument expects a character vector cat_WA <- category(res_WA, name = "WA") tail(cat_WA) # If the data were collected in the northern hemisphere # we must let the function know this, as seen below res_Med <- detect_event(ts2clm(sst_Med, climatologyPeriod = c("1983-01-01", "2012-12-31"))) cat_Med <- category(res_Med, S = FALSE, name = "Med") tail(cat_Med) # One may also choose to have this function output the daily # category classifications as well by setting: climatology = TRUE cat_WA_daily <- category(res_WA, name = "WA", climatology = TRUE) head(cat_WA_daily$climatology)
Applies the Hobday et al. (2016) marine heat wave definition to an input time
series of a given value (usually, but not necessarily limited to, temperature)
along with a daily date vector and pre-calculated seasonal and threshold
climatologies, which may either be created with ts2clm
or some
other means.
detect_event( data, x = t, y = temp, seasClim = seas, threshClim = thresh, threshClim2 = NA, minDuration = 5, minDuration2 = minDuration, joinAcrossGaps = TRUE, maxGap = 2, maxGap2 = maxGap, coldSpells = FALSE, protoEvents = FALSE, categories = FALSE, roundRes = 4, returnDF = TRUE, ... )
detect_event( data, x = t, y = temp, seasClim = seas, threshClim = thresh, threshClim2 = NA, minDuration = 5, minDuration2 = minDuration, joinAcrossGaps = TRUE, maxGap = 2, maxGap2 = maxGap, coldSpells = FALSE, protoEvents = FALSE, categories = FALSE, roundRes = 4, returnDF = TRUE, ... )
data |
A data frame with at least four columns. In the default setting
(i.e. omitting the arguments |
x |
This column is expected to contain a vector of dates as per the
specification of |
y |
This is a column containing the measurement variable. If the column
name differs from the default (i.e. |
seasClim |
The dafault for this argument assumes that the seasonal
climatology column is called |
threshClim |
The threshold climatology column should be called
|
threshClim2 |
If one wishes to provide a second climatology threshold filter for the more rigorous detection of events, a vector or column containing logical values (i.e. TRUE FALSE) should be provided here. By default this argument is ignored. It's primary purpose is to allow for the inclusion of tMin and tMax thresholds. |
minDuration |
The minimum duration for acceptance of detected events.
The default is |
minDuration2 |
The minimum duration for acceptance of events after
filtering by |
joinAcrossGaps |
Boolean switch indicating whether to join events which
occur before/after a short gap as specified by |
maxGap |
The maximum length of gap allowed for the joining of MHWs. The
default is |
maxGap2 |
The maximum gap length after applying both thresholds.
By default |
coldSpells |
Boolean specifying if the code should detect cold events
instead of warm events. The default is |
protoEvents |
The default, |
categories |
Rather than using |
roundRes |
This argument allows the user to choose how many decimal places
the MHW metric outputs will be rounded to. Default is 4. To
prevent rounding set |
returnDF |
The default ( |
... |
Other arguments that will be passed internally to |
This function assumes that the input time series consists of continuous
daily values with few missing values. Time ranges which start and end
part-way through the calendar year are supported. The accompanying function
ts2clm
aids in the preparation of a time series that is
suitable for use with detect_event
, although this may also be accomplished
'by hand' as long as the criteria are met as discussed in the documentation
to ts2clm
.
The calculation of onset and decline rates assumes that the events
started a half-day before the start day and ended a half-day after the
end-day. This is consistent with the duration definition as implemented,
which assumes duration = end day - start day + 1. An event that is already
present at the beginning of a time series, or an event that is still present
at the end of a time series, will report the rate of onset or the rate of
decline as NA
, as it is impossible to know what the temperature half a
day before or after the start or end of the event is.
For the purposes of event detection, any missing temperature values not
interpolated over (through optional maxPadLength
in ts2clm
)
will be set equal to the seasonal climatology. This means they will trigger
the end/start of any adjacent temperature values which satisfy the event
definition criteria.
If the code is used to detect cold events (coldSpells = TRUE
),
then it works just as for heat waves except that events are detected as
deviations below the (100 - pctile)th percentile (e.g., the 10th instead of
90th) for at least 5 days. Intensities are reported as negative values and
represent the temperature anomaly below climatology.
The original Python algorithm was written by Eric Oliver, Institute for Marine and Antarctic Studies, University of Tasmania, Feb 2015, and is documented by Hobday et al. (2016). The marine cold spell option was implemented in version 0.13 (21 Nov 2015) of the Python module as a result of our preparation of Schlegel et al. (2017), wherein the cold events receive a brief overview.
The function will return a list of two data.frames,
climatology
and event
, which are, surprisingly, the climatology
and event results, respectively. The climatology contains the full time series of
daily temperatures, as well as the the seasonal climatology, the threshold
and various aspects of the events that were detected. The software was
designed for detecting extreme thermal events, and the units specified below
reflect that intended purpose. However, various other kinds of extreme
events may be detected according to the specifications, and if that is the
case, the appropriate units need to be determined by the user.
The climatology
results will contain the same column produced by
ts2clm
as well as the following:
threshCriterion |
Boolean indicating if |
durationCriterion |
Boolean indicating whether periods of consecutive
|
event |
Boolean indicating if all criteria that define an extreme event are met. |
event_no |
A sequential number indicating the ID and order of occurrence of the events. |
intensity |
The difference between |
category |
The category classification per day. Only added
if |
The event
results are summarised using a range of event metrics:
event_no |
A sequential number indicating the ID and order of the events. |
index_start |
Start index of event. |
index_end |
End index of event. |
duration |
Duration of event [days]. |
date_start |
Start date of event [date]. |
date_end |
End date of event [date]. |
date_peak |
Date of event peak [date]. |
intensity_mean |
Mean intensity [deg. C]. |
intensity_max |
Maximum (peak) intensity [deg. C]. |
intensity_var |
Intensity variability (standard deviation) [deg. C]. |
intensity_cumulative |
Cumulative intensity [deg. C x days]. |
rate_onset |
Onset rate of event [deg. C / day]. |
rate_decline |
Decline rate of event [deg. C / day]. |
event_name |
The name of the event. Generated from the |
category |
The maximum category threshold reached/exceeded by the event.
Only added if |
p_moderate |
The proportion of the total duration (days) spent at or above
the first threshold, but below any further thresholds. Only added if |
p_strong |
The proportion of the total duration (days) spent at or above
the second threshold, but below any further thresholds. Only added if |
p_severe |
The proportion of the total duration (days) spent at or above
the third threshold, but below the fourth threshold. Only added if |
p_extreme |
The proportion of the total duration (days) spent at or above
the fourth and final threshold. Only added if |
season |
The season(s) during which the event occurred. If the event
occurred across two seasons this will be displayed as e.g. "Winter/Spring".
Across three seasons as e.g. "Winter-Summer". Events lasting across four or more
seasons are listed as "Year-round". December (June) is used here as the start of
Austral (Boreal) summer. If "start", "peak", or "end" was given to the |
intensity_max_relThresh
, intensity_mean_relThresh
,
intensity_var_relThresh
, and intensity_cumulative_relThresh
are as above except relative to the threshold (e.g., 90th percentile) rather
than the seasonal climatology.
intensity_max_abs
, intensity_mean_abs
, intensity_var_abs
, and
intensity_cumulative_abs
are as above except as absolute magnitudes
rather than relative to the seasonal climatology or threshold.
Note that rate_onset
and rate_decline
will return NA
when the event begins/ends on the first/last day of the time series. This
may be particularly evident when the function is applied to large gridded
data sets. Although the other metrics do not contain any errors and
provide sensible values, please take this into account in its
interpretation.
Albertus J. Smit, Robert W. Schlegel, Eric C. J. Oliver
Hobday, A.J. et al. (2016). A hierarchical approach to defining marine heatwaves, Progress in Oceanography, 141, pp. 227-238, doi:10.1016/j.pocean.2015.12.014
Schlegel, R. W., Oliver, C. J., Wernberg, T. W., Smit, A. J. (2017). Nearshore and offshore co-occurrences of marine heatwaves and cold-spells. Progress in Oceanography, 151, pp. 189-205, doi:10.1016/j.pocean.2017.01.004
data.table::setDTthreads(threads = 1) res_clim <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) out <- detect_event(res_clim) # show a portion of the climatology: out$climatology[1:10, ] # show some of the heat waves: out$event[1:5, 1:10] # Or if one wants to calculate MCSs res_clim <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), pctile = 10) out <- detect_event(res_clim, coldSpells = TRUE) # show a portion of the climatology: out$climatology[1:10, ] # show some of the cold-spells: out$event[1:5, 1:10] # It is also possible to calculate the categories of events directly # See the \code{\link{category}} documentation for more functionality res_clim <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) out_event <- detect_event(res_clim, categories = TRUE) out_list <- detect_event(res_clim, categories = TRUE, climatology = TRUE) # It is also possible to give two separate sets of threshold criteria # To use a second static threshold we first use the exceedance function thresh_19 <- exceedance(sst_Med, threshold = 19, minDuration = 10, maxGap = 0)$threshold # Then we use that output when detecting our events events_19 <- detect_event(ts2clm(sst_Med, climatologyPeriod = c("1982-01-01", "2011-12-31")), threshClim2 = thresh_19$exceedance, minDuration2 = 10, maxGap2 = 0) # If we want to use two different percentile thresholds we use detect_event thresh_95 <- detect_event(ts2clm(sst_Med, pctile = 95, climatologyPeriod = c("1982-01-01", "2011-12-31")), minDuration = 2, maxGap = 0)$climatology # Then we use that output when detecting our events events_95 <- detect_event(ts2clm(sst_Med, climatologyPeriod = c("1982-01-01", "2011-12-31")), threshClim2 = thresh_95$event, minDuration2 = 2, maxGap2 = 0)
data.table::setDTthreads(threads = 1) res_clim <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) out <- detect_event(res_clim) # show a portion of the climatology: out$climatology[1:10, ] # show some of the heat waves: out$event[1:5, 1:10] # Or if one wants to calculate MCSs res_clim <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), pctile = 10) out <- detect_event(res_clim, coldSpells = TRUE) # show a portion of the climatology: out$climatology[1:10, ] # show some of the cold-spells: out$event[1:5, 1:10] # It is also possible to calculate the categories of events directly # See the \code{\link{category}} documentation for more functionality res_clim <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) out_event <- detect_event(res_clim, categories = TRUE) out_list <- detect_event(res_clim, categories = TRUE, climatology = TRUE) # It is also possible to give two separate sets of threshold criteria # To use a second static threshold we first use the exceedance function thresh_19 <- exceedance(sst_Med, threshold = 19, minDuration = 10, maxGap = 0)$threshold # Then we use that output when detecting our events events_19 <- detect_event(ts2clm(sst_Med, climatologyPeriod = c("1982-01-01", "2011-12-31")), threshClim2 = thresh_19$exceedance, minDuration2 = 10, maxGap2 = 0) # If we want to use two different percentile thresholds we use detect_event thresh_95 <- detect_event(ts2clm(sst_Med, pctile = 95, climatologyPeriod = c("1982-01-01", "2011-12-31")), minDuration = 2, maxGap = 0)$climatology # Then we use that output when detecting our events events_95 <- detect_event(ts2clm(sst_Med, climatologyPeriod = c("1982-01-01", "2011-12-31")), threshClim2 = thresh_95$event, minDuration2 = 2, maxGap2 = 0)
detect_event3
is a data.table version of the earlier detect_event
.
It applies the Hobday et al. (2016) marine heat wave definition to an input time
series of a given value (usually, but not necessarily limited to, temperature)
along with a daily date vector and pre-calculated seasonal and threshold
climatologies, which may either be created with ts2clm3
or some
other means.
detect_event3( data, x = t, y = temp, seasClim = seas, threshClim = thresh, threshClim2 = NA, minDuration = 5, minDuration2 = minDuration, joinAcrossGaps = TRUE, maxGap = 2, maxGap2 = maxGap, coldSpells = FALSE, protoEvents = FALSE, categories = FALSE, roundRes = 4, ... )
detect_event3( data, x = t, y = temp, seasClim = seas, threshClim = thresh, threshClim2 = NA, minDuration = 5, minDuration2 = minDuration, joinAcrossGaps = TRUE, maxGap = 2, maxGap2 = maxGap, coldSpells = FALSE, protoEvents = FALSE, categories = FALSE, roundRes = 4, ... )
data |
A data frame with at least four columns. In the default setting
(i.e. omitting the arguments |
x |
This column is expected to contain a vector of dates as per the
specification of |
y |
This is a column containing the measurement variable. If the column
name differs from the default (i.e. |
seasClim |
The default for this argument assumes that the seasonal
climatology column is called |
threshClim |
The threshold climatology column should be called
|
threshClim2 |
If one wishes to provide a second climatology threshold filter for the more rigorous detection of events, a vector or column containing logical values (i.e. TRUE FALSE) should be provided here. By default this argument is ignored. It's primary purpose is to allow for the inclusion of tMin and tMax thresholds. |
minDuration |
The minimum duration for acceptance of detected events.
The default is |
minDuration2 |
The minimum duration for acceptance of events after
filtering by |
joinAcrossGaps |
Boolean switch indicating whether to join events which
occur before/after a short gap as specified by |
maxGap |
The maximum length of gap allowed for the joining of MHWs. The
default is |
maxGap2 |
The maximum gap length after applying both thresholds.
By default |
coldSpells |
Boolean specifying if the code should detect cold events
instead of warm events. The default is |
protoEvents |
Boolean. With the default setting of |
categories |
Boolean. Rather than using |
roundRes |
This argument allows the user to choose how many decimal places
the MHW metric outputs will be rounded to. Default is 4. To
prevent rounding set |
... |
Allows unused arguments to pass through the functions. |
This function assumes that the input time series consists of continuous
daily values with few missing values. Time ranges which start and end
part-way through the calendar year are supported. The accompanying function
ts2clm3
aids in the preparation of a time series that is
suitable for use with detect_event3
, although this may also be accomplished
'by hand' as long as the criteria are met as discussed in the documentation
to ts2clm3
.
The calculation of onset and decline rates assumes that the events
started a half-day before the start day and ended a half-day after the
end-day. This is consistent with the duration definition as implemented,
which assumes duration = end day - start day + 1. An event that is already
present at the beginning of a time series, or an event that is still present
at the end of a time series, will report the rate of onset or the rate of
decline as NA
, as it is impossible to know what the temperature half a
day before or after the start or end of the event is.
For the purposes of event detection, any missing temperature values not
interpolated over (through optional maxPadLength
in ts2clm3
)
will be set equal to the seasonal climatology. This means they will trigger
the end/start of any adjacent temperature values which satisfy the event
definition criteria.
If the code is used to detect cold events (coldSpells = TRUE
),
then it works just as for heat waves except that events are detected as
deviations below the (100 - pctile)th percentile (e.g., the 10th instead of
90th) for at least 5 days. Intensities are reported as negative values and
represent the temperature anomaly below climatology.
The original Python algorithm was written by Eric Oliver, Institute for Marine and Antarctic Studies, University of Tasmania, Feb 2015, and is documented by Hobday et al. (2016). The marine cold spell option was implemented in version 0.13 (21 Nov 2015) of the Python module as a result of our preparation of Schlegel et al. (2017), wherein the cold events receive a brief overview.
The function will return a list of two data.frames,
climatology
and event
, which are, surprisingly, the climatology
and event results, respectively. The climatology contains the full time series of
daily temperatures, as well as the the seasonal climatology, the threshold
and various aspects of the events that were detected. The software was
designed for detecting extreme thermal events, and the units specified below
reflect that intended purpose. However, various other kinds of extreme
events may be detected according to the specifications, and if that is the
case, the appropriate units need to be determined by the user.
Note that the exact content of the output depends on specific combinations
of the arguments protoEvents
, categories
, and climatology
:
threshCriterion |
Boolean indicating if |
durationCriterion |
Boolean indicating whether periods of consecutive
|
event |
Boolean indicating if all criteria that define an extreme event are met. |
event_no |
A sequential number indicating the ID and order of occurrence of the events. |
intensity |
The difference between |
category |
The category classification per day. Only added
if |
The event
results are summarised using a range of event metrics:
event_no |
A sequential number indicating the ID and order of the events. |
index_start |
Start index of event. |
index_end |
End index of event. |
duration |
Duration of event [days]. |
date_start |
Start date of event [date]. |
date_end |
End date of event [date]. |
date_peak |
Date of event peak [date]. |
intensity_mean |
Mean intensity [deg. C]. |
intensity_max |
Maximum (peak) intensity [deg. C]. |
intensity_var |
Intensity variability (standard deviation) [deg. C]. |
intensity_cumulative |
Cumulative intensity [deg. C x days]. |
rate_onset |
Onset rate of event [deg. C / day]. |
rate_decline |
Decline rate of event [deg. C / day]. |
intensity_max_relThresh
, intensity_mean_relThresh
,
intensity_var_relThresh
, and intensity_cumulative_relThresh
are as above except relative to the threshold (e.g., 90th percentile) rather
than the seasonal climatology.
intensity_max_abs
, intensity_mean_abs
, intensity_var_abs
, and
intensity_cumulative_abs
are as above except as absolute magnitudes
rather than relative to the seasonal climatology or threshold.
Note that rate_onset
and rate_decline
will return NA
when the event begins/ends on the first/last day of the time series. This
may be particularly evident when the function is applied to large gridded
data sets. Although the other metrics do not contain any errors and
provide sensible values, please take this into account in its
interpretation.
Albertus J. Smit, Robert W. Schlegel, Eric C. J. Oliver
Hobday, A.J. et al. (2016). A hierarchical approach to defining marine heatwaves, Progress in Oceanography, 141, pp. 227-238, doi:10.1016/j.pocean.2015.12.014
Schlegel, R. W., Oliver, C. J., Wernberg, T. W., Smit, A. J. (2017). Nearshore and offshore co-occurrences of marine heatwaves and cold-spells. Progress in Oceanography, 151, pp. 189-205, doi:10.1016/j.pocean.2017.01.004
data.table::setDTthreads(threads = 1) # optimise for your code and local computer res_clim <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) out <- detect_event3(res_clim) # show a portion of the climatology: out$climatology[1:10, ] # show some of the heat waves: out$event[1:5, 1:10] # Or if one wants to calculate MCSs res_clim <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), pctile = 10) out <- detect_event3(res_clim, coldSpells = TRUE) # show a portion of the climatology: out$climatology[1:10, ] # show some of the cold-spells: out$event[1:5, 1:10] # It is also possible to give two separate sets of threshold criteria # To use a second static threshold we first use the exceedance function thresh_19 <- exceedance(sst_Med, threshold = 19, minDuration = 10, maxGap = 0)$threshold # Then we use that output when detecting our events events_19 <- detect_event3(ts2clm3(sst_Med, climatologyPeriod = c("1982-01-01", "2011-12-31")), threshClim2 = thresh_19$exceedance, minDuration2 = 10, maxGap2 = 0) # If we want to use two different percentile thresholds thresh_95 <- detect_event3(ts2clm3(sst_Med, pctile = 95, climatologyPeriod = c("1982-01-01", "2011-12-31")), minDuration = 2, maxGap = 0)$climatology # Then we use that output when detecting our events events_95 <- detect_event3(ts2clm3(sst_Med, climatologyPeriod = c("1982-01-01", "2011-12-31")), threshClim2 = thresh_95$event, minDuration2 = 2, maxGap2 = 0)
data.table::setDTthreads(threads = 1) # optimise for your code and local computer res_clim <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) out <- detect_event3(res_clim) # show a portion of the climatology: out$climatology[1:10, ] # show some of the heat waves: out$event[1:5, 1:10] # Or if one wants to calculate MCSs res_clim <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), pctile = 10) out <- detect_event3(res_clim, coldSpells = TRUE) # show a portion of the climatology: out$climatology[1:10, ] # show some of the cold-spells: out$event[1:5, 1:10] # It is also possible to give two separate sets of threshold criteria # To use a second static threshold we first use the exceedance function thresh_19 <- exceedance(sst_Med, threshold = 19, minDuration = 10, maxGap = 0)$threshold # Then we use that output when detecting our events events_19 <- detect_event3(ts2clm3(sst_Med, climatologyPeriod = c("1982-01-01", "2011-12-31")), threshClim2 = thresh_19$exceedance, minDuration2 = 10, maxGap2 = 0) # If we want to use two different percentile thresholds thresh_95 <- detect_event3(ts2clm3(sst_Med, pctile = 95, climatologyPeriod = c("1982-01-01", "2011-12-31")), minDuration = 2, maxGap = 0)$climatology # Then we use that output when detecting our events events_95 <- detect_event3(ts2clm3(sst_Med, climatologyPeriod = c("1982-01-01", "2011-12-31")), threshClim2 = thresh_95$event, minDuration2 = 2, maxGap2 = 0)
Creates a graph of warm or cold events as per the second row of Figure 3 in Hobday et al. (2016).
event_line( data, x = t, y = temp, metric = intensity_cumulative, min_duration = 5, spread = 150, start_date = NULL, end_date = NULL, category = FALSE, x_axis_title = NULL, x_axis_text_angle = NULL, y_axis_title = NULL, y_axis_range = NULL, line_colours = NULL )
event_line( data, x = t, y = temp, metric = intensity_cumulative, min_duration = 5, spread = 150, start_date = NULL, end_date = NULL, category = FALSE, x_axis_title = NULL, x_axis_text_angle = NULL, y_axis_title = NULL, y_axis_range = NULL, line_colours = NULL )
data |
The function receives the full (list) output from the
|
x |
This column is expected to contain a vector of dates as per the
specification of |
y |
This is a column containing the measurement variable. If the column
name differs from the default (i.e. |
metric |
This tells the function how to choose the event that should be
highlighted as the 'greatest' of the events in the chosen period.
Partial name matching is currently not supported so please specify the metric
name precisely. The default is |
min_duration |
The minimum duration (days) the event must be for it to qualify as a heatwave or cold-spell. |
spread |
The number of days leading and trailing the largest event
(as per |
start_date |
The start date of a period of time within which the largest
event (as per |
end_date |
The end date of a period of time within which the largest
event (as per |
category |
A boolean choice of TRUE or FALSE. If set to FALSE (default) event_line() will
produce a figure as per the second row of Figure 3 in Hobday et al. (2016). If set to TRUE a
figure showing the different categories of the MHWs in the chosen period, highlighted as
seen in Figure 3 of Hobday et al. (in review), will be produced. If |
x_axis_title |
If one would like to add a title for the x-axis it may be provided here. |
x_axis_text_angle |
If one would like to change the angle of the x-axis text, provide the angle here as a single numeric value. |
y_axis_title |
Provide text here if one would like a title for the y-axis other than "Temperature °C" (default) |
y_axis_range |
If one would like to control the y-axis range, provide the desired limits here as two numeric values (e.g. c(20, 30)). |
line_colours |
Provide a vector of colours here for the line geoms on the plot.
The default for the base plot is c("black", "blue", "darkgreen"), and for categories
it is: c("black", "gray20", "darkgreen", "darkgreen", "darkgreen", "darkgreen"). Note that
three ( |
The function will return a line plot indicating the climatology,
threshold and temperature, with the hot or cold events that meet the
specifications of Hobday et al. (2016) shaded in as appropriate. The plotting
of hot or cold events depends on which option is specified in detect_event
.
The top event detect during the selected time period will be visible in a
brighter colour. This function differs in use from geom_flame
in that it creates a stand alone figure. The benefit of this being
that one must not have any prior knowledge of ggplot2 to create the figure.
Robert W. Schlegel
Hobday, A.J. et al. (2016), A hierarchical approach to defining marine heatwaves, Progress in Oceanography, 141, pp. 227-238, doi: 10.1016/j.pocean.2015.12.014
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) event_line(res, spread = 100, metric = duration, start_date = "2010-12-01", end_date = "2011-06-30") event_line(res, spread = 100, start_date = "2010-12-01", end_date = "2011-06-30", category = TRUE) event_line(res, spread = 100, start_date = "2010-12-01", end_date = "2011-06-30", category = TRUE, line_colours = c("black", "blue", "gray20", "gray20", "gray20", "gray20"))
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) event_line(res, spread = 100, metric = duration, start_date = "2010-12-01", end_date = "2011-06-30") event_line(res, spread = 100, start_date = "2010-12-01", end_date = "2011-06-30", category = TRUE) event_line(res, spread = 100, start_date = "2010-12-01", end_date = "2011-06-30", category = TRUE, line_colours = c("black", "blue", "gray20", "gray20", "gray20", "gray20"))
Detect consecutive days in exceedance above or below of a given threshold.
exceedance( data, x = t, y = temp, threshold, below = FALSE, minDuration = 5, joinAcrossGaps = TRUE, maxGap = 2, maxPadLength = FALSE, roundRes = 4, returnDF = TRUE )
exceedance( data, x = t, y = temp, threshold, below = FALSE, minDuration = 5, joinAcrossGaps = TRUE, maxGap = 2, maxPadLength = FALSE, roundRes = 4, returnDF = TRUE )
data |
A data frame with at least the two following columns:
a |
x |
This column is expected to contain a vector of dates as per the
specification of |
y |
This is a column containing the measurement variable. If the column
name differs from the default (i.e. |
threshold |
The static threshold used to determine how many consecutive days are in exceedance of the temperature of interest. |
below |
Default is |
minDuration |
Minimum duration that temperatures must be in exceedance
of the |
joinAcrossGaps |
A TRUE/FALSE statement that indicates whether
or not to join consecutive days of temperatures in exceedance of the
|
maxGap |
The maximum length of the gap across which to connect
consecutive days in exceedance of the |
maxPadLength |
Specifies the maximum length of days over which to
interpolate (pad) missing data (specified as |
roundRes |
This argument allows the user to choose how many decimal places
the exceedance metric outputs will be rounded to. Default is 4. To
prevent rounding set |
returnDF |
The default ( |
This function assumes that the input time series consists of continuous
daily temperatures, with few missing values. The accompanying function
make_whole_fast
aids in the preparation of a time series that is
suitable for use with exceedance
, although this may also be accomplished
'by hand' as long as the criteria are met as discussed in the documentation
to make_whole_fast
.
Future versions seek to accommodate monthly and annual time series, too.
The calculation of onset and decline rates assumes that exceedance of the
threshold
started a half-day before the start day and ended a half-day
after the end-day. This is consistent with the duration definition as implemented,
which assumes duration = end day - start day + 1.
For the purposes of exceedance detection, any missing temperature values not
interpolated over (through optional maxPadLength
) will remain as
NA
. This means they will trigger the end of an exceedance if the adjacent
temperature values are in exceedance of the threshold
.
If the function is used to detect consecutive days of temperature under
the given theshold
, these temperatures are then taken as being in
exceedance below the threshold
as there is no antonym in the English
language for 'exceedance'.
This function is based largely on the detect_event
function found in this
package, which was ported from the Python algorithm that was written by Eric
Oliver, Institute for Marine and Antarctic Studies, University of Tasmania,
Feb 2015, and is documented by Hobday et al. (2016).
The function will return a list of two data.frames.
The first being threshold
, which shows the daily temperatures and on which
specific days the given threshold
was exceeded. The second component of the
list is exceedance
, which shows a medley of statistics for each discrete
group of days in exceedance of the given threshold
. Note that any additional
columns left in the data frame given to this function will be output in the
threshold
component of the output. For example, if one uses
ts2clm
to prepare a time series for analysis and leaves
in the doy
column, this column will appear in the output.
The information shown in the threshold
component is:
t |
The date of the temperature measurement. This variable may named
differently if an alternative name is supplied to the function's |
temp |
Temperature on the specified date [deg. C]. This variable may
named differently if an alternative name is supplied to the function's |
thresh |
The static |
thresh_criterion |
Boolean indicating if |
duration_criterion |
Boolean indicating whether periods of consecutive
|
exceedance |
Boolean indicting if all criteria that define a discrete
group in exceedance of the |
exceedance_no |
A sequential number indicating the ID and order of occurrence of exceedances. |
The individual exceedances are summarised using the following metrics:
exceedance_no |
The same sequential number indicating the ID and
order of the exceedance as found in the |
index_start |
Row number on which exceedance starts. |
index_peak |
Row number on which exceedance peaks. |
index_end |
Row number on which exceedance ends. |
duration |
Duration of exceedance [days]. |
date_start |
Start date of exceedance [date]. |
date_peak |
Date of exceedance peak [date]. |
date_end |
End date of exceedance [date]. |
intensity_mean |
Mean intensity [deg. C]. |
intensity_max |
Maximum (peak) intensity [deg. C]. |
intensity_var |
Intensity standard deviation [deg. C]. |
intensity_cumulative |
Cumulative intensity [deg. C x days]. |
rate_onset |
Onset rate of exceedance [deg. C / day]. |
rate_decline |
Decline rate of exceedance [deg. C / day]. |
intensity_max_abs
, intensity_mean_abs
, intensity_var_abs
,
and intensity_cum_abs
are as above except as absolute magnitudes rather
than relative to the threshold.
Robert W. Schlegel, Albertus J. Smit
res <- exceedance(sst_WA, threshold = 25) # show first ten days of daily data: res$threshold[1:10, ] # show first five exceedances: res$exceedance[1:5, ]
res <- exceedance(sst_WA, threshold = 25) # show first ten days of daily data: res$threshold[1:10, ] # show first five exceedances: res$exceedance[1:5, ]
This function will create polygons between two lines. If given a
temperature and theshold time series, like that produced by
detect_event
, the output will meet the specifications
of Hobday et al. (2016) shown as 'flame polygons.' If one wishes to
plot polygons below a given threshold, and not above, switch the values
being fed to the y
and y2
aesthetics. This function differs
in use from event_line
in that it must be created as a
ggplot
'geom' object. The benefit of this being that one may add
additional information to the figure as geom layers to ggplot2 graphs
as may be necessary.
geom_flame( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., n = 0, n_gap = 0, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_flame( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., n = 0, n_gap = 0, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If NULL, the default, the data is inherited from the plot data as specified
in the call to A data.frame, or other object, will override the plot data. All objects will
be fortified to produce a data frame. See A function will be called with a single argument, the plot data. The return
value must be a |
stat |
The statistical transformation to use on the data for this layer, as a string. |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
... |
other arguments passed on to |
n |
The number of steps along the x-axis (i.e. in a daily time series this
would be days) required before the area between |
n_gap |
The number of steps along the x-axis (i.e. in a daily time series this
would be days) within which to allow |
na.rm |
If |
show.legend |
Logical. Should this layer be included in the legends? |
inherit.aes |
If |
geom_flame
understands the following aesthetics (required aesthetics
are in bold):
x
y
y2
colour
fill
linewidth
alpha
linetype
Robert W. Schlegel
Hobday, A.J. et al. (2016), A hierarchical approach to defining marine heatwaves, Progress in Oceanography, 141, pp. 227-238, doi: 10.1016/j.pocean.2015.12.014
event_line
for a non-ggplot2 based flame function.
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) mhw <- res$clim mhw <- mhw[10580:10690,] library(ggplot2) ggplot(mhw, aes(x = t, y = temp)) + geom_flame(aes(y2 = thresh)) + geom_text(aes(x = as.Date("2011-02-01"), y = 28, label = "That's not a heatwave.\nThis, is a heatwave.")) + xlab("Date") + ylab(expression(paste("Temperature [", degree, "C]")))
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) mhw <- res$clim mhw <- mhw[10580:10690,] library(ggplot2) ggplot(mhw, aes(x = t, y = temp)) + geom_flame(aes(y2 = thresh)) + geom_text(aes(x = as.Date("2011-02-01"), y = 28, label = "That's not a heatwave.\nThis, is a heatwave.")) + xlab("Date") + ylab(expression(paste("Temperature [", degree, "C]")))
The function will return a graph of the intensity of the selected
metric along the *y*-axis versus a time variable along the *x*-axis.
The number of top events (n
) from the chosen metric may be highlighted
in a brighter colour with the aesthetic value colour_n
.
This function differs in use from lolli_plot
in that it must be created as a ggplot2 'geom' object. The benefit of this being
that one may add additional information layer by layer to the figure as
geoms as necessary.
geom_lolli( mapping = NULL, data = NULL, ..., n = 0, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
geom_lolli( mapping = NULL, data = NULL, ..., n = 0, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options:
|
... |
other arguments passed on to |
n |
The number of top events to highlight as based on the value provided
to |
na.rm |
If |
show.legend |
Logical. Should this layer be included in the legends? |
inherit.aes |
If |
geom_lolli
understands the following aesthetics (required aesthetics
are in bold):
x
y
alpha
color
linetype
size
shape
stroke
fill
colour_n
While this value may be used as an aesthetic, it works
better as a parameter for this function because it is set to use discrete values.
One may provide continuous values to colour_n
but remember that one may
not provide multiple continuous or discrete scales to a single ggplot2 object.
Therefore, if one provides a continuous value to aes(colour)
, the values
supplied to colour_n
must be discrete. ggplot2
will attempt to
do this automatically.
Robert W. Schlegel
lolli_plot
for a non-geom based lolliplot function.
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) mhw <- res$event library(ggplot2) # Height of lollis represent event durations and their colours # are mapped to the events' cumulative intensity: ggplot(mhw, aes(x = date_peak, y = duration)) + geom_lolli(aes(colour = intensity_cumulative)) + scale_color_distiller(palette = "Spectral", name = "Cumulative \nintensity") + xlab("Date") + ylab("Event duration [days]") # Height of lollis represent event durations and the top three (longest) # lollis are highlighted in red: ggplot(mhw, aes(x = date_peak, y = duration)) + geom_lolli(n = 3, colour_n = "red") + scale_color_distiller(palette = "Spectral") + xlab("Peak date") + ylab("Event duration [days]") # Because this is a proper geom, any number of ill-advised things # may be done with it: ggplot(mhw, aes(x = event_no, y = intensity_max)) + geom_lolli(shape = 5, aes(colour = rate_onset), linetype = "dotted") + scale_color_distiller(palette = "RdYlGn", name = "Rate \nonset") + xlab("Event number") + ylab("Max intensity [degree C]")
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) mhw <- res$event library(ggplot2) # Height of lollis represent event durations and their colours # are mapped to the events' cumulative intensity: ggplot(mhw, aes(x = date_peak, y = duration)) + geom_lolli(aes(colour = intensity_cumulative)) + scale_color_distiller(palette = "Spectral", name = "Cumulative \nintensity") + xlab("Date") + ylab("Event duration [days]") # Height of lollis represent event durations and the top three (longest) # lollis are highlighted in red: ggplot(mhw, aes(x = date_peak, y = duration)) + geom_lolli(n = 3, colour_n = "red") + scale_color_distiller(palette = "Spectral") + xlab("Peak date") + ylab("Event duration [days]") # Because this is a proper geom, any number of ill-advised things # may be done with it: ggplot(mhw, aes(x = event_no, y = intensity_max)) + geom_lolli(shape = 5, aes(colour = rate_onset), linetype = "dotted") + scale_color_distiller(palette = "RdYlGn", name = "Rate \nonset") + xlab("Event number") + ylab("Max intensity [degree C]")
Visualise a timeline of several possible event metrics as 'lollipop' graphs.
lolli_plot(data, xaxis = date_peak, metric = intensity_max, event_count = 3)
lolli_plot(data, xaxis = date_peak, metric = intensity_max, event_count = 3)
data |
Output from the |
xaxis |
The name of a column from the |
metric |
The name of a column from the |
event_count |
The number of top events to highlight, as determined by the
column given to |
The function will return a graph of the intensity of the selected
metric
along the y-axis and the chosen xaxis
value.
The number of top events as per event_count
will be highlighted
in a brighter colour. This function differs in use from geom_lolli
in that it creates a stand-alone figure. The benefit of this being
that one does not need any prior knowledge of ggplot2
to create the figure.
Albertus J. Smit and Robert W. Schlegel
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) library(ggplot2) # The default output lolli_plot(res)
ts <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res <- detect_event(ts) library(ggplot2) # The default output lolli_plot(res)
A dataset containing the sea surface temperature (in degrees Celsius) and date for the Mediterranean region from 1982-01-01 to 2022-12-31.
sst_Med
sst_Med
A dataframe with 14975 rows and 2 variables:
date, as.Date() format
SST, in degrees Celsius
...
lon/lat: 9.125/43.625
https://www.ncei.noaa.gov/products/optimum-interpolation-sst
A dataset containing the sea surface temperature (in degrees Celsius) and date for the Northwest Atlantic region from 1982-01-01 to 2022-12-31.
sst_NW_Atl
sst_NW_Atl
A dataframe with 14975 rows and 2 variables:
date, as.Date() format
SST, in degrees Celsius
...
lon/lat: -66.875/43.125
https://www.ncei.noaa.gov/products/optimum-interpolation-sst
A dataset containing the sea surface temperature (in degrees Celsius) and date for the Western Australian region from 1982-01-01 to 2022-12-31.
sst_WA
sst_WA
A dataframe with 14975 rows and 2 variables:
date, as.Date() format
SST, in degrees Celsius
...
lon/lat: 112.625/-29.375
https://www.ncei.noaa.gov/products/optimum-interpolation-sst
Creates a daily climatology from a time series of daily temperatures using a user-specified sliding window for the mean and threshold calculation, followed by an optional moving average smoother as used by Hobday et al. (2016).
ts2clm( data, x = t, y = temp, climatologyPeriod, maxPadLength = FALSE, windowHalfWidth = 5, pctile = 90, smoothPercentile = TRUE, smoothPercentileWidth = 31, clmOnly = FALSE, var = FALSE, roundClm = 4, returnDF = TRUE )
ts2clm( data, x = t, y = temp, climatologyPeriod, maxPadLength = FALSE, windowHalfWidth = 5, pctile = 90, smoothPercentile = TRUE, smoothPercentileWidth = 31, clmOnly = FALSE, var = FALSE, roundClm = 4, returnDF = TRUE )
data |
A data frame with two columns. In the default setting (i.e. omitting
the arguments |
x |
This column is expected to contain a vector of dates. If a column
headed |
y |
This is a column containing the measurement variable. If the column
name differs from the default (i.e. |
climatologyPeriod |
Required. To this argument should be passed two values (see example below). The first value should be the chosen date for the start of the climatology period, and the second value the end date of said period. This chosen period (preferably 30 years in length) is then used to calculate the seasonal cycle and the extreme value threshold. Note that these values are always provided as dates, even if hourly data are being input into the function. |
maxPadLength |
Specifies the maximum length of days over which to
interpolate (pad) missing data (specified as |
windowHalfWidth |
Width of sliding window about day-of-year (to one
side of the center day-of-year) used for the pooling of values and
calculation of climatology and threshold percentile. Default is |
pctile |
Threshold percentile (%) for detection of events (MHWs).
Default is |
smoothPercentile |
Boolean switch selecting whether to smooth the
climatology and threshold percentile time series with a moving average of
|
smoothPercentileWidth |
Full width of moving average window for smoothing
climatology and threshold. The default is |
clmOnly |
Choose to calculate and return only the climatologies.
The default is |
var |
This argument has been introduced to allow the user to choose if
the variance of the seasonal signal per doy should be calculated. The default of
|
roundClm |
This argument allows the user to choose how many decimal places
the |
returnDF |
The default ( |
This function assumes that the input time series consists of continuous daily values with few missing values. Time ranges which start and end part-way through the calendar year are supported.
It is recommended that a period of at least 30 years is specified in
order to produce a climatology that smooths out any decadal thermal
periodicities that may be present. When calculated over at least 30 years of
data, such a climatology is called a 'climatological normal.' It is further
advised that full the start and end dates for the climatology period result
in full years, e.g. "1982-01-01" to "2011-12-31" or "1982-07-01" to
"2012-06-30"; if not, this may result in an unequal weighting of data
belonging with certain months within a time series. A daily climatology will
be created; that is, the climatology will be comprised of one mean
temperature for each day of the year (365 or 366 days, depending on how
leap years are dealt with), and the mean will be based on a sample size that
is a function of the length of time determined by the start and end values
given to climatologyPeriod
and the width of the sliding window
specified in windowHalfWidth
.
This function supports leap years. This is done by ignoring Feb 29s for the initial calculation of the climatology and threshold. The values for Feb 29 are then linearly interpolated from the values for Feb 28 and Mar 1.
Previous versions of ts2clm()
tested to see if some rows are
duplicated, or if replicate temperature readings are present per day, but
this has now been disabled. Should the user be concerned about such repeated
measurements, we suggest that the necessary checks and fixes are implemented
prior to feeding the time series to ts2clm()
.
The original Python algorithm was written by Eric Oliver, Institute for Marine and Antarctic Studies, University of Tasmania, Feb 2015, and is documented by Hobday et al. (2016).
The function will return a tibble (see the tidyverse
) with the
input time series and the newly calculated climatology. The climatology contains
the daily climatology and the threshold for calculating MHWs. The software was
designed for creating climatologies of daily temperatures, and the units
specified below reflect that intended purpose. However, various other kinds
of climatologies may be created, and if that is the case, the appropriate
units need to be determined by the user.
doy |
Julian day (day-of-year). For non-leap years it runs 1...59 and 61...366, while leap years run 1...366. |
t |
The date vector in the original time series supplied in |
temp |
The measurement vector as per the the original |
seas |
Daily climatological cycle [deg. C]. |
thresh |
Daily varying threshold (e.g., 90th
percentile) [deg. C]. This is used in |
var |
Daily varying variance (standard deviation) [deg. C]. This
column is not returned if |
Should clmOnly
be enabled, only the 365 or 366 day climatology will be
returned.
Albertus J. Smit, Robert W. Schlegel, Eric C. J. Oliver
Hobday, A.J. et al. (2016). A hierarchical approach to defining marine heatwaves, Progress in Oceanography, 141, pp. 227-238, doi:10.1016/j.pocean.2015.12.014
res <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res[1:10, ] # Or if one only wants the 366 day climatology res_clim <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), clmOnly = TRUE) res_clim[1:10, ] # Or if one wants the variance column included in the results res_var <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), var = TRUE) res_var[1:10, ]
res <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res[1:10, ] # Or if one only wants the 366 day climatology res_clim <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), clmOnly = TRUE) res_clim[1:10, ] # Or if one wants the variance column included in the results res_var <- ts2clm(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), var = TRUE) res_var[1:10, ]
This is the fully data.table-based version of ts2clm
. The function creates
a daily climatology from a time series of daily temperatures using a
user-specified sliding window for the mean and threshold calculation, followed
by an optional moving average smoother as used by Hobday et al. (2016).
ts2clm3( data, x = t, y = temp, climatologyPeriod, maxPadLength = FALSE, windowHalfWidth = 5, pctile = 90, smoothPercentile = TRUE, smoothPercentileWidth = 31, clmOnly = FALSE, var = FALSE, roundClm = 4, ... )
ts2clm3( data, x = t, y = temp, climatologyPeriod, maxPadLength = FALSE, windowHalfWidth = 5, pctile = 90, smoothPercentile = TRUE, smoothPercentileWidth = 31, clmOnly = FALSE, var = FALSE, roundClm = 4, ... )
data |
A data frame with at least two columns. In the default setting
(i.e. omitting the arguments |
x |
This column is expected to contain a vector of dates. If a column
headed |
y |
This is a column containing the measurement variable. If the column
name differs from the default (i.e. |
climatologyPeriod |
Required. To this argument should be passed two values (see example below). The first value should be the chosen date for the start of the climatology period, and the second value the end date of said period. This chosen period (preferably 30 years in length) is then used to calculate the seasonal cycle and the extreme value threshold. |
maxPadLength |
Specifies the maximum length of days over which to apply
linear interpolation (padding) across the missing values (specified as |
windowHalfWidth |
Width of sliding window about day-of-year (to one
side of the center day-of-year) used for the pooling of values and
calculation of climatology and threshold percentile. Default is |
pctile |
Threshold percentile (%) for detection of events (MHWs).
Default is |
smoothPercentile |
Boolean. Select whether to smooth the climatology
and threshold percentile time series with a moving average of
|
smoothPercentileWidth |
Full width of moving average window for smoothing
the climatology and threshold. The default is |
clmOnly |
Boolean. Choose to calculate and return only the climatologies.
The default is |
var |
Boolean. This argument has been introduced to allow the user to
choose if the variance of the seasonal signal per doy should be calculated.
The default of |
roundClm |
This argument allows the user to choose how many decimal places
the |
... |
Allows unused arguments to pass through the functions. |
This function assumes that the input time series consists of continuous daily values with few missing values. Time ranges which start and end part-way through the calendar year are supported.
It is recommended that a period of at least 30 years is specified in
order to produce a climatology that smooths out any decadal thermal
periodicities that may be present. When calculated over at least 30 years of
data, such a climatology is called a 'climatological normal.' It is further
advised that full the start and end dates for the climatology period result
in full years, e.g. "1982-01-01" to "2011-12-31" or "1982-07-01" to
"2012-06-30"; if not, this may result in an unequal weighting of data
belonging with certain months within a time series. A daily climatology will
be created; that is, the climatology will be comprised of one mean
temperature for each day of the year (365 or 366 days, depending on how
leap years are dealt with), and the mean will be based on a sample size that
is a function of the length of time determined by the start and end values
given to climatologyPeriod
and the width of the sliding window
specified in windowHalfWidth
.
This function supports leap years. This is done by ignoring Feb 29s for the initial calculation of the climatology and threshold. The values for Feb 29 are then linearly interpolated from the values for Feb 28 and Mar 1.
Previous versions of ts2clm()
tested to see if some rows are
duplicated, or if replicate temperature readings are present per day, but
this has now been disabled. Should the user be concerned about such repeated
measurements, we suggest that the necessary checks and fixes are implemented
prior to feeding the time series to ts2clm()
.
The original Python algorithm was written by Eric Oliver, Institute for Marine and Antarctic Studies, University of Tasmania, Feb 2015, and is documented by Hobday et al. (2016).
The function will return a data.table (see the data.table
) with the
input time series and the newly calculated climatology. The climatology contains
the daily climatology and the threshold for calculating MHWs. The software was
designed for creating climatologies of daily temperatures, and the units
specified below reflect that intended purpose. However, various other kinds
of climatologies may be created, and if that is the case, the appropriate
units need to be determined by the user.
doy |
Julian day (day-of-year) returned when |
t |
The date vector in the original time series supplied in |
temp |
The measurement vector as per the the original |
seas |
Daily climatological cycle [deg. C]. |
thresh |
Daily varying threshold (e.g., 90th
percentile) [deg. C]. This is used in |
var |
Daily varying variance (standard deviation) [deg. C]. This
column is not returned if |
Should clmOnly
be enabled, only the 365 or 366 day climatology will be
returned.
Albertus J. Smit, Robert W. Schlegel, Eric C. J. Oliver
Hobday, A.J. et al. (2016). A hierarchical approach to defining marine heatwaves, Progress in Oceanography, 141, pp. 227-238, doi:10.1016/j.pocean.2015.12.014
data.table::setDTthreads(threads = 1) # optimise for your code and local computer res <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res[1:10, ] # Or if one only wants the 366 day climatology res_clim <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), clmOnly = TRUE) res_clim[1:10, ] # Or if one wants the variance column included in the results res_var <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), var = TRUE) res_var[1:10, ]
data.table::setDTthreads(threads = 1) # optimise for your code and local computer res <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31")) res[1:10, ] # Or if one only wants the 366 day climatology res_clim <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), clmOnly = TRUE) res_clim[1:10, ] # Or if one wants the variance column included in the results res_var <- ts2clm3(sst_WA, climatologyPeriod = c("1983-01-01", "2012-12-31"), var = TRUE) res_var[1:10, ]