Title: | Calculate Loading Data to Tampa Bay |
---|---|
Description: | Loading data from major sources to Tampa Bay are calculated on a monthly or annual basis. Major sources include domestic point source (reuse, end of pipe), industrial point source, material losses, non-point sources (MS4), atmospheric deposition, and groundwater. |
Authors: | Marcus Beck [aut, cre] , Ed Sherwood [aut] , Ray Pribble [aut] |
Maintainer: | Marcus Beck <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.0.9000 |
Built: | 2024-11-07 05:29:04 UTC |
Source: | https://github.com/tbep-tech/tbeploads |
Data frame of distances of segment locations to National Weather Service (NWS) sites
ad_distance
ad_distance
A data.frame
Used for estimating atmospheric deposition. The data frame contains the following columns:
segment
: Numeric identifier for the segment location
seg_x
: Numeric value for the x-coordinate of the segment location (WGS 84, UTM Zone 17N, CRS 32617)
seg_y
: Numeric value for the y-coordinate of the segment location (WGS 84, UTM Zone 17N, CRS 32617)
matchsit
: Numeric for the NWS site that matches the segment location
distance
: Numeric value for the distance (m) between the segment coordinate and NWS site
invdist2
: Numeric value for the inverse distance squared (1/m^2) between the segment coordinate and NWS site
area
: Numeric value for the area of the segment (ha)
Segment numbers are 1-7 for Old Tampa Bay, Hillsborough Bay, Middle Tampa Bay, Lower Tampa Bay, Boca Ciega Bay, Terra Ceia Bay, and Manatee River.
ad_distance
ad_distance
Data frame of daily rainfall data from NOAA NCDC National Weather Service (NWS) sites from 2017 to 2023
ad_rain
ad_rain
A data.frame
Used for estimating atmospheric deposition and created using the util_ad_getrain
function. The data frame contains the following columns:
station
: Character string for the station id
date
: Date for the observation
Year
: Numeric value for the year of the observation
Month
: Numeric value for the month of the observation
Day
: Numeric value for the day of the observation
rainfall
: Numeric value for the amount of rainfall in inches
ad_rain
ad_rain
Calculate AD loads and summarize
anlz_ad( ad_rain, vernafl, summ = c("segment", "all"), summtime = c("month", "year") )
anlz_ad( ad_rain, vernafl, summ = c("segment", "all"), summtime = c("month", "year") )
ad_rain |
data frame of daily rainfall data from NOAA NCDC, obtained using |
vernafl |
character vector of file path to Verna Wellfield atmospheric concentration data |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
Loading from atmospheric deposition (AD) for bay segments in the Tampa Bay watershed are calculated using rainfall data and atmospheric concentration data from the Verna Wellfield site. Rainfall data must be obtained using the util_ad_getrain
function before calculating loads. For convenience, daily rainfall data from 2017 to 2023 at sites in the watershed are included with the package in the ad_rain
object. The Verna Wellfield data must also be obtained from https://nadp.slh.wisc.edu/sites/ntn-FL41/ as monthly observations. This file is also included with the package and can be found using system.file
as in the examples below. Internally, the Verna data are converted to total nitrogen and total phosphorus from ammonium and nitrate concentration data (see util_ad_prepverna
for additional information).
The function first estimates the total hydrologic load for each bay segment using daily estimates of rainfall at NWIS NCDC sites in the watershed. This is done as a weighted mean of rainfall at the measured sites relative to grid locations in each sub-watershed for the bay segments. The weights are based on distance of the grid cells from the closest site as inverse distance squared. Total hydrologic load for a bay segment is then estimated by converting inches/month to m3/month using the segment area. The distance data and bay segment areas are contained in the ad_distance
file included with the package.
The total nitrogen and phosphorus loads are then estimated for each bay segment by multiplying the total hydrologic load by the total nitrogen and phosphorus concentrations in the Verna data. The loading calculations also include a wet/dry deposition conversion factor to account for differences in loading during the rainy and dry seasons.
A data frame with nitrogen and phosphorus loads in tons/month, hydrologic load in million m3/month, and segment, year, and month as columns if summ = 'segment'
and summtime = 'month'
. Total load to all segments can be returned if summ = 'all'
and annual summaries can be returned if summtime = 'year'
. In the former case, the total excludes the northern portion of Boca Ciega Bay that is not included in the reasonable assurance boundaries. In the latter case, loads are the sum of monthly estimates such that output is tons/yr for TN and TP and as million m3/yr for hydrologic load.
util_ad_getrain
, util_ad_prepverna
vernafl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') data(ad_rain) anlz_ad(ad_rain, vernafl)
vernafl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') data(ad_rain) anlz_ad(ad_rain, vernafl)
Calculate DPS reuse and end of pipe loads and summarize
anlz_dps( fls, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
anlz_dps( fls, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
fls |
vector of file paths to raw entity data, one to many |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
Input data files in fls
are first processed by anlz_dps_facility
to calculate DPS reuse and end of pipe for each facility and outfall. The data are summarized differently based on the summ
and summtime
arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment'
) and year (summtime = 'year'
).
data frame with loading data for TP, TN, TSS, and BOD as tons per month/year and hydro load as million cubic meters per month/year
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) anlz_dps(fls)
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) anlz_dps(fls)
Calculate DPS reuse and end of pipe loads from raw facility data
anlz_dps_facility(fls)
anlz_dps_facility(fls)
fls |
vector of file paths to raw facility data, one to many |
Input data should include flow as million gallons per day, and conc as mg/L. Steps include:
Multiply flow by day in month to get million gallons per month
Multiply flow by 3785.412 to get cubic meters per month
Multiply conc by flow and divide by 1000 to get kg var per month
Multiply m3 by 1000 to get L, then divide by 1e6 to convert mg to kg, same as dividing by 1000
TN, TP, TSS, BOD dps reuse is multiplied by attenuation factor for land application (varies by location)
Hydro load (m3 / mo) is also attenuated for the reuse, multiplied by 0.6 (40% attenuation)
data frame with loading data for TP, TN, TSS, and BOD as tons per month and hydro load as million cubic meters per month. Information for each entity, facility, and outfall is retained.
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) anlz_dps_facility(fls)
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) anlz_dps_facility(fls)
Calculate IPS loads and summarize
anlz_ips( fls, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
anlz_ips( fls, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
fls |
vector of file paths to raw entity data, one to many |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
Input data files in fls
are first processed by anlz_ips_facility
to calculate IPS loads for each facility and outfall. The data are summarized differently based on the summ
and summtime
arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment'
) and year (summtime = 'year'
).
data frame with loading data for TP, TN, TSS, and BOD as tons per month/year and hydro load as million cubic meters per month/year
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) anlz_ips(fls)
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) anlz_ips(fls)
Calculate IPS loads from raw facility data
anlz_ips_facility(fls)
anlz_ips_facility(fls)
fls |
vector of file paths to raw facility data, one to many |
Input data should include flow as million gallons per day, and conc as mg/L. Steps include:
Multiply flow by day in month to get million gallons per month
Multiply flow by 3785.412 to get cubic meters per month
Multiply conc by flow and divide by 1000 to get kg var per month
Multiply m3 by 1000 to get L, then divide by 1e6 to convert mg to kg, same as dividing by 1000
data frame with loading data for TP, TN, TSS, and BOD as tons per month and hydro load as million cubic meters per month. Information for each entity, facility, and outfall is retained.
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) anlz_ips_facility(fls)
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) anlz_ips_facility(fls)
Calculate material loss (ML) loads and summarize
anlz_ml( fls, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
anlz_ml( fls, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
fls |
vector of file paths to raw entity data, one to many |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
Input data files in fls
are first processed by anlz_ml_facility
to calculate ML loads for each facility. The data are summarized differently based on the summ
and summtime
arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment'
) and year (summtime = 'year'
).
data frame with loading data for TN as tons per month/year. Columns for TP, TSS, BOD, and hydrologic load are also returned with zero load for consistency with other point source load calculation functions.
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_indml', full.names = TRUE) anlz_ml(fls)
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_indml', full.names = TRUE) anlz_ml(fls)
Calculate material loss (ML) loads from raw facility data
anlz_ml_facility(fls)
anlz_ml_facility(fls)
fls |
vector of file paths to raw facility data, one to many |
Input data should be one row per year per facility, where the row shows the total tons per year of total nitrogen loss. Input files are often created by hand based on reported annual tons of nitrogen shipped at each facility. The material losses as tons/yr are estimated from the tons shipped using an agreed upon loss rate. Values reported in the example files represent the estimated loss as the total tons of N shipped each year multiplied by 0.0023 and divided by 2000. The total N shipped at a facility each year can be obtained using a simple back-calculation (multiply by 2000, divide by 0.0023).
data frame that is nearly identical to the input data except results are shown as monthly load as the annual loss estimate divided by 12. This is for consistency of reporting with other loading sources.
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_indml', full.names = TRUE) anlz_ml_facility(fls)
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_indml', full.names = TRUE) anlz_ml_facility(fls)
Basin information for coastal subbasin codes
dbasing
dbasing
A data.frame
Used for domestic point source summaries, bay segments are as follows:
1: Hillsborough Bay
2: Old Tampa Bay
3: Middle Tampa Bay
4: Lower Tampa Bay
5: Boca Ciega Bay
6: Terra Ceia Bay
7: Manatee River
55: Boca Ciega Bay South
dbasing
dbasing
Domestic and industrial point source facilities, including industrial with material losses
facilities
facilities
A data.frame
facilities
facilities
Get rainfall data at NOAA NCDC sites
util_ad_getrain(yrs, station = NULL, noaa_key, ntry = 5, quiet = FALSE)
util_ad_getrain(yrs, station = NULL, noaa_key, ntry = 5, quiet = FALSE)
yrs |
numeric vector for the years of data to retrieve |
station |
numeric vector of station numbers to retrieve, see details |
noaa_key |
character for the NOAA API key |
ntry |
numeric for the number of times to try to download the data |
quiet |
logical to print progress in the console |
This function is used to retrieve a long-term record of rainfall for estimating AD loads. It is used to create an input data file for load calculations and it is not used directly by any other functions due to download time. A NOAA API key is required to use the function.
By default, rainfall data is retrieved for the following stations:
228
: ARCADIA
478
: BARTOW
520
: BAY LAKE
940
: BRADENTON EXPERIMENT
945
: BRADENTON 5 ESE
1046
: BROOKSVILLE CHIN HIL
1163
: BUSHNELL 2 E
1632
: CLEARWATER
1641
: CLERMONT 7 S
2806
: ST PETERSBURG WHITTD
3153
: FORT GREEN 12 WSW
3986
: HILLSBOROUGH RVR SP
4707
: LAKE ALFRED EXP STN
5973
: MOUNTAIN LAKE
6065
: MYAKKA RIVER STATE P
6880
: PARRISH
7205
: PLANT CITY
7851
: ST LEO
7886
: ST PETERSBURG WHITTD
8788
: TAMPA INTL ARPT
8824
: TARPON SPNGS SWG PLT
9176
: VENICE
9401
: WAUCHULA 2 N
a data frame with the following columns:
station
: numeric, the station id
date
: Date, the date of the observation
Year
: numeric, the year of the observation
Month
: numeric, the month of the observation
Day
: numeric, the day of the observation
rainfall
: numeric, the amount of rainfall in inches
## Not run: noaa_key <- Sys.getenv('NOAA_KEY') util_ad_getrain(2021, 228, noaa_key) ## End(Not run)
## Not run: noaa_key <- Sys.getenv('NOAA_KEY') util_ad_getrain(2021, 228, noaa_key) ## End(Not run)
Prep Verna Wellfield data for use in AD calculations
util_ad_prepverna(fl, fillmis = T)
util_ad_prepverna(fl, fillmis = T)
fl |
text string for the file path to the Verna Wellfield data |
fillmis |
logical indicating whether to fill missing data with monthly means |
Raw data can be obtained from https://nadp.slh.wisc.edu/sites/ntn-FL41/ as monthly observations. Total nitrogen and phosphorus concentrations are estimated from ammonium and nitrate concentrations (mg/L) using the following relationships:
The first equation corrects for the % of ions in ammonium and nitrate that is N, and the second is a regression relationship between TBADS TN and TP, applied to Verna.
A data frame with total nitrogen and phosphorus estimates as mg/l for each year and month of the input data
fl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') util_ad_prepverna(fl)
fl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') util_ad_prepverna(fl)
Add column names for point source from raw entity data
util_ps_addcol(dat)
util_ps_addcol(dat)
dat |
data frame from raw entity data as |
The function checks for TN, TP, TSS, and BOD. If any of these are missing, the columns are added with empty values including a column for units. If BOD is missing but CBOD is present, the CBOD column is renamed to BOD.
Input data frame from pth
as is if column names are correct, otherwise additional columns are added as needed.
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_addcol(dat)
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_addcol(dat)
Create a data frame of formatting issues with point source input files
util_ps_checkfls(fls)
util_ps_checkfls(fls)
fls |
vector of file paths to raw facility data, one to many |
The chk
column indicates the issue with the file and will indicate "ok"
if no issues are found, "read error"
if the file cannot be read, and "check columns"
if the column names are not as expected. Any file not showing "ok"
should be checked for issues.
All files are checked with util_ps_checkuni
if a file does not have a read error.
The function cannot be used with files for material losses.
A data.frame
with three columns indicating name
for the file name, chk
for the file issue, and nms
for a concatenated string of column names for the file
fls <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') util_ps_checkfls(fls)
fls <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') util_ps_checkfls(fls)
Check units for point source from raw entity data
util_ps_checkuni(dat)
util_ps_checkuni(dat)
dat |
data frame from raw entity data as |
Input data should include flow as million gallons per day, and concentration as mg/L.
Input data frame from pth
with relevant data and columns renamed, otherwise an error is returned if units are not correct. Only year, month, outfall, flow, TN, TP, TSS, and BOD are returned.
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_checkuni(dat)
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_checkuni(dat)
Get point source entity information from file name
util_ps_facinfo(pth, asdf = FALSE)
util_ps_facinfo(pth, asdf = FALSE)
pth |
path to raw entity data |
asdf |
logical, if |
Bay segment is an integer with values of 1, 2, 3, 4, 5, 6, 7, and 55 for Old Tampa Bay, Hillsborough Bay, Middle Tampa Bay, Lower Tampa Bay, Boca Ciega Bay, Terra Ceia Bay, Manatee River, and Boca Ciega Bay South, respectively.
A list or data.frame
(if asdf = TRUE
) with entity, facility, permit, facility id, coastal id, and coastal subbasin code
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') util_ps_facinfo(pth)
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') util_ps_facinfo(pth)
Fill missing point source data with annual average
util_ps_fillmis(dat)
util_ps_fillmis(dat)
dat |
data frame from raw entity data as |
Missing concentration data are replaced with the average for the outfall in a given year. All flow data are also floored at zero. Rows with missing flow data are assigned 0 for all data. Rows with zero flow are assigned concentration of zero.
Input data frame as is if no missing values, otherwise missing data filled as described above.
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) dat <- util_ps_checkuni(dat) util_ps_fillmis(dat)
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) dat <- util_ps_checkuni(dat) util_ps_fillmis(dat)
Light edits to the outfall ID column for point source data
util_ps_fixoutfall(dat)
util_ps_fixoutfall(dat)
dat |
data frame from raw entity data as |
The outfall ID column is edited lightly to remove any leading or trailing white space, a hyphen is added between letters and numbers if missing, and "Outfall" prefix is removed if presenn.
Input data frame as is, with any edits to the outfall ID column.
pth <- system.file('extdata/ps_ind_busch_busch_2020.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_fixoutfall(dat)
pth <- system.file('extdata/ps_ind_busch_busch_2020.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_fixoutfall(dat)
Summarize point source load estimates
util_ps_summ( dat, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
util_ps_summ( dat, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
dat |
Pre-processed data frame of point source load estimates, see examples |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
The data are summarized differently based on the summ
and summtime
arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment'
) and year (summtime = 'year'
).
Data frame with summarized loading data based on user-supplied arguments
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) ipsbyfac <- anlz_ips_facility(fls) # add bay segment and source, there should only be loads to hills, middle, and lower tampa bay ipsld <- ipsbyfac |> dplyr::arrange(coastco) |> dplyr::left_join(dbasing, by = "coastco") |> dplyr::mutate( segment = dplyr::case_when( bayseg == 1 ~ "Old Tampa Bay", bayseg == 2 ~ "Hillsborough Bay", bayseg == 3 ~ "Middle Tampa Bay", bayseg == 4 ~ "Lower Tampa Bay", TRUE ~ NA_character_ ), source = 'IPS' ) |> dplyr::select(-basin, -hectare, -coastco, -name, -bayseg) util_ps_summ(ipsld, summ = 'entity', summtime = 'year')
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) ipsbyfac <- anlz_ips_facility(fls) # add bay segment and source, there should only be loads to hills, middle, and lower tampa bay ipsld <- ipsbyfac |> dplyr::arrange(coastco) |> dplyr::left_join(dbasing, by = "coastco") |> dplyr::mutate( segment = dplyr::case_when( bayseg == 1 ~ "Old Tampa Bay", bayseg == 2 ~ "Hillsborough Bay", bayseg == 3 ~ "Middle Tampa Bay", bayseg == 4 ~ "Lower Tampa Bay", TRUE ~ NA_character_ ), source = 'IPS' ) |> dplyr::select(-basin, -hectare, -coastco, -name, -bayseg) util_ps_summ(ipsld, summ = 'entity', summtime = 'year')