| Title: | Calculate Loading Data to Tampa Bay |
|---|---|
| Description: | Loading data from major sources to Tampa Bay are calculated on a monthly or annual basis. Major sources include domestic point source (reuse, end of pipe), industrial point source, material losses, non-point sources (MS4), atmospheric deposition, and groundwater. |
| Authors: | Marcus Beck [aut, cre] (ORCID: <https://orcid.org/0000-0002-4996-0059>), Ed Sherwood [aut] (ORCID: <https://orcid.org/0000-0001-5330-302X>), Ray Pribble [aut] |
| Maintainer: | Marcus Beck <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.0.0.9000 |
| Built: | 2026-06-03 13:27:39 UTC |
| Source: | https://github.com/tbep-tech/tbeploads |
Allocation assessment TN load corrections by bay segment and entity
aa_correctionsaa_corrections
A data.frame with 43 rows and 4 columns:
Integer bay segment identifier
MS4 jurisdiction or entity name
Atmospheric deposition TN offset (tons/yr)
Net permitted project TN offset (tons/yr); negative values indicate a load credit
TN load offsets applied before hydrologic normalization in
anlz_aa for the 2022-2024 TBNMC assessment period. Values are
sourced from the SAS script 7_Basin_assessment2224.sas and cover two
correction types: atmospheric deposition (AD) loads apportioned to each
entity jurisdiction, and net permitted project (AP) load credits. FDACS
agriculture entries (entity = "All") carry irrigation AP reductions
only (ad_tons = 0). Negative project_tons values reflect
project credits that increase the allowable load.
Updating these numbers to account for additional changes in atomospheric deposition concentrations (i.e., large changes that have occurred at Bayside and Apollo Beach power plants) and the many projects completed since the late 2000s should be done for future package updates. Numbers herein have been used to date for the allocation assessment tables. They likely do not represent current conditions.
See data-raw/aa_corrections.R for construction.
aa_correctionsaa_corrections
Data frame of distances of segment locations to National Weather Service (NWS) sites
ad_distancead_distance
A data.frame
Used for estimating atmospheric deposition. The data frame contains the following columns:
segment: Numeric identifier for the segment location
seg_x: Numeric value for the x-coordinate of the segment location (WGS 84, UTM Zone 17N, CRS 32617)
seg_y: Numeric value for the y-coordinate of the segment location (WGS 84, UTM Zone 17N, CRS 32617)
matchsit: Numeric for the NWS site that matches the segment location
distance: Numeric value for the distance (m) between the segment coordinate and NWS site
invdist2: Numeric value for the inverse distance squared (1/m^2) between the segment coordinate and NWS site
area: Numeric value for the area of the segment (ha)
Segment numbers are 1-7 for Old Tampa Bay, Hillsborough Bay, Middle Tampa Bay, Lower Tampa Bay, Boca Ciega Bay, Terra Ceia Bay, and Manatee River.
ad_distancead_distance
anlz_nps_gaged and anlz_nps_ungaged
Data frame of all flow data used in anlz_nps_gaged and anlz_nps_ungaged
allfloallflo
A data.frame of monthly mean daily flow data for select basins
Monthly flow data at select stations used for estimating non-point source gaged and ungaged loads. Created using the util_nps_getflow function. Includes data from the USGS API using util_nps_getusgsflow and from external sources using util_nps_getextflow. The data frame contains the following columns:
basin: Character string for the basin or gauge location
yr: Year of the observation
mo: Month of the observation
flow_cfs: Numeric value for the average daily flow in cubic feet per second (cfs)
util_nps_getusgsflow, util_nps_getextflow, util_nps_getflow
allfloallflo
anlz_nps_gaged
Data frame of all water quality data used in anlz_nps_gaged
allwqallwq
A data.frame of monthly mean water quality data for select stations
Monthly water quality data for select stations used for estimating non-point source gaged loads. Created using the util_nps_getwq function. Includes data from Manatee, Pinellas, and Hillsborough (EPCHC) counties. The data frame contains the following columns:
basin: Character string for the basin or station location
yr: Year of the observation
mo: Month of the observation
tn_mgl: Numeric value for Total Nitrogen in mg/L
tp_mgl: Numeric value for Total Phosphorus in mg/L
tss_mgl: Numeric value for Total Suspended Solids in mg/L
bod_mgl: Numeric value for Biochemical Oxygen Demand in mg/L
allwqallwq
Allocation assessment for DPS, IPS, and NPS/MS4 entities
anlz_aa(yrrng, dps_data, ips_data, ml_data, nps_data, tbbase)anlz_aa(yrrng, dps_data, ips_data, ml_data, nps_data, tbbase)
yrrng |
Integer vector of length 2, start and end year, e.g., |
dps_data |
Data frame from |
ips_data |
Data frame from |
ml_data |
Data frame from |
nps_data |
Data frame from |
tbbase |
data frame containing polygon areas for the combined data layer of bay segment, basin, jurisdiction, land use data, and soils, see details |
Entities present in the computed loads but absent from the allocation tables
are retained in the output with NA allocation fields so that
unmatched entries are visible for troubleshooting.
DPS path
DPS facility TN loads require no hydrologic normalization. Monthly loads
from dps_data are summed to annual totals per facility, averaged
over yrrng, and compared directly against the dps_allocations
table. The join key is entity + facname + bay\_seg + source, where
source distinguishes direct surface water discharge
("DPS - end of pipe") from reclaimed water reuse
("DPS - reuse"). Bay segment 5 (Boca Ciega Bay) is excluded
and bayseg 6/7 are remapped to 55.
IPS path
Annual IPS facility TN loads are normalized using the ratio:
where basin\_total\_h2o is the annual total water load (NPS + DPS + IPS)
for the same basin and year, matching the SAS ratio1\_2224 denominator.
Effective loads are summed across basins per permit per bay segment, then
averaged over yrrng.
ML path
Material loss TN loads require no hydrologic normalization. Monthly loads
from ml_data are summed to annual totals per facility, averaged
over yrrng, and compared against the ml_allocations
table. Facilities with ishared = FALSE are assessed individually on
entity + facname + bay segment. Facilities with ishared = TRUE
(currently the three Mosaic facilities in Hillsborough Bay) have their
loads summed to an entity + bay segment total before comparison to the
single shared allocation.
NPS/MS4 path
TN loads in nps_data are NPS-only; no point-source correction is
applied to the input loads. Basin-level NPS loads are disaggregated to
individual MS4 entities using the output (created internally) from
util_aa_npsfactors
that combines tbbase, rcclucsid, and emc
into:
factor_tn distributes basin TN load among land use classes.
factor_rc distributes each land use class's load among
entities proportional to area × runoff coefficient.
Agricultural land use (category "Agriculture") is attributed to the
aggregate entity "All" regardless of the underlying MS4 jurisdiction.
Before summing across CLUCSIDs, each entity's disaggregated TN load is
scaled by (1 - conserv\_frac) using conserv_correction,
which provides entity- and CLUCSID-specific fractions of area times runoff
coefficient attributable to conservation land. This removes the conservation
land contribution that is absent from the tbeploads-built tbbase.
After disaggregation, loads and 1992-1994 baseline water volumes are summed
across basins to the segment level. TN corrections from aa_corrections
(ad_tons + project_tons) are subtracted before hydrologic normalization:
Bay segments Terra Ceia Bay (6) and Manatee River (7) are merged into
segment 55 (Remaining Lower Tampa Bay) after disaggregation, consistent
with the hydro_baseline encoding and TBNMC reporting.
Boca Ciega Bay (segment 5) is excluded from the allocation framework.
A data frame with one row per entity (NPS/MS4) or facility (IPS) per bay segment:
Integer bay segment identifier
Bay segment name
MS4 entity name or facility operator
Full entity name from nps_allocations
(NPS rows only)
Facility name (IPS, DPS, and non-shared ML rows)
NPDES permit number (IPS rows only)
Allocation type: "MS4",
"Nonpoint Source/MS4", "IPS",
"DPS - end of pipe", "DPS - reuse", or "ML"
Fractional TN allocation (0-1)
Allocation in TN tons per year
Mean hydrologically-normalized TN load (tons/yr),
averaged over yrrng; equals load_tons for DPS and ML
(no normalization applied)
Mean annual TN load (tons/yr) without hydrologic
normalization, averaged over yrrng
Logical: eff_load_tons <= alloc_tons; NA when
allocation or effective load is missing
## Not run: fls_dps <- list.files(system.file("extdata/", package = "tbeploads"), pattern = "ps_dom_", full.names = TRUE) dps <- anlz_dps_facility(fls_dps) fls_ips <- list.files(system.file("extdata/", package = "tbeploads"), pattern = "ps_ind_", full.names = TRUE) ips <- anlz_ips_facility(fls_ips) fls_ml <- list.files(system.file("extdata/", package = "tbeploads"), pattern = "ps_indml", full.names = TRUE) ml <- anlz_ml_facility(fls_ml) nps <- anlz_nps( yrrng = c("2022-01-01", "2024-12-31"), tbbase = tbbase, rain = rain, allwq = allwq, allflo = allflo, vernafl = system.file("extdata/verna-raw.csv", package = "tbeploads"), summ = "basin", summtime = "year" ) anlz_aa(c(2022, 2024), dps, ips, ml, nps, tbbase) ## End(Not run)## Not run: fls_dps <- list.files(system.file("extdata/", package = "tbeploads"), pattern = "ps_dom_", full.names = TRUE) dps <- anlz_dps_facility(fls_dps) fls_ips <- list.files(system.file("extdata/", package = "tbeploads"), pattern = "ps_ind_", full.names = TRUE) ips <- anlz_ips_facility(fls_ips) fls_ml <- list.files(system.file("extdata/", package = "tbeploads"), pattern = "ps_indml", full.names = TRUE) ml <- anlz_ml_facility(fls_ml) nps <- anlz_nps( yrrng = c("2022-01-01", "2024-12-31"), tbbase = tbbase, rain = rain, allwq = allwq, allflo = allflo, vernafl = system.file("extdata/verna-raw.csv", package = "tbeploads"), summ = "basin", summtime = "year" ) anlz_aa(c(2022, 2024), dps, ips, ml, nps, tbbase) ## End(Not run)
Calculate AD loads and summarize
anlz_ad( rain, vernafl, summ = c("segment", "all"), summtime = c("month", "year") )anlz_ad( rain, vernafl, summ = c("segment", "all"), summtime = c("month", "year") )
rain |
data frame of daily rainfall data from NOAA NCDC, obtained using |
vernafl |
character vector of file path to Verna Wellfield atmospheric concentration data |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
Loading from atmospheric deposition (AD) for bay segments in the Tampa Bay watershed are calculated using rainfall data and atmospheric concentration data from the Verna Wellfield site. Rainfall data must be obtained using the util_getrain function before calculating loads. For convenience, daily rainfall data from 2017 to 2023 at sites in the watershed are included with the package in the rain object. The Verna Wellfield data must also be obtained from https://nadp.slh.wisc.edu/sites/ntn-FL41/ as monthly observations. This file is also included with the package and can be found using system.file as in the examples below. Internally, the Verna data are converted to total nitrogen and total phosphorus from ammonium and nitrate concentration data (see util_prepverna for additional information).
The function first estimates the total hydrologic load for each bay segment using daily estimates of rainfall at NWIS NCDC sites in the watershed. This is done as a weighted mean of rainfall at the measured sites relative to grid locations in each sub-watershed for the bay segments. The weights are based on distance of the grid cells from the closest site as inverse distance squared. Total hydrologic load for a bay segment is then estimated by converting inches/month to m3/month using the segment area. The distance data and bay segment areas are contained in the ad_distance file included with the package.
The total nitrogen and phosphorus loads are then estimated for each bay segment by multiplying the total hydrologic load by the total nitrogen and phosphorus concentrations in the Verna data. The loading calculations also include a wet/dry deposition conversion factor to account for differences in loading during the rainy and dry seasons.
A data frame with nitrogen and phosphorus loads in tons/month, hydrologic load in million m3/month, and segment, year, and month as columns if summ = 'segment' and summtime = 'month'. Total load to all segments can be returned if summ = 'all' and annual summaries can be returned if summtime = 'year'. In the former case, the total excludes the northern portion of Boca Ciega Bay that is not included in the reasonable assurance boundaries. In the latter case, loads are the sum of monthly estimates such that output is tons/yr for TN and TP and as million m3/yr for hydrologic load.
vernafl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') data(rain) anlz_ad(rain, vernafl)vernafl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') data(rain) anlz_ad(rain, vernafl)
Calculate DPS reuse and end of pipe loads and summarize
anlz_dps( fls, summ = c("entity", "facility", "basin", "segment", "all"), summtime = c("month", "year") )anlz_dps( fls, summ = c("entity", "facility", "basin", "segment", "all"), summtime = c("month", "year") )
fls |
vector of file paths to raw entity data, one to many |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
Input data files in fls are first processed by anlz_dps_facility to calculate DPS reuse and end of pipe for each facility and outfall. The data are summarized differently based on the summ and summtime arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment') and year (summtime = 'year'). Options for summ are 'entity' to summarize by entity, 'facility' to summarize by facility, 'basin' to summarize by drainage basin (retains the basin column for use with anlz_nps_psremove), 'segment' to summarize by bay segment, and 'all' to summarize across all segments.
data frame with loading data for TP, TN, TSS, and BOD as tons per month/year and hydro load as million cubic meters per month/year
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) anlz_dps(fls)fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) anlz_dps(fls)
Calculate DPS reuse and end of pipe loads from raw facility data
anlz_dps_facility(fls)anlz_dps_facility(fls)
fls |
vector of file paths to raw facility data, one to many |
Input data should include flow as million gallons per day, and conc as mg/L. Steps include:
Multiply flow by day in month to get million gallons per month
Multiply flow by 3785.412 to get cubic meters per month
Multiply conc by flow and divide by 1000 to get kg var per month
Multiply m3 by 1000 to get L, then divide by 1e6 to convert mg to kg, same as dividing by 1000
TN, TP, TSS, BOD dps reuse is multiplied by attenuation factor for land application (varies by location)
Hydro load (m3 / mo) is also attenuated for the reuse, multiplied by 0.6 (40% attenuation)
data frame with loading data for TP, TN, TSS, and BOD as tons per month and hydro load as million cubic meters per month. Information for each entity, facility, and outfall is retained.
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) anlz_dps_facility(fls)fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) anlz_dps_facility(fls)
Calculate groundwater loads to Tampa Bay segments
anlz_gw( pot_dry, pot_wet, yrrng = c(2022, 2024), wqdat = NULL, summtime = c("month", "year") )anlz_gw( pot_dry, pot_wet, yrrng = c(2022, 2024), wqdat = NULL, summtime = c("month", "year") )
pot_dry |
|
pot_wet |
|
yrrng |
integer vector of length 2, start and end year for the load
estimates, e.g. |
wqdat |
data frame of Floridan aquifer TN and TP concentrations (mg/L)
as returned by |
summtime |
character, temporal summarization: |
Estimates groundwater loads to each Tampa Bay segment for three aquifer layers following the methodology in Zarbock et al. (1994).
Floridan aquifer: Flow is computed with Darcy's Law:
where is transmissivity (ft/day), is the hydraulic
gradient (ft/mile), and is the flow zone length (miles).
is in MGD. Nutrient loads (kg/month) are:
where is the TN or TP concentration (mg/L). Hydrologic load
(m/month) is .
Hydraulic gradients:
Gradients are computed once from pot_dry and pot_wet via
util_gw_grad and applied to every year in yrrng.
Update pot_dry and pot_wet with fresh outputs from
util_gw_getcontour when new FDEP potentiometric surface maps
become available. See util_gw_grad for details on search
areas, zero-gradient segments, and benchmark warnings.
Surficial and intermediate aquifers:
Loads are fixed constants per segment. Surficial values are from
gwupdate95-98_final.xls (1995-1998 SWFWMD monitoring data).
Intermediate values are means from SWFWMD monitoring over 1999-2003.
These have not changed since the original analysis.
Season assignment: Months 1-6 and 11-12 are dry season; months 7-10 are wet season.
A data frame with columns:
Year: integer
Month: integer (omitted when summtime = 'year')
source: character, "GW"
segment: character, bay segment name
tn_load: numeric, total nitrogen load (tons/month or tons/year)
tp_load: numeric, total phosphorus load (tons/month or tons/year)
hy_load: numeric, hydrologic load (million m/month or
million m/year)
Zarbock, H., A. Janicki, D. Wade, D. Heimbuch, and H. Wilson. 1994. Estimates of Total Nitrogen, Total Phosphorus, and Total Suspended Solids Loadings to Tampa Bay, Florida. Technical Publication #04-94. Prepared by Coastal Environmental, Inc. Prepared for Tampa Bay National Estuary Program. St. Petersburg, FL.
# contdry and contwet are pre-computed 2022 package datasets gw <- anlz_gw(contdry, contwet, yrrng = c(2022, 2024)) head(gw) # annual totals anlz_gw(contdry, contwet, yrrng = c(2022, 2024), summtime = 'year') ## Not run: # update rasters from FDEP for a new year, then compute loads pot_dry <- util_gw_getcontour("dry", 2025) pot_wet <- util_gw_getcontour("wet", 2025) gw <- anlz_gw(pot_dry, pot_wet, yrrng = c(2025, 2025)) # pass concentrations from the Water Atlas API gw <- anlz_gw(pot_dry, pot_wet, yrrng = c(2025, 2025), wqdat = util_gw_getwq()) ## End(Not run)# contdry and contwet are pre-computed 2022 package datasets gw <- anlz_gw(contdry, contwet, yrrng = c(2022, 2024)) head(gw) # annual totals anlz_gw(contdry, contwet, yrrng = c(2022, 2024), summtime = 'year') ## Not run: # update rasters from FDEP for a new year, then compute loads pot_dry <- util_gw_getcontour("dry", 2025) pot_wet <- util_gw_getcontour("wet", 2025) gw <- anlz_gw(pot_dry, pot_wet, yrrng = c(2025, 2025)) # pass concentrations from the Water Atlas API gw <- anlz_gw(pot_dry, pot_wet, yrrng = c(2025, 2025), wqdat = util_gw_getwq()) ## End(Not run)
Calculate IPS loads and summarize
anlz_ips( fls, summ = c("entity", "facility", "basin", "segment", "all"), summtime = c("month", "year") )anlz_ips( fls, summ = c("entity", "facility", "basin", "segment", "all"), summtime = c("month", "year") )
fls |
vector of file paths to raw entity data, one to many |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
Input data files in fls are first processed by anlz_ips_facility to calculate IPS loads for each facility and outfall. The data are summarized differently based on the summ and summtime arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment') and year (summtime = 'year'). Options for summ are 'entity' to summarize by entity, 'facility' to summarize by facility, 'basin' to summarize by drainage basin (retains the basin column for use with anlz_nps_psremove), 'segment' to summarize by bay segment, and 'all' to summarize across all segments.
data frame with loading data for TP, TN, TSS, and BOD as tons per month/year and hydro load as million cubic meters per month/year
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) anlz_ips(fls)fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) anlz_ips(fls)
Calculate IPS loads from raw facility data
anlz_ips_facility(fls)anlz_ips_facility(fls)
fls |
vector of file paths to raw facility data, one to many |
Input data should include flow as million gallons per day, and conc as mg/L. Steps include:
Multiply flow by day in month to get million gallons per month
Multiply flow by 3785.412 to get cubic meters per month
Multiply conc by flow and divide by 1000 to get kg var per month
Multiply m3 by 1000 to get L, then divide by 1e6 to convert mg to kg, same as dividing by 1000
data frame with loading data for TP, TN, TSS, and BOD as tons per month and hydro load as million cubic meters per month. Information for each entity, facility, and outfall is retained.
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) anlz_ips_facility(fls)fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) anlz_ips_facility(fls)
Calculate material loss (ML) loads and summarize
anlz_ml( fls, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )anlz_ml( fls, summ = c("entity", "facility", "segment", "all"), summtime = c("month", "year") )
fls |
vector of file paths to raw entity data, one to many |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
Input data files in fls are first processed by anlz_ml_facility to calculate ML loads for each facility. The data are summarized differently based on the summ and summtime arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment') and year (summtime = 'year'). Options for summ are 'entity' to summarize by entity only, 'facility' to summarize by facility only, 'segment' to summarize by bay segment, and 'all' to summarize total load. Options for summtime are 'month' to summarize by month and 'year' to summarize by year. The default is to summarize by entity and month.
data frame with loading data for TN as tons per month/year. Columns for TP, TSS, BOD, and hydrologic load are also returned with zero load for consistency with other point source load calculation functions.
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_indml', full.names = TRUE) anlz_ml(fls)fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_indml', full.names = TRUE) anlz_ml(fls)
Calculate material loss (ML) loads from raw facility data
anlz_ml_facility(fls)anlz_ml_facility(fls)
fls |
vector of file paths to raw facility data, one to many |
Input data should be one row per year per facility, where the row shows the total tons per year of total nitrogen loss. Input files are often created by hand based on reported annual tons of nitrogen shipped at each facility. The material losses as tons/yr are estimated from the tons shipped using an agreed upon loss rate. Values reported in the example files represent the estimated loss as the total tons of N shipped each year multiplied by 0.0023 and divided by 2000. The total N shipped at a facility each year can be obtained using a simple back-calculation (multiply by 2000, divide by 0.0023).
data frame that is nearly identical to the input data except results are shown as monthly load as the annual loss estimate divided by 12. This is for consistency of reporting with other loading sources.
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_indml', full.names = TRUE) anlz_ml_facility(fls)fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_indml', full.names = TRUE) anlz_ml_facility(fls)
Calculate non-point source (NPS) loads for Tampa Bay
anlz_nps( yrrng = c("2021-01-01", "2023-12-31"), tbbase, rain, mancopth = NULL, pincopth = NULL, lakemanpth = NULL, tampabypth = NULL, bellshlpth = NULL, vernafl, allflo = NULL, allwq = NULL, usgsflow = NULL, summ = c("basin", "segment", "all"), summtime = c("month", "year"), aslu = FALSE, verbose = TRUE )anlz_nps( yrrng = c("2021-01-01", "2023-12-31"), tbbase, rain, mancopth = NULL, pincopth = NULL, lakemanpth = NULL, tampabypth = NULL, bellshlpth = NULL, vernafl, allflo = NULL, allwq = NULL, usgsflow = NULL, summ = c("basin", "segment", "all"), summtime = c("month", "year"), aslu = FALSE, verbose = TRUE )
yrrng |
A vector of two dates in 'YYYY-MM-DD' format, specifying the date range to retrieve flow data. Default is from '2021-01-01' to '2023-12-31'. |
tbbase |
data frame containing polygon areas for the combined data layer of bay segment, basin, jurisdiction, land use data, and soils, see details |
rain |
data frame of rainfall data, see details |
mancopth |
character, path to the Manatee County water quality data file, see details |
pincopth |
character, path to the Pinellas County water quality data file, see details |
lakemanpth |
character, path to the file containing the Lake Manatee flow data, see details |
tampabypth |
character, path to the file containing the Tampa Bypass flow data, see details |
bellshlpth |
character, path to the file containing the Bell shoals data, see details |
vernafl |
character vector of file path to Verna Wellfield atmospheric concentration data |
allflo |
data frame of flow data, if already available from |
allwq |
data frame of water quality data, if already available from |
usgsflow |
data frame of USGS flow data, if already available from |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
aslu |
logical indicating whether to summarize by land use type (ungaged loads only), default is FALSE |
verbose |
logical indicating whether to print verbose output |
The function estimates non-point source (NPS) loads for Tampa Bay by combining ungaged and gaged NPS loads. Ungaged loads are estimated using rainfall, flow, event mean concentration, land use, and soils data, while gaged loads are estimated using water quality data and flow data. The function also incorporates atmospheric concentration data from the Verna Wellfield site.
The data are summarized differently based on the summ and summtime arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment') and year (summtime = 'year'). Options for summ are 'basin' to summarize across sub-basins within bay segments, 'segment' to summarize by bay segment, and 'all' to summarize total load. Loads can also be summarized by land use type with the summ and summtime argumets by setting aslu = TRUE. Land use type summaries only apply to ungaged load estimates. Options for summtime are 'month' to summarize by month and 'year' to summarize by year. The default is to summarize by basin and month.
The following functions are used internally and are provided here for reference on the components used in the calculations:
anlz_nps_ungaged: Estimates ungaged NPS loads.
anlz_nps_gaged: Estimates gaged NPS loads.
util_nps_fillmiswq: Fills missing water quality data with linear interpolation.
util_nps_getflow: Gets flow estimates for NPS gaged and ungaged calculations.
util_nps_getusgsflow: Gets USGS flow data for NPS calculations, used in util_nps_getflow.
util_nps_getextflow: Gets external flow data (Lake Manatee, Tampa Bypass, and Bell Shoals), used in util_nps_getflow.
util_nps_getwq: Gets water quality data for NPS gaged calculations.
util_nps_preprain: Prepares and formats rainfall data.
util_nps_preplog: Prepares land use data for logistic regression modeling.
util_nps_segment: Assigns basins to bay segments.
util_prepverna: Prepares and fills missing data with five-year means for the Verna Wellfield site data.
A data frame of non-point source loads for Tampa Bay, including columns for year, month, bay segment, basin, and loads for total nitrogen (TN), total phosphorus (TP), total suspended solids (TSS), biochemical oxygen demand (BOD), and hydrology using default values for the summ and summtime arguments. TN, TP, TSS, and BOD Loads are tons per month or year depending on the summtime argument. Hydrologic loads are million cubic meters per month or year depending on the summtime argument.
data(tbbase) data(rain) data(allwq) data(allflo) vernafl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') nps <- anlz_nps( yrrng = c('2021-01-01', '2023-12-31'), tbbase = tbbase, rain = rain, allwq = allwq, allflo = allflo, vernafl = vernafl, ) head(nps)data(tbbase) data(rain) data(allwq) data(allflo) vernafl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') nps <- anlz_nps( yrrng = c('2021-01-01', '2023-12-31'), tbbase = tbbase, rain = rain, allwq = allwq, allflo = allflo, vernafl = vernafl, ) head(nps)
Calculate non-point source (NPS) loads for gaged basins
anlz_nps_gaged( yrrng = c("2021-01-01", "2023-12-31"), mancopth = NULL, pincopth = NULL, lakemanpth = NULL, tampabypth = NULL, bellshlpth = NULL, allflo = NULL, allwq = NULL, usgsflow = NULL, verbose = TRUE )anlz_nps_gaged( yrrng = c("2021-01-01", "2023-12-31"), mancopth = NULL, pincopth = NULL, lakemanpth = NULL, tampabypth = NULL, bellshlpth = NULL, allflo = NULL, allwq = NULL, usgsflow = NULL, verbose = TRUE )
yrrng |
A vector of two dates in 'YYYY-MM-DD' format, specifying the date range to retrieve flow data. Default is from '2021-01-01' to '2023-12-31'. |
mancopth |
character, path to the Manatee County water quality data file, see details |
pincopth |
character, path to the Pinellas County water quality data file, see details |
lakemanpth |
character, path to the file containing the Lake Manatee flow data |
tampabypth |
character, path to the file containing the Tampa Bypass flow data |
bellshlpth |
character, path to the file containing the Bell shoals data |
allflo |
data frame of flow data, if already available from |
allwq |
data frame of water quality data, if already available from |
usgsflow |
data frame of USGS flow data, if already available from |
verbose |
logical indicating whether to print verbose output |
The function uses util_nps_getflow to retrieve flow data and util_nps_getwq to retrieve water quality data. It then combines these datasets and calculates loads for TN, TP, TSS, BOD, and hydrologic load. See the help files for each function for more details.
Required external data inputs are Lake Manatee, Tampa Bypass, and Alafia River Bell Shoals flow data. These are not available from the USGS API and must be obtained from the contacts listed in util_nps_getextflow. USGS flow data are for stations 02299950, 02300042, 02300500, 02300700, 02301000, 02301300, 02301500, 02301750, 02303000, 02303330, 02304500, 02306647, 02307000, 02307359, and 02307498. The USGS flow data are from the NWIS database as returned by read_waterdata_daily using util_nps_getusgsflow. A preprocessed USGS flow data frame can be provided using the usgsflow argument to avoid re-downloading the data. All inputs for flow can be superseded by providing a complete flow data frame using the allflo argument.
Water Quality data are obtained from the FDEP WIN database API, tbeptools, or local files as described in util_nps_getwq. Chosen stations are ER2 and UM2 for Manatee County and station 06-06 for Pinellas County. Environmental Protection Commission (EPC) of Hillsborough County stations retained are 105, 113, 114, 132, 141, 138, 142, and 147. Manatee or Pinellas County data can be imported from local files using the mancopth and pincopth arguments, respectively. If these are not provided, the function will attempt to retrieve data from the FDEP WIN database using read_importwqwin from tbeptools. The EPC data are retrieved using read_importepc from tbeptools. All inputs for water quality can be superseded by providing a complete water quality data frame using the allwq argument.
The function assumes that the water quality data are in mg/L and flow data are in cfs. Missing water quality data are filled with previous five year averages for the end months, then linearly interpolated using util_nps_fillmiswq.
A data frame with columns for basin, year, month, TN in mg/L, TP in mg/L, TSS in mg/L, BOD in mg/L, flow in liters/month, hydrologic load in m3/month, TN load in kg/month, TP load in kg/month, TSS load in kg/month, and BOD load in kg/month.
data(allwq) data(allflo) nps_gaged <- anlz_nps_gaged( yrrng = c('2021-01-01', '2023-12-31'), allflo = allflo, allwq = allwq ) head(nps_gaged)data(allwq) data(allflo) nps_gaged <- anlz_nps_gaged( yrrng = c('2021-01-01', '2023-12-31'), allflo = allflo, allwq = allwq ) head(nps_gaged)
Subtract gaged industrial and domestic point source loads from NPS model output to isolate true non-point source loads.
anlz_nps_psremove( nps, ips, dps, ad_ap = TRUE, summ = c("segment", "basin"), summtime = c("month", "year") )anlz_nps_psremove( nps, ips, dps, ad_ap = TRUE, summ = c("segment", "basin"), summtime = c("month", "year") )
nps |
data frame of NPS loads from |
ips |
data frame of IPS loads from |
dps |
data frame of DPS loads from |
ad_ap |
logical, whether to apply fixed monthly AD/AP TN reductions
from the 2007 RA allocation analysis. Default |
summ |
character, one of |
summtime |
character, one of |
Gaged NPS loads (estimated from stream gauges) include point source loads discharged upstream of the gauge. This function subtracts IPS and DPS loads in gaged basins from the combined NPS model output so that point source contributions are not double-counted.
Only IPS and DPS records in gaged basins (identified via dbasing)
are subtracted. Nested basin identifiers (02301000, 02301300 → 02301500;
02303000, 02303330 → 02304500; 02299950 → LMANATEE) are reassigned to their
parent basins before summing, consistent with the handling in
anlz_nps.
When ad_ap = TRUE, fixed monthly TN reductions from the 2007 RA
allocation analysis (AD/AP) are subtracted from the segment-level NPS totals.
These values represent the annual reduction divided into monthly increments:
-2.41 short tons/month
-4.31 short tons/month
-2.29 short tons/month
-0.36 short tons/month
-2.74 short tons/month (representing the combined reduction for segments 55, 6, and 7 as applied in the 2022-2024 RA)
Other segments (Boca Ciega Bay, Boca Ciega Bay South, Terra Ceia Bay) receive no AD/AP adjustment.
data frame with columns for Year, Month (if
summtime = 'month'), source (always "NPS"),
segment, basin (if summ = 'basin'), tn_load,
tp_load, tss_load, bod_load, and hy_load.
Loads are in short tons per month or year; hydrologic load is in cubic
meters per month or year.
## Not run: nps <- anlz_nps(yrrng = c('2021-01-01', '2023-12-31'), tbbase = tbbase, rain = rain, allwq = allwq, allflo = allflo, vernafl = vernafl, summ = 'basin', summtime = 'month') ipsfls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) dpsfls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) ips <- anlz_ips(ipsfls, summ = 'basin', summtime = 'month') dps <- anlz_dps(dpsfls, summ = 'basin', summtime = 'month') anlz_nps_psremove(nps, ips, dps) ## End(Not run)## Not run: nps <- anlz_nps(yrrng = c('2021-01-01', '2023-12-31'), tbbase = tbbase, rain = rain, allwq = allwq, allflo = allflo, vernafl = vernafl, summ = 'basin', summtime = 'month') ipsfls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) dpsfls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_dom', full.names = TRUE) ips <- anlz_ips(ipsfls, summ = 'basin', summtime = 'month') dps <- anlz_dps(dpsfls, summ = 'basin', summtime = 'month') anlz_nps_psremove(nps, ips, dps) ## End(Not run)
Estimated non-point source (NPS) ungaged loads
anlz_nps_ungaged( yrrng = c("2021-01-01", "2023-12-31"), tbbase, rain, lakemanpth = NULL, tampabypth = NULL, bellshlpth = NULL, allflo = NULL, usgsflow = NULL, verbose = TRUE )anlz_nps_ungaged( yrrng = c("2021-01-01", "2023-12-31"), tbbase, rain, lakemanpth = NULL, tampabypth = NULL, bellshlpth = NULL, allflo = NULL, usgsflow = NULL, verbose = TRUE )
yrrng |
A vector of two dates in 'YYYY-MM-DD' format, specifying the date range to retrieve flow data. Default is from '2021-01-01' to '2023-12-31'. |
tbbase |
data frame containing polygon areas for the combined data layer of bay segment, basin, jurisdiction, land use data, and soils, see details |
rain |
data frame of rainfall data, see details |
lakemanpth |
character, path to the file containing the Lake Manatee flow data, see details |
tampabypth |
character, path to the file containing the Tampa Bypass flow data, see details |
bellshlpth |
character, path to the file containing the Bell shoals data, see details |
allflo |
data frame of flow data, if already available from |
usgsflow |
data frame of USGS flow data, if already available from |
verbose |
logical indicating whether to print verbose output |
This function estimates pollutant loads from non-point sources in ungaged (unmonitored) basins within the Tampa Bay watershed. The approach combines spatial land use data, rainfall patterns, hydrologic modeling, and empirical relationships to estimate monthly nutrient and sediment loads. Several steps are followed:
Data Preparation: Processes land use data for logistic regression modeling and calculates inverse distance-weighted rainfall data for each sub-basin.
Flow Estimation: Uses a logistic regression model to predict monthly streamflow in ungaged basins based on:
Current month rainfall and 2-month lagged rainfall
Land use percentages (urban, agriculture, wetlands, forest)
Seasonal patterns (wet season: July-October, dry season: November-June)
Urban development intensity (Group A: <19% urban, Group B: ≥19% urban)
Hydrologic soil group characteristics
Runoff Coefficient Application: Applies land use and soil-specific runoff coefficients to distribute predicted flows across different landscape types within each basin.
Load Calculation: Estimates pollutant loads using Event Mean Concentrations (EMCs) for different land use categories (CLUCSID), calculating:
Total Nitrogen (TN) loads
Total Phosphorus (TP) loads
Total Suspended Solids (TSS) loads
Biochemical Oxygen Demand (BOD) loads
Stormwater-specific loads (with different EMCs for certain categories)
Spatial Framework:
The analysis uses a nested spatial hierarchy:
Bay Segments: Major subdivisions of Tampa Bay (1-7, with 55 for Lower Boca Ciega Bay)
Basins: Hydrologic sub-basins, including both USGS gaged and ungaged areas
Land Use Polygons: Detailed spatial units combining jurisdiction, land use (FLUCCS codes), and soil characteristics
Flow Prediction Models:
The function uses season- and development-specific regression equations:
Separate models for wet vs. dry seasons
Separate models for low-development (Group A) vs. high-development (Group B) areas
Models account for antecedent moisture conditions through lagged rainfall terms
Load Estimation:
Pollutant loads are calculated as: Load = Flow × EMC × Unit Conversions
Where EMCs vary by land use category (CLUCSID). Special handling is applied for water bodies and certain wetland types (CLUCSIDs 18, 20), which are assigned zero stormwater loads.
Requires the following inputs:
tbbase: A data frame containing polygon areas for the combined data layer of bay segment, basin, jurisdiction, land use data, and soils, Stored as tbbase or created (takes an hour or so) with util_nps_tbbase.
rain: A data frame of rainfall data. See rain.
lakemanpth: character, path to the file containing the Lake Manatee flow data. See util_nps_getextflow. Only applies if allflo is not provided.
tampabypth: character, path to the file containing the Tampa Bypass flow data. See util_nps_getextflow. Only applies if allflo is not provided.
bellshlpth: character, path to the file containing the Bell shoals data. See util_nps_getextflow. Only applies if allflo is not provided.
USGS gaged flows are also used, as returned by util_nps_getusgsflow and combined with the external flow data using util_nps_getextflow and util_nps_getflow.
A data frame with monthly pollutant load estimates containing the following columns:
bay_seg: Bay segment identifier (1: Old Tampa Bay, 2: Hillsborough Bay, 3: Middle Tampa Bay, 4: Lower Tampa Bay, 5: Upper Boca Ciega Bay, 6: Terra Ceia Bay, 7: Manatee River, 55: Lower Boca Ciega Bay)
basin: Basin identifier (USGS gage number or internal code)
yr: Year
mo: Month (1-12)
clucsid: Consolidated Land Use Classification System ID
h2oload: Water load (cubic meters)
tnload: Total nitrogen load (kg)
tpload: Total phosphorus load (kg)
tssload: Total suspended solids load (kg)
bodload: Biochemical oxygen demand load (kg)
area: Land use area (hectares)
bas_area: Total basin area (hectares)
# external flow sources data(tbbase) data(rain) data(allflo) nps_ungaged <- anlz_nps_ungaged( yrrng = c('2021-01-01', '2023-12-31'), tbbase = tbbase, rain = rain, allflo = allflo ) head(nps_ungaged)# external flow sources data(tbbase) data(rain) data(allflo) nps_ungaged <- anlz_nps_ungaged( yrrng = c('2021-01-01', '2023-12-31'), tbbase = tbbase, rain = rain, allflo = allflo ) head(nps_ungaged)
Calculate spring loads to Hillsborough Bay
anlz_spr( tbwxlpth, wqpth = NULL, yrrng = c(2022, 2024), summ = c("spring", "basin", "segment"), summtime = c("month", "year"), sulphurflow = NULL, verbose = TRUE )anlz_spr( tbwxlpth, wqpth = NULL, yrrng = c(2022, 2024), summ = c("spring", "basin", "segment"), summtime = c("month", "year"), sulphurflow = NULL, verbose = TRUE )
tbwxlpth |
character string, file path to the Tampa Bay Water discharge Excel
workbook (.xlsx) for Lithia and Buckhorn springs. The workbook must contain one sheet
per device, named by device ID: 3381 (Lithia Minor), 4586 (Lithia Major),
3388 (Buckhorn Upper), and 3649 (Buckhorn Lower). Each sheet must contain
columns |
wqpth |
character string or |
yrrng |
integer vector of length 2, start and end year for the analysis,
e.g. |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
sulphurflow |
data frame of daily Sulphur Spring discharge or |
verbose |
logical, if |
Loads are calculated for Lithia, Buckhorn, and Sulphur springs, all of which discharge to Hillsborough Bay (bay segment 2).
Discharge data (Lithia and Buckhorn):
The Excel workbook supplied in tbwxlpth contains one sheet per device.
Device IDs map to sub-springs as follows: 3381 = Lithia Minor, 4586 = Lithia
Major, 3388 = Buckhorn Upper, 3649 = Buckhorn Lower. Flow values in MGD are
converted to CFS (1 MGD = 1.547 CFS); values already in CFS are used as-is.
Lithia total flow is the sum of Minor and Major. Buckhorn total flow is Lower
minus Upper, because the two gauges bracket the spring input on the same
stream reach.
Contact for gage data is Cathleen Jonas, [email protected]. Device IDs 3381, 4586, 3388, and 3649 should be bundled with requests for Tampa Bypass Canal data (device ID 957) and Bell Shoals data (device ID 4626) used in the NPS workflow.
Discharge data (Sulphur Springs):
Daily CFS values for station 02306000 are retrieved from the USGS NWIS API
via util_nps_getusgsflow. A data frame can also be
supplied via the sulphurflow argument.
Interpolation:
Because springs are assumed never to have zero discharge, all gaps in the
daily discharge record are filled by linear interpolation between observed
values (na.approx with rule = 2). Leading or
trailing gaps are filled with the nearest observed value.
Water quality data (file path):
When wqpth is supplied, sample concentrations (mg/L) for TN and TP
are read from the CSV. These data are from FDEP's Impaired Waters Rule
dataset available at https://publicfiles.dep.state.fl.us/dear/iwr/.
Annual mean concentrations are computed per spring and joined to monthly flow
estimates. A spring-year is considered complete when its samples span all four
calendar quarters (Jan-Mar, Apr-Jun, Jul-Sep, Oct-Dec). Spring-years that are
entirely missing or whose samples do not cover all four quarters are filled by
carrying forward the most recent complete year's mean. The file should therefore
include data from years prior to the focal period so that every spring has at
least one complete reference year available. If the earliest year in the file
is already incomplete for a spring, there is no prior year to carry forward and
an error is raised.
Water quality data (API, wqpth = NULL):
When wqpth is NULL, water quality data are obtained using
util_spr_getwq. Lithia (SWFWMD station
17805, Lithia Main Spring) and Buckhorn (SWFWMD station 18276, Buckhorn Main
Spring) concentrations are retrieved from the
Water Atlas API
(WIN_21FLSWFD data source). These are probably the same quarterly SWFWMD
observations included in the FDEP IWR file. Sulphur Spring (EPC station
174) is retrieved via read_importepc, providing
monthly observations from the Environmental Protection Commission of
Hillsborough County.
TSS concentrations:
When wqpth is supplied, TSS concentrations are assigned from a fixed
lookup table derived from the historical SAS-based loading model (SPRMOD2).
When wqpth = NULL, TSS values from the API or EPC source are used
where available. Any spring-year with no observed TSS falls back to the same
fixed values. The fixed values are: Sulphur Springs (02306000) = 4.4 mg/L,
Buckhorn Springs (02301695) = 4.0 mg/L, Lithia Springs (02301600) = 4.0
mg/L.
BOD concentrations: BOD loads are returned as zero because BOD is not measured at the springs.
Load calculation: Monthly mean flows (CFS) are computed from the complete daily discharge series. Loads are then:
hy_load is converted to million m3 in the final output.
Spatial summaries:
The data are summarized differently based on the summ and summtime arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment') and year (summtime = 'year'). For springs, valid options for summ are
'spring' (one row per spring per time period), 'basin' (loads
summed within drainage basins. Lithia and Buckhorn combined into
"Alafia River", Sulphur into "Hillsborough River"), and
'segment' (all springs summed to bay segment Hillsborough Bay).
A data frame whose structure depends on summ:
'spring': one row per spring per time period, with columns
Year, Month (dropped for annual), source, segment,
spring, tn_load (tons), tp_load (tons),
tss_load (tons), bod_load (tons), and hy_load (1e6 m3).
'basin': one row per drainage basin per time period, with
columns Year, Month (dropped for annual), source,
segment, basin, tn_load (tons), tp_load (tons),
tss_load (tons), bod_load (tons), and hy_load (1e6 m3).
'segment': one row per bay segment per time period, with
columns Year, Month (dropped for annual), source,
segment, tn_load (tons), tp_load (tons),
tss_load (tons), bod_load (tons), and hy_load (1e6 m3).
For annual output (summtime = 'year'), load columns are summed over
months.
tbwxlpth <- system.file('extdata/sprflow2224.xlsx', package = 'tbeploads') wqpth <- system.file('extdata/sprwq2224.csv', package = 'tbeploads') # monthly per-spring loads using a local water quality file anlz_spr(tbwxlpth = tbwxlpth, wqpth = wqpth, yrrng = c(2022, 2024)) # annual basin-level totals anlz_spr(tbwxlpth = tbwxlpth, wqpth = wqpth, yrrng = c(2022, 2024), summ = 'basin', summtime = 'year') # monthly segment-level totals anlz_spr(tbwxlpth = tbwxlpth, wqpth = wqpth, yrrng = c(2022, 2024), summ = 'segment') ## Not run: # retrieve water quality from APIs automatically (no local file needed) anlz_spr(tbwxlpth = tbwxlpth, yrrng = c(2022, 2024)) ## End(Not run)tbwxlpth <- system.file('extdata/sprflow2224.xlsx', package = 'tbeploads') wqpth <- system.file('extdata/sprwq2224.csv', package = 'tbeploads') # monthly per-spring loads using a local water quality file anlz_spr(tbwxlpth = tbwxlpth, wqpth = wqpth, yrrng = c(2022, 2024)) # annual basin-level totals anlz_spr(tbwxlpth = tbwxlpth, wqpth = wqpth, yrrng = c(2022, 2024), summ = 'basin', summtime = 'year') # monthly segment-level totals anlz_spr(tbwxlpth = tbwxlpth, wqpth = wqpth, yrrng = c(2022, 2024), summ = 'segment') ## Not run: # retrieve water quality from APIs automatically (no local file needed) anlz_spr(tbwxlpth = tbwxlpth, yrrng = c(2022, 2024)) ## End(Not run)
Lookup table for FLUCCSCODE conversion to CLUCSID and IMPROVED
clucsidclucsid
A data frame
Used to create the tbbase combined layer with jurisdictions, land use, soils, and sub-basins, used in util_nps_tbbase.
FLUCCSCODE: Numeric value for the Florida Land Use, Cover and Forms Classification System (FLUCCS) code
CLUCSID: Numeric value for the coastal land use code, from JEI
IMPROVED: Numeric value for whether the code is improved (1) or not (0)
DESCRIPTION: Character description of the CLUCSID code
## Not run: clucsid <- read.csv('data-raw/clucsid.csv', stringsAsFactors = F, header = T) save(clucsid, file = 'data/clucsid.RData', compress = 'xz') ## End(Not run) clucsid## Not run: clucsid <- read.csv('data-raw/clucsid.csv', stringsAsFactors = F, header = T) save(clucsid, file = 'data/clucsid.RData', compress = 'xz') ## End(Not run) clucsid
Conservation land correction fractions for NPS/MS4 allocation assessment
conserv_correctionconserv_correction
A data frame with one row per unique bay segment, basin, entity, and CLUCSID combination where conservation land is present:
Integer bay segment identifier (1 = OTB, 2 = HB, 3 = MTB, 4 = LTB, 55 = RALTB).
Character drainage basin identifier.
MS4 jurisdiction name.
Integer Coastal Land Use Classification System identifier.
Fraction of entity area x runoff-coefficient attributable to conservation land within that bay segment / basin / CLUCSID combination. Computed as conservation area x RC divided by total entity area x RC (conservation + non-conservation) for that group.
The tbeploads-built tbbase is derived from routinely updated
GIS sources (land use, soils, jurisdictions) and does not include a
conservation land spatial overlay. The conservation layer was available
only for prior SAS workflows and cannot reproduced.
conserv_correction provides entity-specific fractions derived from
the SAS land cover file (npsag_3_2224_25Sep25.sas7bdat), which
includes a binary conservation column (0/1) indicating conservation
land while retaining the original MS4 jurisdiction in entity. Within
anlz_aa, each MS4 entity's disaggregated TN load for a given
basin and CLUCSID is scaled by (1 - conserv_frac) to remove the
conservation land contribution before hydrologic normalization.
Preprocessing matches util_aa_npsfactors: non-contributing
drainage (drnfeat = "NONCON") and water / tidal CLUCSIDs (17, 21, 22)
are excluded, compound hydrologic soil groups are simplified, and nested basins
are remapped. Only entity, basin, CLUCSID combinations with
conserv_frac > 0 are retained.
Derived from data-raw/npsag_3_2224_25Sep25.sas7bdat, the SAS
NPS land cover file. Built by data-raw/conserv_correction.R.
anlz_aa, tbbase,
util_aa_npsfactors
Upper Floridan Aquifer potentiometric surface raster, dry season 2022
contdrycontdry
A PackedSpatRaster (see wrap)
representing a 1-mile resolution grid of potentiometric head (ft above
MSL) for the dry season. Unwrap with terra::unwrap(contdry) to
obtain a SpatRaster.
Interpolated from Upper Floridan Aquifer potentiometric surface contour
lines (May 2022) downloaded from the FDEP / Florida Geological Survey
ArcGIS REST service using util_gw_getcontour. The spatial
extent covers the Tampa Bay watershed (tbfullshed) buffered
outward by 40 miles. Interpolation used inverse distance weighting (IDW,
5-mile radius, power = 2) followed by five passes of a 3x3 focal mean gap
fill. Projection is NAD83(2011) / Florida GDL Albers (ftUS), CRS 6443.
Wet season equivalent is contwet.
## Not run: pot_dry <- util_gw_getcontour("dry", 2022) contdry <- terra::wrap(pot_dry) save(contdry, file = "data/contdry.RData", compress = "xz") ## End(Not run) terra::unwrap(contdry)## Not run: pot_dry <- util_gw_getcontour("dry", 2022) contdry <- terra::wrap(pot_dry) save(contdry, file = "data/contdry.RData", compress = "xz") ## End(Not run) terra::unwrap(contdry)
Upper Floridan Aquifer potentiometric surface raster, wet season 2022
contwetcontwet
A PackedSpatRaster (see wrap)
representing a 1-mile resolution grid of potentiometric head (ft above
MSL) for the wet season. Unwrap with terra::unwrap(contwet) to
obtain a SpatRaster.
Interpolated from Upper Floridan Aquifer potentiometric surface contour
lines (September 2022) downloaded from the FDEP / Florida Geological Survey
ArcGIS REST service using util_gw_getcontour. The spatial
extent covers the Tampa Bay watershed (tbfullshed) buffered
outward by 40 miles. Interpolation used inverse distance weighting (IDW,
5-mile radius, power = 2) followed by five passes of a 3x3 focal mean gap
fill. Projection is NAD83(2011) / Florida GDL Albers (ftUS), CRS 6443.
Dry season equivalent is contdry.
## Not run: pot_wet <- util_gw_getcontour("wet", 2022) contwet <- terra::wrap(pot_wet) save(contwet, file = "data/contwet.RData", compress = "xz") ## End(Not run) terra::unwrap(contwet)## Not run: pot_wet <- util_gw_getcontour("wet", 2022) contwet <- terra::wrap(pot_wet) save(contwet, file = "data/contwet.RData", compress = "xz") ## End(Not run) terra::unwrap(contwet)
Basin information for coastal subbasin codes
dbasingdbasing
A data.frame
Used for domestic point source summaries, bay segments are as follows:
1: Old Tampa Bay
2: Hillsborough Bay
3: Middle Tampa Bay
4: Lower Tampa Bay
5: Upper Boca Ciega Bay
6: Terra Ceia Bay
7: Manatee River
55: Lower Boca Ciega Bay
See "data-raw/dbasing.R" for creation.
dbasingdbasing
TBNMC TN load allocations for DPS domestic wastewater facilities
dps_allocationsdps_allocations
A data.frame
TN load allocations assigned to individual domestic point source (DPS) facilities under the Tampa Bay Nitrogen Management Consortium (TBNMC) framework.
entity: Short entity name matching the facilities
table convention (e.g., "Clearwater", "Hillsborough Co.")
entity_full: Full entity name as listed in the source
allocation file (e.g., "City of Clearwater",
"Hillsborough County")
facname: Facility name matching the facilities
table convention
bay_seg: Integer bay segment identifier (1 = Old Tampa Bay,
2 = Hillsborough Bay, 3 = Middle Tampa Bay, 4 = Lower Tampa Bay,
55 = Remaining Lower Tampa Bay)
source: DPS discharge type; one of "DPS - end of pipe"
(direct surface water discharge) or "DPS - reuse" (reclaimed
water reuse)
alloc_tons: Allocation in tons TN per year
TECO Big Bend and Tropicana are not included: TECO is an industrial reuse
customer rather than a direct discharger, and Tropicana is classified as an
industrial point source in the facilities table. Neither can
be matched to DPS load data from anlz_dps_facility.
dps_allocationsdps_allocations
Event Mean Concentration (EMC) data for CLUCSID in Tampa Bay
emcemc
A data.frame
Used for non-point source (NPS) ungaged estimates summaries.
Values are grouped by CLUCSID and include mean TN, TP, TSS, and BOD.
See "data-raw/emc.R" for creation.
emcemc
Domestic and industrial point source facilities, including industrial with material losses
facilitiesfacilities
A data.frame
facilitiesfacilities
Historic 1992-1994 mean total water load baseline by bay segment and basin
hydro_baselinehydro_baseline
A data.frame
Mean total water load for the 1992-1994 baseline period, used for hydrologic normalization in the allocation assessment. Values are in million cubic meters per year.
bay_seg: Integer bay segment identifier
basin: Drainage basin identifier
mean_h2o_9294: Mean 1992-1994 total water load (million m3/yr)
See "data-raw/hydro_baseline.R" for creation.
hydro_baselinehydro_baseline
TBNMC TN load allocations for industrial material loss (ML) facilities
ml_allocationsml_allocations
A data.frame
TN load allocations assigned to industrial material loss facilities under the Tampa Bay Nitrogen Management Consortium (TBNMC) framework.
entity: Entity name matching the facilities
table convention
facname: Facility name matching the facilities
table convention; NA for shared-allocation groups (see below)
bay_seg: Integer bay segment identifier (2 = Hillsborough Bay,
4 = Lower Tampa Bay)
alloc_tons: Allocation in tons TN per year
ishared: Logical; TRUE when the allocation is shared
across multiple facilities. When TRUE, the combined load from all
facilities belonging to the same entity and bay segment is compared to
the single alloc_tons value.
The three Mosaic material loss facilities (Big Bend, Riverview, Tampa Marine)
share a single 3.30 ton/year allocation in Hillsborough Bay; they are
represented by one row (ishared = TRUE, facname = NA).
All other entries are non-shared (ishared = FALSE) with one row per
facility.
ml_allocationsml_allocations
TBNMC TN load allocations for NPS/MS4 entities
nps_allocationsnps_allocations
A data.frame
TN load allocations assigned to non-point source and MS4 jurisdictions under the Tampa Bay Nitrogen Management Consortium (TBNMC) framework.
bay_seg: Integer bay segment identifier (1, 2, 3, 4, 55)
entity: Short entity name used for joining
entity_full: Full entity name
type: Allocation type (e.g., MS4, Agriculture, Other)
alloc_pct: Fractional allocation share (0-1)
alloc_tons: Allocation in tons TN per year
nps_allocationsnps_allocations
Data frame of distances of drainage basin locations to National Weather Service (NWS) sites
nps_distancenps_distance
A data.frame
Used for estimating non-point source (NPS) ungaged loads. The data frame contains the following columns:
target: Numeric identifier for the drainage basin
targ_x: Numeric value for the x-coordinate of the drainage basin location (WGS 84, UTM Zone 17N, CRS 32617)
targ_y: Numeric value for the y-coordinate of the drainage basin location (WGS 84, UTM Zone 17N, CRS 32617)
matchsit: Numeric for the NWS site that matches the drainage basin location
distance: Numeric value for the distance (m) between the drainage basin coordinate and NWS site
invdist2: Numeric value for the inverse distance squared (1/m^2) between the drainage basin coordinate and NWS site
nps_distancenps_distance
TBNMC TN load allocations for IPS point source facilities
ps_allocationsps_allocations
A data.frame
TN load allocations assigned to individual industrial point source facilities under the Tampa Bay Nitrogen Management Consortium (TBNMC) framework.
entity: Entity name (owner/operator)
facname: Facility name as used in facilities
permit: NPDES permit number
alloc_pct: Fractional allocation share (0-1)
alloc_tons: Allocation in tons TN per year
ps_allocationsps_allocations
Data frame of daily rainfall data from NOAA NCDC National Weather Service (NWS) sites from 2017 to 2023
rainrain
A data.frame
Used for estimating atmospheric deposition and non-point source ungaged loads. Created using the util_getrain function. The data frame contains the following columns:
station: Character string for the station id
date: Date for the observation
Year: Numeric value for the year of the observation
Month: Numeric value for the month of the observation
Day: Numeric value for the day of the observation
rainfall: Numeric value for the amount of rainfall in inches
rainrain
Data frame of historical daily rainfall datas
rain_historicrain_historic
A data.frame
Historical daily rain fall data for 388 stations through 2021. Columns are:
COOPID: Character string for the station id
date: Date for the observation
Prcp: Numeric value for the amount of rainfall in inches
## Not run: pth1 <- 'T:/03_BOARDS_COMMITTEES/05_TBNMC/2022_RA_Update/01_FUNDING_OUT/DELIVERABLES/TO-9/' pth2 <- 'datastick_deliverables/LoadingCodes&Datasets/2021/AtmosphericDeposition2021/' fl <- 'fl_rain_por_220223v93.sas7bdat' pth <- file.path(pth1, pth2, fl) rain_historic <- haven::read_sas(pth) save(rain_historic, file = 'data/rain_historic.RData', compress = 'xz') ## End(Not run)## Not run: pth1 <- 'T:/03_BOARDS_COMMITTEES/05_TBNMC/2022_RA_Update/01_FUNDING_OUT/DELIVERABLES/TO-9/' pth2 <- 'datastick_deliverables/LoadingCodes&Datasets/2021/AtmosphericDeposition2021/' fl <- 'fl_rain_por_220223v93.sas7bdat' pth <- file.path(pth1, pth2, fl) rain_historic <- haven::read_sas(pth) save(rain_historic, file = 'data/rain_historic.RData', compress = 'xz') ## End(Not run)
Lookup table for CLUCSID runoff coefficients
rcclucsidrcclucsid
A data frame
Used to create the land use runoff coefficient data used in util_nps_landsoilrc.
clucsid: Numeric value for CLUCSID
hsg: Numeric value for the hydrologic soil group
dry_rc: Numeric value for dry weather runoff coefficient
wet_rc: Numeric value for wet weather runoff coefficient
## Not run: rcclucsid <- read.csv('data-raw/rc_clucsid.csv', stringsAsFactors = F, header = T) save(rcclucsid, file = 'data/rcclucsid.RData', compress = 'xz') ## End(Not run) rcclucsid## Not run: rcclucsid <- read.csv('data-raw/rc_clucsid.csv', stringsAsFactors = F, header = T) save(rcclucsid, file = 'data/rcclucsid.RData', compress = 'xz') ## End(Not run) rcclucsid
Combined spatial data required for non-point source (NPS) ungaged estimate
tbbasetbbase
A summarized data frame containing the union of all inputs showing major bay segment, sub-basin (basin), drainage feature (drnfeat), jurisdiction (entity), land use/land cover (FLUCCSCODE), CLUCSID, IMPROVED, hydrologic group (hydgrp), and area in hectures. These represent all relevant spatial combinations in the Tampa Bay watershed.
See "data-raw/tbbase.R" for creation.
tbbasetbbase
Simple feature polygons major drainage basins in the Tampa Bay Estuary Program boundary
tbdbasintbdbasin
A sf object
Used for estimating ungaged non-point source (NPS) loads. The data includes the following columns.
basin: Numeric value for the basin
drnfeat: Numeric for the drainage feature
geometry: The geometry column
Projection is NAD83(2011) / Florida West (ftUS), CRS 6443.
## Not run: prj <- 6443 tbdbasin <- sf::st_read("./data-raw/gis/TBEP_dBasins_FIPS0902_Projection.shp") |> sf::st_transform(prj) |> sf::st_buffer(dist = 0) |> dplyr::group_by(NEWGAGE, DRNFEATURE) |> dplyr::summarise(geometry = sf::st_union(geometry), .groups = "drop") |> dplyr::select( basin = NEWGAGE, drnfeat = DRNFEATURE ) |> dplyr::arrange(basin, drnfeat) save(tbdbasin, file = 'data/tbdbasin.RData', compress = 'xz') ## End(Not run) tbdbasin## Not run: prj <- 6443 tbdbasin <- sf::st_read("./data-raw/gis/TBEP_dBasins_FIPS0902_Projection.shp") |> sf::st_transform(prj) |> sf::st_buffer(dist = 0) |> dplyr::group_by(NEWGAGE, DRNFEATURE) |> dplyr::summarise(geometry = sf::st_union(geometry), .groups = "drop") |> dplyr::select( basin = NEWGAGE, drnfeat = DRNFEATURE ) |> dplyr::arrange(basin, drnfeat) save(tbdbasin, file = 'data/tbdbasin.RData', compress = 'xz') ## End(Not run) tbdbasin
Simple features polygon for the Tampa Bay Estuary Program boundary
tbfullshedtbfullshed
A sf object
Used for estimating ungaged non-point source (NPS) loads. The data includes the following columns.
Name: Character for the layer name
Hectares: Numeric value for area of the polygon
geometry: The geometry column
Projection is NAD83(2011) / Florida West (ftUS), CRS 6443.
## Not run: prj <- 6443 tbfullshed <- sf::st_read("./data-raw/gis/TBEP_Watershed_Correct_Projection.shp") |> st_transform(prj) |> st_union(by_feature = T) |> st_buffer(dist = 0) |> dplyr::select(Name, Hectares) save(tbfullshed, file = 'data/tbfullshed.RData', compress = 'xz') ## End(Not run) tbfullshed## Not run: prj <- 6443 tbfullshed <- sf::st_read("./data-raw/gis/TBEP_Watershed_Correct_Projection.shp") |> st_transform(prj) |> st_union(by_feature = T) |> st_buffer(dist = 0) |> dplyr::select(Name, Hectares) save(tbfullshed, file = 'data/tbfullshed.RData', compress = 'xz') ## End(Not run) tbfullshed
Simple feature polygons of jurisdictional boundaries in the Tampa Bay Estuary Program boundary
tbjuristbjuris
A sf object
Used for estimating ungaged non-point source (NPS) loads. The data includes the following columns.
entity: Character for the entity name
geometry: The geometry column
Projection is NAD83(2011) / Florida West (ftUS), CRS 6443.
## Not run: prj <- 6443 tbjuris <- sf::st_read("./data-raw/gis/TB_Juris.shp") |> sf::st_transform(prj) |> sf::st_buffer(dist = 0) |> dplyr::rename(entity = NAME_FINAL) |> dplyr::select(entity) |> dplyr::group_by(entity) |> dplyr::summarise() save(tbjuris, file = 'data/tbjuris.RData', compress = 'xz') ## End(Not run) tbjuris## Not run: prj <- 6443 tbjuris <- sf::st_read("./data-raw/gis/TB_Juris.shp") |> sf::st_transform(prj) |> sf::st_buffer(dist = 0) |> dplyr::rename(entity = NAME_FINAL) |> dplyr::select(entity) |> dplyr::group_by(entity) |> dplyr::summarise() save(tbjuris, file = 'data/tbjuris.RData', compress = 'xz') ## End(Not run) tbjuris
Simple feature polygons of 2023 land use in the Tampa Bay Estuary Program boundary
tblu2023tblu2023
A sf object
Used for estimating ungaged non-point source (NPS) loads. The data includes the following columns.
FLUCCSCODE: Numeric value for the Florida Land Use, Cover and Forms Classification System (FLUCCS) code
FLUCCSDESC: Character describing the FLUCCS description
geometry: The geometry column
Projection is NAD83(2011) / Florida West (ftUS), CRS 6443.
## Not run: # import from local source prj <- 6443 tblu2023 <- sf::st_read("T:/05_GIS/SWFWMD/LULC_2023/LANDUSELANDCOVER2023.shp") |> sf::st_transform(prj) |> sf::st_intersection(tbfullshed) |> sf::st_buffer(dist = 0) |> dplyr::group_by(FLUCCSCODE, FLUCSDESC) |> dplyr::summarise() |> dplyr::ungroup() # or use SWFWMD API tblu2023 <- util_nps_getswfwmd('lulc2023') save(tblu2023, file = 'data/tblu2023.RData', compress = 'xz') ## End(Not run)## Not run: # import from local source prj <- 6443 tblu2023 <- sf::st_read("T:/05_GIS/SWFWMD/LULC_2023/LANDUSELANDCOVER2023.shp") |> sf::st_transform(prj) |> sf::st_intersection(tbfullshed) |> sf::st_buffer(dist = 0) |> dplyr::group_by(FLUCCSCODE, FLUCSDESC) |> dplyr::summarise() |> dplyr::ungroup() # or use SWFWMD API tblu2023 <- util_nps_getswfwmd('lulc2023') save(tblu2023, file = 'data/tblu2023.RData', compress = 'xz') ## End(Not run)
Simple feature polygons of Tampa Bay segments with shoreline detail
tbsegdetailtbsegdetail
A sf object
Detailed shoreline polygons for the major bay segments, including a North/South split for Boca Ciega Bay. Note that the Boca Ciega Bay segment is only the northern portion. The data includes the following columns.
Layer was clipped to retain only the main open-water bay areas, excluding tidal rivers and creek arms. The clipping mask is available in data-raw/tbsegdetail_clip_mask.RData for reproducibility. See data-raw/tbsegdetail_clip.R for the clipping workflow.
bay_seg: Integer, numeric segment identifier matching
geometry: The geometry column
Bay segments included:
1: Old Tampa Bay
2: Hillsborough Bay
3: Middle Tampa Bay
4: Lower Tampa Bay
5: Boca Ciega Bay (North)
6: Terra Ceia Bay
7: Manatee River
55: Boca Ciega Bay South
Projection is NAD83(2011) / Florida West (ftUS), CRS 6443.
tbsegdetailtbsegdetail
Simple feature polygons of soil data in the Tampa Bay Estuary Program boundary
tbsoiltbsoil
A sf object
Used for estimating ungaged non-point source (NPS) loads. The data includes the following columns.
FLUCCSCODE: Numeric value for the Florida Land Use, Cover and Forms Classification System (FLUCCS) code
FLUCCSDESC: Character describing the FLUCCS description
geometry: The geometry column
Projection is NAD83(2011) / Florida West (ftUS), CRS 6443.
## Not run: # use SWFWMD API tbsoil <- util_nps_getswfwmd('soil') save(tbsoil, file = 'data/tbsoil.RData', compress = 'xz') ## End(Not run)## Not run: # use SWFWMD API tbsoil <- util_nps_getswfwmd('soil') save(tbsoil, file = 'data/tbsoil.RData', compress = 'xz') ## End(Not run)
Simple feature polygons of sub-watersheds in the Tampa Bay Estuary Program boundary
tbsubshedtbsubshed
A sf object
Used for estimating ungaged non-point source (NPS) loads. The data includes bay segment as follows:
1: Old Tampa Bay
2: Hillsborough Bay
3: Middle Tampa Bay
4: Lower Tampa Bay
5: Boca Ciega Bay
6: Terra Ceia Bay
7: Manatee River
55: Boca Ciega Bay South
Projection is NAD83(2011) / Florida West (ftUS), CRS 6443.
## Not run: prj <- 6443 tbsubshed <- sf::st_read("./data-raw/gis/TBEP_Major_Basins_NAD1983_SP_FIPS0902_FT.shp") |> sf::st_transform(prj) |> sf::st_buffer(dist = 0) |> dplyr::mutate( bay_seg = dplyr::case_when( BASINNAME %in% c('Coastal Old Tampa Bay') ~ 1, BASINNAME %in% c('Alafia River', 'Coastal Hillsborough Bay', 'Hillsborough River') ~ 2, BASINNAME %in% c('Coastal Middle Tampa Bay', 'Little Manatee River') ~ 3, BASINNAME %in% c('Coastal Lower Tampa Bay') ~ 4, BASINNAME %in% c('Upper Boca Ciega Bay') ~ 5, BASINNAME %in% c('Coastal Terra Ceia Bay') ~ 6, BASINNAME %in% c('Manatee River') ~ 7, BASINNAME %in% c('Lower Boca Ciega Bay') ~ 55, ) ) |> dplyr::group_by(bay_seg) |> dplyr::summarise(geometry = sf::st_union(geometry), .groups = "drop") save(tbsubshed, file = 'data/tbsubshed.RData', compress = 'xz') ## End(Not run) tbsubshed## Not run: prj <- 6443 tbsubshed <- sf::st_read("./data-raw/gis/TBEP_Major_Basins_NAD1983_SP_FIPS0902_FT.shp") |> sf::st_transform(prj) |> sf::st_buffer(dist = 0) |> dplyr::mutate( bay_seg = dplyr::case_when( BASINNAME %in% c('Coastal Old Tampa Bay') ~ 1, BASINNAME %in% c('Alafia River', 'Coastal Hillsborough Bay', 'Hillsborough River') ~ 2, BASINNAME %in% c('Coastal Middle Tampa Bay', 'Little Manatee River') ~ 3, BASINNAME %in% c('Coastal Lower Tampa Bay') ~ 4, BASINNAME %in% c('Upper Boca Ciega Bay') ~ 5, BASINNAME %in% c('Coastal Terra Ceia Bay') ~ 6, BASINNAME %in% c('Manatee River') ~ 7, BASINNAME %in% c('Lower Boca Ciega Bay') ~ 55, ) ) |> dplyr::group_by(bay_seg) |> dplyr::summarise(geometry = sf::st_union(geometry), .groups = "drop") save(tbsubshed, file = 'data/tbsubshed.RData', compress = 'xz') ## End(Not run) tbsubshed
Data frame of USGS stream flow data from the USGS NWIS database for 2021 to 2023
usgsflowusgsflow
A data.frame
Daily flow data at select stations used for estimating non-point source gaged and ungaged loads. Created using the util_nps_getusgsflow function. The file is provided to reduce calls to the USGS API. The data frame contains the following columns:
site_no: Character string for the site number
date: Date for the observation
flow_cfs: Numeric value for the daily flow in cubic feet per second (cfs)
usgsflowusgsflow
Create NPS disaggregation factors for allocation assessment
util_aa_npsfactors(tbbase, rcclucsid, emc)util_aa_npsfactors(tbbase, rcclucsid, emc)
tbbase |
Data frame returned from |
rcclucsid |
Data frame of runoff coefficients by land use class and
hydrologic soil group. See |
emc |
Data frame of event mean concentrations by land use class. Must
include columns |
These factors are used internally by anlz_aa to disaggregate basin-level
NPS loads to individual MS4 jurisdictions (entities).
Two factors are required because the disaggregation is a two-step process matching the original SAS workflow:
factor_tn)Distributes total basin TN load among land use classes. Based on area × event mean TN concentration per CLUCSID. Jurisdiction is not required, sums to basin total.
factor_rc)Distributes each land use class's TN and
water loads among MS4 entities based on each entity's share of that land
use type's weighted runoff (area × runoff coefficient). Jurisdiction is
required here, i.e., tbbase includes the entity overlay.
Annual runoff coefficient: rc = (dry_rc * 8 + wet_rc * 4) / 12.
Basin remapping matches the SAS preprocessing: nested basins 02303000 and 02303330 are assigned to 02304500; 02301000 and 02301300 to 02301500; 02299950 to LMANATEE; basin 206-5 is assigned to bay_seg 55. Basin 02307359 is excluded entirely.
Non-contributing drainage features (drnfeat == "NONCON") and
water/tidal CLUCSIDs (17, 21, 22) are excluded from both factors.
Compound hydrologic soil groups (e.g., "A/D") are simplified to
their primary group ("A") before joining runoff coefficients.
These factors are specific to a land use layer and should be rebuilt
whenever the underlying tbbase changes.
A named list with two elements:
Data frame of RC factors: bay_seg, basin,
entity, category, clucsid, factor_rc.
factor_rc is each entity's fractional share of the weighted area ×
runoff coefficient within each basin × CLUCSID combination. Sums to 1
across all entities for each basin × CLUCSID.
Data frame of TN factors: bay_seg, basin,
clucsid, factor_tn.
factor_tn is each CLUCSID's fractional share of basin TN load,
weighted by area × event mean TN concentration. Sums to 1 across all
CLUCSIDs for each basin.
data(tbbase) data(rcclucsid) data(emc) nps_factors <- util_aa_npsfactors(tbbase, rcclucsid, emc)data(tbbase) data(rcclucsid) data(emc) nps_factors <- util_aa_npsfactors(tbbase, rcclucsid, emc)
Get rainfall data at NOAA NCDC sites for atmospheric deposition and non-point source ungaged calculations
util_getrain(yrs, station = NULL, noaa_key, ntry = 5, quiet = FALSE)util_getrain(yrs, station = NULL, noaa_key, ntry = 5, quiet = FALSE)
yrs |
numeric vector for the years of data to retrieve |
station |
numeric vector of station numbers to retrieve, see details |
noaa_key |
character for the NOAA API key |
ntry |
numeric for the number of times to try to download the data |
quiet |
logical to print progress in the console |
This function is used to retrieve a long-term record of rainfall for estimating AD and NPS ungaged loads. It is used to create an input data file for load calculations and it is not used directly by any other functions due to download time. A NOAA API key is required to use the function.
By default, rainfall data is retrieved for the following stations:
228: ARCADIA
478: BARTOW
520: BAY LAKE
940: BRADENTON EXPERIMENT
945: BRADENTON 5 ESE
1046: BROOKSVILLE CHIN HIL
1163: BUSHNELL 2 E
1632: CLEARWATER
1641: CLERMONT 7 S
2806: ST PETERSBURG WHITTD
3153: FORT GREEN 12 WSW
3986: HILLSBOROUGH RVR SP
4707: LAKE ALFRED EXP STN
5973: MOUNTAIN LAKE
6065: MYAKKA RIVER STATE P
6880: PARRISH
7205: PLANT CITY
7851: ST LEO
7886: ST PETERSBURG WHITTD
8788: TAMPA INTL ARPT
8824: TARPON SPNGS SWG PLT
9176: VENICE
9401: WAUCHULA 2 N
a data frame with the following columns:
station: numeric, the station id
date: Date, the date of the observation
Year: numeric, the year of the observation
Month: numeric, the month of the observation
Day: numeric, the day of the observation
rainfall: numeric, the amount of rainfall in inches
## Not run: noaa_key <- Sys.getenv('NOAA_KEY') util_getrain(2021, 228, noaa_key) ## End(Not run)## Not run: noaa_key <- Sys.getenv('NOAA_KEY') util_getrain(2021, 228, noaa_key) ## End(Not run)
Download and rasterize FDEP Upper Floridan Aquifer potentiometric surface
util_gw_getcontour( season = c("dry", "wet"), yr, max_records = 1000, verbose = TRUE )util_gw_getcontour( season = c("dry", "wet"), yr, max_records = 1000, verbose = TRUE )
season |
character, |
yr |
integer, year for which to retrieve data. Biannual (May/September) observations are available from approximately 2010 onward. |
max_records |
integer, maximum records per paginated API request. Default 1000. |
verbose |
logical, print download and interpolation progress.
Default |
Downloads Upper Floridan Aquifer potentiometric surface contour lines from
the FDEP / Florida Geological Survey ArcGIS REST service
(https://ca.dep.state.fl.us/arcgis/rest/services/OpenData/FGS_PUBLIC/MapServer/8)
and interpolates them to a 1-mile SpatRaster using inverse distance
weighting (IDW).
Spatial extent: The API query covers the Tampa Bay watershed
(tbfullshed) buffered outward by 40 miles (211,200 US Survey
Feet), converted to WGS84. This wider extent captures high potentiometric values outside of the surficial subwatersheds that drive groundwater flow.
Interpolation: Contour line vertices are used as elevation
observations and interpolated to a 1-mile grid via IDW (5-mile radius,
power = 2). Cells more than 5 miles from any contour vertex are left
NA to avoid extrapolation into data-sparse regions. Five passes of a
3x3 focal mean then fill small gaps. The 5-mile radius was chosen to bridge
typical contour spacing in the Tampa Bay region without extrapolating into
the panhandle or coastal areas.
Season mapping:
"dry" maps to May of yr
"wet" maps to September of yr
A SpatRaster of potentiometric head (ft above
MSL) at 1-mile resolution in the CRS of tbfullshed (EPSG
6443). Returns NULL with a warning if no features are found for the
requested season/year.
## Not run: pot_dry <- util_gw_getcontour("dry", 2022) pot_wet <- util_gw_getcontour("wet", 2022) ## End(Not run)## Not run: pot_dry <- util_gw_getcontour("dry", 2022) pot_wet <- util_gw_getcontour("wet", 2022) ## End(Not run)
Get groundwater quality concentrations for Floridan aquifer segments
util_gw_getwq(sta_ids = NULL, yrrng = NULL, verbose = TRUE)util_gw_getwq(sta_ids = NULL, yrrng = NULL, verbose = TRUE)
sta_ids |
character vector of SWFWMD station IDs to query. When
|
yrrng |
integer vector of length 2 specifying the start and end year
for computing concentration means, e.g. |
verbose |
logical, if |
Retrieves TN and TP concentrations (mg/L) from the
Water Atlas API
(GET /api/samplingdata/stream) for Upper Floridan Aquifer monitoring
stations and computes grand-mean concentrations per station. Station means
are then mapped to bay segments:
OTB (segment 1): mean of sta_ids[1] only (default:
CR 581 North Fldn, station 18340).
HB (segment 2): arithmetic mean of the per-station means
across all sta_ids (default: mean of stations 18340 and 18965,
SR 52 and CR 581 Deep).
Segments 3-7: fixed constants carried forward from
gwupdate95-98_final.xls (the original 1995-1998 SWFWMD monitoring
analysis). These values were used unchanged in every loading script from
2012 through 2021 and are not updated from the API.
History: Through the 2021 loading cycle, all seven segments used hardcoded Floridan concentrations sourced from the 1995-1998 spreadsheet (TN: 0.010-0.025 mg/L, TP: 0.097-0.137 mg/L). For the 2022-2024 update, new SWFWMD well data showed substantially higher TN in the Pasco County Floridan aquifer, so segments 1 and 2 were revised using stations 18340 and 18965. Segments 3-7 retained the original values.
TN is taken from the TN_mgl parameter and TP from TP_mgl in
the Water Atlas API response.
A data frame with one row per bay segment (1-7) and columns:
bay_seg: integer, bay segment number
tn_mgl: numeric, mean total nitrogen concentration (mg/L)
tp_mgl: numeric, mean total phosphorus concentration (mg/L)
## Not run: # default stations, all available data conc <- util_gw_getwq() # restrict to a specific period conc <- util_gw_getwq(yrrng = c(2020, 2024)) ## End(Not run)## Not run: # default stations, all available data conc <- util_gw_getwq() # restrict to a specific period conc <- util_gw_getwq(yrrng = c(2020, 2024)) ## End(Not run)
Compute hydraulic gradient per bay segment from potentiometric surface raster
util_gw_grad( pot_rast, season = c("dry", "wet"), segs = tbsubshed, shoreline = tbsegdetail, buf_segs = NULL )util_gw_grad( pot_rast, season = c("dry", "wet"), segs = tbsubshed, shoreline = tbsegdetail, buf_segs = NULL )
pot_rast |
|
season |
character, |
segs |
|
shoreline |
|
buf_segs |
named numeric vector mapping bay segment IDs (as character
strings) to omnidirectional buffer distances in US Survey Feet (CRS 6443).
The subwatershed for each listed segment is buffered outward by the given
distance and bay water is removed before the potentiometric high-point
search. Listing a segment here also removes it from the default
zero-gradient set so it is computed dynamically. When |
Computes the Floridan Aquifer hydraulic gradient (ft/mile) per bay
segment using the Darcy's Law framework of Zarbock et al. (1994):
The max potentiometric head within the search area is located in the interpolated raster and the distance is measured from the nearest shoreline crossing along the bay centroid-to-head transect (see below).
Zero-gradient segments (hardcoded): The following segments always receive a gradient of 0 based on the original SAS loading analysis (Zarbock et al., 1994; GWld2224_SASCode.txt):
Boca Ciega Bay (segments 5 and 55), both seasons: the urbanized coastal watershed has no meaningful Floridan Aquifer recharge directed toward the bay.
Lower Tampa Bay (4), Terra Ceia Bay (6), Manatee River (7), dry
season only: the potentiometric gradient is negligible during the dry
season. These segments are computed dynamically in the wet season via
the default buf_segs.
Any segment listed in buf_segs is removed from the zero set and
computed dynamically.
Default buf_segs (calibrated against 2021 SAS reference values):
Dry season: c("1" = 100000) – Old Tampa Bay subwatershed
buffered ~19 miles; captures the potentiometric high north/northeast of
the standard watershed boundary.
Wet season: c("1" = 100000, "4" = 100000, "6" = 100000,
"7" = 100000) – adds LTB, TCB, and MR (each ~19 miles) to unlock
wet-season computation for those segments.
Buffer distances were tuned to produce gradients within ~15\ FDEP potentiometric surface values used in the SAS analysis.
Distance calculation: Rather than measuring from the potentiometric high to the nearest shoreline point (which can hit an extreme geographic corner), the function draws a line from the bay segment centroid to the max-head cell. The portion of that line inside the bay polygon is subtracted from the total length, giving the distance from the shoreline crossing point to the high point along a representative transect.
Hillsborough Bay (segment 2):
Uses a three-zone weighted gradient (Polk County 0.4, Pasco County 0.3,
Alafia River 0.3) following the original flow net analysis. Sub-zones are
constructed from tbdbasin drainage basins as in the original
SAS code.
Benchmark warning:
After computing gradients, each non-zero segment is compared against the
2021 SAS reference values (GWld2224_SASCode.txt). A warning is issued for
any segment whose computed gradient deviates by more than 50\
reference, indicating a potentially anomalous potentiometric surface or a
need to revisit the buf_segs configuration.
A data frame with columns bay_seg (integer) and grad
(numeric, ft/mile; 0 for zero-gradient segments).
Zarbock, H., A. Janicki, D. Wade, D. Heimbuch, and H. Wilson. 1994. Estimates of Total Nitrogen, Total Phosphorus, and Total Suspended Solids Loadings to Tampa Bay, Florida. Technical Publication #04-94. Prepared by Coastal Environmental, Inc. Prepared for Tampa Bay National Estuary Program. St. Petersburg, FL.
## Not run: pot_dry <- util_gw_getcontour("dry", 2022) util_gw_grad(pot_dry, season = "dry") pot_wet <- util_gw_getcontour("wet", 2022) util_gw_grad(pot_wet, season = "wet") ## End(Not run)## Not run: pot_dry <- util_gw_getcontour("dry", 2022) util_gw_grad(pot_dry, season = "dry") pot_wet <- util_gw_getcontour("wet", 2022) util_gw_grad(pot_wet, season = "wet") ## End(Not run)
Visualise the hydraulic gradient for a bay segment
util_gw_showgrad( pot_rast, season = c("dry", "wet"), seg, segs = tbsubshed, shoreline = tbsegdetail, buf_segs = NULL )util_gw_showgrad( pot_rast, season = c("dry", "wet"), seg, segs = tbsubshed, shoreline = tbsegdetail, buf_segs = NULL )
pot_rast |
|
season |
character, |
seg |
integer, bay segment number (1-7). |
segs |
|
shoreline |
|
buf_segs |
named numeric vector of buffer distances (US Survey Feet)
in the same format accepted by |
Returns a ggplot2 map for the requested segment showing:
The potentiometric surface (ft) within the search area, coloured by head value.
The search area boundary (light yellow).
All bay segments (grey background) and the target segment (blue).
A dotted line from the bay centroid to the max-head land cell (showing the full transect used in the distance calculation).
A dashed line for the land portion of that transect (the actual gradient distance).
The max-head point (red dot).
The subtitle reports max head (ft), distance (miles), and gradient (ft/mi).
See util_gw_grad for the distance calculation methodology.
A ggplot object.
## Not run: contdry <- util_gw_getcontour("dry", 2022) ## End(Not run) util_gw_showgrad(contdry, season = "dry", seg = 1) util_gw_showgrad(contdry, season = "dry", seg = 3) ## Not run: contwet <- util_gw_getcontour("wet", 2022) ## End(Not run) util_gw_showgrad(contwet, season = "wet", seg = 4) util_gw_showgrad(contwet, season = "wet", seg = 7)## Not run: contdry <- util_gw_getcontour("dry", 2022) ## End(Not run) util_gw_showgrad(contdry, season = "dry", seg = 1) util_gw_showgrad(contdry, season = "dry", seg = 3) ## Not run: contwet <- util_gw_getcontour("wet", 2022) ## End(Not run) util_gw_showgrad(contwet, season = "wet", seg = 4) util_gw_showgrad(contwet, season = "wet", seg = 7)
Fill in missing water quality values for non-point source (NPS) data
util_nps_fillmiswq(wq, yrrng = c("2021-01-01", "2023-12-31"))util_nps_fillmiswq(wq, yrrng = c("2021-01-01", "2023-12-31"))
wq |
A data frame of water quality data returned by |
yrrng |
A vector of two dates in 'YYYY-MM-DD' format, specifying the date range to retrieve flow data. Default is from '2021-01-01' to '2023-12-31'. |
Missing end date monthly values are filled with prior 5 year averages. Then, missing monthly values are linearly interpolated using na.approx.
Input data frame with missing data filled as described above.
## Not run: data(allwq) wq <- util_nps_fillmiswq(allwq) ## End(Not run)## Not run: data(allwq) wq <- util_nps_fillmiswq(allwq) ## End(Not run)
Get external flow data not from USGS for NPS calculations
util_nps_getextflow(pth, loc, yrrng = c(2021, 2023))util_nps_getextflow(pth, loc, yrrng = c(2021, 2023))
pth |
Path to the external Excel file |
loc |
Location of the external flow data. Options are 'LMANATEE', 'TBYPASS', or '02301500'. |
yrrng |
Numeric vector of length 2 indicating the year range to filter the data. Default is c(2021, 2023). |
This function retrieves and formats external flow data that cannot be obtained from the USGS API. The three required locations are Lake Manatee, Tampa Bypass Canal (s160), and the Alafia River Bell Shoals Bell Shoals.
External data can be obtained as follows:
LMANATEE: Lake Manatee flow for the Manatee River dam, from Manatee County Utilities, input flow is cfs (Manatee County contact is Amanda ShawverKarnitz, [email protected]).
TBYPASS: Tampa Bypass Canal flow from Tampa Bay Water. Input flow is MGD and is converted to cfs (Tampa Bay Water contact is Cathleen Jonas, [email protected], device ID 957). Requests for this data should be bundled with requests for the Lithia and Buckhorn spring discharge data (device IDs 3381, 4586, 3388, and 3649) used in the springs workflow.
02301500: Alafia River Bell Shoals flow data from SWFWMD WMIS Pumpage Reports for Permit 11794 (https://www18.swfwmd.state.fl.us/search/search/searchwupsimple.aspx) or optionally from Tampa Bay Water reported withdrawals for Site 4626 (Cathleen Jonas, [email protected], device ID 4626, request should be bundled with device IDs 3381, 4586, 3388, and 3649 used in the springs workflow). Input flow from latter is daily average converted to cfs.
System files are included in the package which can be updated annually.
A data frame of flow data for the location in loc
# lake manatee pth <- system.file('extdata/nps_extflow_lakemanatee.xlsx', package = 'tbeploads') extflo <- util_nps_getextflow(pth, loc = "LMANATEE") # tampa bypass pth <- system.file('extdata/nps_extflow_tampabypass.xlsx', package = 'tbeploads') extflo <- util_nps_getextflow(pth, loc = "TBYPASS") # bell shoals pth <- system.file('extdata/nps_extflow_bellshoals.xls', package = 'tbeploads') extflo <- util_nps_getextflow(pth, loc = "02301500")# lake manatee pth <- system.file('extdata/nps_extflow_lakemanatee.xlsx', package = 'tbeploads') extflo <- util_nps_getextflow(pth, loc = "LMANATEE") # tampa bypass pth <- system.file('extdata/nps_extflow_tampabypass.xlsx', package = 'tbeploads') extflo <- util_nps_getextflow(pth, loc = "TBYPASS") # bell shoals pth <- system.file('extdata/nps_extflow_bellshoals.xls', package = 'tbeploads') extflo <- util_nps_getextflow(pth, loc = "02301500")
Get flow data from for NPS calculations at gaged sites
util_nps_getflow( lakemanpth = NULL, tampabypth = NULL, bellshlpth = NULL, yrrng = c(2021, 2023), usgsflow = NULL, verbose = TRUE )util_nps_getflow( lakemanpth = NULL, tampabypth = NULL, bellshlpth = NULL, yrrng = c(2021, 2023), usgsflow = NULL, verbose = TRUE )
lakemanpth |
character, path to the file containing the Lake Manatee flow data |
tampabypth |
character, path to the file containing the Tampa Bypass flow data |
bellshlpth |
character, path to the file containing the Bell shoals data |
yrrng |
vector of two integers, the year range for which to retrieve flow data. Default is c(2021, 2023). |
usgsflow |
data frame of USGS flow data, if already available from |
verbose |
logical indicating whether to print verbose output |
Missing flow values are linearly interpolated using na.approx. The function combines external and USGS API flow data using the util_nps_getextflow and util_nps_getusgsflow functions.
A preprocessed USGS flow data frame can be provided using the usgsflow argument to avoid re-downloading the data.
A data frame of monthly mean flow for fifteen USGS stations and three external flow sites
util_nps_getextflow, util_nps_getusgsflow
## Not run: usgsflow <- util_nps_getusgsflow(yrrng = as.Date(c('2021-01-01', '2023-12-31'))) ## End(Not run) lakemanpth <- system.file('extdata/nps_extflow_lakemanatee.xlsx', package = 'tbeploads') tampabypth <- system.file('extdata/nps_extflow_tampabypass.xlsx', package = 'tbeploads') bellshlpth <- system.file('extdata/nps_extflow_bellshoals.xls', package = 'tbeploads') allflo <- util_nps_getflow(lakemanpth, tampabypth, bellshlpth, usgsflow = usgsflow)## Not run: usgsflow <- util_nps_getusgsflow(yrrng = as.Date(c('2021-01-01', '2023-12-31'))) ## End(Not run) lakemanpth <- system.file('extdata/nps_extflow_lakemanatee.xlsx', package = 'tbeploads') tampabypth <- system.file('extdata/nps_extflow_tampabypass.xlsx', package = 'tbeploads') bellshlpth <- system.file('extdata/nps_extflow_bellshoals.xls', package = 'tbeploads') allflo <- util_nps_getflow(lakemanpth, tampabypth, bellshlpth, usgsflow = usgsflow)
Retrieve non-point source (NPS) supporting data from SWFWMD web services
util_nps_getswfwmd(dat, max_records = 1000, verbose = TRUE)util_nps_getswfwmd(dat, max_records = 1000, verbose = TRUE)
dat |
Character string indicating the type of data to retrieve. Options are 'soil', 'lulc2020', or 'lulc2023'. |
max_records |
Integer specifying the maximum number of records to retrieve in each request. Default is 1000. |
verbose |
Logical indicating whether to print verbose output. Default is TRUE. |
This function retrieves data from the SWFWMD web services for soils and land use/land cover (LULC) for the years 2020 and 2023. Soils data from https://www25.swfwmd.state.fl.us/arcgis12/rest/services/BaseVector and land use data from https://www25.swfwmd.state.fl.us/arcgis12/rest/services/OpenData.
A simple features object for the relevant data, clipped by the Tampa Bay watershed boundary (tbfullshed).
## Not run: # Retrieve soil data soil_data <- util_nps_getswfwmd('soil') # Retrieve LULC data for 2020 lulc2020_data <- util_nps_getswfwmd('lulc2020') # Retrieve LULC data for 2023 lulc2023_data <- util_nps_getswfwmd('lulc2023') ## End(Not run)## Not run: # Retrieve soil data soil_data <- util_nps_getswfwmd('soil') # Retrieve LULC data for 2020 lulc2020_data <- util_nps_getswfwmd('lulc2020') # Retrieve LULC data for 2023 lulc2023_data <- util_nps_getswfwmd('lulc2023') ## End(Not run)
Get flow data from USGS for NPS calculations
util_nps_getusgsflow( site = NULL, yrrng = c("2021-01-01", "2023-12-31"), verbose = TRUE )util_nps_getusgsflow( site = NULL, yrrng = c("2021-01-01", "2023-12-31"), verbose = TRUE )
site |
A character vector of USGS site numbers. If NULL, defaults to a predefined set of sites. Default is NULL, see details. |
yrrng |
A vector of two dates in 'YYYY-MM-DD' format, specifying the date range to retrieve flow data. Default is from '2021-01-01' to '2023-12-31'. |
verbose |
logical indicating whether to print verbose output |
Stations are from the USGS NWIS database and include 02299950, 02300042, 02300500, 02300700, 02301000, 02301300, 02301500, 02301750, 02303000, 02303330, 02304500, 02306647, 02307000, 02307359, and 02307498. Uses the read_waterdata_daily function from the dataRetrieval package.
A data frame of daily flow values in cfs for fifteen stations
## Not run: usgsflow <- util_nps_getusgsflow() ## End(Not run)## Not run: usgsflow <- util_nps_getusgsflow() ## End(Not run)
Get water quality data for NPS gaged flows
util_nps_getwq( yrrng = c("2021-01-01", "2023-12-31"), mancopth = NULL, pincopth = NULL, verbose = TRUE )util_nps_getwq( yrrng = c("2021-01-01", "2023-12-31"), mancopth = NULL, pincopth = NULL, verbose = TRUE )
yrrng |
A vector of two dates in 'YYYY-MM-DD' format, specifying the date range to retrieve flow data. Default is from '2021-01-01' to '2023-12-31'. |
mancopth |
character, path to the Manatee County water quality data file, see details |
pincopth |
character, path to the Pinellas County water quality data file, see details |
verbose |
logical indicating whether to print verbose output |
If mancopth or pincopth are NULL, Manatee and Pinellas County data are retrieved from the FDEP WIN database using read_importwqwin. Hillsborough County data is retrieved using read_importwq. If mancopth or pincopth are not NULL, then data are imported from disk using the path specified. Data from the Environmental Protection Commission (EPC) of Hillsborough County are imported using read_importepc from the tbeptools R package.
Local data files can be downloaded from the FDEP WIN database at https://prodenv.dep.state.fl.us/DearWin/public/wavesSearchFilter?calledBy=menu, using filters for 21FLMANA and 21FLPDEM for Manatee and Pinellas County, respectively. Activity start and end dates are bounded by the values in yrrng. Chosen analytes are Nitrate-Nitrite (N), Nitrogen- Total Kjeldahl, Phosphorus- Total, and Residues- Nonfilterable (TSS). Chosen stations are ER2 and UM2 for Manatee County and station 06-06 for Pinellas County. EPC stations retained are 105, 113, 114, 132, 141, 138, 142, and 147.
The data are filtered to include only the following analytes: "Nitrate-Nitrite (N)", "Nitrogen- Total Kjeldahl", "Phosphorus- Total", and "Residues- Nonfilterable (TSS)". The units for all analytes are assumed to be mg/L.
A data frame of water quality data for Manatee, Pinellas, and Hillsborough County
## Not run: # import from WIN wqdat <- util_nps_getwq(c('2021-01-01', '2023-12-31')) # use system files mancopth <- system.file('extdata/nps_wq_manco.txt', package = 'tbeploads') pincopth <- system.file('extdata/nps_wq_pinco.txt', package = 'tbeploads') wqdat <- util_nps_getwq(c('2021-01-01', '2023-12-31'), mancopth = mancopth, pincopth = pincopth) ## End(Not run)## Not run: # import from WIN wqdat <- util_nps_getwq(c('2021-01-01', '2023-12-31')) # use system files mancopth <- system.file('extdata/nps_wq_manco.txt', package = 'tbeploads') pincopth <- system.file('extdata/nps_wq_pinco.txt', package = 'tbeploads') wqdat <- util_nps_getwq(c('2021-01-01', '2023-12-31'), mancopth = mancopth, pincopth = pincopth) ## End(Not run)
Utility function for non-point source (NPS) ungaged workflow to create land use and soil data
util_nps_landsoil(tbbase)util_nps_landsoil(tbbase)
tbbase |
Input data frame returned from |
A data frame summarizing land use and soil by bay segment, sub-basin, drainage feature, CLUCSID, hydrologic group, and improved status.
data(tbbase) util_nps_landsoil(tbbase)data(tbbase) util_nps_landsoil(tbbase)
Utility function to create non-point source (NPS) ungaged land use and soil runoff coefficients
util_nps_landsoilrc(tbbase, yrexp = c(2021:2023))util_nps_landsoilrc(tbbase, yrexp = c(2021:2023))
tbbase |
Data frame returned from |
yrexp |
Years to expand the data frame to include all months for each year. |
A data frame with land use (CLUCSID) and soil runoff coefficients by year and month.
data(tbbase) util_nps_landsoilrc(tbbase, yrexp = c(2021:2023))data(tbbase) util_nps_landsoilrc(tbbase, yrexp = c(2021:2023))
Summarize non-point source (NPS) ungaged loads by land use
util_nps_lusumm( dat, summ = c("basin", "segment", "all"), summtime = c("month", "year") )util_nps_lusumm( dat, summ = c("basin", "segment", "all"), summtime = c("month", "year") )
dat |
Input data frame as an intermediate result from |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
The data are summarized differently based on the summ and summtime arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment') and year (summtime = 'year').
Data frame with summarized loading data based on user-supplied arguments, loading data in tons per month or year depending on the summtime argument, and hydrologic load in million cubic meters per month or year depending on the summtime argument.
dat <- data.frame( bay_seg = rep(1:2, each = 6), basin = rep(c("02304500", "02306647"), each = 6), yr = rep(2021:2022, each = 3, times = 2), mo = rep(1:3, times = 4), clucsid = rep(1:3, times = 4), tnload = c(150, 250, 50, 180, 300, 40, 160, 270, 45, 170, 280, 35), tpload = c(15, 35, 8, 18, 42, 6, 16, 38, 7, 17, 40, 5), tssload = c(1200, 3500, 400, 1400, 4000, 350, 1300, 3800, 380, 1350, 3900, 320), bodload = c(800, 1500, 200, 900, 1800, 180, 850, 1600, 190, 870, 1650, 170), h2oload = c(50000, 80000, 25000, 55000, 85000, 22000, 52000, 82000, 23000, 53000, 83000, 21000) ) util_nps_lusumm(dat, summ = 'basin', summtime = 'month')dat <- data.frame( bay_seg = rep(1:2, each = 6), basin = rep(c("02304500", "02306647"), each = 6), yr = rep(2021:2022, each = 3, times = 2), mo = rep(1:3, times = 4), clucsid = rep(1:3, times = 4), tnload = c(150, 250, 50, 180, 300, 40, 160, 270, 45, 170, 280, 35), tpload = c(15, 35, 8, 18, 42, 6, 16, 38, 7, 17, 40, 5), tssload = c(1200, 3500, 400, 1400, 4000, 350, 1300, 3800, 380, 1350, 3900, 320), bodload = c(800, 1500, 200, 900, 1800, 180, 850, 1600, 190, 870, 1650, 170), h2oload = c(50000, 80000, 25000, 55000, 85000, 22000, 52000, 82000, 23000, 53000, 83000, 21000) ) util_nps_lusumm(dat, summ = 'basin', summtime = 'month')
Utility function for non-point source (NPS) ungaged workflow to prepare land use and soil data for logistic regression
util_nps_preplog(tbbase)util_nps_preplog(tbbase)
tbbase |
Input data frame returned from |
A data frame with land use and soil areas in long format with bay segment and basins in the rows.
util_nps_preplog(tbbase)util_nps_preplog(tbbase)
Prep rain data for non-point source (NPS) ungaged load estimates
util_nps_preprain(rain, yrrng = NULL)util_nps_preprain(rain, yrrng = NULL)
rain |
data frame of rainfall data, see |
yrrng |
numeric vector of a single year or a two year range to filter the data, defaults to NULL (no filtering) |
The inverse distance weighting scheme of the drainage basin to the rain gauge is used (with nps_distance) to estimate a cumulative monthly total rainfall, including lags of one and two months.
A data frame of monthly total rainfall (inches) by drainage basin
util_nps_preprain(rain) util_nps_preprain(rain, yrrng = c(2021))util_nps_preprain(rain) util_nps_preprain(rain, yrrng = c(2021))
Create bay segment column for non-point source (NPS) load data
util_nps_segment(dat)util_nps_segment(dat)
dat |
data frame with a |
This is a simple helper function used internally with anlz_nps and util_nps_lusumm to create a segment column based on the basin and bay_seg columns.
The same data frame with an additional segment column indicating the major bay segment associated with each row
dat <- data.frame( basin = c("LTARPON", "TBYPASS", "02300500", "206-4", "EVERSRES", "UNKNOWN"), bay_seg = c(1, 2, 3, 4, 7, NA) ) util_nps_segment(dat)dat <- data.frame( basin = c("LTARPON", "TBYPASS", "02300500", "206-4", "EVERSRES", "UNKNOWN"), bay_seg = c(1, 2, 3, 4, 7, NA) ) util_nps_segment(dat)
Create unioned base layer for non-point source (NPS) ungaged load estimation in the Tampa Bay watershed
util_nps_tbbase( tblu, tbsoil, gdal_path = NULL, chunk_size = NULL, cast = FALSE, verbose = TRUE )util_nps_tbbase( tblu, tbsoil, gdal_path = NULL, chunk_size = NULL, cast = FALSE, verbose = TRUE )
tblu |
sf object of land use/land cover in the Tampa Bay watershed, currently |
tbsoil |
sf object |
gdal_path |
Character string specifying the path to GDAL binaries (e.g., "C:/OSGeo4W/bin"). If NULL (default), assumes GDAL is in system PATH. |
chunk_size |
Integer. For large datasets, process in chunks of this many features. Set to NULL (default) to process all at once. This applies only to the final union with the soils data. |
cast |
Logical. If TRUE, will cast multipolygon geometries to polygons before processing. Default is FALSE, which keeps multipolygons as is (usually faster). |
verbose |
Logical. If TRUE, will print progress messages. Default is TRUE. |
Relies heavily on util_nps_union to perform the union operations efficiently using GDAL/OGR. All input must have the CRS of NAD83(2011) / Florida West (ftUS), EPSG:6443.
A summarized data frame containing the union of all inputs showing major bay segment, sub-basin (basin), drainage feature (drnfeat), jurisdiction (entity), land use/land cover (FLUCCSCODE), CLUCSID, IMPROVED, hydrologic group (hydgrp), and area in hectares. These represent all relevant spatial combinations in the Tampa Bay watershed.
## Not run: # Load required data data(tblu2023) data(tbsoil) result <- util_nps_tbbase(tblu2023, tbsoil, gdal_path = "C:/OSGeo4W/bin", chunk_size = 1000) ## End(Not run)## Not run: # Load required data data(tblu2023) data(tbsoil) result <- util_nps_tbbase(tblu2023, tbsoil, gdal_path = "C:/OSGeo4W/bin", chunk_size = 1000) ## End(Not run)
Performs a spatial intersection and union of two sf objects using GDAL's optimized spatial operations. This function is significantly faster than native sf operations for large datasets.
util_nps_union( sf1, sf2, gdal_path = NULL, chunk_size = NULL, cast = FALSE, verbose = TRUE )util_nps_union( sf1, sf2, gdal_path = NULL, chunk_size = NULL, cast = FALSE, verbose = TRUE )
sf1 |
An sf object containing polygons. All non-geometry columns will be preserved in the output |
sf2 |
An sf object containing polygons. All non-geometry columns will be preserved in the output |
gdal_path |
Character string specifying the path to GDAL binaries (e.g., "C:/OSGeo4W/bin"). If NULL (default), assumes GDAL is in system PATH |
chunk_size |
Integer. For large datasets, process in chunks of this many features from sf1. Set to NULL (default) to process all at once |
cast |
Logical. If TRUE, will cast multipolygon geometries to polygons before processing. Default is FALSE, which keeps multipolygons as is (usually faster). |
verbose |
Logical. If TRUE, will print progress messages. Default is TRUE. |
An sf object containing the spatial intersection of sf1 and sf2, with geometries unioned by unique combinations of all attributes from both input objects
This function uses GDAL's ogr2ogr utility to perform spatial intersection operations, which can be much faster than sf's native functions for large datasets. The process:
Exports both sf objects to temporary GeoPackage files
Combines them into a single file
Dynamically builds SQL query based on actual column names
Uses SQL with spatial functions to find intersections
Groups and unions results by all attribute combinations
For very large datasets that cause memory issues, the function can process data in chunks.
The function automatically detects all non-geometry columns from both input objects and includes them in the intersection operation.
Requires GDAL/OGR to be installed and accessible. On Windows, this is typically provided by OSGeo4W or QGIS installations, downloadable at https://trac.osgeo.org/osgeo4w/.
## Not run: data(tbsubshed) data(tbjuris) result <- util_nps_union( sf1 = tbsubshed, sf2 = tbjuris, "C:/OSGeo4W/bin" ) ## End(Not run)## Not run: data(tbsubshed) data(tbjuris) result <- util_nps_union( sf1 = tbsubshed, sf2 = tbjuris, "C:/OSGeo4W/bin" ) ## End(Not run)
Helper function for union operation
util_nps_unionchunk(sf1, sf2, chunk_size, verbose = TRUE)util_nps_unionchunk(sf1, sf2, chunk_size, verbose = TRUE)
sf1 |
First sf object |
sf2 |
Second sf object |
chunk_size |
Integer. For large datasets, process in chunks of this many features from sf1. |
verbose |
Logical. If TRUE, will print progress messages. Default is TRUE. |
Used internally by util_nps_union. See the help file for more details.
An sf object containing the spatial intersection of sf1 and sf2, with geometries unioned by unique combinations of all attributes from both input objects.
## Not run: data(tbjuris) data(tbsubshed) result <- util_nps_unionchunk(tbsubshed, tbjuris) ## End(Not run)## Not run: data(tbjuris) data(tbsubshed) result <- util_nps_unionchunk(tbsubshed, tbjuris) ## End(Not run)
Helper function for union operation
util_nps_unionnochunk(sf1, sf2)util_nps_unionnochunk(sf1, sf2)
sf1 |
First sf object |
sf2 |
Second sf object |
Used internally by util_nps_union. See the help file for more details.
An sf object containing the spatial intersection of sf1 and sf2, with geometries unioned by unique combinations of all attributes from both input objects.
## Not run: data(tbjuris) data(tbsubshed) result <- util_nps_unionnochunk(tbsubshed, tbjuris) ## End(Not run)## Not run: data(tbjuris) data(tbsubshed) result <- util_nps_unionnochunk(tbsubshed, tbjuris) ## End(Not run)
Prep Verna Wellfield data for use in AD and NPS calculations
util_prepverna(fl, typ, fillmis = T)util_prepverna(fl, typ, fillmis = T)
fl |
text string for the file path to the Verna Wellfield data |
typ |
character string for the type of data to prepare, either 'AD' for atmospheric deposition or 'NPS' for nonpoint source. Uses different TP calculation for each type. |
fillmis |
logical indicating whether to fill missing data with monthly means, see details |
Raw data can be obtained from https://nadp.slh.wisc.edu/sites/ntn-FL41/ as monthly observations. Total nitrogen and phosphorus concentrations are estimated from ammonium and nitrate concentrations (mg/L) using the following relationships:
The first equation corrects for the % of ions in ammonium and nitrate that is N, and the second is a regression relationship between TBADS TN and TP, applied to Verna for atmospheric deposition estimates. A constant is used for non-point source estimates.
Missing data (-9 values) can be filled using monthly means from the previous five years where data exist for that month. If there are less than five previous years of data for that month, the missing value is not filled.
Years with incomplete seasonal data will be filled with NA values if fillmis = FALSE or filled with monthly means if fillmis = TRUE.
A data frame with total nitrogen and phosphorus estimates as mg/l for each year and month of the input data
fl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') util_prepverna(fl, typ = 'AD')fl <- system.file('extdata/verna-raw.csv', package = 'tbeploads') util_prepverna(fl, typ = 'AD')
Add column names for point source from raw entity data
util_ps_addcol(dat)util_ps_addcol(dat)
dat |
data frame from raw entity data as |
The function checks for TN, TP, TSS, and BOD. If any of these are missing, the columns are added with empty values including a column for units. If BOD is missing but CBOD is present, the CBOD column is renamed to BOD.
Input data frame from pth as is if column names are correct, otherwise additional columns are added as needed.
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_addcol(dat)pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_addcol(dat)
Create a data frame of formatting issues with point source input files
util_ps_checkfls(fls)util_ps_checkfls(fls)
fls |
vector of file paths to raw facility data, one to many |
The chk column indicates the issue with the file and will indicate "ok" if no issues are found, "read error" if the file cannot be read, and "check columns" if the column names are not as expected. Any file not showing "ok" should be checked for issues.
All files are checked with util_ps_checkuni if a file does not have a read error.
The function cannot be used with files for material losses.
A data.frame with three columns indicating name for the file name, chk for the file issue, and nms for a concatenated string of column names for the file
fls <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') util_ps_checkfls(fls)fls <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') util_ps_checkfls(fls)
Check units for point source from raw entity data
util_ps_checkuni(dat)util_ps_checkuni(dat)
dat |
data frame from raw entity data as |
Input data should include flow as million gallons per day, and concentration as mg/L.
Input data frame from pth with relevant data and columns renamed, otherwise an error is returned if units are not correct. Only year, month, outfall, flow, TN, TP, TSS, and BOD are returned.
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_checkuni(dat)pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_checkuni(dat)
Get point source entity information from file name
util_ps_facinfo(pth, asdf = FALSE)util_ps_facinfo(pth, asdf = FALSE)
pth |
path to raw entity data |
asdf |
logical, if |
Bay segment is an integer with values of 1, 2, 3, 4, 5, 6, 7, and 55 for Old Tampa Bay, Hillsborough Bay, Middle Tampa Bay, Lower Tampa Bay, Boca Ciega Bay, Terra Ceia Bay, Manatee River, and Boca Ciega Bay South, respectively.
A list or data.frame (if asdf = TRUE) with entity, facility, permit, facility id, coastal id, and coastal subbasin code
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') util_ps_facinfo(pth)pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') util_ps_facinfo(pth)
Fill missing point source data with annual average
util_ps_fillmis(dat)util_ps_fillmis(dat)
dat |
data frame from raw entity data as |
Missing concentration data are replaced with the average for the outfall in a given year. All flow data are also floored at zero. Rows with missing flow data are assigned 0 for all data. Rows with zero flow are assigned concentration of zero.
Input data frame as is if no missing values, otherwise missing data filled as described above.
pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) dat <- util_ps_checkuni(dat) util_ps_fillmis(dat)pth <- system.file('extdata/ps_dom_hillsco_falkenburg_2019.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) dat <- util_ps_checkuni(dat) util_ps_fillmis(dat)
Light edits to the outfall ID column for point source data
util_ps_fixoutfall(dat)util_ps_fixoutfall(dat)
dat |
data frame from raw entity data as |
The outfall ID column is edited lightly to remove any leading or trailing white space, a hyphen is added between letters and numbers if missing, and "Outfall" prefix is removed if presenn.
Input data frame as is, with any edits to the outfall ID column.
pth <- system.file('extdata/ps_ind_busch_busch_2020.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_fixoutfall(dat)pth <- system.file('extdata/ps_ind_busch_busch_2020.txt', package = 'tbeploads') dat <- read.table(pth, skip = 0, sep = '\t', header = TRUE) util_ps_fixoutfall(dat)
Downloads and parses monthly Part A Discharge Monitoring Report (DMR) PDFs for the AOC LLC industrial wastewater facility (NPDES permit FL0029653, Polk County, FL) from the Florida Department of Environmental Protection (FDEP) OCULUS public document management system. Returns Average Daily Flow and Total Nitrogen concentration for each available monitoring month.
util_ps_getaoc(yr, search_xlsx, pdf_dir = NULL, out_file = NULL, quiet = FALSE)util_ps_getaoc(yr, search_xlsx, pdf_dir = NULL, out_file = NULL, quiet = FALSE)
yr |
numeric (length 1), the monitoring year to retrieve (e.g., 2025). |
search_xlsx |
character, path to an OCULUS search-results spreadsheet for facility FL0029653. See Details for instructions on generating this file. |
pdf_dir |
character or NULL. Directory in which to save the downloaded
PDFs. If |
out_file |
character or NULL. If provided, the results data frame is
written to this path as an |
quiet |
logical. Suppress progress messages (default |
The search_xlsx file is an Excel export from the FDEP OCULUS public
document portal. To generate it:
Navigate to https://depedms.dep.state.fl.us in a web browser.
Click Public Oculus Login (no account required).
In the search form, set:
Catalog: Wastewater
Profile: Sampling
Facility-Site ID: FL0029653 (this does not go in the Permit Number field)
Document Date: From MM-DD-YYYY to MM-DD-YYYY (covering the desired monitoring year)
Document Type: Discharge Monitoring Report (DMR)
Run the search and export the results to Excel (use the Export to Excel button).
Save the exported .xlsx file and pass its path as search_xlsx.
The file must contain HYPERLINK() formulas in column A pointing to
the individual DMR PDFs and document subject lines in column K. Both
are present in any standard OCULUS search export.
The function keeps only monthly (MO) Part A documents for the requested
year. Annual summary (YR), Part B daily tables, and other document types
are excluded automatically. If a month has multiple submissions (e.g., a
revision), the most recently filed document is used.
For some facilities the OCULUS cycle label (e.g., "JAN MO") may not align with the calendar month of the monitoring period. The month returned in the output is always derived from the monitoring period dates inside the PDF, not from the OCULUS label.
When the facility reports No Observable Discharge, adf_mgd is set to 0
and tn_mgl is set to NA. A zero flow value therefore implies no
discharge for that month.
Total phosphorus (TP), biochemical oxygen demand (BOD), and total suspended solids (TSS) have not been recorded at this facility in recent years and are not included in the output.
On the initial run, supply a pdf_dir path so the downloaded PDFs are
retained for inspection. Verify that the monitoring months, flow values,
and TN concentrations in the output data frame match those in the PDFs.
A data frame with one row per available monitoring month, sorted by
month. Calendar months for which no Part A document was found are omitted
(a message is printed when quiet = FALSE). Columns:
| Column | Type | Description |
yr |
integer | Monitoring year (from the PDF monitoring period). |
mo |
integer | Calendar month (1–12, from the PDF monitoring period). |
adf_mgd |
numeric | Average Daily Flow (MGD). 0 indicates no discharge (NOD). |
tn_mgl |
numeric | Total nitrogen grab-sample concentration (mg/L). NA when no discharge or not reported.
|
## Not run: # Retrieve 2025 AOC DMR data # (requires an OCULUS search spreadsheet generated as described in Details) df <- util_ps_getaoc( yr = 2025, search_xlsx = "AOC_OCULUSSearchData_2025.xlsx" ) # Keep PDFs and save results to Excel df <- util_ps_getaoc( yr = 2025, search_xlsx = "AOC_OCULUSSearchData_2025.xlsx", pdf_dir = "~/Desktop/AOC_DMR_2025", out_file = "~/Desktop/AOC_DMR_2025_results.xlsx" ) ## End(Not run)## Not run: # Retrieve 2025 AOC DMR data # (requires an OCULUS search spreadsheet generated as described in Details) df <- util_ps_getaoc( yr = 2025, search_xlsx = "AOC_OCULUSSearchData_2025.xlsx" ) # Keep PDFs and save results to Excel df <- util_ps_getaoc( yr = 2025, search_xlsx = "AOC_OCULUSSearchData_2025.xlsx", pdf_dir = "~/Desktop/AOC_DMR_2025", out_file = "~/Desktop/AOC_DMR_2025_results.xlsx" ) ## End(Not run)
Downloads and parses monthly Discharge Monitoring Report (DMR) PDFs for the MacDill Air Force Base wastewater treatment plant (NPDES permit FLA012124, Hillsborough County, FL) from the FDEP OCULUS public document system. Returns monthly effluent parameters for three discharge outfalls.
util_ps_getmacdill( yr, search_xlsx, pdf_dir = NULL, out_file = NULL, quiet = FALSE )util_ps_getmacdill( yr, search_xlsx, pdf_dir = NULL, out_file = NULL, quiet = FALSE )
yr |
numeric (length 1), the monitoring year to retrieve (e.g., 2025). |
search_xlsx |
character, path to an OCULUS search-results spreadsheet for facility FLA012124. See Details for instructions on generating this file. |
pdf_dir |
character or NULL. Directory in which to save the downloaded
PDFs. If |
out_file |
character or NULL. If provided, the results data frame is
written to this path as an |
quiet |
logical. Suppress progress messages (default |
The search_xlsx file is an Excel export from the FDEP OCULUS public
document portal. To generate it:
Navigate to https://depedms.dep.state.fl.us in a web browser.
Click Public Oculus Login (no account required).
In the search form, set:
Catalog: Wastewater
Profile: Sampling
Facility-Site ID: FLA012124 (not the Permit Number field)
Document Date: From MM-DD-YYYY to MM-DD-YYYY (covering the desired year)
Document Type: Discharge Monitoring Report (DMR)
Run the search and export the results to Excel (Export to Excel button).
Save the exported .xlsx and pass its path as search_xlsx.
The file must contain HYPERLINK() formulas in column A and document
subject lines in column K, both of which are present in any standard
OCULUS search export.
All monthly (MO) documents for the requested year are downloaded and
inspected. Each PDF is then classified by its actual content:
Part A (monthly summary) — contains the official permit-limit table with pre-computed monthly averages and permit compliance results.
Part B (daily sample results) — contains a day-by-day table of flow and effluent quality measurements for a given month.
The OCULUS document labels ("Part A", "Part B") are not always reliable for
this facility, so the function detects content type from the PDF text. Annual
summary (YR) documents are excluded automatically.
Some older submissions are scanned image PDFs with no embedded text layer.
These cannot be parsed and are saved as
macdill_unclassified_{document_subject}.pdf in pdf_dir (when
pdf_dir is supplied) so you can identify them from their OCULUS subject
line and enter the values manually if needed.
Where a Part B daily table is available for a given calendar month, BOD and
TSS are computed as the mean of all observed daily values using the
substitution rules <1 \u{2192} 0.5 and <2 \u{2192} 1.0 (consistent with the
2022–2024 reporting methodology). Monthly average flow is also derived from
Part B daily readings when available. For months with no Part B, Part A
monthly-summary values are used for BOD, TSS, and flow.
Total nitrogen (tn_mgl) is always sourced from Part A because Part B
tables do not include a TN column.
The OCULUS document cycle labels (e.g., "JAN MO") do not always align with
the calendar month of the monitoring period. The mo column in the output
is always derived from the monitoring period dates inside the PDF, not
from the OCULUS label.
NOD (No Observable Discharge) is treated as zero flow and NA
concentration. ANC (Acceptable Not Collected) is treated as NA.
A data frame with three rows per available monitoring month (one per
outfall), sorted by month then outfall. Calendar months for which no
usable document was found are omitted (a message is printed when
quiet = FALSE). Columns:
| Column | Type | Description |
yr |
integer | Monitoring year (from the PDF monitoring period). |
mo |
integer | Calendar month (1–12). |
outfall |
character | Outfall ID: "R-001", "R-002", or "R-003". |
flow_mgd |
numeric | Average Daily Flow (MGD). 0 when NOD. |
bod_mgl |
numeric | BOD (mg/L). NA when ANC or not collected. |
tss_mgl |
numeric | TSS (mg/L). NA when ANC or not collected. |
tn_mgl |
numeric | Total nitrogen as NO3-N (mg/L), R-003 only. NA for R-001 and R-002. |
verify |
logical | TRUE when one or more concentration values (TSS for R-001/R-003, TN for R-003) are single-day maximums from Part A rather than monthly averages. This occurs when no machine-readable Part B was available. Cross-check these values against the original PDF.
|
util_ps_checkfls(), anlz_ips()
## Not run: # Retrieve 2025 MacDill DMR data df <- util_ps_getmacdill( yr = 2025, search_xlsx = "MacDill_OCULUSSearchData_2025.xlsx" ) # Keep PDFs and save results to Excel df <- util_ps_getmacdill( yr = 2025, search_xlsx = "MacDill_OCULUSSearchData_2025.xlsx", pdf_dir = "~/Desktop/MacDill_DMR_2025", out_file = "~/Desktop/MacDill_DMR_2025_results.xlsx", keep_pdfs = TRUE ) ## End(Not run)## Not run: # Retrieve 2025 MacDill DMR data df <- util_ps_getmacdill( yr = 2025, search_xlsx = "MacDill_OCULUSSearchData_2025.xlsx" ) # Keep PDFs and save results to Excel df <- util_ps_getmacdill( yr = 2025, search_xlsx = "MacDill_OCULUSSearchData_2025.xlsx", pdf_dir = "~/Desktop/MacDill_DMR_2025", out_file = "~/Desktop/MacDill_DMR_2025_results.xlsx", keep_pdfs = TRUE ) ## End(Not run)
Fill total phosphorus, total suspended solids, and biological oxygen demand for miscellaneous industrial point source records where one or more of these parameters are unmeasured and must be estimated from historical averages.
util_ps_misc(dat)util_ps_misc(dat)
dat |
data frame for a single facility with columns matching the
standard clean IPS format: |
Unlike Mosaic facilities (see util_ps_mosaic), the facilities
handled here already report some concentration parameters from DMR data.
This function fills only those parameters that are chronically unmeasured,
using historical averages or standard assumptions. Measured values are
preserved unless the fill rule explicitly overrides them (e.g., Trademark
Nitrogen TP and BOD are always set to historical means regardless of any
reported value).
The general fill rule is: when flow is zero or missing, all concentrations
are set to NA; when flow is positive, fill values are applied to
unmeasured parameters and measured values are retained for everything else.
Unit strings are standardised to "mg/L" when a value is present and
"" otherwise.
As for the util_ps_mosaic function, #' Note that this function
may need to be updated if new data become available or if there are changes
in the fill rules. The current fill values and rules are based on
historical permit compliance data and may not reflect future conditions.
The input data frame with Total.P, TSS, BOD,
and their unit columns updated. All other columns are returned unchanged
except that unit strings are standardised and concentrations are set to
NA for zero- or missing-flow months.
TP = 1, TSS = 2, BOD = 6; last recorded BOD and TSS from Dec 2017 minimal discharge; TP from Grizzle-Figg limits
TP = 1.73, TSS = 12.11, BOD = 9.6; TP and TSS from 95-98 averages; BOD from Harper 1994
BOD = 9.6 (Harper 1994); TP and TSS from actual DMR measurements
TSS = 5, BOD = 9.6 (Harper 1994); TP from actual DMR measurements
BOD = 9.6 (Harper 1994); TP and TSS from actual DMR measurements
TP = 0, TSS = 0, BOD = 9.6; no TP or TSS info; BOD same as Winston Yard previously
TP = 0, TSS = 0, BOD = 9.6 (Harper 1994); no TP or TSS info
TP = 0, BOD = 9.6 (Harper 1994); TSS from actual DMR measurements; no TP measurements
BOD = 9.6 (Harper 1994); TP and TSS from actual DMR measurements
BOD = 9.6 (Harper 1994); TP and TSS from actual DMR measurements
TP = 1, TSS = 12.96, BOD = 9.6; TP from Grizzle-Figg limits; TSS avg 2012-2014; BOD Harper 1994
TSS = 5, BOD = 9.6 (Harper 1994); TP from actual DMR measurements; TN and TP for September 2023 filled with adjacent-month means (TN = 0.967, TP = 0.17)
TP = 0.13333, BOD = 1.09833; both filled with means from 1995-1998 loadings regardless of measured values; TSS from actual DMR measurements
util_ps_mosaic for filling missing data for Mosaic facilities
dat <- data.frame( Permit.Number = rep('FL0185833', 3), Facility.Name = rep('Busch Gardens', 3), Outfall.ID = rep('D-002', 3), Year = rep(2022L, 3), Month = 1:3, Average.Daily.Flow..ADF...mgd. = c(0.78, 0.50, 0), Total.N = c(0.45, 0.06, NA), Total.N.Unit = c('mg/L', 'mg/L', ''), Total.P = c(0.08, 0.08, NA), Total.P.Unit = c('mg/L', 'mg/L', ''), TSS = c(NA, NA, NA), TSS.Unit = c('', '', ''), BOD = c(NA, NA, NA), BOD.Unit = c('', '', '') ) util_ps_misc(dat)dat <- data.frame( Permit.Number = rep('FL0185833', 3), Facility.Name = rep('Busch Gardens', 3), Outfall.ID = rep('D-002', 3), Year = rep(2022L, 3), Month = 1:3, Average.Daily.Flow..ADF...mgd. = c(0.78, 0.50, 0), Total.N = c(0.45, 0.06, NA), Total.N.Unit = c('mg/L', 'mg/L', ''), Total.P = c(0.08, 0.08, NA), Total.P.Unit = c('mg/L', 'mg/L', ''), TSS = c(NA, NA, NA), TSS.Unit = c('', '', ''), BOD = c(NA, NA, NA), BOD.Unit = c('', '', '') ) util_ps_misc(dat)
Fill total phosphorus, total suspended solids, and biological oxygen demand for Mosaic industrial point source records, which contain only flow and total nitrogen.
util_ps_mosaic(dat)util_ps_mosaic(dat)
dat |
data frame for a single Mosaic facility with columns
|
Mosaic data contain only average daily flow (MGD) and total nitrogen (TN, mg/L). Total phosphorus (TP), total suspended solids (TSS), and biological oxygen demand (BOD) are not measured and are filled with historical averages derived from earlier permit compliance data (2008-2011 averages unless otherwise noted).
The general fill rule is: if average daily flow is zero or missing, TP, TSS, and
BOD are set to NA; if flow is positive they are assigned the per-facility
(or per-outfall) historical average. A few facilities always receive
the historical fill regardless of flow and these include Mosaic Bartow, Green Bay, New Wales, South
Pierce, and Bonnie outfall D-003. Missing flow values are replaced with zero.
A permit number is added for each facility from FDEP permit records.
A data frame with columns: Permit.Number, Facility.Name,
Outfall.ID, Year, Month,
Average.Daily.Flow..ADF...mgd., Total.N, Total.N.Unit,
Total.P, Total.P.Unit, TSS, TSS.Unit,
BOD, BOD.Unit.
TP = 1.61, TSS = 8.38, BOD = 9.6, always filled
TP = 0.56, TSS = 8.2, BOD = 2.45
TP = 0.18, TSS = 3.40, BOD = 9.6
TP = 0.85, TSS = 1.63, BOD = 9.6
TP = 0.50, TSS = 2.26, BOD = 9.6
TP = 0.23, TSS = 1.70, BOD = 9.6
TP = 0.73, TSS = 26.46, BOD = 9.6
TP = 2.30, TSS = 6.58, BOD = 9.6, always filled
TP = 1.12, TSS = 12.7, BOD = 9.6
TP = 4.23, TSS = 7.90, BOD = 9.6, always filled
TP = 25.3, TSS = 9.35, BOD = 9.6
TP = 0.016, TSS = 2.4, BOD = 9.6
TP = 2.21, TSS = 5.07, BOD = 9.6
TP = 6.67, TSS = 6.78, BOD = 9.6
TP = 0.27, TSS = 4.9, BOD = 9.6, always filled
TP = 0.21, TSS = 1.95, BOD = 1.85
TP = 0.65, TSS = 12.0, BOD = 9.6
TP = 0.66, TSS = 14.4, BOD = 9.6
TP = 10.65, TSS = 11.49, BOD = 1.8
TP = 10.65, TSS = 8.70, BOD = 1.8
TP = 1.50, TSS = 3.58, BOD = 9.6, always filled
TP = 22.0, TSS = 49.6, BOD = 9.6
TP = 25.3, TSS = 9.33, BOD = 9.6
Mosaic Hookers Prairie and Mosaic Riverview Stack Closure have no established
fill values; TP, TSS, and BOD will be NA for those facilities.
Facilities with named outfall rules (Bonnie, Four Corners, Mulberry,
Mulberry Phospho Stack, Riverview, and Tampa Marine Terminal) require that
every outfall present in dat matches a known entry. An error is returned
for unrecognised outfalls.
Note that this function may need to be updated if new data become available or if there are changes in the fill rules. The current fill values and rules are based on historical permit compliance data and may not reflect future conditions.
util_ps_misc for filling missing data for miscellaneous industrial
point source facilities
dat <- data.frame( Facility.Name = rep('Mosaic Bartow', 3), Outfall.ID = rep('D-001', 3), Year = rep(2022L, 3), Month = 1:3, Average.Daily.Flow..ADF...mgd. = c(0.57, 0, 0.43), Total.N = c(3.94, NA, 2.11) ) util_ps_mosaic(dat)dat <- data.frame( Facility.Name = rep('Mosaic Bartow', 3), Outfall.ID = rep('D-001', 3), Year = rep(2022L, 3), Month = 1:3, Average.Daily.Flow..ADF...mgd. = c(0.57, 0, 0.43), Total.N = c(3.94, NA, 2.11) ) util_ps_mosaic(dat)
Create Pasco Reuse point source input data from external hydrologic volume inputs and a constant TN concentration
util_ps_pascoreuse( yr, res, golf = rep(0, length(yr)), ribs = rep(0, length(yr)), ag = rep(0, length(yr)), tn_conc = 9, n_coastal = 2 )util_ps_pascoreuse( yr, res, golf = rep(0, length(yr)), ribs = rep(0, length(yr)), ag = rep(0, length(yr)), tn_conc = 9, n_coastal = 2 )
yr |
integer vector of years |
res |
numeric vector of residential reuse volumes (million gallons per year x 1000), one value per year |
golf |
numeric vector of golf course reuse volumes (million gallons per year x 1000), one value per year |
ribs |
numeric vector of rapid infiltration basin volumes (million gallons per year x 1000), one value per year |
ag |
numeric vector of agricultural reuse volumes (million gallons per year x 1000), one value per year |
tn_conc |
numeric, constant TN concentration in mg/L applied to all records (default |
n_coastal |
integer, number of coastal bay segment codes over which flow is divided equally (default |
Pasco County reuse hydrologic inputs are provided externally as annual volumes (in million gallons per year x 1000) broken into
land use categories: residential, golf courses, rapid infiltration basins (RIBs), and agriculture.
These are summed and converted to million gallons (MG) per year, then distributed evenly across
12 months and divided equally across n_coastal coastal bay segment codes to produce average
daily flow in MGD. A constant TN concentration of tn_conc mg/L is assumed. TP, TSS, and
BOD are set to zero.
The output format matches the standard point source input data frame used by anlz_dps_facility.
A data.frame with one row per year-month combination and columns:
character, "PascoReuse"
character, "Pasco Reuse"
character, "R-001"
integer
integer, 1–12
numeric, average daily flow in MGD
numeric, TN concentration in mg/L
character, "mg/l"
numeric, 0
character, "mg/l"
numeric, 0
character, "mg/l"
numeric, 0
character, "mg/l"
util_ps_pascoreuse( yr = 2022:2024, res = c(744120, 522273, 344189), golf = c(0, 0, 0), ribs = c(0, 0, 0), ag = c(169, 269, 153) )util_ps_pascoreuse( yr = 2022:2024, res = c(744120, 522273, 344189), golf = c(0, 0, 0), ribs = c(0, 0, 0), ag = c(169, 269, 153) )
Retrieve spring water quality data from APIs
util_spr_getwq(yrrng, verbose = TRUE)util_spr_getwq(yrrng, verbose = TRUE)
yrrng |
integer vector of length 2, start and end year, e.g. |
verbose |
logical, if |
Fetches annual mean TN, TP, and TSS concentrations (mg/L) for Lithia, Buckhorn, and Sulphur springs from two external sources.
Lithia and Buckhorn springs are retrieved from the
Water Atlas API
(GET /api/samplingdata/stream), using Southwest Florida Water
Management District (SWFWMD) monitoring stations 17805 (Lithia Main Spring)
and 18276 (Buckhorn Main Spring) from the WIN_21FLSWFD data source.
This is the same underlying dataset as FDEP's Impaired Waters Rule file but
accessed directly via API. TSS is not routinely measured at these
stations and will typically be NA, in which case anlz_spr
substitutes fixed historical values.
Sulphur Spring data are retrieved via
read_importepc, which downloads the Environmental
Protection Commission of Hillsborough County (EPC) monitoring spreadsheet.
Station 174 corresponds to the Sulphur Spring sampling location and provides
monthly TN, TP, and TSS observations.
Annual means are computed across all observations within each calendar year.
TSS values that are NaN (i.e., no valid observations in a year) are
converted to NA so that anlz_spr can apply the fixed
fallback concentrations.
A data frame with columns spring, yr, tn_mgl,
tp_mgl, and tss_mgl (one row per spring per year).
## Not run: wqdat <- util_spr_getwq(c(2022, 2024)) ## End(Not run)## Not run: wqdat <- util_spr_getwq(c(2022, 2024)) ## End(Not run)
Summarize load estimates
util_summ( dat, summ = c("entity", "facility", "basin", "segment", "all"), summtime = c("month", "year") )util_summ( dat, summ = c("entity", "facility", "basin", "segment", "all"), summtime = c("month", "year") )
dat |
Pre-processed data frame of load estimates, see examples |
summ |
chr string indicating how the returned data are summarized, see details |
summtime |
chr string indicating how the returned data are summarized temporally (month or year), see details |
The data are summarized differently based on the summ and summtime arguments. All loading data are summed based on these arguments, e.g., by bay segment (summ = 'segment') and year (summtime = 'year').
Data frame with summarized loading data based on user-supplied arguments
fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) ipsbyfac <- anlz_ips_facility(fls) # add bay segment and source # there should only be loads to Hillsborough, Middle, and Lower Tampa Bay ipsld <- ipsbyfac |> dplyr::arrange(coastco) |> dplyr::left_join(dbasing, by = "coastco") |> dplyr::mutate( segment = dplyr::case_when( bayseg == 1 ~ "Old Tampa Bay", bayseg == 2 ~ "Hillsborough Bay", bayseg == 3 ~ "Middle Tampa Bay", bayseg == 4 ~ "Lower Tampa Bay", TRUE ~ NA_character_ ), source = 'IPS' ) |> dplyr::select(-basin, -hectare, -coastco, -name, -bayseg) util_summ(ipsld, summ = 'entity', summtime = 'year')fls <- list.files(system.file('extdata/', package = 'tbeploads'), pattern = 'ps_ind_', full.names = TRUE) ipsbyfac <- anlz_ips_facility(fls) # add bay segment and source # there should only be loads to Hillsborough, Middle, and Lower Tampa Bay ipsld <- ipsbyfac |> dplyr::arrange(coastco) |> dplyr::left_join(dbasing, by = "coastco") |> dplyr::mutate( segment = dplyr::case_when( bayseg == 1 ~ "Old Tampa Bay", bayseg == 2 ~ "Hillsborough Bay", bayseg == 3 ~ "Middle Tampa Bay", bayseg == 4 ~ "Lower Tampa Bay", TRUE ~ NA_character_ ), source = 'IPS' ) |> dplyr::select(-basin, -hectare, -coastco, -name, -bayseg) util_summ(ipsld, summ = 'entity', summtime = 'year')