Point count surveys were completed across Tennessee looking for grassland birds. Surveyors examined which characteristics impacted the total number of birds detected at point counts. They measured environmental characteristics including litter depth, vegetation height, vegetation density, and percent cover of grass. Additionally, they measured the size of the site in hectares and the start time of the survey (minutes after sunrise).
What does our data look like?
median | mean | min | max |
---|---|---|---|
7 | 7.89 | 0 | 163 |
Will probably have some outliers in the data. Need to keep on eye out for this.
glm.global <- glm(total_individuals ~ litter_avg + veg_ht_avg + vd_vo + pc_grass + site_ha + min_from_sunrise, family = poisson(link = "log"), data = combined.detections)
summary(glm.global)
##
## Call:
## glm(formula = total_individuals ~ litter_avg + veg_ht_avg + vd_vo +
## pc_grass + site_ha + min_from_sunrise, family = poisson(link = "log"),
## data = combined.detections)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.3701619 0.0809845 29.267 < 2e-16 ***
## litter_avg 0.0314573 0.0118662 2.651 0.00803 **
## veg_ht_avg -0.0032496 0.0014786 -2.198 0.02797 *
## vd_vo 0.0019937 0.0015833 1.259 0.20795
## pc_grass 0.0098337 0.0046899 2.097 0.03601 *
## site_ha -0.0003578 0.0003780 -0.947 0.34387
## min_from_sunrise -0.0029974 0.0004384 -6.837 8.07e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 1318.8 on 190 degrees of freedom
## Residual deviance: 1241.1 on 184 degrees of freedom
## AIC: 1926.5
##
## Number of Fisher Scoring iterations: 6
library(easystats)
check_model(glm.global)
There are 2 outliers in the data which are much higher counts
compared to the rest of the dataset. Those outliers are caused by large
flocks of migrating shorebirds. Let’s remove those two points since we
are targeting species that breed in TN.
What does our data look like now that we have removed the outliers?
check_model(glm.global.2)
We have a moderate amount of dispersion.
# check for collinearity
library(GGally)
ggpairs(combined.detections, columns = c(2, 3, 4, 5, 6, 8)) + theme_bw()
Vegetation height and vegetation density appear to be highly correlated. Need to be careful of that in the models.
# check for multicollinearity
library(performance)
performance::check_collinearity(glm.global.2)
## # Check for Multicollinearity
##
## Low Correlation
##
## Term VIF VIF 95% CI adj. VIF Tolerance Tolerance 95% CI
## litter_avg 1.05 [1.00, 2.10] 1.02 0.96 [0.48, 1.00]
## veg_ht_avg 3.08 [2.49, 3.91] 1.76 0.32 [0.26, 0.40]
## vd_vo 2.70 [2.19, 3.41] 1.64 0.37 [0.29, 0.46]
## pc_grass 1.32 [1.16, 1.64] 1.15 0.76 [0.61, 0.86]
## site_ha 1.05 [1.00, 1.91] 1.03 0.95 [0.52, 1.00]
## min_from_sunrise 1.06 [1.00, 1.76] 1.03 0.94 [0.57, 1.00]
VIF values are <5 so we do not need to worry about multicollinearity.
# display list of ranked models
model_comparison <- model.sel(models)
model_comparison
## Model selection table
## (Int) pc_grs veg_ht_avg sit_ha min_frm_snr ltt_avg vd_vo df logLik AICc delta weight
## 5 1.786 0.014400 -0.001663 0.004083 4 -546.395 1101.0 0.00 0.503
## 9 1.775 0.014320 -0.001646 0.009368 0.003981 5 -546.192 1102.7 1.70 0.214
## 8 1.786 0.009952 0.004017 -2.007e-04 -0.001759 5 -546.328 1103.0 1.98 0.187
## 12 1.764 0.011400 0.002275 -2.136e-04 -0.001696 0.009831 0.002243 7 -544.968 1104.6 3.55 0.085
## 10 1.964 -0.001903 0.004756 3 -551.584 1109.3 8.29 0.008
## 1 1.555 0.013260 0.003525 3 -553.924 1114.0 12.97 0.001
## 4 1.539 0.013010 0.003424 0.017080 4 -553.258 1114.7 13.72 0.001
## 11 1.882 0.018270 -0.001587 3 -554.482 1115.1 14.09 0.000
## 7 1.554 0.013270 0.003516 -3.265e-04 0.015790 5 -552.879 1116.1 15.08 0.000
## 6 1.884 0.018370 -6.485e-05 -0.001576 4 -554.468 1117.2 16.15 0.000
## 2 1.683 0.020700 -2.335e-04 3 -560.233 1126.6 25.59 0.000
## 3 2.129 1.652e-04 -0.001897 3 -563.319 1132.8 31.76 0.000
## Models ranked by AICc(x)
# select best supported models
subset(model_comparison, delta <5)
## Model selection table
## (Int) pc_grs veg_ht_avg sit_ha min_frm_snr ltt_avg vd_vo df logLik AICc delta weight
## 5 1.786 0.014400 -0.001663 0.004083 4 -546.395 1101.0 0.00 0.508
## 9 1.775 0.014320 -0.001646 0.009368 0.003981 5 -546.192 1102.7 1.70 0.217
## 8 1.786 0.009952 0.004017 -0.0002007 -0.001759 5 -546.328 1103.0 1.98 0.189
## 12 1.764 0.011400 0.002275 -0.0002136 -0.001696 0.009831 0.002243 7 -544.968 1104.6 3.55 0.086
## Models ranked by AICc(x)
Model 5, 9, and 8 all have delta AICc <2. Model 5 is 2.34 times more likely than model 9. (0.508/0.217=2.34) The global model does have a delta AICc <5.
# calculate variable importance weights
MuMIn::sw(model_comparison)
## min_from_sunrise pc_grass vd_vo litter_avg veg_ht_avg site_ha
## Sum of weights: 1.00 0.99 0.81 0.30 0.27 0.27
## N containing models: 8 10 4 4 5 6
# model averaging
model.avg <- model.avg(model_comparison, revised.var = TRUE)
model.avg
##
## Call:
## model.avg(object = model_comparison, revised.var = TRUE)
##
## Component models:
## '146' '1456' '1234' '123456' '46' '12' '125' '14' '1235' '134' '13' '34'
##
## Coefficients:
## (Intercept) vd_vo pc_grass min_from_sunrise litter_avg veg_ht_avg site_ha
## full 1.782717 0.003135970 0.01318085 -0.001679270 0.002861623 0.0009518711 -5.592074e-05
## subset 1.782717 0.003868789 0.01328673 -0.001681899 0.009518861 0.0034714627 -2.047905e-04
Now we can look at the direction of impact these variables have with the number of bird detections.
summary(model.avg(model_comparison, subset = delta < 2))
##
## Call:
## model.avg(object = model_comparison, subset = delta < 2)
##
## Component model call:
## glm(formula = <3 unique values>, family = poisson(link = "log"), data = combined.detections)
##
## Component models:
## df logLik AICc delta weight
## 146 4 -546.40 1101.01 0.00 0.56
## 1456 5 -546.19 1102.71 1.70 0.24
## 1234 5 -546.33 1102.98 1.98 0.21
##
## Term codes:
## pc_grass veg_ht_avg site_ha min_from_sunrise litter_avg vd_vo
## 1 2 3 4 5 6
##
## Model-averaged coefficients:
## (full average)
## Estimate Std. Error Adjusted SE z value Pr(>|z|)
## (Intercept) 1.783e+00 9.039e-02 9.099e-02 19.599 < 2e-16 ***
## vd_vo 3.213e-03 1.866e-03 1.869e-03 1.719 0.085562 .
## pc_grass 1.346e-02 4.922e-03 4.950e-03 2.720 0.006536 **
## min_from_sunrise -1.679e-03 4.647e-04 4.677e-04 3.589 0.000332 ***
## litter_avg 2.221e-03 8.133e-03 8.174e-03 0.272 0.785816
## veg_ht_avg 8.316e-04 1.688e-03 1.689e-03 0.492 0.622480
## site_ha -4.156e-05 1.911e-04 1.921e-04 0.216 0.828719
##
## (conditional average)
## Estimate Std. Error Adjusted SE z value Pr(>|z|)
## (Intercept) 1.7832395 0.0903929 0.0909880 19.599 < 2e-16 ***
## vd_vo 0.0040524 0.0009962 0.0010028 4.041 5.32e-05 ***
## pc_grass 0.0134619 0.0049218 0.0049500 2.720 0.006536 **
## min_from_sunrise -0.0016787 0.0004647 0.0004677 3.589 0.000332 ***
## litter_avg 0.0093681 0.0145603 0.0146567 0.639 0.522716
## veg_ht_avg 0.0040170 0.0009867 0.0009932 4.044 5.25e-05 ***
## site_ha -0.0002007 0.0003800 0.0003825 0.525 0.599694
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# calculating the exponentiated coefficients of our averaged model
exp(coef(model.avg))
## (Intercept) vd_vo pc_grass min_from_sunrise litter_avg veg_ht_avg site_ha
## 5.9459873 1.0038763 1.0133754 0.9983195 1.0095643 1.0034775 0.9997952
We can expect the following relationships: -Minutes from sunrise: 0.2% decrease in bird detections for every minute after sunrise -Percent grass: 1.3% increase in bird detections per 1% increase in grass cover -Vegetation density: 0.4% increase in bird detections with 1 unit increase in vegetation density -Litter depth: 0.1% increase in bird detections with 1 cm increase in litter depth -Vegetation height: 0.3% increase in bird detections with every 1 cm increase in vegetation height -Site size: 0.02% decrease in bird detections with every 1 ha increase in site size
In conclusion, the factors that impact bird detections include survey start time, the percent cover of grass, and the vegetation density. To maximize detections surveys should be conducted closer to sunrise, in areas with large amounts of grass, and in areas with dense vegetation.