Point count surveys were completed across Tennessee looking for grassland birds. Surveyors examined which characteristics impacted the total number of birds detected at point counts. They measured environmental characteristics including litter depth, vegetation height, vegetation density, and percent cover of grass. Additionally, they measured the size of the site in hectares and the start time of the survey (minutes after sunrise).

What does our data look like?

A summary kable dispaying the median, mean, minimum, and maximum number of detections at surveys.
median mean min max
7 7.89 0 163

Will probably have some outliers in the data. Need to keep on eye out for this.

Global Model

glm.global <- glm(total_individuals ~ litter_avg + veg_ht_avg + vd_vo + pc_grass + site_ha + min_from_sunrise, family = poisson(link = "log"), data = combined.detections)
summary(glm.global) 
## 
## Call:
## glm(formula = total_individuals ~ litter_avg + veg_ht_avg + vd_vo + 
##     pc_grass + site_ha + min_from_sunrise, family = poisson(link = "log"), 
##     data = combined.detections)
## 
## Coefficients:
##                    Estimate Std. Error z value Pr(>|z|)    
## (Intercept)       2.3701619  0.0809845  29.267  < 2e-16 ***
## litter_avg        0.0314573  0.0118662   2.651  0.00803 ** 
## veg_ht_avg       -0.0032496  0.0014786  -2.198  0.02797 *  
## vd_vo             0.0019937  0.0015833   1.259  0.20795    
## pc_grass          0.0098337  0.0046899   2.097  0.03601 *  
## site_ha          -0.0003578  0.0003780  -0.947  0.34387    
## min_from_sunrise -0.0029974  0.0004384  -6.837 8.07e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 1318.8  on 190  degrees of freedom
## Residual deviance: 1241.1  on 184  degrees of freedom
## AIC: 1926.5
## 
## Number of Fisher Scoring iterations: 6
library(easystats)
check_model(glm.global) 


There are 2 outliers in the data which are much higher counts compared to the rest of the dataset. Those outliers are caused by large flocks of migrating shorebirds. Let’s remove those two points since we are targeting species that breed in TN.

What does our data look like now that we have removed the outliers?

check_model(glm.global.2) 

We have a moderate amount of dispersion.

Check for Collinearity

# check for collinearity 
library(GGally)
ggpairs(combined.detections, columns = c(2, 3, 4, 5, 6, 8)) + theme_bw()

Vegetation height and vegetation density appear to be highly correlated. Need to be careful of that in the models.

# check for multicollinearity
library(performance)
performance::check_collinearity(glm.global.2)
## # Check for Multicollinearity
## 
## Low Correlation
## 
##              Term  VIF   VIF 95% CI adj. VIF Tolerance Tolerance 95% CI
##        litter_avg 1.05 [1.00, 2.10]     1.02      0.96     [0.48, 1.00]
##        veg_ht_avg 3.08 [2.49, 3.91]     1.76      0.32     [0.26, 0.40]
##             vd_vo 2.70 [2.19, 3.41]     1.64      0.37     [0.29, 0.46]
##          pc_grass 1.32 [1.16, 1.64]     1.15      0.76     [0.61, 0.86]
##           site_ha 1.05 [1.00, 1.91]     1.03      0.95     [0.52, 1.00]
##  min_from_sunrise 1.06 [1.00, 1.76]     1.03      0.94     [0.57, 1.00]

VIF values are <5 so we do not need to worry about multicollinearity.

Model Selection

# display list of ranked models
model_comparison <- model.sel(models)
model_comparison
## Model selection table 
##    (Int)   pc_grs veg_ht_avg     sit_ha min_frm_snr  ltt_avg    vd_vo df   logLik   AICc delta weight
## 5  1.786 0.014400                         -0.001663          0.004083  4 -546.395 1101.0  0.00  0.503
## 9  1.775 0.014320                         -0.001646 0.009368 0.003981  5 -546.192 1102.7  1.70  0.214
## 8  1.786 0.009952   0.004017 -2.007e-04   -0.001759                    5 -546.328 1103.0  1.98  0.187
## 12 1.764 0.011400   0.002275 -2.136e-04   -0.001696 0.009831 0.002243  7 -544.968 1104.6  3.55  0.085
## 10 1.964                                  -0.001903          0.004756  3 -551.584 1109.3  8.29  0.008
## 1  1.555 0.013260   0.003525                                           3 -553.924 1114.0 12.97  0.001
## 4  1.539 0.013010   0.003424                        0.017080           4 -553.258 1114.7 13.72  0.001
## 11 1.882 0.018270                         -0.001587                    3 -554.482 1115.1 14.09  0.000
## 7  1.554 0.013270   0.003516 -3.265e-04             0.015790           5 -552.879 1116.1 15.08  0.000
## 6  1.884 0.018370            -6.485e-05   -0.001576                    4 -554.468 1117.2 16.15  0.000
## 2  1.683 0.020700            -2.335e-04                                3 -560.233 1126.6 25.59  0.000
## 3  2.129                      1.652e-04   -0.001897                    3 -563.319 1132.8 31.76  0.000
## Models ranked by AICc(x)
# select best supported models 
subset(model_comparison, delta <5)
## Model selection table 
##    (Int)   pc_grs veg_ht_avg     sit_ha min_frm_snr  ltt_avg    vd_vo df   logLik   AICc delta weight
## 5  1.786 0.014400                         -0.001663          0.004083  4 -546.395 1101.0  0.00  0.508
## 9  1.775 0.014320                         -0.001646 0.009368 0.003981  5 -546.192 1102.7  1.70  0.217
## 8  1.786 0.009952   0.004017 -0.0002007   -0.001759                    5 -546.328 1103.0  1.98  0.189
## 12 1.764 0.011400   0.002275 -0.0002136   -0.001696 0.009831 0.002243  7 -544.968 1104.6  3.55  0.086
## Models ranked by AICc(x)

Model 5, 9, and 8 all have delta AICc <2. Model 5 is 2.34 times more likely than model 9. (0.508/0.217=2.34) The global model does have a delta AICc <5.

# calculate variable importance weights 
MuMIn::sw(model_comparison)
##                      min_from_sunrise pc_grass vd_vo litter_avg veg_ht_avg site_ha
## Sum of weights:      1.00             0.99     0.81  0.30       0.27       0.27   
## N containing models:    8               10        4     4          5          6
# model averaging 
model.avg <- model.avg(model_comparison, revised.var = TRUE)
model.avg
## 
## Call:
## model.avg(object = model_comparison, revised.var = TRUE)
## 
## Component models: 
## '146'    '1456'   '1234'   '123456' '46'     '12'     '125'    '14'     '1235'   '134'    '13'     '34'    
## 
## Coefficients: 
##        (Intercept)       vd_vo   pc_grass min_from_sunrise  litter_avg   veg_ht_avg       site_ha
## full      1.782717 0.003135970 0.01318085     -0.001679270 0.002861623 0.0009518711 -5.592074e-05
## subset    1.782717 0.003868789 0.01328673     -0.001681899 0.009518861 0.0034714627 -2.047905e-04

Now we can look at the direction of impact these variables have with the number of bird detections.

summary(model.avg(model_comparison, subset = delta < 2))
## 
## Call:
## model.avg(object = model_comparison, subset = delta < 2)
## 
## Component model call: 
## glm(formula = <3 unique values>, family = poisson(link = "log"), data = combined.detections)
## 
## Component models: 
##      df  logLik    AICc delta weight
## 146   4 -546.40 1101.01  0.00   0.56
## 1456  5 -546.19 1102.71  1.70   0.24
## 1234  5 -546.33 1102.98  1.98   0.21
## 
## Term codes: 
##         pc_grass       veg_ht_avg          site_ha min_from_sunrise       litter_avg            vd_vo 
##                1                2                3                4                5                6 
## 
## Model-averaged coefficients:  
## (full average) 
##                    Estimate Std. Error Adjusted SE z value Pr(>|z|)    
## (Intercept)       1.783e+00  9.039e-02   9.099e-02  19.599  < 2e-16 ***
## vd_vo             3.213e-03  1.866e-03   1.869e-03   1.719 0.085562 .  
## pc_grass          1.346e-02  4.922e-03   4.950e-03   2.720 0.006536 ** 
## min_from_sunrise -1.679e-03  4.647e-04   4.677e-04   3.589 0.000332 ***
## litter_avg        2.221e-03  8.133e-03   8.174e-03   0.272 0.785816    
## veg_ht_avg        8.316e-04  1.688e-03   1.689e-03   0.492 0.622480    
## site_ha          -4.156e-05  1.911e-04   1.921e-04   0.216 0.828719    
##  
## (conditional average) 
##                    Estimate Std. Error Adjusted SE z value Pr(>|z|)    
## (Intercept)       1.7832395  0.0903929   0.0909880  19.599  < 2e-16 ***
## vd_vo             0.0040524  0.0009962   0.0010028   4.041 5.32e-05 ***
## pc_grass          0.0134619  0.0049218   0.0049500   2.720 0.006536 ** 
## min_from_sunrise -0.0016787  0.0004647   0.0004677   3.589 0.000332 ***
## litter_avg        0.0093681  0.0145603   0.0146567   0.639 0.522716    
## veg_ht_avg        0.0040170  0.0009867   0.0009932   4.044 5.25e-05 ***
## site_ha          -0.0002007  0.0003800   0.0003825   0.525 0.599694    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# calculating the exponentiated coefficients of our averaged model
exp(coef(model.avg))
##      (Intercept)            vd_vo         pc_grass min_from_sunrise       litter_avg       veg_ht_avg          site_ha 
##        5.9459873        1.0038763        1.0133754        0.9983195        1.0095643        1.0034775        0.9997952

We can expect the following relationships: -Minutes from sunrise: 0.2% decrease in bird detections for every minute after sunrise -Percent grass: 1.3% increase in bird detections per 1% increase in grass cover -Vegetation density: 0.4% increase in bird detections with 1 unit increase in vegetation density -Litter depth: 0.1% increase in bird detections with 1 cm increase in litter depth -Vegetation height: 0.3% increase in bird detections with every 1 cm increase in vegetation height -Site size: 0.02% decrease in bird detections with every 1 ha increase in site size

In conclusion, the factors that impact bird detections include survey start time, the percent cover of grass, and the vegetation density. To maximize detections surveys should be conducted closer to sunrise, in areas with large amounts of grass, and in areas with dense vegetation.