-
Notifications
You must be signed in to change notification settings - Fork 0
/
ENVS2018 kioloa results markdown 2023.Rmd
1057 lines (796 loc) · 45.9 KB
/
ENVS2018 kioloa results markdown 2023.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "ENVS2018 Environmental Science Field School: Kioloa analyses"
author: "Shoshana Rapley, Saul Cunningham"
date: "20 September 2023"
output:
html_document:
toc: true
number_sections: true
toc_depth: 3
toc_float:
collapsed: true
theme: cerulean
highlight: pygments
code_folding: hide
editor_options:
chunk_output_type: console
knit: (function(inputFile, encoding) { rmarkdown::render(inputFile, encoding = encoding, output_file = file.path(dirname(inputFile), 'ENVS2018 Kioloa Report Results.html')) })
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(warning=FALSE, message=FALSE)
```
# Background
The purpose of the Kioloa week report is to report on the patterns of biodiversity we observed. In particular, you should examine the composition and structure of different plant communities at Kioloa, whether animal (bird) communities also differ, and consider how these patterns relate to the underlying soils and geology.
We are excited to share the whole-class survey results with you all, because there are some really interesting patterns here. Keep in mind when writing the reports that we want you to use this data summary as source material. We are not expecting that you do further, more complex analyses. You can focus on the summary data that we have provided. You may also use the figures/graphs from this markdown in your report.
Also you are not required to report on every piece of data. Given that we have more than 100 species in the list and 14 habitat measures, this would dilute the message of the report. Keep your focus on the data that show patterns of interest. Try to pick a narrative from these data that you find interesting and that you can explain within the constraints of the wordcount.
# Markdown
This document is called a 'markdown'. It's a way to share code, annotations and results all in one place, how good!
If you are interested in the code you can unfold the chunks by clicking the 'Show' icon. There is also an option on the top right of the document called 'Code' that allows you to show or hide all code chunks. Notes in the code itself that start with a hash explain what the next line of code achieves.
```{r}
# Install and load required packages
pacman::p_load(dplyr, gdata, ggplot2, ggrepel, gt, gtExtras, janitor, lme4, lubridate,
rstudioapi, readxl, sjmisc, tidyr, tidyverse, vegan)
```
# Vegetation communities
We collected vegetation data using the biometric tool methodology. This included compositional elements (e.g. overstory trees) and functional elements (e.g. woody debris).
## Floral composition
Let's start by looking at a matrix of the floral composition.
The species composition table has each site as a row, and each column is a plant species. The number 1 means the species was present in the composition survey. In some cases the plant species have been organised into groups that are similar, as follows.
* Block 1 are trees species that occupy the overstory
* Block 2 are mid story species that are recognised as favouring rainforest vegetation (according to https://plantnet.rbgsyd.nsw.gov.au/)
* Block 3 are species in the genus *Acacia*, which occur in the mid story
* Block 4 are ferns
* Block 5 are grasses
* Block 6 are plants in the family *Cyperaceae*, which we call “true sedges”. *Lomandra longifolia* is sometimes grouped with the sedges but is actually a different plant family
The other species (in grey) are not blocked, but plants in the same genus are put next to each other.
Note that as you progress down the rows of the table you have species that are not blocked with similar species, and which are found in relatively few sites. Therefore they are less useful in interpreting the similarities and differences in composition of the plant communities.
I have not put the paddock sites (P1, P2, P3, P4) into the species composition table because we did not survey them this year. However, you could see that they had little or no tree cover (ie only scattered *Eucalyptus tereticornis*) no mid story shrubs, and a ground layer dominated by grasses including introduced species that have been sown for livestock grazing.
```{r}
# read in the data
comp_data <- read_excel("data/ENVS2018 kioloa flora composition 2023.xlsx", sheet = 2)
# create a pretty table using the package 'gt'
comp_table <- gt(comp_data) %>%
tab_options(data_row.padding = px(1)) %>%
# change font size
tab_style(style = cell_text(size = px(12)), locations = cells_column_labels()) %>%
tab_style(style = cell_text(size = px(10)), locations = cells_body()) %>%
# add header
tab_header(title = "Floral composition",
subtitle = "ENVS2018 Student data Kioloa 2023") %>%
# replace NAs with blanks
tab_style(cell_text(style = "italic"),locations = cells_body(columns = 2)) %>%
sub_missing(missing_text = " ") %>%
# add row colours for plant type (using a 7 colour viridis scale)
tab_style(style = list(cell_fill(color = "#440154CC"), cell_text(color = "white")),
locations = cells_body(columns = 1:2, rows = 1:7)) %>%
tab_style(style = list(cell_fill(color = "#443A83CC"), cell_text(color = "white")),
locations = cells_body(columns = 1:2, rows = 8:13)) %>%
tab_style(style = list(cell_fill(color = "#31688ECC"), cell_text(color = "white")),
locations = cells_body(columns = 1:2, rows = 14:20)) %>%
tab_style(style = list(cell_fill(color = "#21908CCC"), cell_text(color = "white")),
locations = cells_body(columns = 1:2, rows = 21:24)) %>%
tab_style(style = list(cell_fill(color = "#35B779CC")),
locations = cells_body(columns = 1:2, rows = 25:33)) %>%
tab_style(style = list(cell_fill(color = "#8FD744CC")),
locations = cells_body(columns = 1:2, rows = 34:39)) %>%
tab_style(style = list(cell_fill(color = "#EAEAEA")),
locations = cells_body(columns = 1:2, rows = 40:80)) %>%
# change tally colour
tab_style(style = list(cell_fill(color = "lightgrey")),
locations = cells_body(columns = 23))
comp_table
```
What you can see here is the distribution of dominant species across the sites. In the left-most column there is a note on the structural layer the species belongs to (i.e. ground, mid and overstory); in some cases there is an additional note about a detail the plant type as it pertains to the groupings (e.g. rainforest loving, sedge).
The labels OG, RG, RF, P and C were assigned without a detailed analysis of the composition of the plant communities and should not be used as an assumed grouping of sites into communities. Instead I have suggested a grouping of sites in terms of similarity in plant species composition, below.
* OG10, OG5, OG7, OG9 and RF1 have *Corymbia maculata* and *Eucalyptus pilularis* in the overstory, and *Acacia* species in the midstorey. There are ferns in the ground layer, but only one species of *Cyperaceae*. Some ‘rainforest loving’ species are also found here. For this report we can call these sites **“Upper slope eucalypt forest”**
* RF2, RF3, and RF4 have *Backhousia myrtifolia* in the midstorey, plus a few other rainforest loving species. All sites have ferns. Shrubs in the *Notelaea* are absent. RF2 is a bit more similar to the Upper slope eucalypt forest, noting it has some grass and *Gahnia melanocarpa*, which RF3 and RF4 do not. For this report we can call these sites **“Moist gully rainforest”**
* RG1, RG2, RG3, RG4, RG8, RG10 have *Corymbia maculata* in the overstory, but no *Eucalyptus botryoides* or *Casuarina glauca*. All have *Acacia* mid story and *Cyperaceae* in the ground layer. Most sites don’t have ferns, and none have *Livistona* palms or rainforest-loving species. For this report we can call them these sites **“Lower slope eucalypt forest”**
* C1, C2, RG2, and OG1 have *Corymbia maculata* and *Eucalyptus botryoides* in the canopy. They have *Acacia* species in the mid layer. Although they occur toward the coast, they do not have *Casuarina glauca*. For this report we can call it **“Coastal eucalypt forest”**
* C3, C4, and C5 have a canopy of *Casuarina glauca*. There are no *Acacia* in the mid story or ferns in the ground layer. Two of these sites have the palm, *Livistona australis*. Species moderately common in other sites but missing from here include *Hibbertia* species, *Dodonaea* species and *Synoum glandulosum*. For this report we can call these sites **“Coastal Casuarina forest”**
Let's highlight the new community groupings and add headers to make these communities easier to see. The next table is the same as the last table except with colours and headings.
```{r}
comp_table2 <- comp_table %>%
# add column colours for community type (using a 7 colour viridis scale with transparency)
tab_style(style = list(cell_fill(color = "#44015433")),
locations = cells_body(columns = 3:7)) %>%
tab_style(style = list(cell_fill(color = "#41448733")),
locations = cells_body(columns = 8)) %>%
tab_style(style = list(cell_fill(color = "#2A788E33")),
locations = cells_body(columns = 9:10)) %>%
tab_style(style = list(cell_fill(color = "#22A88433")),
locations = cells_body(columns = 11:15)) %>%
tab_style(style = list(cell_fill(color = "#7AD15133")),
locations = cells_body(columns = 16:19)) %>%
tab_style(style = list(cell_fill(color = "#FDE72533")),
locations = cells_body(columns = 20:22)) %>%
# add new community grouping names
tab_spanner(label = "Upper slope eucalypt forest", columns = 3:7) %>%
tab_spanner(label = "Moist gully rainforest", columns = 8:10) %>%
tab_spanner(label = "Lower slope eucalypt forest", columns = 11:15) %>%
tab_spanner(label = "Coastal eucalypt forest", columns = 16:19) %>%
tab_spanner(label = "Coastal Casuarina forest", columns = 20:22)
comp_table2
```
Excellent!
## Biometric values
Now that we have our groupings, let's take a look at the biometric data you collected. We will highlight cells for a few key variables based on their values to make it easier to see what's going on.
Note: paddock data is derived from 2022 (since you didn't collect it this year).
```{r}
# read in data
biometric_data <- read_excel("data/ENVS2018 kioloa biometric 2023.xlsx", sheet = 2)
# make table in gt
biometric_table <- gt(biometric_data) %>%
tab_options(data_row.padding = px(1)) %>%
# change font size
tab_style(style = cell_text(size = px(12)), locations = cells_column_labels()) %>%
tab_style(style = cell_text(size = px(10)), locations = cells_body()) %>%
# add header
tab_header(title = "Biometric data",
subtitle = "ENVS2018 Student data Kioloa 2023") %>%
# add source footnote
tab_source_note(source_note = md("Note: paddock data derived from 2022")) %>%
# format percentages to display %
fmt_percent(columns = 3:13, decimals = 1) %>%
# add data grouping labels
tab_spanner(label = "Percent cover", columns = 3:13) %>%
tab_spanner(label = "Feature counts", columns = 14:16) %>%
# add row colours for community type (using a 7 colour viridis scale with transparency)
tab_style(style = list(cell_fill(color = "#44015433")),
locations = cells_body(rows = 1:5, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#41448733")),
locations = cells_body(rows = 6, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#2A788E33")),
locations = cells_body(rows = 7:8, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#22A88433")),
locations = cells_body(rows = 9:13, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#7AD15133")),
locations = cells_body(rows = 14:17, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#FDE72533")),
locations = cells_body(rows = 18:20, columns = 1:2)) %>%
tab_style(style = list(cell_fill(color = "#EAEAEA")),
locations = cells_body(rows = 21:24, columns = 1:2)) %>%
# colour columns according to values
# Grass > 40%
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = Grass, rows = Grass > .4)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = Grass, rows = Grass < .4)) %>%
# Leaf litter > 30%
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = Litter, rows = Litter > .3)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = Litter, rows = Litter < .3)) %>%
# Midstory > 15%
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = Midstory, rows = Midstory > .15)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = Midstory, rows = Midstory < .15)) %>%
# Midstory + over > 60%
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = Cover, rows = Cover > .6)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = Cover, rows = Cover < .6)) %>%
# CWD length > 100m
tab_style(cell_fill(color = "#F9E3D6"),
locations = cells_body(columns = CWD, rows = CWD > 100)) %>%
tab_style(cell_fill(color = "#F2F2F2"),
locations = cells_body(columns = CWD, rows = CWD < 100)) %>%
# add footnotes for the above
tab_footnote(footnote = "Highlighted grass cover > 40%",
locations = cells_column_labels(columns = Grass)) %>%
tab_footnote(footnote = "Highlighted leaf litter > 30%",
locations = cells_column_labels(columns = Litter)) %>%
tab_footnote(footnote = "Highlighted midstory cover > 15%",
locations = cells_column_labels(columns = Midstory)) %>%
tab_footnote(footnote = "Highlighted combined mid and overstory cover > 60%; note percentage can be over 100%",
locations = cells_column_labels(columns = Cover)) %>%
tab_footnote(footnote = "Highlighted coarse woody debris where the total length > 100m",
locations = cells_column_labels(columns = CWD))
biometric_table
```
## Visualisation
That's given us a good overview of how the biometric features vary by community type. Let's now summarise and visualise the data as boxplots. A boxplot shows the mean (middle line of the box), quartiles (the box and whiskers) and outliers (dots) and are useful for comparing quantitative data among groups.
```{r}
# remove white spaces in names so can work easily in ggplot
biometric_data2 <- clean_names(biometric_data) %>%
# convert percentages to 0-100
mutate(across(3:13, ~ .x * 100)) %>%
# order the factors (by apperance) to match the table colour palette
mutate(community = as_factor(community))
# break into ground cover data and convert to long format for graphing
cover <- select(biometric_data2, -c(11:16)) %>%
pivot_longer(cols = 3:10, names_to = "feature", values_to = "percent") %>%
mutate(feature = as_factor(feature))
# create box plot for cover
ggplot(cover) +
geom_boxplot(aes(community, percent, fill = community)) +
scale_fill_viridis_d() +
facet_wrap(~feature) +
theme_classic() +
theme(axis.text.x=element_blank())
```
Good. We can see there are consistently low values for cryptogam, rock and bare ground, so let's remove those to make the others easier to read.
```{r}
cover2 <- cover %>%
filter(feature %in% c("cover", "forb", "grass", "litter", "low_shrub"))
# create box plot for cover
ggplot(cover2) +
geom_boxplot(aes(community, percent, fill = community)) +
scale_fill_viridis_d() +
facet_wrap(~feature) +
theme_classic() +
theme(axis.text.x=element_blank())
```
Interesting!
Now let's look at the mid and overstory cover, plus the combined mid and overstory (called in the plot 'cover'). The combined mid and overstory metric gives us an indication of how much light will get through to the floor (which in turn affects which plants can grow there).
```{r}
# break into tree cover data and convert to long format for graphing
tree <- select(biometric_data2, -c(3:10, 14:16)) %>%
pivot_longer(cols = 3:5, names_to = "feature", values_to = "percent") %>%
mutate(feature = as_factor(feature))
# create box plot for cover
ggplot(tree) +
geom_boxplot(aes(community, percent, fill = community)) +
scale_fill_viridis_d() +
facet_wrap(~feature) +
theme_classic() +
theme(axis.text.x=element_blank())
```
Now let's look at the count biometric data: trees, trees with hollows and length of CWD.
```{r}
# break into cover data and convert to long format for graphing
count <- select(biometric_data2, -c(3:13)) %>%
pivot_longer(cols = 3:5, names_to = "feature", values_to = "count") %>%
mutate(feature = as_factor(feature))
# create box plot for count
ggplot(count) +
geom_boxplot(aes(community, count, fill = community)) +
scale_fill_viridis_d() +
facet_wrap(~feature) +
theme_classic() +
theme(axis.text.x=element_blank())
```
Great! You can use these graphs in your report.
## Summary Statistics
And we'll calculate the following summary statistics for all features: mean and standard deviation. You can use these numbers in the report.
```{r}
summary_stats <- biometric_data %>% clean_names() %>%
pivot_longer(cols = 3:16, names_to = "feature", values_to = "value") %>%
group_by(community, feature) %>%
summarise(mean = mean(value),
sd = sd(value)) %>%
pivot_wider(names_from = 2, values_from = 3:4) %>%
sjmisc::rotate_df(cn=TRUE) %>% slice(-1) %>%
rownames_to_column() %>%
rename("Stat" = 1) %>%
separate_wider_delim(Stat, "_", names = c("Stat", "Variable"),
too_many = "merge") %>%
arrange(Variable) %>%
mutate_at(c(3:8), as.numeric)
summary_stats_table <- gt(summary_stats) %>%
tab_options(data_row.padding = px(1)) %>%
# change font size
tab_style(style = cell_text(size = px(12)), locations = cells_column_labels()) %>%
tab_style(style = cell_text(size = px(10)), locations = cells_body()) %>%
# add header
tab_header(title = "Biometric data summary statistics",
subtitle = "ENVS2018 Student data Kioloa 2023") %>%
# add source footnote
tab_source_note(source_note = md("Note: paddock data derived from 2022")) %>%
# drop trailing zeros
fmt_number(drop_trailing_zeros = TRUE) %>%
# format percentages to display %
fmt_percent(rows = c(1,3,5,9,11,15,17,19,21,23,25), decimals = 1) %>%
# add column colours
tab_style(style = cell_fill(color = "#44015433"),
locations = cells_body(columns = 8)) %>%
tab_style(style = cell_fill(color = "#41448733"),
locations = cells_body(columns = 6)) %>%
tab_style(style = cell_fill(color = "#22A88433"),
locations = cells_body(columns = 4)) %>%
tab_style(style = cell_fill(color = "#7AD15133"),
locations = cells_body(columns = 5)) %>%
tab_style(style = cell_fill(color = "#FDE72533"),
locations = cells_body(columns = 3)) %>%
tab_style(style = cell_fill(color = "#EAEAEA"),
locations = cells_body(columns = 7)) %>%
# arrange columsn per other tables
cols_move(columns = 8, after = 2) %>%
cols_move(columns = 6, after = 3) %>%
cols_move(columns = 5, after = 4) %>%
cols_move(columns = 3, after = 7) %>%
cols_move(columns = 4, after = 6) %>%
cols_move_to_end(Paddock)
summary_stats_table
```
# Bird surveys
We conducted expert bird surveys of all the plots. The observer (Sho, Julian or Belinda) recorded all birds seen and heard in a five-minute period in the following distance categories: <25m, 25-50m, 50-100m, >100m and overhead. We also asked you to estimate how many species you could identify during the survey.
## Site visits
Let's start by visualising and summarising the data to get our heads around it. Each site was surveyed twice, so let's look at each visit for each site.
```{r}
# read in the data and clean names
birds_expert <- read_excel("data/ENVS2018 kioloa bird data 2023.xlsx", sheet = 1) %>%
clean_names() %>%
# replace NAs with zeros
replace(is.na(.), 0) %>%
# sum counts by species for inside (0-50m) and outside (50+m) the site, and total
rowwise() %>%
mutate(count_in = sum(x25 + x25_50),
count_out = sum(x50_100 + x100 + overhead),
count_total = sum(count_in + count_out)) %>%
select(!c(8:12)) %>%
mutate(present_in = ifelse(count_in>0, 1, 0),
present_out = ifelse(count_out>0, 1, 0),
present_total = ifelse(count_total>0, 1, 0))
# summarise by site for richness
birds_richness <- birds_expert %>% group_by(site, visit) %>%
mutate(visit = as.factor(visit)) %>%
summarise(richness_in = sum(present_in>0),
richness_out = sum(present_out>0),
richness_total = sum(present_total>0))
# plot site visits
ggplot(birds_richness)+
geom_point(aes(site, richness_total, colour = visit)) +
theme_minimal() +
theme(axis.text.x = element_text(angle=45))
# calculate differences
birds_alpha <- birds_richness %>% group_by(site) %>%
mutate(diff = abs(richness_total - lag(richness_total))) %>%
na.omit()
alpha_diff <- data.frame(mean = mean(birds_alpha$diff),
min = min(birds_alpha$diff),
max = max(birds_alpha$diff),
stdev = sd(birds_alpha$diff))
print(alpha_diff)
```
The difference between visits was on average 4 bird species.
How about the difference in diversity detected between each visit? I.e. the beta diversity.
```{r}
birds_beta <- data.frame()
sites <- unique(birds_expert$site)
for (i in 1:24) {
visit1 <- filter(birds_expert, site == sites[i] & visit == 1)
visit2 <- filter(birds_expert, site == sites[i] & visit == 2)
diff_a <- setdiff(visit1$common_name, visit2$common_name)
diff_b <- setdiff(visit2$common_name, visit1$common_name)
out <- data.frame(site = sites[i], beta = length(diff_a) + length(diff_b))
birds_beta <- rbind(birds_beta, out)
}
beta_diff <- data.frame(mean = mean(birds_beta$beta),
min = min(birds_beta$beta),
max = max(birds_beta$beta),
stdev = sd(birds_beta$beta))
differences <- rbind(alpha_diff, beta_diff) %>%
add_column(type = c("alpha", "beta"), .before = "mean")
print(differences)
```
Although the total difference between the visits was small, the beta diversity was greater! There was always differences in the diversity between visits (the minimum is 7) and the mean beta diversity was 14.
This is part of why we do repeat visits to site- to capture the diversity. Visits may differ by time of day (e.g. 6am or 7am) or weather.
## Species richness
From here on we will use the total richness detected (i.e. total number of unique species across both visits).
Let's summarise the species richness by sites. We will just do this for birds 'in' the site (i.e. within 50m, since then we can compare these data with the vegetation). Let's colour these by floristic community.
```{r}
# filter to birds within 50m
birds_insite <- birds_expert %>%
filter(present_in>0) %>%
group_by(site) %>%
summarise(richness = length(unique(common_name)))
# format for us in the package 'vegan'
birds_vegan <- birds_expert %>%
select(c(4,6,8)) %>%
pivot_wider(names_from = common_name, values_from = count_in, values_fn = sum, values_fill = 0) %>%
column_to_rownames(var = "site")
# create table of sites and communities
communities <- data.frame(site = c("OG10", "OG9", "OG7", "OG5", "RF1",
"RF2", "RF3", "RF4",
"RG10", "RG1", "RG3", "RG4", "RG8",
"C1", "C2", "OG1", "RG2",
"C3", "C4", "C5",
"P10", "P1", "P2", "P3"),
community = c(rep("Upper slope eucalypt forest", 5,),
rep("Moist gully rainforest", 3),
rep("Lower slope eucalypt forest", 5),
rep("Coastal eucalypt forest", 4),
rep("Coastal Casuarina forest", 3),
rep("Paddock", 4))) %>%
mutate(community = as_factor(community))
# assign communities to bird data
birds_insite2 <- left_join(communities, birds_insite) %>%
mutate(community = as_factor(community)) %>%
arrange(community)
# set the plots in the right order for plotting
birds_insite2$site <- factor(birds_insite2$site, levels = birds_insite2$site)
# plot by community
ggplot(birds_insite2) +
geom_boxplot(aes(community, richness, fill = community))+
scale_fill_viridis_d()+
theme_minimal() +
theme(axis.text.x = element_blank())
# summary statistics
birds_summary <- birds_insite2 %>% group_by(community) %>%
summarise(mean_richness = mean(richness),
stdev = sd(richness))
print(birds_summary)
# test if there is a difference in richness by community
richness <- specnumber(birds_vegan)
sp_aov <- aov(richness ~ community, data = communities)
anova(sp_aov)
```
Interesting! The moist gully rainforest had the highest species richness, with an average of 15.3 species. The paddock (unsuprisingly) had the lowest. There is no significant difference in bird species richness between the community types.
## Beta diversity
But as we know, richness doesn't tell us everything. What about beta diversity between the sites and communities?
```{r}
# select all birds in sites
species <- birds_expert %>%
filter(present_in==1) %>%
# add community
left_join(communities)
# for loop of birds only found in that site
sites_beta <- data.frame()
sites <- unique(birds_expert$site)
for (i in 1:24) {
site_test <- species %>% filter(site == sites[i])
site_other <- species %>% filter(!site == sites[i])
diff_site <- setdiff(site_test$common_name, site_other$common_name)
if(length(diff_site) == 0){
next
}
out <- data.frame(site = sites[i], unique = diff_site)
sites_beta <- rbind(sites_beta, out)
}
print(sites_beta)
```
Cool. Half of the sites had at least one unique bird species found no where else. The site with the highest count of unique birds was P1 with Brown Quail, Cattle Egret and Willie Wagtail.
Let's now look at the unique diversity across the floristic communities.
```{r}
# for loop of birds only found in that community
comm_beta <- data.frame()
comm <- unique(communities$community)
for (i in 1:6) {
comm_test <- species %>% filter(community == comm[i])
comm_other <- species %>% filter(!community == comm[i])
diff_comm <- setdiff(comm_test$common_name, comm_other$common_name)
if(length(diff_site) == 0){
next
}
out <- data.frame(community = comm[i], unique = diff_comm)
comm_beta <- rbind(comm_beta, out)
}
# summarise unique birds
comm_beta_sum <- comm_beta %>%
mutate(community = as_factor(community)) %>%
group_by(community) %>%
summarise(unique_species = length(unique))
# plot unique birds by community
ggplot(comm_beta_sum)+
geom_point(aes(unique_species, community, colour = community), size = 5) +
scale_colour_viridis_d()+
xlim(0,11)+
theme_minimal()
# print the unique birds
print(comm_beta)
```
Very cool! There were a total of 26 bird species only found in one community type. The lower slope eucalypt forest had the greatest number of unique species (n=8) followed by the paddock (n=7).
We only detected Mistletoebird in the lower slope eucalypt forest. As the name suggests, Mistletoebirds specalise on mistletoe fruit. Mistletoes are hemiparasitic plants that grow on the branches of other plants. There are approximately 100 species of mistletoe in Australia and these often have only one or two host trees. Mistletoes are an important element of eucalypt forests and woodlands because they provide fruit (for frugivorous birds) and, more importantly, they provide pockets of cover that are good for nesting in. Of the Australian birds that construct nests in trees, roughly two-thirds of them will preferentially nest in mistletoe. Mistletoe can have a bad reputation in popular culture because of the misconception that they kill trees. In some cases, mistletoe loads become high and this places additional water stress on the host tree; but the underlying cause of this issue is habitat fragmentation. It is possible that we detected Mistletoebird in RG4 because of the presence of Mistletoe.
Despite the apparent 'lack' of 'habitat' in the paddocks, there are a number of birds that are found in the paddock and nowhere else. Different birds have different requirements, some of which include open grassland areas and some are resilient to degraded landscapes. For example, the Australasian Pipit is a grassland insectivore that is mostly usually on the ground in open pasture or native grassland. It nests on the ground in tussocks.
We only found Brown Cuckoo-Dove in the rainforest, which is unsurprising since they are a rainforest species (as well as wet sclerophyll) especially found along creeks and rivers. They are largely frugivores (eating fruit and berries, but also seeds), and can eat (and disperse) hard to digest fruits. They are an important fruit disperser for rainforest plants.
These are just a couple of examples of how unique species relate to the vegetation and landscape. You could look into more of the unique species' ecology for your report.
## Diversity indices
We can also calculate indices for diversity aside from richness. For instance, Simpsons's diversity index takes into account abundance as well as richness. Why do we care about abundance? Some species are not evenly distributed in the landscape, even if they are found in more than one place. This metric helps us take that into account.
```{r}
# calculate simpson's diversity index for each site
simpson <- as.data.frame(diversity(birds_vegan, index = "simpson")) %>%
rownames_to_column() %>%
rename(site = 1,
simpsons = 2) %>%
left_join(communities)
simpson2 <- diversity(birds_vegan, index = "simpson")
# plot simpsons by community
ggplot(simpson) +
geom_boxplot(aes(community, simpsons, fill = community))+
scale_fill_viridis_d()+
theme_minimal() +
theme(axis.text.x = element_blank())+
ylim(c(0.6, 1))
# test difference
simpson_aov <- aov(simpson2 ~ community, data = communities)
anova(simpson_aov)
```
Note the y-axis is limited to a minimum of 0.6 to make this easier to read.
The moist gully rainforest had the highest average Simpson's diversity index, which means it is more biodiverse in terms of both species richness and abundance. There was no significant in Simpson's diversity index among the communities.
## Bird communities
**(This section is interesting but you don't have to include it in your report)**
So far we have looked at the bird richness by the floristic communities. But what about the bird communities? That is, which species tend to be together due to shared habitat features they prefer (or avoid).
We'll use a constrained ordination to determine bird communities. An ordination examines communities across space and a contrained ordination examines communities across specific environmental variables (in our case, biometric data). The type we are doing is called 'Canonical Correspondence Analysis' (CCA).
For more information on how this works you can use the same tutorial I did to produce this: https://rpubs.com/an-bui/vegan-cheat-sheet
```{r}
# exclude paddocks because way out on their own
birds_vegan2 <- birds_vegan %>%
rownames_to_column() %>%
filter(!rowname %in% c("P1","P2","P3")) %>%
column_to_rownames()
biometric_data2 <- filter(biometric_data, !Site %in% c("P1","P2","P3"))
# CCA
birdCCA <- cca(birds_vegan2 ~ Forb + Grass + Litter + Midstory + Overstory +
Cover + Trees + Hollow + CWD, data = biometric_data2)
# extracting the bird community data
CCA_bird <- as.data.frame(birdCCA$CCA$v) %>%
rownames_to_column()%>%
rename(bird = rowname)
# plot
ggplot() +
geom_point(data = CCA_bird, aes(x = CCA1, y = CCA2), color = "darkblue")+
geom_text_repel(data = CCA_bird, aes(x = CCA1, y = CCA2, label = bird),
nudge_y = -0.05) +
theme_minimal()
```
A note: I've excluded the paddocks from this ordination. The paddock birds and biometrics were so distinct that it made it hard to see what was going on in the vegetated sites.
Awesome! Let's read this figure. You can essentially ignore the axes here (arbitrary numbers in ordination space). What we are interested in is the grouping of birds. What this figure shows us is how birds cluster in the landscape according to the vegetation features (at least the ones we measured). And it looks pretty good! Not all the birds are labelled because the ones in the middle are so overlapping (read: likely to co-occur) that they didn't get shown.
There are plenty of groupings I would expect to see here, for example Crimson Rosella, Galah, Laughing Kookaburra, Red Wattlebird and Superb Fairywren is a classic assemblage (and usually ones you see in places with human interference because all of those birds are resilient to modified and urban areas).
Likewise, Gang-Gang Cockatoo, Superb Lyrebird and Eastern Shriketit are a grouping that make sense because they are more specific in their habitat needs and were only found higher up the hill into the upper forest (the Rainbow Lorikeet is an odditity here, they only landed in the site once, so this was more due to chance).
As a final example, the cluster of birds around Yellow Thornbill is a likely community of the Casuarina forest (largely because there are more edges in this community type and the birds of this group like edges).
## Student estimates of bird richness
And out of interest (you don't have to include it in your report), how well did you do at estimating the bird richness at each site? Let's compare average expert and student counts. These are total (i.e. all within 50m and >50m).
```{r}
# read in data
birds_student <- read_excel("data/ENVS2018 kioloa bird data 2023.xlsx", sheet = 4) %>%
clean_names() %>%
mutate(estimate = as.numeric(estimate)) %>%
select(3:4) %>%
rename(student = 2) %>%
pivot_longer(2, names_to = "observer", values_to = "birds")
# combine expert and student
birds_compare <- birds_richness %>%
select(c(1,5)) %>%
rename(expert = 2) %>%
pivot_longer(2, names_to = "observer", values_to = "birds") %>%
rbind(birds_student)
# plot expert and student
ggplot(birds_compare)+
geom_boxplot(aes(site, birds, fill = observer), alpha = .8,
position=position_dodge(width=.3)) +
theme_minimal()+
theme(axis.text.x = element_text(angle=45))
```
In general, you underestimated the total number of species- which is to be expected! It takes a lot of practice and patience to learn bird calls. I hope you enjoyed coming out on the bird surveys with us and I appreciated your enthusiasm despite the early start!
## Acoustic data
**(This section is interesting but you don't have to include it in your report)**
You also surveyed the sites using acoustic meters. We collected data for two hours around dusk and dawn for one night. One of the benefits of acoustic meters is they can help sample diversity that otherwise would be missed in the time constraints of a five-minute count. The other benefit was to teach you how to use this kit and read sonograms! Let's see what we detected. First we'll look at the effect of time of day.
```{r}
# read in
acoustic <- read_excel("data/ENVS2018 kioloa acoustic data 2023.xlsx", sheet = 1) %>%
clean_names()
acoustic_rich <- acoustic %>% group_by(site,time) %>%
summarise(richness = length(unique(common_name))) %>%
left_join(communities) %>%
mutate(type = "acoustic")
# plot by site and dusk/dawn
ggplot(acoustic_rich)+
geom_boxplot(aes(time, richness, fill = community))+
facet_wrap(~community)+
scale_fill_viridis_d()+
theme_minimal()
# test difference dusk and dawn
tod <- aov(richness ~ time, data = acoustic_rich)
anova(tod)
```
There was no difference in the richness recorded between dusk and dawn. Note: only one paddock site (P10) had an acoustic meter.
How did the richness detected on the acoustic meters compare with the bird surveys?
```{r}
# combine acoustic and expert bird counts
acoustic_compare <- select(birds_richness, c(1,3)) %>%
rename(richness = 2) %>%
mutate(type = "expert") %>%
left_join(communities) %>%
full_join(select(acoustic_rich, !2))
# plot comparison
ggplot(acoustic_compare) +
geom_boxplot(aes(type, richness, fill = community))+
scale_fill_viridis_d()+
facet_wrap(~community)+
theme_minimal()
# test difference
aco <- aov(richness ~ type, data = acoustic_compare)
anova(aco)
```
Wow! We detected significantly more bird species on the acoustic meters than in the 5-minute expert counts (only considering the species within the 50m radius).
A caveat here is that we simply used 'loudness' of the acoustic data as a proxy for distance, so the two aren't directly comparable. For example, the Wood Duck heard on the acoustic meter might have been further away than 50m because their 'waaaaa' call carries. We could have tested this by playing a known volume tone at 50m from the meters to figure out how 'loud' something needs to be to be in the site.
You can imagine the addition of time to a bird survey as a curve of diminishing returns- you add more time, you get more species, but at a decaying rate of increase over time. So we could achieve higher richness counts by doing say, a 15-minute or even 1-hour bird survey (imagine that!), but that would not be a good use of people-time. This is where acoustic meters can come in handy.
Another important use of acoustic meters is to detect rare things that don't call often. If you were tasked with finding a Night Parrot, you would have to spend 1000s of hours in the remote spinifex desert to potentially hear a short call; which is why agencies like Parks and NGOs such as AWC instead deploy acoustic meters and then use AI to scan through in search of the target species.
How about the beta diversity between the acoustic and expert counts?
```{r}
# for loop of birds only found in that community
aco_beta <- data.frame()
species2 <- species %>% filter(present_in==1)
for (i in 1:24) {
uni_exp <- species2 %>% filter(site == sites[i])
uni_aco <- acoustic %>% filter(site == sites[i])
diff1 <- setdiff(uni_exp$common_name, uni_aco$common_name) %>%
as.data.frame() %>%
rename(unique = 1) %>%
mutate(source = "expert",
site = sites[i])
diff2 <- setdiff(uni_aco$common_name, uni_exp$common_name)%>%
as.data.frame() %>%
rename(unique = 1) %>%
mutate(source = "acoustic",
site = sites[i])
aco_beta <- rbind(aco_beta, diff1, diff2)
}
aco_beta2 <- aco_beta %>%
group_by(site, source) %>%
summarise(beta = length(unique(unique))) %>%
left_join(communities)
# plot difference
ggplot(aco_beta2)+
geom_boxplot(aes(source, beta, fill=community))+
scale_fill_viridis_d()+
facet_wrap(~community)+
theme_minimal()
# test difference
aco_b <- aov(beta ~ source, data = aco_beta2)
anova(aco_b)
```
While we found unique species between both methods, there were significantly more unique species found on the acoustic meters that we didn't detect during the expert counts (if we only consider expert counts within the site- this is important because if we include birds outside of the sites we actually detect more birds with the expert surveys. The people can hear further than the units).
Aside from the study sits, were there any unique species found only on the expert counts or acoustic meters across the whole campus?
```{r}
b_exp <- setdiff(species2$common_name, acoustic$common_name) %>%
as.data.frame() %>%
rename(unique = 1) %>%
mutate(source = "expert")
b_aco <- setdiff(acoustic$common_name, species2$common_name)%>%
as.data.frame() %>%
rename(unique = 1) %>%
mutate(source = "acoustic")
aco_beta3 <- rbind(b_exp, b_aco)
print(aco_beta3)
```
Cool! Some of these species were detected during the bird surveys but not within the site itself (i.e. heard at a distance instead of within 50m). For example, we heard King Parrots during the expert counts but not in the sites.
What about out of the sites? Let's sum up the total richness of birds recorded across the whole campus from both methods
```{r}
# all bird species
species_expert <- data.frame(species = unique(birds_expert$common_name),
type = "expert")
species_acoustic <- data.frame(species = unique(acoustic$common_name),
type = "acoustic")
species_all <- rbind(species_expert, species_acoustic) %>%
mutate(type = ifelse(duplicated2(species, bothWays=TRUE)==TRUE,"both",type)) %>%
#remove the frog
filter(!species == "Common Eastern Froglet") %>%
distinct()
# make gt table
species_all %>%
group_by(type) %>%
gt() %>%
tab_options(data_row.padding = px(1)) %>%
# change font size
tab_style(style = cell_text(size = px(12)), locations = cells_column_labels()) %>%
tab_style(style = cell_text(size = px(10)), locations = cells_body())
species_all2 <- species_all %>%
group_by(type) %>%
summarise(count = length(species))
# plot
ggplot(species_all2,aes(1,count,fill=type))+
geom_bar(stat="identity")+
scale_fill_viridis_d()+
theme_minimal()+
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
```
We detected a total of 76 bird species. Of these, 47 were detected on both the acoustic meters and in the bird surveys, where 23 were only detected on the expert counts and 6 in the acoustics.
Species that were uniquely recorded on the acoustic meters (and not in the total expert counts including out of sites) were: Bassian Thrush, Bell Miner, Crescent Honeyeater, Rufous Fantail, Shining Bronze-Cuckoo and White-bellied Sea Eagle (which have a nest in site RG1 were they were recorded). These 6 species were added to the total site richness thanks to the acoustic meter!
# Other wildlife results
You don't have to report on this section, but we wanted to show you some interesting findings.
## Cameras
Let's have a look at how our camera results compared with previous years.
```{r}
# groups
mammals <- c("Eastern Grey Kangaroo", "Swamp Wallaby", "Red-necked Wallaby",
"Short-beaked Echidna", "Common Wombat",
"Brush-tailed Possum", "Krefft's Glider",
"Black Rat", "Bush Rat", "Rat", "Swamp Rat", "House Mouse",
"Long-nosed Bandicoot", "White-footed Dunnart", "Brown Antechinus",
"Cat", "Red Fox", "Pig" , "Sheep", "Cow")
# read in data
camera <- read_excel("data/ENVS2018 kioloa data 2017-2023.xlsx", sheet = 2) %>%
clean_names() %>%
distinct() %>%
#remove skinks
filter(!common_name == "Delicate Sun-skink") %>%
mutate(taxa = ifelse(common_name %in% mammals, "mammal", "bird")) %>%
filter(taxa == "mammal") %>%
mutate(common_name = factor(common_name, levels = mammals))
# plot mammals
ggplot(camera)+
geom_point(aes(year, common_name, colour = common_name), size = 4)+
scale_colour_viridis_d()+
theme_minimal()+
theme(legend.position= "none")+
theme(axis.title.y=element_blank())+
scale_y_discrete(limits=rev)+
scale_x_continuous(n.breaks = 7)+
coord_fixed(ratio = 0.6)
```
Here we can see which species were seen in which years. The occurrence of these species might relate to environmental conditions (e.g. drought). Note: the cameras were not placed in rainforest and coastal locations in 2017-2020, only upper and lower slope eucalypt forest and paddock. Also note there was a bushfire in the 2019/2020 Summer. No data was collected in 2021 (covid-19).
## Herpetofauna
Let's do the same thing with the reptiles and frogs (from tin/tile and opportunistic records).
```{r}
# read in data
herpsp <- c("Eastern Small-eyed Snake", "Red-bellied Black Snake", "Diamond Python",
"Jacky Dragon", "Blue-tongued Lizard", "Delicate Sun-skink",
"Weasel Skink", "Skink sp", "Whistling Tree Frog", "Jervis Bay Tree Frog",
"Eastern Dwarf Tree Frog", "Southern Green Stream Frog",
"Screaming Tree Frog", "Striped Marsh Frog", "Tyler's Toadlet",
"Common Eastern Froglet", "Haswell's Froglet")
herps <- read_excel("data/ENVS2018 kioloa data 2017-2023.xlsx", sheet = 1) %>%
clean_names() %>%
select(2:3) %>%
# remove mice
filter(!common_name == "House Mouse") %>%
distinct() %>%
mutate(common_name = factor(common_name, levels = herpsp))
# plot
ggplot(herps)+
geom_point(aes(year, common_name, colour = common_name), size = 4)+
scale_colour_viridis_d()+
theme_minimal()+
theme(legend.position= "none")+
theme(axis.title.y=element_blank())+
scale_y_discrete(limits=rev)+
scale_x_continuous(n.breaks = 7)+
coord_fixed(ratio = 0.6)
```
This is really an indication of the survey effort for herpetofauna! In 2018-2020 we only had tin/tile (AKA substrate) checks (which were done by students instead of expert observers) whereas in 2022 we started recording incidental sightings and doing the pond frog surveys.
I wonder how these results compare with the frog diversity in the pre-readings?
## Birds over time
Lastly, let's look at how the bird diversity has changed over the years (remember you do not need to include this in your report). We'll only be able to compare over the sites labelled 'OG', 'RG' and 'P' since these are consistent over all years ('RF' and 'C' added in 2022).
```{r}
# read in data
sites_consistent <- c("P3", "RG2", "RG1", "OG7", "RG10", "OG1", "OG5", "OG10", "OG9", "RG8", "P10", "RG3", "P2", "P1")