Helpful Tips

Useful keyboard shortcuts these work for windows, they may be different for Macs

Cntrl + Z functions like the Undo button in word or excel. Undoes whatever action you just did
Cntrl + Alt + I inserts a new code chunk
Cntrl + Enter runs the code that your courser is on (good for if you want to run small sections of code within a larger code chunk)
Cntrl + F allows you to find things within the document (can be useful if you want to replace the name of an object everywhere in the document at the same time)
Cntrl + Shift + O opens a document outline that you can use to quickly navigate between titled sections of your document
Cntrl + C to copy, Cntrl + V to paste
Cntrl + S to save

Use rm(list = ls()) to Clear workspace (if desired/needed). If run correctly will clear everything from your environment tab. Typically it’s good to do this at the start of each of your sessions. Defiantly should do it if you open a new Rmd file.

Use thisgetwd() to check working directory. Will most likely be whatever folder your Rmd file is in. If you need to change it for some reason use setwd()

Downloading/loading packages

Use install.packages to download and install a new R package. You only need to do this once, then you can just use library() to load it and be able to call functions within it. and load the package libraries.

TIP: It’s generally considered best practice to keep the commands to call all the packages you need to use at the top of the document

tidyr, dplyr, and readr are probably the most commonly used packages for data wrangling/cleaning.ggplot2 is very common for creating publication quality figures.

Note: The order which you install packages is important. Functions with the same names will be masked by packages loaded in after the previous.
Ex: The select() function from MASS will be masked by select() from dplyar

library("leaflet")
library("tidyr")
library("dplyr")
library("readr")
library("ggplot2")
library("ggpubr")
library("vegan")


#library("extrafont")
#font_import()
#fonts()

Code for merging datasets

MyMerge       <- function(x, y){
  df            <- merge(x, y, all = TRUE)
  rownames(df)  <- df$Row.names
  df$Row.names  <- NULL
  return(df)
}

Data Wrangling

Load in data and merge datasets:

Load in each years .csv files containing the data

#TH_2018 <- read.csv(file.choose(), header = TRUE)

relcov_08_20 <- read.csv("../Data/LTER_06MAT_relcov_2008_2020.csv", header = TRUE)

relcov_21 <- read.csv("../Data/LTER_06MAT_relcov_2021.csv", header = TRUE) %>% select(-site)

Look at the data to see what it contains
Make sure the column names in each data set match so they can be merged together

head(relcov_21) #opens first 6 rows
tail(relcov_21) #opens last 6 rows
summary(relcov_21)
str(relcov_21) #tells us what data types (numbers, factors, etc) are in the data frame

Then merge the years together…all the names of columns need to be exactly the same name (capitalization and spaces included)

relcov_08_21           <- Reduce(MyMerge, list(relcov_21, relcov_08_20))

write.csv(relcov_08_21, file = "../Derived CSVs/LTER_06MAT_relcov_allyrs.csv")

Make sure the # of Obs in the 1st data set + the 2nd match the # of obs in the merged data set

Remove any data that you will not need for the analysis

In this case we can remove all the data from block 4

relcov_08_21 <- relcov_08_21 %>% filter(block != "4")

Fix naming convention errors

Rename “rel.cov” column to “rel_cov”

relcov_08_21 <- relcov_08_21 %>%
  rename(rel_cov = rel.cov)

Check unique vales in each column to make sure that there are not naming errors

unique(relcov_08_21$species)

If there are mistakes then rename them using the code below and recheck unique values again to make sure the recode worked. make sure you didn’t loose any observations

#fix naming convention errors 
relcov_08_21$species <- relcov_08_21$species %>% 
  recode("bare " = "bare", "Litter" = "litter", "Moss" = "moss", "Lichen " = "lichen","Lichen" = "lichen", "Fr boil" = "frost boil", "St. D. Sal pul" = "St. D. Sal", "Ev litter" = "Eri vag litter", "Grass ex." = "grass")

#simplify things identified to species level  that shouldn't be 
  #St.D.
relcov_08_21$species <- relcov_08_21$species %>% 
recode("St. D. Led Pal" = "St. D", "Dead Evrg" = "St. D", "Dead Bet" = "St. D", "St. D. Bet" = "St. D", "St. D. Sal" = "St. D", "St. D. Bet." = "St. D", "St. D. Evrg" = "St. D")

  #litter
relcov_08_21$species <- recode(relcov_08_21$species, "Eri vag litter" = "litter")

  #combine species to genus level for forbs and Calamagrostis
relcov_08_21$species <- relcov_08_21$species %>% recode("Ped lap" = "Ped sp.", "Pet fri"="Pet sp.", "Ped lap"="Ped sp.", "Ste lon" = "Ste sp.", "Ste edw"= "Ste sp.", "Cal can"="Cal sp.", "Cal lap"="Cal sp.")

Sum functional cover across species within quadrats. This likely wont change the relative cover for most species unless they were listed more than once in a quadrat but it’s good to do to insure accuracy after fixing naming conventions.

It will defiantly change values for things like St.D, which was originally identified to species level and is now just St. D

relcov_08_21 <- (relcov_08_21) %>% group_by(year, block, plot, quad, species) %>% summarise_at(vars(rel_cov), list(rel_cov = sum ), na.rm = TRUE)

Get overall average cover by species for the CT plots (used for table describing what species were assigned to each functional group)

species.avg.ct <- (relcov_08_21) %>%
  filter(plot== "CT")%>%
  group_by(species, plot)%>% #You don't need to list plot here, I just did it as a logic check 
  summarise(spec.avg = mean(rel_cov, na.rm = TRUE), n = n(), sd = sd(rel_cov, na.rm = TRUE))

write.csv(species.avg.ct, file = "../Derived CSVs/LTER_06MAT_species_avg_allyrs.csv")

Creating functional groups

Create vectors for assigning functional groups (The name of the functional groups that you will use to fill in the new column you will create called “func_grp”)

bare <- c("Bare ground")
dec_shrub <- c("Deciduous shrubs")
ev_shrub <- c("Evergreen shrubs")
gram <- c("Gramminoid")
erivag <- c("Eriophorum")
lichen <- c("Lichen")
litter <- c("Litter")
moss <- c("Moss")
Forb <- c("Forb")
St.D <- c("Standing dead")

Subset the dataset by species within given functional groups

bg <- subset(relcov_08_21, species== "bare" |species== "frost boil")
 
ds <- subset(relcov_08_21, species== "Bet nan"| species== "Vac uli"| species== "Sal pul"| species== "Arc alp"| species== "Sal phl"| species== "Sal arc"|species== "Sal ret")
es <- subset(relcov_08_21, species== "Emp nig"|species== "Led pal"|species=="Vac vit"|species== "Cas tet"|species== "And pol"|species== "Dry int")
gr <- subset(relcov_08_21, species== "Cal sp." | species== "grass" | species== "Car big" | species== "Arc lat" | species== "Fes sp."| species== "graminoid")
ev <- subset(relcov_08_21, species== "Eri vag")
fb <- subset(relcov_08_21, species== "Ped lap"|species== "Pol bis"|species== "Rub cha"|species== "Luz sp."|species== "Pet sp."|species== "Hie alp"|species== "Ped sp."|species== "Arnica sp."|species== "Ste edw"|species== "Pet fri"|species== "Ste lon"|species== "Sau ang"|species== "Hie sp."|species== "Min sp."|species== "Tof sp."|species== "Ste sp."|species== "dicot"|species== "Lag gla"|species== "Sax pun"|species== "Pyr sec"|species== "Equ arv"|species== "Pol viv"|species== "Tof coc"|species== "Pyr sp.")
lich <- subset(relcov_08_21, species== "lichen")
 
lit <- subset(relcov_08_21, species== "litter")
  
ms <- subset(relcov_08_21, species== "moss")
std <- subset(relcov_08_21, species== "St. D")

Then reassigned these groups with their given functional group names in a new column called “func_grp”.

bg["func_grp"] <-bare
ds["func_grp"] <-dec_shrub
es["func_grp"] <-ev_shrub
gr["func_grp"] <-gram
ev["func_grp"] <-erivag
fb["func_grp"] <-Forb
lich["func_grp"] <-lichen
lit["func_grp"] <-litter
ms["func_grp"] <-moss
std["func_grp"] <-St.D

Then merge these groups back together into one dataset. *Make sure the number of observations in relcov_func_08_21 matches relcov_08_21, but relcov_func_08_21 should have one more variable (the new func_grp column you created),

relcov_func_08_21           <- Reduce(MyMerge, list(bg, ds, es, gr, ev, fb, lich, lit, ms, std))
#export this file
write.csv(relcov_func_08_21, file = "../Derived CSVs/LTER_06MAT_relcov_funcgrp_allyrs.csv")

Statistics & graphing

Average across quadrats within a plot

This step sums relative cover across species of a given functional group within each individual quadrat. If you do not do this then you will be averaging across all the species and quadrats within a functional group at the same time, instead of averaging the total relative cover of a functional group across the 8 quadrats. Basically you will deflating the average cover, so make sure you do this step!!! :)

sum_func <- (relcov_func_08_21) %>% group_by(year, block, plot, quad, func_grp) %>% summarise_at(vars(rel_cov), list(relcov_func = sum ), na.rm = TRUE)

**Get overall average cover by functional group for the CT plots (used for table describing what species were assigned to each functional group)**

func_grp_avg_ct <- (sum_func) %>%
  filter(plot== "CT")%>%
  group_by(func_grp, plot)%>% #You don't need to list plot here, I just did it as a logic check 
  summarise(func_avg = mean(relcov_func, na.rm = TRUE), n = n(), sd = sd(relcov_func, na.rm = TRUE))

write.csv(func_grp_avg_ct, file = "../Derived CSVs/LTER_06MAT_funcgrp_avg_allyrs.csv")

Subset by year based on # of quads measured per plot in a given year

In 2008 we measured 4 quadrants/plot
In 2015 & 2020 we measured with 5 quadrants/plot
In all other years we measured with 8 quadrants

Q4 <- subset(sum_func, year== "2008")
Q5 <- subset(sum_func, year== "2015" | year== "2020")
Q8 <- subset(sum_func, year== "2010" | year== "2011" | year== "2012" | year== "2013" | year== "2014" | year== "2016" | year== "2017" | year== "2018" | year== "2019" |year== "2010" |year== "2021")

Sum across quadrats in a plot

Q4_sum_avg_quad <- (Q4) %>% 
  group_by(year, block, plot, func_grp) %>%
  summarise_at(vars(relcov_func), list(sum_func_quad = sum), na.rm = TRUE)

Q5_sum_avg_quad <- (Q5) %>% 
  group_by(year, block, plot, func_grp)%>% 
summarise_at(vars(relcov_func), list(sum_func_quad = sum), na.rm = TRUE)

Q8_sum_avg_quad <- (Q8) %>% 
  group_by(year, block, plot, func_grp) %>% 
  summarise_at(vars(relcov_func), list(sum_func_quad = sum), na.rm = TRUE)

Create functions/vectors for calculating average rel_cov bases on # of quadrats then fill a new column with the average

avg_Q4_quad <- c(Q4_sum_avg_quad$sum_func_quad/4)
avg_Q5_quad <- c(Q5_sum_avg_quad$sum_func_quad/5)
avg_Q8_quad <- c(Q8_sum_avg_quad$sum_func_quad/8)

#creates and fills a new column with the average (by diving that sum by the number of reps specified)
Q4_sum_avg_quad["avg_quad"] <-avg_Q4_quad
Q5_sum_avg_quad["avg_quad"] <-avg_Q5_quad
Q8_sum_avg_quad["avg_quad"] <-avg_Q8_quad

Merge all the years back together

Avg_func_quad           <- Reduce(MyMerge, list(Q4_sum_avg_quad, Q5_sum_avg_quad, Q8_sum_avg_quad))

#export this file
#write.csv(Avg_func_quad, file = "../Derived CSVs/LTER_06MAT_avg_func_quad_allyrs.csv")

Average across blocks

Avg_func_block <- (Avg_func_quad) %>% 
  group_by(year, plot, func_grp) %>% 
summarise(avg_block = mean(avg_quad, na.rm = TRUE), sd = sd(avg_quad, na.rm = TRUE), n = n(), se = sd/sqrt(n)) %>% ungroup()

## `summarise()` has grouped output by 'year', 'plot'. You can override using the `.groups` argument.

#export this file
write.csv(Avg_func_block, file = "../Derived CSVs/LTER_06MAT_avg_func_block_allyrs.csv")

Creating stacked bar graph

#specify factor levels for fertilization and clipping treatments (This will tell R in what order to print functional groups or fert treatments on the graph/key)
Avg_func_block$plot <- factor(Avg_func_block$plot, levels = c("CT", "F2", "F5", "F10"))
Avg_func_block$func_grp <- factor(Avg_func_block$func_grp, levels = c("Deciduous shrubs", 
                                                                          "Evergreen shrubs", 
                                                                          "Gramminoid",
                                                                          "Eriophorum",
                                                                          "Forb",
                                                                          "Lichen",
                                                                          "Moss", 
                                                                          "Litter",
                                                                          "Standing dead",
                                                                          "Bare ground"))

#Creates stacked bar graph#
ggplot(data = Avg_func_block, aes(fill=func_grp, x = year, y = avg_block)) +
  geom_bar(position="fill", stat="identity", color = "black")+ #Percent stacked
  #geom_bar(position="stack", stat="identity")+ #Stacked
  scale_fill_manual(" ", values = c(
    "Deciduous shrubs" = "#004c6d",
                                                           #"Deciduous shrubs" = "red",
                                                           "Evergreen shrubs" = "#6996b3",
                                                           "Gramminoid" = "#036d3f",
                                                           #"Grass" = "red",
                                                           "Eriophorum"="#6fa17e",
                                                          #"Sedge"="red",
                                                           "Forb"="#c4d6c8",
                                                          #"Forb"="red",
                                                           "Lichen"="#f6e8c3",
                                                           "Moss"="#dfc27d",
                                                           "Litter"="#bf812d",
                                                           "Standing dead"="#8c510a",
                                                           "Bare ground" = "#543005"))+
  theme_bw()+
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  theme(axis.text.x = element_text(color = "black", size = 10))+
  theme(axis.text.y = element_text(color = "black", size = 10))+
  scale_x_continuous(expand = c(0,0), breaks= c(2008, 2015, 2021)) +
  scale_y_continuous(expand = c(0,0), breaks= c(0, 0.5, 1))+
  labs(y = "Relative plant abundance ", x = " ")+ 
  theme(axis.title.y = element_text(size = 15)) +
  #theme(legend.position="bottom")+ #puts the key at the bottom of the graph and horizontally
  facet_grid(plot ~ .) + #vertically stacked 
  theme(strip.text.y = element_text(size = 10)) + 
  theme(panel.spacing.y = unit(5, "mm"))

ggsave("../Figures/LTER_06MAT_stacked_bar_cover.jpeg", width = 19.05, height = 21, units = 'cm', dpi = 300)

MANOVA

live plant functional groups vs fertilization

Check assumptions

MANOVA can be used in certain conditions:

The dependent variables should be normally distribute within groups. The R function mshapiro.test( ) [in the mvnormtest package] can be used to perform the Shapiro-Wilk test for multivariate normality. This is useful in the case of MANOVA, which assumes multivariate normality.
Homogeneity of variances across the range of predictors.
Linearity between all pairs of dependent variables, all pairs of covariates, and all dependent variable-covariate pairs in each cell

Calulating the MAOVA

Test for significant differences in rel cover of each species in 2021

followed instructions from this site

Prep the data

Filter for only 2021 or 2020 data respectively
Filter out non-living species (Standing dead, Litter, Bare ground)
Drop the column that still contains the sum of cover that you used to get the average
Convert from long to wide format
Change name of Deciduous and Evergreen shrub column to drop “shrubs” so there are no spaces in the name

data_MANOVA_2008 <- Avg_func_quad %>%
  filter(year=="2008" & func_grp!="Standing dead" & func_grp!="Litter" & func_grp!="Bare ground")%>%
  dplyr::select(-sum_func_quad)%>%
  pivot_wider(names_from = func_grp, values_from = avg_quad) %>%
  rename("Deciduous" = "Deciduous shrubs", "Evergreen" = "Evergreen shrubs")

data_MANOVA_2021 <- Avg_func_quad %>%
  filter(year=="2021" & func_grp!="Standing dead" & func_grp!="Litter" & func_grp!="Bare ground")%>%
  dplyr::select(-sum_func_quad) %>%
  pivot_wider(names_from = func_grp, values_from = avg_quad) %>%
  rename("Deciduous" = "Deciduous shrubs", "Evergreen" = "Evergreen shrubs")

Run the MANOVAs

#MANOVA_08<- manova(cbind(Deciduous, Evergreen, Forb, Grass, Lichen, Moss, Sedge) ~ plot, data = data_MANOVA_2008)
#summary.aov(MANOVA_08)

MANOVA_21<- manova(cbind(Deciduous, Evergreen, Forb, Gramminoid, Lichen, Moss, Eriophorum) ~ plot, data = data_MANOVA_2021)
summary.aov(MANOVA_21)

####Post hoc tests

Followed instructions from this site

Scatter plots

average relative cover of one species vs another (all years, not averaged across blocks)

Prep the data

First we need to transform the data from long to wide format so that each functional groups rel_cov is in a separate column

Drop the “sum_func_quad”
Transform from long to wide
Change name of Deciduous and Evergreen shrub column to drop “shrubs” so there are no spaces in the name
Change year to be a character not an integer
Replace all the NA values with 0’s (NAs were created where no rel cov value was present in a given plot for a given year. [ex: bare ground in the F10 treatment] So we will change these values to be 0s)

wide_avg_func_quad <- Avg_func_quad %>%
  dplyr::select(-sum_func_quad)%>%
  pivot_wider(names_from = func_grp, values_from = avg_quad)%>%
  rename("Deciduous" = "Deciduous shrubs", "Evergreen" = "Evergreen shrubs")%>%
  mutate(year = as.character(year)) %>%
  mutate_all(~replace_na(.,0))
  
  write.csv(wide_avg_func_quad, file = "../Derived CSVs/LTER_06MAT_WIDE_avg_quad.csv")

Create the scatter plots

Sedge vs Dec shrub

#specify factor levels for fertilization
wide_avg_func_quad$plot <- factor(wide_avg_func_quad$plot, levels = c("CT","F2","F5","F10"))

highlight_2008 <- wide_avg_func_quad %>% 
  filter(year == "2008")

highlight_2021 <- wide_avg_func_quad %>% 
  filter(year == "2021")

#Deciduous VS Sedge
ggplot(data = wide_avg_func_quad, aes(x = Deciduous, y = Eriophorum, color = plot)) +
  scale_color_manual(values = c("CT" = "#d8b365", 
                                "F2" = "#41b6c4",
                                "F5" = "#2c7fb8",
                                "F10" = "#253494"))+
  geom_point(shape = 1, size = 1.5)+
  geom_point(data=highlight_2008, 
             aes(x=Deciduous,y=Eriophorum), 
             shape = 17, size = 2)+
  geom_point(data=highlight_2021, 
             aes(x=Deciduous,y=Eriophorum), 
             shape=16, size = 2)+
  geom_smooth(data=dplyr::filter(wide_avg_func_quad, plot == "F10"), method = "lm", aes(fill = plot))+
  geom_smooth(data=dplyr::filter(wide_avg_func_quad, plot == "F5"), method = "lm", aes(fill = plot))+
  scale_fill_manual(values = c("CT" = "#E2C68E", 
                                "F2" = "#83D1D9",
                                "F5" = "#57A3D7",
                                "F10" = "#8A97E2"))+
  theme_bw() + 
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))+
  theme(axis.text.x = element_text(color = "black", size = 10))+
  theme(axis.text.y = element_text(color = "black", size = 10))+
  scale_y_continuous(expand = c(0,0), limits = c(0, .3), breaks = c(.1, .2, .3))+
  scale_x_continuous(expand = c(0,0), limits = c(0, .8))+
  labs(y = "Eriophorum abundance", x = "Deciduous shrub abundance")+ 
  theme(axis.title.y = element_text( size = 15))+
    theme(axis.title.x = element_text(size = 15))+
   stat_regline_equation(label.x= 0.52, label.y= 0.25, color = "black", size = 3)+ #adds regression line
  #stat_cor(aes(label=..rr.label..), label.x=0.4, label.y=.25, color = "black", size = 3)+ #adds R2 
  stat_cor(method = "pearson", label.x = .64, label.y = .25, color = "black", size = 3)+ #adds p-value 
  theme(legend.position="top", legend.title=element_blank(), ,
        legend.box.margin=margin(-10,-10,-10,-10))+
  facet_grid(plot ~ .)

## Warning: Removed 9 rows containing missing values (geom_smooth).

ggsave("../Figures/LTER_06MAT_correlation_Deciduous_Sedge.jpeg", width = 19.05, height = 21, units = 'cm', dpi = 300)

## Warning: Removed 9 rows containing missing values (geom_smooth).

Other graphs:

#Deciduous VS Sedge (not separated by fert treatment )
ggplot(data = wide_avg_func_quad, aes(x = Deciduous, y = Eriophorum)) +
   geom_point()+
  geom_smooth(method = "lm", se = TRUE)+
   stat_regline_equation(label.x= 0.55, label.y= 0.2)+
  stat_cor(aes(label=..rr.label..), label.x=0.6, label.y=.1)+
    stat_cor(method = "pearson", label.x = .5, label.y = .25) #adds p-value

#Deciduous VS Forbs
ggplot(data = wide_avg_func_quad, aes(x = Deciduous, y = Forb, color = plot)) +
  scale_color_manual(values = c("CT" = "#d8b365", 
                                "F2" = "#41b6c4",
                                "F5" = "#2c7fb8",
                                "F10" = "#253494"))+
  geom_point()+
  geom_smooth(method = "lm", se = TRUE, aes(fill = plot))+
  scale_fill_manual(values = c("CT" = "#E2C68E", 
                                "F2" = "#83D1D9",
                                "F5" = "#57A3D7",
                                "F10" = "#8A97E2"))+
   stat_regline_equation(label.x= 0.6, label.y= 0.25)+
  stat_cor(aes(label=..rr.label..), label.x=0.6, label.y=.1)+
  facet_grid(plot ~ .)

#Eriophorum VS Forbs
ggplot(data = wide_avg_func_quad, aes(x = Forb, y = Eriophorum, color = plot)) +
  scale_color_manual(values = c("CT" = "#d8b365", 
                                "F2" = "#41b6c4",
                                "F5" = "#2c7fb8",
                                "F10" = "#253494"))+
  geom_point()+
  geom_smooth(method = "lm", se = TRUE, aes(fill = plot))+
  scale_fill_manual(values = c("CT" = "#E2C68E", 
                                "F2" = "#83D1D9",
                                "F5" = "#57A3D7",
                                "F10" = "#8A97E2"))+
  theme_bw() + 
  theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.line = element_line(colour = "black"))+
  theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
  theme(axis.text.y = element_text(color = "black", size = 8))+
  theme( axis.line = element_line(colour = "black", 
                      size = 1, linetype = "solid"))+
  scale_y_continuous(breaks= c(0, 0.15, .3))+
  labs(y = "E. vaginatum", x = "Forb")+ 
  theme(axis.title.y = element_text(face = "bold"))+
    theme(axis.title.x = element_text(face = "bold"))+
   stat_regline_equation(label.x= 0.2, label.y= 0.15)+ #adds regression line
  stat_cor(aes(label=..rr.label..), label.x=0.2, label.y=.1)+ #adds R2 
stat_cor(method = "pearson", label.x = .2, label.y = .25)+ #adds p-value 
  facet_grid(plot ~ .)

Species composition measures

Prep the data

sum_species <- (relcov_08_21) %>% group_by(year, block, plot, quad, species) %>% 
  summarise_at(vars(rel_cov), list(relcov_species = sum ), na.rm = TRUE) %>% 
  ungroup()%>% 
  filter(species != "St. D", species != "grass", species != "graminoid", species != "bare", species != "frost boil", species != "litter", species != "dicot") 

sum_species <- subset(sum_species, year== "2008" | year== "2021")

Calculate Shannon Diverity

H_LTER <- sum_species %>%
  group_by(year,block, plot, quad) %>%
  mutate(cov_total = sum(relcov_species)) %>% #calculate H
  mutate(P = relcov_species/cov_total)%>%
  mutate(P_LnP = P * log(P)) %>%
  mutate(H = (sum(P_LnP, na.rm = TRUE))*-1) %>% 
  select(-cov_total, -P, -P_LnP)%>% #Remove intermediate columns
  ungroup()%>%
  mutate(year = as.character(year))

#H will be the same for every species within a quadrant. To get mean H across plots and then blocks its easiest to filter the data down to have just one value per quadrat. To do this, pick a species that shows up in every quadrat. In our case, Bet nan works. 

H_quad_avg <- H_LTER %>%
  filter(species =="Bet nan")%>%
group_by(year, block, plot) %>%
  summarise(H_quad_avg = mean(H, na.rm = TRUE), sd = sd(H, na.rm = TRUE), n = n(), se = sd/sqrt(n))%>%
  ungroup()

## `summarise()` has grouped output by 'year', 'block'. You can override using the `.groups` argument.

##LOGIC CHECK --> n = number of quads. IF there are less than the number you expected to have in a given treatment then that means "Bet nan" wasn't found in one of the quadrants, and you should pick a different species that shows up in all treatments (2008 had 4 quads, 2021 had 8)

H_block_avg <- H_quad_avg %>%
  group_by(year, plot)%>%
  summarise(H_block_avg = mean(H_quad_avg, na.rm = TRUE), sd = sd(H_quad_avg, na.rm = TRUE), n = n(), se = sd/sqrt(n))%>%
  ungroup()

## `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.

write.csv(H_block_avg, file = "../Derived CSVs/LTER_Diversity_block_avg.csv")

Box plot graph

H_quad_avg$plot <- factor(H_quad_avg$plot, levels = c("CT", "F2", "F5", "F10"))

ggplot(H_quad_avg, aes(x= year, y=H_quad_avg, fill = plot))+
  geom_boxplot(position=position_dodge(.85))+
  scale_fill_manual(values=c("#E1BA15", "#41b6c4", "#2c7fb8", "#253494"))+
     theme_light()+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
  labs(y = "Shannon Diversity Index (per m^2)", x = "")

    #theme(legend.position="bottom")
  
#ggsave("../Figures/LTER_shannon_diversity.jpeg")

Calculate Richness

richness_avg_quad <- H_LTER %>%
  filter(relcov_species != "0")%>%
group_by(year, block, plot, quad) %>%
  add_count(year) %>%
  ungroup()%>%
  filter(species =="Bet nan")%>%
    group_by(year, block, plot)%>%
  summarise(rich_avg_quad = mean(n, na.rm = TRUE), sd = sd(n, na.rm = TRUE), n = n(), se = sd/sqrt(n))%>%
  ungroup()

## `summarise()` has grouped output by 'year', 'block'. You can override using the `.groups` argument.

richness_avg_block <- richness_avg_quad %>%
  group_by(year, plot)%>%
  summarise(richness_avg_block = mean(rich_avg_quad, na.rm = TRUE), sd = sd(rich_avg_quad, na.rm = TRUE), n = n(), se = sd/sqrt(n))%>%
  ungroup()

## `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.

write.csv(richness_avg_block, file = "../Derived CSVs/LTER_Richness_block_avg.csv")

Look at what species are in the plots

CT_08_spec <- H_LTER %>%
  filter(plot=="CT", year=="2008")

  CT_08_spec<- unique(CT_08_spec$species)
  
CT_21_spec <- H_LTER %>%
  filter(plot=="CT", year=="2021")

  CT_21_spec<- unique(CT_21_spec$species)

F10_08_spec <- H_LTER %>%
  filter(plot=="F10", year=="2008")

  F10_08_spec<- unique(F10_08_spec$species)

F10_21_spec <- H_LTER %>%
  filter(plot=="F10", year=="2021")

  F10_21_spec<- unique(F10_21_spec$species)
  
write.csv(CT_08_spec, file = "../Derived CSVs/UNIQUE_CT_08.csv")
write.csv(CT_21_spec, file = "../Derived CSVs/UNIQUE_CT_21.csv")
write.csv(F10_08_spec, file = "../Derived CSVs/UNIQUE_F10_08.csv")
write.csv(F10_21_spec, file = "../Derived CSVs/UNIQUE_F10_21.csv")

Map feild site locations

locations <- read.csv("../Data/Toolik_location.csv")


leaflet(locations) %>% 
  addTiles() %>%
  addScaleBar() %>%
  addMarkers(lng = ~Long, lat = ~Lat, popup = ~ Location)

Thesis_cover_2018_2021

Nicole Williamson

2022-12-14

Helpful Tips

Downloading/loading packages

Data Wrangling

Load in data and merge datasets:

Remove any data that you will not need for the analysis

Fix naming convention errors

Creating functional groups

Statistics & graphing

Average across quadrats within a plot

Average across blocks

Creating stacked bar graph

MANOVA

Check assumptions

Calulating the MAOVA

Prep the data

Run the MANOVAs

Scatter plots

Prep the data

Create the scatter plots

Species composition measures

Prep the data

Calculate Shannon Diverity

Box plot graph

Calculate Richness

Map feild site locations