Setting up

Helpful Tips

Useful keyboard shortcuts these work for windows, they may be different for Macs

Cntrl + Z functions like the Undo button in word or excel. Undoes whatever action you just did
Cntrl + Alt + I inserts a new code chunk
Cntrl + Enter runs the code that your courser is on (good for if you want to run small sections of code within a larger code chunk)
Cntrl + Shift + M inserts the pipe functions %>%
Cntrl + F allows you to find things within the document (can be useful if you want to replace the name of an object everywhere in the document at the same time)
Cntrl + Shift + O opens a document outline that you can use to quickly navigate between titled sections of your document
Cntrl + C to copy, Cntrl + V to paste
Cntrl + S to save

Use rm(list = ls()) to Clear workspace (if desired/needed). If run correctly will clear everything from your environment tab. Typically it’s good to do this at the start of each of your sessions. Defiantly should do it if you open a new Rmd file.

Use thisgetwd() to check working directory. Will most likely be whatever folder your Rmd file is in. If you need to change it for some reason use setwd()

Downloading/loading packages

Use install.packages to download and install a new R package. You only need to do this once, then you can just use library() to load it and be able to call functions within it. and load the package libraries.

TIP: It’s generally considered best practice to keep the commands to call all the packages you need to use at the top of the document

tidyr, dplyr, and readr are probably the most commonly used packages for data wrangling/cleaning.ggplot2 is very common for creating publication quality figures.

Note: The order which you install packages is important. Functions with the same names will be masked by packages loaded in after the previous.
Ex: The select() function from MASS will be masked by select() from dplyar

library("tidyr")
library("dplyr")
library("readr")
library("ggplot2")
library("ggpubr")

Data Wrangling

Import data

Load in each years .csv files containing the data. 2021 will be added after since it already has the new codes

#TH_2018 <- read.csv(file.choose(), header = TRUE)

TH_2018 <- read.csv("C:/Users/willi/OneDrive/Documents/Towson University/Data/Simulated herbivory/QAQCed tiller heights/2018_tiller_heights_QAQCed.csv", header = TRUE)

TH_2019 <- read.csv("C:/Users/willi/OneDrive/Documents/Towson University/Data/Simulated herbivory/QAQCed tiller heights/2019_tiller_heights_QAQCed.csv", header = TRUE)

TH_2020 <- read.csv("C:/Users/willi/OneDrive/Documents/Towson University/Data/Simulated herbivory/QAQCed tiller heights/2020_tiller_heights_QAQCed.csv", header = TRUE)

TH_2021 <- read.csv("C:/Users/willi/OneDrive/Documents/Towson University/Data/Simulated herbivory/QAQCed tiller heights/2021_tiller_heights_QAQCed.csv", header = TRUE)

Merge older data together

Then merge the years together
2018-2020 ONLY – 2021 must be added later as it only has new clipping codes

!IMPORTANT! all the names of columns need to be exactly the same name (capitalization and spaces included)**

MyMerge       <- function(x, y){
  df            <- merge(x, y, all = TRUE)
  rownames(df)  <- df$Row.names
  df$Row.names  <- NULL
  return(df)
}
TH_19_18_20           <- Reduce(MyMerge, list(TH_2018, TH_2019, TH_2020))

!IMPORTANT! Make sure the number of observations in the merge match the sum of the observations of the individual data set

Look at the data to see what it contains

head(TH_19_18_20) #opens first 6 rows
tail(TH_19_18_20) #opens last 6 rows
summary(TH_19_18_20)
str(TH_19_18_20) #tells us what data types (numbers, factors, etc) are in the data frame

Check unique vales in each column to make sure that there are not naming errors

#unique(TH_19_18_20$date)
unique(TH_19_18_20$block)
unique(TH_19_18_20$plot)
unique(TH_19_18_20$old_id)
#unique(TH_19_18_20$rep)
#unique(TH_19_18_20$t_num)
#unique(TH_19_18_20$g_ht)

Remove or rename – important step! EX: 18 observations in the merge were labeled “F10” instead of “F10”. This meant when the data is subset, the “F10” observations would be lost if not corrected for. Also removed any NA values in the t_num column

#Remove columns we won't need/use
TH_19_18_20 <- TH_19_18_20 %>% select(-c(y_ht, max_ht, y_pct))

#Example remove NA values
TH_19_18_20 <- TH_19_18_20 %>% filter(t_num != "NA")

#Example rename
TH_19_18_20$plot <- recode(TH_19_18_20$plot, "F10 " = "F10")

Add new clip code

Once naming conventions have been checked, the next step is to create a new column in the spread sheet that will be filled out with the new clipping codes…(PU1/PU2 or PR1/PR2 instead of CL# 1-4)

To do this, the 1st step is to create vectors for the information you plan to add to the columns. In this case we will then list the names of all of the new clipping codes…

~vectors for filling out “new_id” column~

CT1 <- c("CT1")
CT2 <- c("CT2")
PU1 <- c("PU1")
PU2 <- c("PU2")
PR1 <- c("PR1")
PR2 <- c("PR2")

The next step is to dived the data set up into groups based on what corresponds to each clip_id.

~subset the dataset~

  ## BLOCK 1 ##

#B1CT
B1.CT.CT1 <- subset(TH_19_18_20, block=="1" & plot=="CT" & old_id=="CT1")
B1.CT.CT2 <- subset(TH_19_18_20, block=="1" & plot=="CT" & old_id=="CT2")
B1.CT.CL1 <- subset(TH_19_18_20, block=="1" & plot=="CT" & old_id=="CL1")
B1.CT.CL2 <- subset(TH_19_18_20, block=="1" & plot=="CT" & old_id=="CL2")
B1.CT.CL3 <- subset(TH_19_18_20, block=="1" & plot=="CT" & old_id=="CL3")
B1.CT.CL4 <- subset(TH_19_18_20, block=="1" & plot=="CT" & old_id=="CL4")

#B1F2
B1.F2.CT1 <- subset(TH_19_18_20, block=="1" & plot=="F2" & old_id=="CT1")
B1.F2.CT2 <- subset(TH_19_18_20, block=="1" & plot=="F2" & old_id=="CT2")
B1.F2.CL1 <- subset(TH_19_18_20, block=="1" & plot=="F2" & old_id=="CL1")
B1.F2.CL2 <- subset(TH_19_18_20, block=="1" & plot=="F2" & old_id=="CL2")
B1.F2.CL3 <- subset(TH_19_18_20, block=="1" & plot=="F2" & old_id=="CL3")
B1.F2.CL4 <- subset(TH_19_18_20, block=="1" & plot=="F2" & old_id=="CL4")

#B1F5
B1.F5.CT1 <- subset(TH_19_18_20, block=="1" & plot=="F5" & old_id=="CT1")
B1.F5.CT2 <- subset(TH_19_18_20, block=="1" & plot=="F5" & old_id=="CT2")
B1.F5.CL1 <- subset(TH_19_18_20, block=="1" & plot=="F5" & old_id=="CL1")
B1.F5.CL2 <- subset(TH_19_18_20, block=="1" & plot=="F5" & old_id=="CL2")
B1.F5.CL3 <- subset(TH_19_18_20, block=="1" & plot=="F5" & old_id=="CL3")
B1.F5.CL4 <- subset(TH_19_18_20, block=="1" & plot=="F5" & old_id=="CL4")

#B1F10
B1.F10.CT1 <- subset(TH_19_18_20, block=="1" & plot=="F10" & old_id=="CT1")
B1.F10.CT2 <- subset(TH_19_18_20, block=="1" & plot=="F10" & old_id=="CT2")
B1.F10.CL1 <- subset(TH_19_18_20, block=="1" & plot=="F10" & old_id=="CL1")
B1.F10.CL2 <- subset(TH_19_18_20, block=="1" & plot=="F10" & old_id=="CL2")
B1.F10.CL3 <- subset(TH_19_18_20, block=="1" & plot=="F10" & old_id=="CL3")
B1.F10.CL4 <- subset(TH_19_18_20, block=="1" & plot=="F10" & old_id=="CL4")

  ## BLOCK 2 ##

#B2CT
B2.CT.CT1 <- subset(TH_19_18_20, block=="2" & plot=="CT" & old_id=="CT1")
B2.CT.CT2 <- subset(TH_19_18_20, block=="2" & plot=="CT" & old_id=="CT2")
B2.CT.CL1 <- subset(TH_19_18_20, block=="2" & plot=="CT" & old_id=="CL1")
B2.CT.CL2 <- subset(TH_19_18_20, block=="2" & plot=="CT" & old_id=="CL2")
B2.CT.CL3 <- subset(TH_19_18_20, block=="2" & plot=="CT" & old_id=="CL3")
B2.CT.CL4 <- subset(TH_19_18_20, block=="2" & plot=="CT" & old_id=="CL4")

#B2F2
B2.F2.CT1 <- subset(TH_19_18_20, block=="2" & plot=="F2" & old_id=="CT1")
B2.F2.CT2 <- subset(TH_19_18_20, block=="2" & plot=="F2" & old_id=="CT2")
B2.F2.CL1 <- subset(TH_19_18_20, block=="2" & plot=="F2" & old_id=="CL1")
B2.F2.CL2 <- subset(TH_19_18_20, block=="2" & plot=="F2" & old_id=="CL2")
B2.F2.CL3 <- subset(TH_19_18_20, block=="2" & plot=="F2" & old_id=="CL3")
B2.F2.CL4 <- subset(TH_19_18_20, block=="2" & plot=="F2" & old_id=="CL4")

#B2F5
B2.F5.CT1 <- subset(TH_19_18_20, block=="2" & plot=="F5" & old_id=="CT1")
B2.F5.CT2 <- subset(TH_19_18_20, block=="2" & plot=="F5" & old_id=="CT2")
B2.F5.CL1 <- subset(TH_19_18_20, block=="2" & plot=="F5" & old_id=="CL1")
B2.F5.CL2 <- subset(TH_19_18_20, block=="2" & plot=="F5" & old_id=="CL2")
B2.F5.CL3 <- subset(TH_19_18_20, block=="2" & plot=="F5" & old_id=="CL3")
B2.F5.CL4 <- subset(TH_19_18_20, block=="2" & plot=="F5" & old_id=="CL4")

#B2F10
B2.F10.CT1 <- subset(TH_19_18_20, block=="2" & plot=="F10" & old_id=="CT1")
B2.F10.CT2 <- subset(TH_19_18_20, block=="2" & plot=="F10" & old_id=="CT2")
B2.F10.CL1 <- subset(TH_19_18_20, block=="2" & plot=="F10" & old_id=="CL1")
B2.F10.CL2 <- subset(TH_19_18_20, block=="2" & plot=="F10" & old_id=="CL2")
B2.F10.CL3 <- subset(TH_19_18_20, block=="2" & plot=="F10" & old_id=="CL3")
B2.F10.CL4 <- subset(TH_19_18_20, block=="2" & plot=="F10" & old_id=="CL4")

  ## BLOCK 3 ##

#B3CT
B3.CT.CT1 <- subset(TH_19_18_20, block=="3" & plot=="CT" & old_id=="CT1")
B3.CT.CT2 <- subset(TH_19_18_20, block=="3" & plot=="CT" & old_id=="CT2")
B3.CT.CL1 <- subset(TH_19_18_20, block=="3" & plot=="CT" & old_id=="CL1")
B3.CT.CL2 <- subset(TH_19_18_20, block=="3" & plot=="CT" & old_id=="CL2")
B3.CT.CL3 <- subset(TH_19_18_20, block=="3" & plot=="CT" & old_id=="CL3")
B3.CT.CL4 <- subset(TH_19_18_20, block=="3" & plot=="CT" & old_id=="CL4")

#B3F2
B3.F2.CT1 <- subset(TH_19_18_20, block=="3" & plot=="F2" & old_id=="CT1")
B3.F2.CT2 <- subset(TH_19_18_20, block=="3" & plot=="F2" & old_id=="CT2")
B3.F2.CL1 <- subset(TH_19_18_20, block=="3" & plot=="F2" & old_id=="CL1")
B3.F2.CL2 <- subset(TH_19_18_20, block=="3" & plot=="F2" & old_id=="CL2")
B3.F2.CL3 <- subset(TH_19_18_20, block=="3" & plot=="F2" & old_id=="CL3")
B3.F2.CL4 <- subset(TH_19_18_20, block=="3" & plot=="F2" & old_id=="CL4")

#B3F5
B3.F5.CT1 <- subset(TH_19_18_20, block=="3" & plot=="F5" & old_id=="CT1")
B3.F5.CT2 <- subset(TH_19_18_20, block=="3" & plot=="F5" & old_id=="CT2")
B3.F5.CL1 <- subset(TH_19_18_20, block=="3" & plot=="F5" & old_id=="CL1")
B3.F5.CL2 <- subset(TH_19_18_20, block=="3" & plot=="F5" & old_id=="CL2")
B3.F5.CL3 <- subset(TH_19_18_20, block=="3" & plot=="F5" & old_id=="CL3")
B3.F5.CL4 <- subset(TH_19_18_20, block=="3" & plot=="F5" & old_id=="CL4")

#B3F10
B3.F10.CT1 <- subset(TH_19_18_20, block=="3" & plot=="F10" & old_id=="CT1")
B3.F10.CT2 <- subset(TH_19_18_20, block=="3" & plot=="F10" & old_id=="CT2")
B3.F10.CL1 <- subset(TH_19_18_20, block=="3" & plot=="F10" & old_id=="CL1")
B3.F10.CL2 <- subset(TH_19_18_20, block=="3" & plot=="F10" & old_id=="CL2")
B3.F10.CL3 <- subset(TH_19_18_20, block=="3" & plot=="F10" & old_id=="CL3")
B3.F10.CL4 <- subset(TH_19_18_20, block=="3" & plot=="F10" & old_id=="CL4")

You can then create a new column in the data set and tell R how to fill it for each of the subset groups #fill with the new clipping ID codes – SEE METADATA SHEET FOR TABLE OF OLD CODES TO NEW

## BLOCK 1 ##

  #CT
B1.CT.CT1["clip_id"] <- CT1
B1.CT.CT2["clip_id"] <- CT2
B1.CT.CL1["clip_id"] <- PR1
B1.CT.CL2["clip_id"] <- PR2
B1.CT.CL3["clip_id"] <- PU1
B1.CT.CL4["clip_id"] <- PU2
  #F2
B1.F2.CT1["clip_id"] <- CT1
B1.F2.CT2["clip_id"] <- CT2
B1.F2.CL1["clip_id"] <- PU1
B1.F2.CL2["clip_id"] <- PU2
B1.F2.CL3["clip_id"] <- PR1
B1.F2.CL4["clip_id"] <- PR2
  #F5
B1.F5.CT1["clip_id"] <- CT1
B1.F5.CT2["clip_id"] <- CT2
B1.F5.CL1["clip_id"] <- PU1
B1.F5.CL2["clip_id"] <- PU2
B1.F5.CL3["clip_id"] <- PR1
B1.F5.CL4["clip_id"] <- PR2
  #F10
B1.F10.CT1["clip_id"] <- CT1
B1.F10.CT2["clip_id"] <- CT2
B1.F10.CL1["clip_id"] <- PU1
B1.F10.CL2["clip_id"] <- PU2
B1.F10.CL3["clip_id"] <- PR1
B1.F10.CL4["clip_id"] <- PR2

    ## BLOCK 2 ##

  #CT
B2.CT.CT1["clip_id"] <- CT1
B2.CT.CT2["clip_id"] <- CT2
B2.CT.CL1["clip_id"] <- PR1
B2.CT.CL2["clip_id"] <- PR2
B2.CT.CL3["clip_id"] <- PU1
B2.CT.CL4["clip_id"] <- PU2
  #F2
B2.F2.CT1["clip_id"] <- CT1
B2.F2.CT2["clip_id"] <- CT2
B2.F2.CL1["clip_id"] <- PU1
B2.F2.CL2["clip_id"] <- PU2
B2.F2.CL3["clip_id"] <- PR1
B2.F2.CL4["clip_id"] <- PR2
  #F5
B2.F5.CT1["clip_id"] <- CT1
B2.F5.CT2["clip_id"] <- CT2
B2.F5.CL1["clip_id"] <- PU1
B2.F5.CL2["clip_id"] <- PU2
B2.F5.CL3["clip_id"] <- PR1
B2.F5.CL4["clip_id"] <- PR2
  #F10
B2.F10.CT1["clip_id"] <- CT1
B2.F10.CT2["clip_id"] <- CT2
B2.F10.CL1["clip_id"] <- PU1
B2.F10.CL2["clip_id"] <- PR1
B2.F10.CL3["clip_id"] <- PR2
B2.F10.CL4["clip_id"] <- PU2   
  
    ## BLOCK 3 ##

  #CT
B3.CT.CT1["clip_id"] <- CT1
B3.CT.CT2["clip_id"] <- CT2
B3.CT.CL1["clip_id"] <- PR1
B3.CT.CL2["clip_id"] <- PU1
B3.CT.CL3["clip_id"] <- PU2
B3.CT.CL4["clip_id"] <- PR2
  #F2
B3.F2.CT1["clip_id"] <- CT1
B3.F2.CT2["clip_id"] <- CT2
#B3.F2.CL1["clip_id"] <- na
#B3.F2.CL2["clip_id"] <- na
#B3.F2.CL3["clip_id"] <- na
#B3.F2.CL4["clip_id"] <- na
  #F5
B3.F5.CT1["clip_id"] <- CT1
B3.F5.CT2["clip_id"] <- CT2
B3.F5.CL1["clip_id"] <- PU1
B3.F5.CL2["clip_id"] <- PU2
B3.F5.CL3["clip_id"] <- PR1
B3.F5.CL4["clip_id"] <- PR2
  #F10
B3.F10.CT1["clip_id"] <- CT1
B3.F10.CT2["clip_id"] <- CT2
B3.F10.CL1["clip_id"] <- PR1
B3.F10.CL2["clip_id"] <- PR2
B3.F10.CL3["clip_id"] <- PU1
B3.F10.CL4["clip_id"] <- PU2

Then merge all these subsets back together –

TH_new_19_18_20           <- Reduce(MyMerge, list(
  B1.CT.CT1, B1.CT.CT2, B1.CT.CL1, B1.CT.CL2, B1.CT.CL3, B1.CT.CL4, B1.F2.CT1, B1.F2.CT2, B1.F2.CL1, B1.F2.CL2, B1.F2.CL3,
  B1.F2.CL4, B1.F5.CT1, B1.F5.CT2, B1.F5.CL1, B1.F5.CL2, B1.F5.CL3, B1.F5.CL4, B1.F10.CT1, B1.F10.CT2, B1.F10.CL1,
  B1.F10.CL2, B1.F10.CL3, B1.F10.CL4,
  B2.CT.CT1, B2.CT.CT2, B2.CT.CL1, B2.CT.CL2, B2.CT.CL3, B2.CT.CL4, B2.F2.CT1, B2.F2.CT2, B2.F2.CL1, B2.F2.CL2, B2.F2.CL3,
  B2.F2.CL4, B2.F5.CT1, B2.F5.CT2, B2.F5.CL1, B2.F5.CL2, B2.F5.CL3, B2.F5.CL4, B2.F10.CT1, B2.F10.CT2, B2.F10.CL1,
  B2.F10.CL2, B2.F10.CL3, B2.F10.CL4,
  B3.CT.CT1, B3.CT.CT2, B3.CT.CL1, B3.CT.CL2, B3.CT.CL3, B3.CT.CL4, B3.F2.CT1, B3.F2.CT2, B3.F2.CL1, B3.F2.CL2, B3.F2.CL3,
  B3.F2.CL4, B3.F5.CT1, B3.F5.CT2, B3.F5.CL1, B3.F5.CL2, B3.F5.CL3, B3.F5.CL4, B3.F10.CT1, B3.F10.CT2, B3.F10.CL1,
  B3.F10.CL2, B3.F10.CL3, B3.F10.CL4
  ))

!IMPORTANT! – make sure the number of observations in the new merge match the number of observations in the original

Merge old and new data

Then merge the 2021 data set with the rest of the years…

TH_all           <- Reduce(MyMerge, list(TH_new_19_18_20, TH_2021))

!IMPORTANT! – make sure the number of observations in the new merge match the sum of observations in TH_new_19_18_20 and TH_2021

Recheck for any naming errors

unique(TH_all$date)

##  [1] "6/19/2021" "6/21/2019" "6/23/2019" "6/24/2019" "7/13/2018" "7/17/2019"
##  [7] "7/20/2018" "7/21/2021" "7/30/2018" "7/31/2020" "7/6/2018"  "7/6/2019" 
## [13] "7/7/2018"  "7/7/2021"  "7/9/2018"  "8/2/2019"  "8/3/2021"

unique(TH_all$block)

## [1] 1 2 3

unique(TH_all$plot)

## [1] "CT"  "F10" "F2"  "F5"

unique(TH_all$old_id)

## [1] NA    "CL1" "CL2" "CL3" "CL4" "CT1" "CT2"

unique(TH_all$clip_id)

## [1] "PU2" "PR2" "PU1" "PR1" "CT1" "CT2" "CT3"

unique(TH_all$rep)

##  [1]  1  2  3  4  5  6  7  8  9 10

Add new date column

Now we create two new columns for two different kinds of date replacement

~vectors for filling out “d_sub1” and “d_sub2” columns~

  ## d_sub2 ##
d2.7.6.18 <- c(1)
d2.7.7.18 <- c(1)
d2.7.9.18 <- c(1)
d2.7.13.18 <- c(2)
d2.7.20.18 <- c(3)
d2.7.30.18 <- c(4)


d2.6.21.19 <- c(5)
d2.6.23.19 <- c(5)
d2.6.24.19 <- c(5)
d2.7.6.19 <- c(6)
d2.7.17.19 <- c(7)
d2.8.2.19 <- c(8)

d2.7.31.20 <- c(9)

d2.6.19.21 <- c(10)
d2.7.7.21 <- c(11)
d2.7.21.21 <- c(12)
d2.8.3.21 <- c(13)

Then subset by date…

  ##2018##
jul.06.18 <- subset(TH_all, date=="7/6/2018")
jul.07.18 <- subset(TH_all, date=="7/7/2018")
jul.09.18 <- subset(TH_all, date=="7/9/2018")
jul.13.18 <- subset(TH_all, date=="7/13/2018")
jul.20.18 <- subset(TH_all, date=="7/20/2018")
jul.30.18 <- subset(TH_all, date=="7/30/2018")

  ##2019##
jun.21.19 <- subset(TH_all, date=="6/21/2019")
jun.23.19 <- subset(TH_all, date=="6/23/2019")
jun.24.19 <- subset(TH_all, date=="6/24/2019")
jul.06.19 <- subset(TH_all, date=="7/6/2019")
jul.17.19 <- subset(TH_all, date=="7/17/2019")
aug.02.19 <- subset(TH_all, date=="8/2/2019")

  ##2020##
jul.31.20 <- subset(TH_all, date=="7/31/2020")

  ##2021##
jun.19.21 <- subset(TH_all, date=="6/19/2021")
jul.07.21 <- subset(TH_all, date=="7/7/2021") 
jul.21.21 <- subset(TH_all, date=="7/21/2021")
aug.03.21 <- subset(TH_all, date=="8/3/2021")

Fill the columns

#sub2
  ##2018##
jul.06.18["d_sub2"] <- d2.7.6.18
jul.07.18["d_sub2"] <- d2.7.7.18
jul.09.18["d_sub2"] <- d2.7.9.18
jul.13.18["d_sub2"] <- d2.7.13.18
jul.20.18["d_sub2"] <- d2.7.20.18
jul.30.18["d_sub2"] <- d2.7.30.18

  ##2019##
jun.21.19["d_sub2"] <- d2.6.21.19 
jun.23.19["d_sub2"] <- d2.6.23.19
jun.24.19["d_sub2"] <- d2.6.24.19
jul.06.19["d_sub2"] <- d2.7.6.19
jul.17.19["d_sub2"] <- d2.7.17.19
aug.02.19["d_sub2"] <- d2.8.2.19

  ##2020##
jul.31.20["d_sub2"] <- d2.7.31.20

  ##2021##
jun.19.21["d_sub2"] <- d2.6.19.21
jul.07.21["d_sub2"] <- d2.7.7.21
jul.21.21["d_sub2"] <- d2.7.21.21 
aug.03.21["d_sub2"] <- d2.8.3.21

TH_all           <- Reduce(MyMerge, list(jul.06.18, jul.07.18,jul.09.18, jul.13.18, jul.20.18, jul.30.18, jun.21.19, jun.23.19, jun.24.19, jul.06.19, jul.17.19, aug.02.19, jul.31.20, jun.19.21, jul.07.21, jul.21.21, aug.03.21))

Add clipping treatments

Now we need to add a new column for the overarching clipping treatments vectors

CT <- c("CT")
PU <- c("PU")
PR <- c("PR")

Then subset by clip_id…

g_CT1 <- subset(TH_all, clip_id=="CT1")
g_CT2 <- subset(TH_all, clip_id=="CT2")
g_PU1 <- subset(TH_all, clip_id=="PU1")
g_PU2 <- subset(TH_all, clip_id=="PU2")
g_PR1 <- subset(TH_all, clip_id=="PR1")
g_PR2 <- subset(TH_all, clip_id=="PR2")

Fill the columns

g_CT1["clip"] <- CT
g_CT2["clip"] <- CT
g_PU1["clip"] <- PU
g_PU2["clip"] <- PU
g_PR1["clip"] <- PR
g_PR2["clip"] <- PR

TH_all           <- Reduce(MyMerge, list(g_CT1, g_CT2, g_PU1, g_PU2, g_PR1, g_PR2))

#Remove columns we won't need/use
TH_all <- TH_all %>% select(-c(old_id, max_ht, y_ht, y_pct))

STATISTICS I

Basic stats

Tiller Height

Calculate basic stats (avg, SD, Max, Min….Ex) for tiller heights

#average cross replicates on each tussock 
ht.avg.rep <- (TH_all) %>% group_by(d_sub2, block, plot, clip, clip_id) %>% summarise_at(vars(g_ht), list(avg_t_ht1 = mean, std_ht1 = sd, max_ht1 = max, min_ht1 = min), na.rm = TRUE)

#average across type of clip treatment 
ht.avg.clip <- (ht.avg.rep) %>% group_by(d_sub2, block, plot, clip) %>% summarise_at(vars(avg_t_ht1), list(avg_t_ht2 = mean, std_ht2 = sd, max_ht2 = max, min_ht2 = min), na.rm = TRUE)

#average across the blocks
ht.avg.block <- (ht.avg.clip) %>% group_by(d_sub2, plot, clip) %>% summarise_at(vars(avg_t_ht2), list(avg_t_ht3 = mean, std_ht3 = sd, max_ht3 = max, min_ht3 = min), na.rm = TRUE)

!!Important!! ~ make sure you have the “na.rm = TRUE” at the end of the code above. This tells R to ignore na values when calculating stats. If you don’t have this in the code then it will return “NA” for any groups that are missing data

If you would like to work with the spreadsheets at this point, you can export/write new CSVs with the code below….

#export new file 
write.csv(TH_all, file = "tiller_ht_all.csv")
write.csv(ht.avg.block, file = "tiller_avg.csv")

Recommend calculating a batch or two by hand to make sure the numbers it spits out are accurate

Tiller Number

#average cross replicates on each tussock 
tnum.avg.rep <- (TH_all) %>% group_by(d_sub2, block, plot, clip, clip_id) %>% summarise_at(vars(t_num), list(avg_t_num1 = mean, std_num1 = sd, max_num1 = max, min_num1 = min), na.rm = TRUE)

#average across type of clip treatment 
tnum.avg.clip <- (tnum.avg.rep) %>% group_by(d_sub2, block, plot, clip) %>% summarise_at(vars(avg_t_num1), list(avg_t_num2 = mean, std_num2 = sd, max_num2 = max, min_num2 = min), na.rm = TRUE)

#average across the blocks
tnum.avg.block <- (tnum.avg.clip) %>% group_by(d_sub2, plot, clip) %>% summarise_at(vars(avg_t_num2), list(avg_t_num3 = mean, std_num3 = sd, max_num3 = max, min_num3 = min), na.rm = TRUE)

GRAPHING

Line graph - tiller heights over time

#specify factor levels for fertilization and clipping treatments 
ht.avg.block$plot <- factor(ht.avg.block$plot, levels = c("CT", "F2", "F5", "F10"))
ht.avg.block$clip <- factor(ht.avg.block$clip, levels = c("CT", "PU", "PR"))


ggplot(data = ht.avg.block, aes(x = d_sub2, y = avg_t_ht3, ymin = avg_t_ht3-std_ht3, ymax = avg_t_ht3+std_ht3, color = plot)) +
  scale_color_manual(values = c("CT" = "#d8b365", 
                                "F2" = "#41b6c4",
                                "F5" = "#2c7fb8",
                                "F10" = "#253494"))+
  
  theme_light()+
  theme(aspect.ratio = 9/18.5, 
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), 
        axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1),
        axis.text.y = element_text(color = "black", size = 8),
        axis.line = element_line(colour = "black", linetype = "solid"), 
        legend.position="none",
        plot.title = element_text(hjust = .5))+
  geom_hline(yintercept = 15, color="#D0CECE")+
  geom_vline(xintercept = 4.5, color="#D0CECE")+
  geom_vline(xintercept = 8.5, color="#D0CECE")+
  geom_vline(xintercept = 9.5, color="#D0CECE")+
  
  scale_x_continuous(breaks= c(2, 4, 6, 8, 10, 12), labels = c("2" = "7/13", "4" = "7/30", "6" = "7/06", 
                                                               "8" = "8/02", "10" = "6/19", "12" = "7/21"))+
  geom_errorbar(width = .25)+
  geom_line(size = .75) +
  geom_point(size = 1.25) + 
  labs(y = "Mean tiller height (cm)", x = "")+ 
  #theme(axis.title.y = element_text(face = "bold"))+
   # theme(axis.title.x = element_text(face = "bold"))+
  guides(color=guide_legend(" "))+
  #theme(legend.position="bottom")+
  facet_grid( clip ~ .) #vertically stacked

  #facet_grid( ~ clip)  #horizontally stacked


ggsave("../Figures/Linegraph_tiller_heights2.jpeg")

## Saving 7 x 5 in image

Line graph - tiller number over time

#specify factor levels for fertilization and clipping treatments 
tnum.avg.block$plot <- factor(tnum.avg.block$plot, levels = c("CT", "F2", "F5", "F10"))
tnum.avg.block$clip <- factor(tnum.avg.block$clip, levels = c("CT", "PU", "PR"))


ggplot(data = tnum.avg.block, aes(x = d_sub2, y = avg_t_num3, ymin = avg_t_num3-std_num3, ymax = avg_t_num3+std_num3, color = plot)) +
  scale_color_manual(values = c("CT" = "#d8b365", 
                                "F2" = "#41b6c4",
                                "F5" = "#2c7fb8",
                                "F10" = "#253494"))+
  
  theme_light()+
  theme(aspect.ratio = 9/18.5, 
        panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(), 
        axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1),
        axis.text.y = element_text(color = "black", size = 8),
        axis.line = element_line(colour = "black", linetype = "solid"), 
        legend.position="none",
        plot.title = element_text(hjust = .5))+
  geom_vline(xintercept = 4.5, color="#D0CECE")+
  geom_vline(xintercept = 8.5, color="#D0CECE")+
  geom_vline(xintercept = 9.5, color="#D0CECE")+
  
  scale_x_continuous(breaks= c(2, 4, 6, 8, 10, 12), labels = c("2" = "7/13", "4" = "7/30", "6" = "7/06", 
                                                               "8" = "8/02", "10" = "6/19", "12" = "7/21"))+
  geom_errorbar(width = .25)+
  geom_line(size = .75) +
  geom_point(size = 1.25) + 
  labs(y = "Mean number of tillers", x = "")+ 
  #theme(axis.title.y = element_text(face = "bold"))+
   # theme(axis.title.x = element_text(face = "bold"))+
  guides(color=guide_legend(" "))+
  #theme(legend.position="bottom")+
  facet_grid( clip ~ .) #vertically stacked

  #facet_grid( ~ clip)  #horizontally stacked


ggsave("../Figures/Linegraph_tiller_number.jpeg")

## Saving 7 x 5 in image

STATISTICS II

2-way ANOVAs

For tiller heights across fert and clip treatments for each year separably

Step 1 - subset data by year and “final height” date

#before final average was taken 
PG2018 <- subset(ht.avg.rep, d_sub2 == 4)
PG2019 <- subset(ht.avg.rep, d_sub2 == 8)
PG2020 <- subset(ht.avg.rep, d_sub2 == 9)
PG2021 <- subset(ht.avg.rep, d_sub2 == 13)

#re-combine the years with just the final heights (used for final bar graph)
PG_all           <- Reduce(MyMerge, list(PG2018, PG2019, PG2020, PG2021))

#Rename date column data
PG_all$d_sub2 <- recode(PG_all$d_sub2, "4" = "2018", "8" = "2019", "9" = "2020", "13" = "2021")

Step 2 - visualize your data with box plots

## 2018 ##
  PG2018$plot <- factor(PG2018$plot, levels = c("CT", "F2", "F5", "F10"))
  PG2018$clip <- factor(PG2018$clip, levels = c("CT", "PU", "PR"))

  BoxPlot2018 <- ggplot(PG2018, aes(x= plot, y=avg_t_ht1, fill = plot))+
      geom_boxplot(position=position_dodge(1))+
      scale_fill_manual(values=c("#d8b365", "#41b6c4", "#2c7fb8", "#253494"))+
      labs(title = "Tiller height - 2018", face = "bold",
         #subtitle = "(subtitle)",
         y = "Tiller Hight (cm)", x = "Fertilization level", fill = "Fertilization level")+ 
     theme_light()+
     scale_y_continuous(limits = c(2.5,23), breaks=seq(0,25,5))+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
     theme(aspect.ratio = 9/18.5)+
     facet_grid(clip ~ .)
   
 
## 2019 ##
  PG2019$plot <- factor(PG2019$plot, levels = c("CT", "F2", "F5", "F10"))
  PG2019$clip <- factor(PG2019$clip, levels = c("CT", "PU", "PR"))

  BoxPlot2019 <- ggplot(PG2019, aes(x= plot, y=avg_t_ht1, fill = plot))+
      geom_boxplot(position=position_dodge(1))+
      scale_fill_manual(values=c("#d8b365", "#41b6c4", "#2c7fb8", "#253494"))+
      labs(title = "Tiller height - 2019", face = "bold",
         #subtitle = "(subtitle)",
         y = "Tiller Hight (cm)", x = "Fertilization level", fill = "Fertilization level")+ 
     theme_light()+
     scale_y_continuous(limits = c(2.5,23), breaks=seq(0,25,5))+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
     theme(aspect.ratio = 9/18.5)+
     facet_grid(clip ~ .)
           

## 2020 ##
  PG2020$plot <- factor(PG2020$plot, levels = c("CT", "F2", "F5", "F10"))
  PG2020$clip <- factor(PG2020$clip, levels = c("CT", "PU", "PR"))

  BoxPlot2020 <- ggplot(PG2020, aes(x= plot, y=avg_t_ht1, fill = plot))+
      geom_boxplot(position=position_dodge(1))+
      scale_fill_manual(values=c("#d8b365", "#41b6c4", "#2c7fb8", "#253494"))+
      labs(title = "Tiller height - 2020", face = "bold",
         #subtitle = "(subtitle)",
         y = "Tiller Hight (cm)", x = "Fertilization level", fill = "Fertilization level")+ 
     theme_light()+
     scale_y_continuous(limits = c(2.5,23), breaks=seq(0,25,5))+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
     theme(aspect.ratio = 9/18.5)+
     facet_grid(clip ~ .)
          
  
## 2021 ##
  PG2021$plot <- factor(PG2021$plot, levels = c("CT", "F2", "F5", "F10"))
  PG2021$clip <- factor(PG2021$clip, levels = c("CT", "PU", "PR"))

  BoxPlot2021 <- ggplot(PG2021, aes(x= plot, y=avg_t_ht1, fill = plot))+
      geom_boxplot(position=position_dodge(1))+
      scale_fill_manual(values=c("#d8b365", "#41b6c4", "#2c7fb8", "#253494"))+
      labs(title = "Tiller height - 2021", face = "bold",
         #subtitle = "(subtitle)",
         y = "Tiller Hight (cm)", x = "Fertilization level", fill = "Fertilization level")+ 
     theme_light()+
     scale_y_continuous(limits = c(2.5,23), breaks=seq(0,25,5))+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
     theme(aspect.ratio = 9/18.5)+
     facet_grid(clip ~ .)

Step 3 - visualize data with interaction plots

#2018
int.18 <- interaction.plot(PG2018$plot, PG2018$clip, PG2018$avg_t_ht1, 
                 ylab = "avg.ht", 
                 xlab = "plot", 
                 trace.label = "clip", 
                 col = c("black", "orange", "red"), 
                 lty = 1)

#2019            
int.19 <- interaction.plot(PG2019$plot, PG2019$clip, PG2019$avg_t_ht1, 
                 ylab = "avg.ht", 
                 xlab = "plot", 
                 trace.label = "clip", 
                 col = c("black", "orange", "red"), 
                 lty = 1)

#2020           
int.20 <- interaction.plot(PG2020$plot, PG2020$clip, PG2020$avg_t_ht1, 
                 ylab = "avg.ht", 
                 xlab = "plot", 
                 trace.label = "clip", 
                 col = c("black", "orange", "red"), 
                 lty = 1)

#2021            
int.21 <- interaction.plot(PG2021$plot, PG2021$clip, PG2021$avg_t_ht1, 
                 ylab = "avg.ht", 
                 xlab = "plot", 
                 trace.label = "clip", 
                 col = c("black", "orange", "red"), 
                 lty = 1)

Step 4 - Run ANOVA’s for each year

#ANOVA_18<- aov(avg_t_ht1 ~ block + clip + plot + clip:plot, data = PG2018)
#summary(ANOVA_18)

#ANOVA_19<- aov(avg_t_ht1 ~ block + clip + plot + clip:plot, data = PG2019)
#summary(ANOVA_19)

#ANOVA_20<- aov(avg_t_ht1 ~ block + clip + plot + clip:plot, data = PG2020)
#summary(ANOVA_20)

ANOVA_21<- aov(avg_t_ht1 ~ block + clip + plot + clip:plot, data = PG2021)
summary(ANOVA_21)

##             Df Sum Sq Mean Sq F value  Pr(>F)    
## block        1   16.0   16.03   4.863 0.03198 *  
## clip         2  597.0  298.52  90.559 < 2e-16 ***
## plot         3   29.5    9.82   2.980 0.03989 *  
## clip:plot    6   65.3   10.88   3.299 0.00806 ** 
## Residuals   51  168.1    3.30                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Step 5 - Tukeys test

TukeyHSD(ANOVA_21, which = "clip:plot")

## Warning in replications(paste("~", xx), data = mf): non-factors ignored: block

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = avg_t_ht1 ~ block + clip + plot + clip:plot, data = PG2021)
## 
## $`clip:plot`
##                      diff         lwr         upr     p adj
## PU:CT-CT:CT   -2.80833333  -6.3973320  0.78066534 0.2659314
## PR:CT-CT:CT   -7.07500000 -10.6639987 -3.48600133 0.0000009
## CT:F2-CT:CT    0.92809223  -3.0845303  4.94071474 0.9996470
## PU:F2-CT:CT   -2.53440777  -6.5470303  1.47821474 0.5826633
## PR:F2-CT:CT   -7.54690777 -11.5595303 -3.53428526 0.0000026
## CT:F5-CT:CT   -1.52500000  -5.1139987  2.06399867 0.9458293
## PU:F5-CT:CT   -0.05500000  -3.8191736  3.70917357 1.0000000
## PR:F5-CT:CT   -7.03333333 -10.6223320 -3.44433466 0.0000010
## CT:F10-CT:CT  -0.07500000  -3.6639987  3.51399867 1.0000000
## PU:F10-CT:CT   1.68333333  -1.9056653  5.27233201 0.8989497
## PR:F10-CT:CT  -6.51876311 -10.2829367 -2.75458954 0.0000162
## PR:CT-PU:CT   -4.26666667  -7.8556653 -0.67766799 0.0081166
## CT:F2-PU:CT    3.73642557  -0.2761969  7.74904807 0.0898900
## PU:F2-PU:CT    0.27392557  -3.7386969  4.28654807 1.0000000
## PR:F2-PU:CT   -4.73857443  -8.7511969 -0.72595193 0.0088033
## CT:F5-PU:CT    1.28333333  -2.3056653  4.87233201 0.9843752
## PU:F5-PU:CT    2.75333333  -1.0108402  6.51750690 0.3606075
## PR:F5-PU:CT   -4.22500000  -7.8139987 -0.63600133 0.0091442
## CT:F10-PU:CT   2.73333333  -0.8556653  6.32233201 0.3026415
## PU:F10-PU:CT   4.49166667   0.9026680  8.08066534 0.0042052
## PR:F10-PU:CT  -3.71042977  -7.4746033  0.05374379 0.0566858
## CT:F2-PR:CT    8.00309223   3.9904697 12.01571474 0.0000006
## PU:F2-PR:CT    4.54059223   0.5279697  8.55321474 0.0144849
## PR:F2-PR:CT   -0.47190777  -4.4845303  3.54071474 0.9999996
## CT:F5-PR:CT    5.55000000   1.9610013  9.13899867 0.0001500
## PU:F5-PR:CT    7.02000000   3.2558264 10.78417357 0.0000032
## PR:F5-PR:CT    0.04166667  -3.5473320  3.63066534 1.0000000
## CT:F10-PR:CT   7.00000000   3.4110013 10.58899867 0.0000011
## PU:F10-PR:CT   8.75833333   5.1693347 12.34733201 0.0000000
## PR:F10-PR:CT   0.55623689  -3.2079367  4.32041046 0.9999960
## PU:F2-CT:F2   -3.46250000  -7.8581077  0.93310772 0.2572069
## PR:F2-CT:F2   -8.47500000 -12.8706077 -4.07939228 0.0000015
## CT:F5-CT:F2   -2.45309223  -6.4657147  1.55953027 0.6296323
## PU:F5-CT:F2   -0.98309223  -5.1531319  3.18694739 0.9995772
## PR:F5-CT:F2   -7.96142557 -11.9740481 -3.94880306 0.0000007
## CT:F10-CT:F2  -1.00309223  -5.0157147  3.00953027 0.9992688
## PU:F10-CT:F2   0.75524110  -3.2573814  4.76786360 0.9999526
## PR:F10-CT:F2  -7.44685534 -11.6168950 -3.27681571 0.0000084
## PR:F2-PU:F2   -5.01250000  -9.4081077 -0.61689228 0.0132744
## CT:F5-PU:F2    1.00940777  -3.0032147  5.02203027 0.9992251
## PU:F5-PU:F2    2.47940777  -1.6906319  6.64944739 0.6678736
## PR:F5-PU:F2   -4.49892557  -8.5115481 -0.48630306 0.0160512
## CT:F10-PU:F2   2.45940777  -1.5532147  6.47203027 0.6260061
## PU:F10-PU:F2   4.21774110   0.2051186  8.23036360 0.0314232
## PR:F10-PU:F2  -3.98435534  -8.1543950  0.18568429 0.0734725
## CT:F5-PR:F2    6.02190777   2.0092853 10.03453027 0.0002564
## PU:F5-PR:F2    7.49190777   3.3218681 11.66194739 0.0000073
## PR:F5-PR:F2    0.51357443  -3.4990481  4.52619694 0.9999991
## CT:F10-PR:F2   7.47190777   3.4592853 11.48453027 0.0000033
## PU:F10-PR:F2   9.23024110   5.2176186 13.24286360 0.0000000
## PR:F10-PR:F2   1.02814466  -3.1418950  5.19818429 0.9993567
## PU:F5-CT:F5    1.47000000  -2.2941736  5.23417357 0.9698284
## PR:F5-CT:F5   -5.50833333  -9.0973320 -1.91933466 0.0001720
## CT:F10-CT:F5   1.45000000  -2.1389987  5.03899867 0.9616055
## PU:F10-CT:F5   3.20833333  -0.3806653  6.79733201 0.1209288
## PR:F10-CT:F5  -4.99376311  -8.7579367 -1.22958954 0.0018596
## PR:F5-PU:F5   -6.97833333 -10.7425069 -3.21415977 0.0000036
## CT:F10-PU:F5  -0.02000000  -3.7841736  3.74417357 1.0000000
## PU:F10-PU:F5   1.73833333  -2.0258402  5.50250690 0.9079427
## PR:F10-PU:F5  -6.46376311 -10.3953142 -2.53221204 0.0000468
## CT:F10-PR:F5   6.95833333   3.3693347 10.54733201 0.0000013
## PU:F10-PR:F5   8.71666667   5.1276680 12.30566534 0.0000000
## PR:F10-PR:F5   0.51457023  -3.2496033  4.27874379 0.9999982
## PU:F10-CT:F10  1.75833333  -1.8306653  5.34733201 0.8699413
## PR:F10-CT:F10 -6.44376311 -10.2079367 -2.67958954 0.0000206
## PR:F10-PU:F10 -8.20209644 -11.9662700 -4.43792288 0.0000001

Step 6 - check assumptions

# Assumption 1 - Residuals should be normally distributed

  #get residuals 
    #test.res <- test.anova$residuals
  #graph residuals 
    #hist(test.res, main="Histogram of residuals",xlab="Residuals")


# Assumption 2 - Homogeneity of Variance(Levene’s Test)

GRAPHING

Bar graphs - Tiller heights all years

PG_all_avgs <- (PG_all) %>% group_by(d_sub2, plot, clip) %>% summarise_at(vars(avg_t_ht1), list(avg_t_ht = mean, std_ht = sd, max_ht = max, min_ht = min), na.rm = TRUE)

  PG_all_avgs$plot <- factor(PG_all_avgs$plot, levels = c("CT", "F2", "F5", "F10"))
  PG_all_avgs$clip <- factor(PG_all_avgs$clip, levels = c("CT", "PU", "PR"))

ggplot(data=PG_all_avgs, aes(x=d_sub2, y=avg_t_ht, group = plot))+
      geom_bar(aes(fill = plot), stat = "identity", width = 0.8, position= position_dodge(0.9))+
        theme_light()+
      scale_fill_manual(values=c("#d8b365", "#41b6c4", "#2c7fb8", "#253494")) +
    scale_x_discrete()+
    theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     #theme(axis.title.y = element_text(face = "bold"))+
     #theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme( axis.line = element_line(colour = "black", 
                      size = 1, linetype = "solid"))+
     theme(axis.text.x = element_text(color = "black", size = 11, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
    labs(y = "Mean final tiller height (cm)", x = "")+ 
    geom_errorbar(aes(ymin = avg_t_ht-std_ht, ymax= avg_t_ht+std_ht), position=position_dodge(.85), width = 0.2, color = "#909090")+
     theme(aspect.ratio = 9/18.5)+
     guides(color=guide_legend(" "))+
    #theme(legend.position="bottom")+
         facet_grid(clip ~ .)

ggsave("../Figures/finalheights_all_BarGraph.jpeg")

## Saving 7 x 5 in image

Box plot - Tiller Heights

PG_all$plot <- factor(PG_all$plot, levels = c("CT", "F2", "F5", "F10"))
  PG_all$clip <- factor(PG_all$clip, levels = c("CT", "PU", "PR"))

ggplot(PG_all, aes(x= d_sub2, y=avg_t_ht1, fill = plot))+
    geom_hline(yintercept = 15, color="#D0CECE")+
      geom_boxplot(position=position_dodge(.85))+
      scale_fill_manual(values=c("#E1BA15", "#41b6c4", "#2c7fb8", "#3041BE"))+
  stat_summary(fun="mean", color="black", size = .2, shape = 4, position=position_dodge(.85))+
     theme_light()+
     scale_y_continuous(limits = c(2.5,23), breaks=seq(0,25,5))+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
     theme(aspect.ratio = 9/18.5)+
  labs(y = "Final tiller height (cm)", x = "")+
    #theme(legend.position="bottom")+
     facet_grid(clip ~ .)

## Warning: Removed 5 rows containing non-finite values (stat_boxplot).

## Warning: Removed 5 rows containing non-finite values (stat_summary).

## Warning: Removed 16 rows containing missing values (geom_segment).

## Warning: Removed 16 rows containing missing values (geom_segment).

## Warning: Removed 16 rows containing missing values (geom_segment).

ggsave("../Figures/Finalheights_all_BOXPLOT.jpeg", width = 7, height = 7)

## Warning: Removed 5 rows containing non-finite values (stat_boxplot).

## Warning: Removed 5 rows containing non-finite values (stat_summary).

## Warning: Removed 16 rows containing missing values (geom_segment).

## Warning: Removed 16 rows containing missing values (geom_segment).

## Warning: Removed 16 rows containing missing values (geom_segment).

Box plot - T_Number

Step 1 - subset data by year and “final height” date

#before final average was taken 
Num2018 <- subset(tnum.avg.rep, d_sub2 == 4)
Num2019 <- subset(tnum.avg.rep, d_sub2 == 8)
Num2020 <- subset(tnum.avg.rep, d_sub2 == 9)
Num2021 <- subset(tnum.avg.rep, d_sub2 == 13)

#re-combine the years with just the final heights (used for final bar graph)
Num_all           <- Reduce(MyMerge, list(Num2018, Num2019, Num2020, Num2021))

#Rename date column data
Num_all$d_sub2 <- recode(Num_all$d_sub2, "4" = "2018", "8" = "2019", "9" = "2020", "13" = "2021")

Num_all$plot <- factor(Num_all$plot, levels = c("CT", "F2", "F5", "F10"))
Num_all$clip <- factor(Num_all$clip, levels = c("CT", "PU", "PR"))

ggplot(Num_all, aes(x= d_sub2, y=avg_t_num1, fill = plot))+
    geom_hline(yintercept = 3, color="#D0CECE")+
      geom_boxplot(position=position_dodge(.85))+
      scale_fill_manual(values=c("#E1BA15", "#41b6c4", "#2c7fb8", "#3041BE"))+
  stat_summary(fun="mean", color="black", size = .2, shape = 4, position=position_dodge(.85))+
     theme_light()+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 11, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 11))+
     theme(aspect.ratio = 9/18.5)+
  labs(y = "Final tiller number", x = "")+
    #theme(legend.position="bottom")+
     facet_grid(clip ~ .)

## Warning: Removed 16 rows containing missing values (geom_segment).

## Warning: Removed 16 rows containing missing values (geom_segment).

## Warning: Removed 16 rows containing missing values (geom_segment).

ggsave("../Figures/FinalTillerNumbers_all_BOXPLOT.jpeg", width = 7, height = 7)

## Warning: Removed 16 rows containing missing values (geom_segment).

## Warning: Removed 16 rows containing missing values (geom_segment).

## Warning: Removed 16 rows containing missing values (geom_segment).

Supplemental

Vole Damage assesment

Load in data

Vole_2022 <- read.csv("../Data/Herbivor_damage_2022.csv", header = TRUE)

head(Vole_2022) #opens first 6 rows

##     ï..Date block plot clip clip_id diam_cm damg
## 1 6/21/2022     1   CT   CT     CT1    23.0    0
## 2 6/21/2022     1   CT   CT     CT2    31.0    3
## 3 6/21/2022     1   CT   CT     CT3    29.0    0
## 4 6/21/2022     1   CT   PU     PU1    28.5    0
## 5 6/21/2022     1   CT   PU     PU2    22.5    0
## 6 6/21/2022     1   CT   PR     PR1    19.0    0

#rename date column
Vole_2022 <- Vole_2022 %>%
  rename(date = ï..Date)

unique(Vole_2022$diam_cm)

##  [1] 23.0 31.0 29.0 28.5 22.5 19.0 11.0 37.0 25.0 17.5 16.0 15.0 16.5 13.5 23.5
## [16] 18.0 21.5 17.0   NA 20.0 27.0 13.0 12.0 19.5 25.5 21.0 14.5 14.0 27.5 30.0
## [31] 22.0 24.5 24.0 15.5

Basic Stats

Calculate basic stats (avg, SD, Max, Min….Ex) for tussock damage

#average across type of clip treatment 
vole.dmg.clip <- (Vole_2022) %>% group_by(block, plot, clip) %>% summarise_at(vars(damg), list(avg_dmg = mean, std_dmg = sd, max_dmg = max, min_dmg = min), na.rm = TRUE) 

vole.diam.clip <- (Vole_2022) %>% group_by(block, plot, clip) %>% summarise_at(vars(diam_cm), list(avg_diam = mean, std_diam = sd, max_diam = max, min_diam = min), na.rm = TRUE)

#average across the blocks
vole.dmg.block <- (vole.dmg.clip) %>% group_by(plot, clip) %>% summarise_at(vars(avg_dmg), list(avg_dmg2 = mean, std_dmg2 = sd, max_dmg2 = max, min_dmg2 = min), na.rm = TRUE)

vole.diam.block <- (vole.diam.clip) %>% group_by(plot, clip) %>% summarise_at(vars(avg_diam), list(avg_diam2 = mean, std_diam2 = sd, max_diam2 = max, min_diam2 = min), na.rm = TRUE)

Merge the data back togeather

#Plot
vole.clip           <- Reduce(MyMerge, list(vole.dmg.clip, vole.diam.clip))

#Block
vole.block           <- Reduce(MyMerge, list(vole.dmg.block, vole.diam.block))

Box plots

tussock diameter

vole.clip$plot <- factor(vole.clip$plot, levels = c("CT", "F2", "F5", "F10"))
  vole.clip$clip <- factor(vole.clip$clip, levels = c("CT", "PU", "PR"))


ggplot(vole.clip, aes(x= clip, y=avg_diam, fill = plot))+
    geom_hline(yintercept = 22.5, color="#D0CECE")+
      geom_boxplot(position=position_dodge(.85))+
      scale_fill_manual(values=c("#E1BA15", "#41b6c4", "#2c7fb8", "#3041BE"))+
  stat_summary(fun="mean", color="black", size = .2, shape = 4, position=position_dodge(.85))+
     theme_light()+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
     theme(aspect.ratio = 9/18.5)+
  labs(y = "Tusscok diameter (cm)", x = "")

## Warning: Removed 12 rows containing missing values (geom_segment).

ggsave("../Figures/Tusscok_diameter_BOXPLOT.jpeg", width = 7, height = 7)

## Warning: Removed 12 rows containing missing values (geom_segment).

Vole damage

vole.clip$plot <- factor(vole.clip$plot, levels = c("CT", "F2", "F5", "F10"))
  vole.clip$clip <- factor(vole.clip$clip, levels = c("CT", "PU", "PR"))


ggplot(vole.clip, aes(x= clip, y=avg_dmg, fill = plot))+
    geom_hline(yintercept = 2, color="#D0CECE")+
      geom_boxplot(position=position_dodge(.85))+
      scale_fill_manual(values=c("#E1BA15", "#41b6c4", "#2c7fb8", "#3041BE"))+
  stat_summary(fun="mean", color="black", size = .2, shape = 4, position=position_dodge(.85))+
     theme_light()+
     theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
     theme(axis.title.y = element_text(face = "bold"))+
     theme(axis.title.x = element_text(face = "bold"))+
     theme(plot.title = element_text(hjust = .5))+
     theme(axis.text.x = element_text(color = "black", size = 8, angle = 45, hjust = 1))+
     theme(axis.text.y = element_text(color = "black", size = 8))+
     theme(aspect.ratio = 9/18.5)+
  labs(y = "Tusscok damage", x = "")

## Warning: Removed 12 rows containing missing values (geom_segment).

ggsave("../Figures/Tusscok_damage_BOXPLOT.jpeg", width = 7, height = 7)

## Warning: Removed 12 rows containing missing values (geom_segment).

Other

Scatter plot PR tiller hights calculate slopes of regrowth

#subset the PR data from the other treatments 
PR_sub <- subset(ht.avg.clip, clip=="PR")

#Create a new block ID column that can be used in graph 
#-----------------------------------------------------------------------------------
#vectors for new block ID column
B1 <- c("B1")
B2 <- c("B2")
B3 <- c("B3")

#Then subset by clip_id...
PR_B1_sub <- subset(PR_sub, block== 1 )
PR_B2_sub <- subset(PR_sub, block== 2 )
PR_B3_sub <- subset(PR_sub, block== 3 )

#Fill the columns
PR_B1_sub["block_ID"] <- B1
PR_B2_sub["block_ID"] <- B2
PR_B3_sub["block_ID"] <- B3

#Merge them back togeather 
MyMerge       <- function(x, y){
  df            <- merge(x, y, all = TRUE)
  rownames(df)  <- df$Row.names
  df$Row.names  <- NULL
  return(df)
}
PR_sub1           <- Reduce(MyMerge, list(PR_B1_sub, PR_B2_sub, PR_B3_sub))

#subset by year
#-----------------------------------------------------------------------------------------
#2018
PR_18_sub <- subset(PR_sub1, d_sub2 < 5)

#2019
PR_19_sub <- subset(PR_sub1, d_sub2 > 4)
PR_19_sub <- subset(PR_19_sub, d_sub2 < 9)

#2021
PR_21_sub <- subset(PR_sub1, d_sub2 > 9 )

2018

#specify factor levels for fertilization and clipping treatments 
PR_18_sub$plot <- factor(PR_18_sub$plot, levels = c("CT", "F2", "F5", "F10"))
PR_18_sub$clip <- factor(PR_18_sub$clip, levels = c("CT", "PU", "PR"))


ggplot(data = PR_18_sub, aes(x = d_sub2, y = avg_t_ht2, color = plot, shape = block_ID)) +
  scale_color_manual(values = c("CT" = "#d8b365", 
                                "F2" = "#41b6c4",
                                "F5" = "#2c7fb8",
                                "F10" = "#253494"))+
  geom_point()

  #geom_smooth(method = "lm", se = TRUE)

R_cleaning_code_tiller_heights

Nicole Williamson

2022-09-15

Setting up

Helpful Tips

Downloading/loading packages

Data Wrangling

Import data

Merge older data together

Add new clip code

Merge old and new data

Add new date column

Add clipping treatments

STATISTICS I

Basic stats

Tiller Height

Tiller Number

GRAPHING

Line graph - tiller heights over time

Line graph - tiller number over time

STATISTICS II

2-way ANOVAs

Step 1 - subset data by year and “final height” date

Step 2 - visualize your data with box plots

Step 3 - visualize data with interaction plots

Step 4 - Run ANOVA’s for each year

Step 5 - Tukeys test

Step 6 - check assumptions

GRAPHING

Bar graphs - Tiller heights all years

Box plot - Tiller Heights

Box plot - T_Number

Step 1 - subset data by year and “final height” date

Supplemental

Vole Damage assesment

Load in data

Basic Stats

Merge the data back togeather

Box plots

Other

Scatter plot PR tiller hights calculate slopes of regrowth

2018