For the following analyses we will require the use of a number of different R packages. Most of which can be sourced from CRAN, but some may be downloaded from GitHub. We can use the following code to load in the packages and install any packages not previously installed in the R console.
if (!require("pacman")) install.packages("pacman")
pacman::p_load(ggplot2, stringr, gridExtra, ggpubr, Rmisc, FSA, rcompanion, RColorBrewer, dplyr, vegan, nparcomp, RVAideMemoire, MANOVA.RM, pairwiseAdonis, PMCMR, PMCMRplus, patchwork, plyr, gdata)
pacman::p_load_gh("pmartinezarbizu/pairwiseAdonis/pairwiseAdonis")
Loading the data set into R. All analyses are conducted from a master spreadsheet dating back to November 2017. We are using the function read.xls from the package gdata to read the excel file straight into R. Note: this master spreadsheet only has 1 sheet.
This data set contains a count of “TMTC” or “too many to count” so we are adding a line of na.strings = "TMTC" which will take that string and treat it as an NA. This is so we can manipulate that column as integers. na.strings can be expanded to include other qualifiers if necessary. We included the code class(rd$TotalHealthy) to ensure that the column containing the “TMTC” input is being read as an integer and that the na.strings = "TMTC" worked. We are calling it rd for “roving diver” which is the type of surveys these are.
rd <- read.xls("../data/RovingDiverSurveys.xlsx", head = T, na.strings ="TMTC")
class(rd$Healthy)
## [1] "integer"
head(rd)
## Date DiverName SiteCode TimeStart TimeEnd Depth SurveyArea
## 1 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 2 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 3 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 4 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 5 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## 6 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## SpeciesCode UK DS BB RB YB SCTLD WP WS P PB BL Healthy Notes
## 1 MCAV 0 0 0 0 0 2 0 0 0 0 0 2 Dead 2
## 2 PCLI 0 0 0 0 0 9 0 0 0 0 0 5 Dead 26
## 3 ISIN 0 0 0 0 0 0 0 0 0 0 0 1
## 4 PAST 0 0 0 0 0 0 0 0 0 0 14 13
## 5 PCLI 0 0 0 0 0 2 0 0 0 0 0 4 Dead 2
## 6 SBOU 0 0 0 0 0 0 0 0 0 0 2 0 Dead 1
One of the purposes of this pipeline is to manipulate the data set a number of ways in order to achieve specific outputs for reports. We are going to mainly do this using the package dplyr. We are doing these manipulations and calculations in R to prevent manipulation of the master database and to ensure continuity and consistency throughout this longterm monitoring project.
Our first step is to format the dates they can be read by R. Reading the data in from the excel spreadsheet, R recognizes the Date column as a Date format (if you read your data in as a csv file this won’t happen). The data is read as YYY/MM/DD, but often it is helpful to parse things out by Month or by Year so we are adding a column using a M-Y format. This same code can be changed to include a YYYY format as well.
rd$MonthYear = format(as.Date(as.Date(rd$Date), format = "%y/%m/%d"), "%m-%y")
rd$Date = as.Date(rd$Date)
class(rd$Date)
## [1] "Date"
head(rd)
## Date DiverName SiteCode TimeStart TimeEnd Depth SurveyArea
## 1 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 2 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 3 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 4 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 5 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## 6 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## SpeciesCode UK DS BB RB YB SCTLD WP WS P PB BL Healthy Notes MonthYear
## 1 MCAV 0 0 0 0 0 2 0 0 0 0 0 2 Dead 2 11-17
## 2 PCLI 0 0 0 0 0 9 0 0 0 0 0 5 Dead 26 11-17
## 3 ISIN 0 0 0 0 0 0 0 0 0 0 0 1 11-17
## 4 PAST 0 0 0 0 0 0 0 0 0 0 14 13 11-17
## 5 PCLI 0 0 0 0 0 2 0 0 0 0 0 4 Dead 2 11-17
## 6 SBOU 0 0 0 0 0 0 0 0 0 0 2 0 Dead 1 11-17
We see that we have MonthYear at the end of the dataset and Date is now in a YYYY-MM-DD format.
We also want to add a few more columns like County, Location, Total Observations, and Prevalence. We will start with adding County and Location.
Our dataset includes surveys across three counties, at four different locations, but they are entered into the data set at the individual site level (SiteCode). It is often helpful, especially when reporting, to report at the County level or even by location. SiteCode is the individual sites where surveys occur (ex: SEFL05, T328, BC1, etc), Location is the general area where operations occur (ex: Jupiter, Lauderdale-by-the-Sea), and County is the county where operations occur (ex: Martin County, Palm Beach County, and Broward County).
rd$County = if_else(rd$SiteCode %in% c("SEFL01", "SEFL02", "SEFL 03", "SLR South", "SLR Central", "SLR Ledge", "SLR North"), "Martin",
if_else(rd$SiteCode %in% c("T328", "BC1", "FTL4"), "Broward", "Palm Beach"))
rd$Location= if_else(rd$SiteCode %in% c("SEFL01", "SEFL02", "SEFL03", "SLR South", "SLR Central", "SLR Ledge", "SLR North"), "SLR",
if_else(rd$SiteCode %in% c("T328", "BC1", "FTL4"), "PMP",
if_else(rd$SiteCode %in% c("SEFL04", "SEFL05", "SEFL06"), "JUP",
if_else(rd$SiteCode %in% c("SEFL08", "SEFL09", "SEFL10", "SEFL11", "SEFL12"), "WPB", ""))))
head(rd)
## Date DiverName SiteCode TimeStart TimeEnd Depth SurveyArea
## 1 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 2 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 3 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 4 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 5 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## 6 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## SpeciesCode UK DS BB RB YB SCTLD WP WS P PB BL Healthy Notes MonthYear
## 1 MCAV 0 0 0 0 0 2 0 0 0 0 0 2 Dead 2 11-17
## 2 PCLI 0 0 0 0 0 9 0 0 0 0 0 5 Dead 26 11-17
## 3 ISIN 0 0 0 0 0 0 0 0 0 0 0 1 11-17
## 4 PAST 0 0 0 0 0 0 0 0 0 0 14 13 11-17
## 5 PCLI 0 0 0 0 0 2 0 0 0 0 0 4 Dead 2 11-17
## 6 SBOU 0 0 0 0 0 0 0 0 0 0 2 0 Dead 1 11-17
## County Location
## 1 Martin SLR
## 2 Martin SLR
## 3 Martin SLR
## 4 Martin SLR
## 5 Martin SLR
## 6 Martin SLR
We can see that the County and Location columns have been added to the end of the data set!
Now we are going to tabulate the all of the disease observations and healthy observations into a TotalObservations column. This data set has 11 different keys for “disease” from unknown (UK) to black band disease (BB), to stony coral tissue loss disease (SCTLD), as well as paling (P) and bleaching (BL). We are combinging all of those, along with the Healthy column to have a TotalObservations column.
rd$TotalObservations <- rd$UK + rd$DS + rd$BB + rd$RB + rd$YB + rd$SCTLD + rd$WP + rd$WS + rd$P + rd$PB + rd$BL + rd$Healthy
head(rd)
## Date DiverName SiteCode TimeStart TimeEnd Depth SurveyArea
## 1 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 2 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 3 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 4 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 5 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## 6 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## SpeciesCode UK DS BB RB YB SCTLD WP WS P PB BL Healthy Notes MonthYear
## 1 MCAV 0 0 0 0 0 2 0 0 0 0 0 2 Dead 2 11-17
## 2 PCLI 0 0 0 0 0 9 0 0 0 0 0 5 Dead 26 11-17
## 3 ISIN 0 0 0 0 0 0 0 0 0 0 0 1 11-17
## 4 PAST 0 0 0 0 0 0 0 0 0 0 14 13 11-17
## 5 PCLI 0 0 0 0 0 2 0 0 0 0 0 4 Dead 2 11-17
## 6 SBOU 0 0 0 0 0 0 0 0 0 0 2 0 Dead 1 11-17
## County Location TotalObservations
## 1 Martin SLR 4
## 2 Martin SLR 14
## 3 Martin SLR 1
## 4 Martin SLR 27
## 5 Martin SLR 6
## 6 Martin SLR 2
For the purposes of reporting, you might want to block certain dates out, and only report on certain dates. Here, as an example, we are subsetting everything before July 3, 2019. This code can be modified to capture any time period you may need. We are calling it rd.early because it encapsulates the early days of our surveys. I am also subsetting for only our longterm monitoring sites.
rd.early <- filter(rd, Date < "2019-07-03") %>%
subset(SiteCode %in% c("SEFL01", "SEFL02", "SEFL04", "SEFL05", "SEFL06", "SEFL08", "SEFL11", "SEFL12", "SLR North", "SLR South", "SLR Central", "SLR Ledge", "BC1", "T328", "FTL4"))
head(rd.early)
## Date DiverName SiteCode TimeStart TimeEnd Depth SurveyArea
## 1 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 2 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 3 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 4 2017-11-09 Joshua Voss SEFL01 14:40:00.00 15:12:00.00 15 700
## 5 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## 6 2017-11-09 Joshua Voss SEFL02 13:29:00.00 14:05:00.00 20 600
## SpeciesCode UK DS BB RB YB SCTLD WP WS P PB BL Healthy Notes MonthYear
## 1 MCAV 0 0 0 0 0 2 0 0 0 0 0 2 Dead 2 11-17
## 2 PCLI 0 0 0 0 0 9 0 0 0 0 0 5 Dead 26 11-17
## 3 ISIN 0 0 0 0 0 0 0 0 0 0 0 1 11-17
## 4 PAST 0 0 0 0 0 0 0 0 0 0 14 13 11-17
## 5 PCLI 0 0 0 0 0 2 0 0 0 0 0 4 Dead 2 11-17
## 6 SBOU 0 0 0 0 0 0 0 0 0 0 2 0 Dead 1 11-17
## County Location TotalObservations
## 1 Martin SLR 4
## 2 Martin SLR 14
## 3 Martin SLR 1
## 4 Martin SLR 27
## 5 Martin SLR 6
## 6 Martin SLR 2
tail(rd.early)
## Date DiverName SiteCode TimeStart TimeEnd Depth SurveyArea
## 573 2019-06-21 Ryan Eckert BC1 09:55:00.00 10:15:00.00 NA 1500
## 574 2019-06-21 Ryan Eckert BC1 09:55:00.00 10:15:00.00 NA 1500
## 575 2019-06-21 Ryan Eckert BC1 09:55:00.00 10:15:00.00 NA 1500
## 576 2019-06-21 Ryan Eckert BC1 09:55:00.00 10:15:00.00 NA 1500
## 577 2019-06-21 Ryan Eckert BC1 09:55:00.00 10:15:00.00 NA 1500
## 578 2019-06-21 Ryan Eckert BC1 09:55:00.00 10:15:00.00 NA 1500
## SpeciesCode UK DS BB RB YB SCTLD WP WS P PB BL Healthy Notes MonthYear
## 573 PAST 0 0 0 0 0 0 0 0 0 0 0 23 06-19
## 574 SSID 0 0 0 0 0 0 0 0 0 0 0 15 06-19
## 575 OFAV 0 0 0 0 0 0 0 0 0 0 0 3 06-19
## 576 SINT 0 0 0 0 0 0 0 0 0 0 0 3 06-19
## 577 ACER 0 0 0 0 0 0 0 0 0 0 0 2 06-19
## 578 OANN 0 0 0 0 0 0 0 0 0 0 0 1 06-19
## County Location TotalObservations
## 573 Broward PMP 23
## 574 Broward PMP 15
## 575 Broward PMP 3
## 576 Broward PMP 3
## 577 Broward PMP 2
## 578 Broward PMP 1
Checking to see that we have our earliest date head(rd.early) and our cut-off date tail(rd.early) to confirm the subset worked.
Currently the data set has very high resolution, to species level, which may not be necessary for certain reports. Now, using dplyr we will tabulate the data into a more concise form. In this example we are assuming that we need to report on data from the time frame we subsetted earlier (November 2017 - June 2019), and the report only needs to know prevalence data by County. Sometimes packages mask other packages so I am specifying that I want to use the function select from the package dplyr by using the following: dplyr::select.
countyData <- rd.early %>% dplyr::select(c(SiteCode, County, MonthYear, SpeciesCode, SCTLD, TotalObservations)) %>% group_by(SiteCode,MonthYear,County) %>% summarise_if(is.integer, sum)
countyData
## # A tibble: 60 x 5
## # Groups: SiteCode, MonthYear [60]
## SiteCode MonthYear County SCTLD TotalObservations
## <fct> <chr> <chr> <int> <int>
## 1 BC1 03-19 Broward 6 154
## 2 BC1 06-19 Broward 16 307
## 3 BC1 11-18 Broward 26 236
## 4 BC1 12-18 Broward 23 315
## 5 FTL4 03-19 Broward 6 95
## 6 FTL4 05-19 Broward 7 128
## 7 FTL4 06-19 Broward 5 118
## 8 FTL4 11-18 Broward 9 65
## 9 SEFL01 11-17 Martin 11 46
## 10 SEFL02 11-17 Martin 3 10
## # … with 50 more rows
To summarize: I am selecting (dplyr::select()) from the data the columns of SiteCode, County, MonthYear, SpeciesCode, TLD, and TotalObservations. grouping them (group_by()) by SiteCode, MonthYear, and County; and adding (summarise_if(is.integer, sum)) all integers found in that subset (in this case TLD and TotalObservations).
# Calculating Prevalence Data * We want to calculate SCTLD disease prevalence at our monitoring sites. To do this, we will calculate prevalence by diving our SCTLD observations (TLD) from our total number of observations (TotalObservations). The syntax used here is very similar to the way we calculated TotalObservations**.
countyData$Prevalence = countyData$SCTLD/countyData$TotalObservations
head(countyData)
## # A tibble: 6 x 6
## # Groups: SiteCode, MonthYear [6]
## SiteCode MonthYear County SCTLD TotalObservations Prevalence
## <fct> <chr> <chr> <int> <int> <dbl>
## 1 BC1 03-19 Broward 6 154 0.0390
## 2 BC1 06-19 Broward 16 307 0.0521
## 3 BC1 11-18 Broward 26 236 0.110
## 4 BC1 12-18 Broward 23 315 0.0730
## 5 FTL4 03-19 Broward 6 95 0.0632
## 6 FTL4 05-19 Broward 7 128 0.0547
You can see that we added a Prevalence column to the data set!
Now our report needs some analyses. Here we want to see if our prevalence data varies between counties and through time. We will be using a PERMANOVA in R to assess this. We will use the adonis() function in vegan.
set.seed(999)
adonis(formula = Prevalence ~ County*MonthYear, data = countyData, method = "euclidian")
##
## Call:
## adonis(formula = Prevalence ~ County * MonthYear, data = countyData, method = "euclidian")
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## County 2 0.15666 0.078328 12.2394 0.28648 0.001 ***
## MonthYear 12 0.10287 0.008572 1.3395 0.18811 0.249
## County:MonthYear 4 0.02493 0.006233 0.9739 0.04559 0.443
## Residuals 41 0.26239 0.006400 0.47982
## Total 59 0.54684 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
County is significantly different, but Time is not.
Now lets conduct a pairwise comparison to determine which county (or counties) is driving the variation. We will be using pairwise.adonis() from pairwiseAdonis.
set.seed(999)
pairwise.adonis(countyData[c(6)],factors=countyData$County,
sim.method='euclidian',p.adjust.m='bonferroni', perm = 999)
## pairs Df SumsOfSqs F.Model R2 p.value
## 1 Broward vs Martin 1 0.001822233 0.1137796 0.004922589 0.708
## 2 Broward vs Palm Beach 1 0.085256428 53.6098535 0.538198297 0.001
## 3 Martin vs Palm Beach 1 0.111946987 14.8666035 0.248328828 0.002
## p.adjusted sig
## 1 1.000
## 2 0.003 *
## 3 0.006 *
We see that Palm Beach County is significantly different from both Martin County and Broward County. Now its time to visualize this data!
.
We will now make a number of plots to visualize this prevalence data. The first is plotting disease prevalence at each site over time, grouped by each county. But first we need to subset by County. Just a note, SEFL01 and SEFL02 are actually SLR Central and SLR South, respectively. For plotting purposes I am renaming them. I am also arranging the factor levels chronologically.
martin <- subset(countyData, County == "Martin")
class(martin$MonthYear)
## [1] "character"
martin$MonthYear <- as.factor(martin$MonthYear)
levels(martin$MonthYear)
## [1] "03-19" "04-19" "06-19" "11-17"
martin$MonthYear = factor(martin$MonthYear, levels(martin$MonthYear)[c(4, 1, 2, 3)])
levels(martin$MonthYear)
## [1] "11-17" "03-19" "04-19" "06-19"
levels(martin$SiteCode)[levels(martin$SiteCode) == "SEFL01"] <- "SLR Central"
levels(martin$SiteCode)[levels(martin$SiteCode) == "SEFL02"] <- "SLR South"
levels(martin$SiteCode)
## [1] "BC1" "FTL4" "SLR Central" "SLR South" "SEFL03"
## [6] "SEFL04" "SEFL05" "SEFL06" "SEFL08" "SEFL09"
## [11] "SEFL10" "SEFL11" "SEFL12" "SEFL13" "SEFL14"
## [16] "SEFL15" "SEFL16" "SEFL17" "SEFL18" "SEFL19"
## [21] "SEFL19*" "SEFL20" "SEFL21" "SEFL22" "SLR Ledge"
## [26] "SLR North" "T328"
palm <- subset(countyData, County == "Palm Beach")
class(palm$MonthYear)
## [1] "character"
palm$MonthYear <- as.factor(palm$MonthYear)
levels(palm$MonthYear)
## [1] "02-19" "03-19" "04-18" "05-18" "05-19" "06-18" "06-19" "08-18" "12-17"
## [10] "12-18"
palm$MonthYear = factor(palm$MonthYear, levels(palm$MonthYear)[c(9, 3, 4, 6, 8, 10, 1, 2, 5, 7)])
levels(palm$MonthYear)
## [1] "12-17" "04-18" "05-18" "06-18" "08-18" "12-18" "02-19" "03-19" "05-19"
## [10] "06-19"
broward <- subset(countyData, County == "Broward")
class(broward$MonthYear)
## [1] "character"
broward$MonthYear <- as.factor(broward$MonthYear)
levels(broward$MonthYear)
## [1] "03-19" "05-19" "06-19" "11-18" "12-18"
broward$MonthYear = factor(broward$MonthYear, levels(broward$MonthYear)[c(4,5,1,2,3)])
levels(broward$MonthYear)
## [1] "11-18" "12-18" "03-19" "05-19" "06-19"
Plot for Martin County
martin.prev.plot1 <-
ggplot(martin, aes(x=MonthYear, y=Prevalence, color = SiteCode, group = SiteCode, label=SiteCode))+
geom_line(size = 1)+
geom_point(size = 2)+
labs(y = "Prevalence")+
annotate("label", x = 2.45, y = 0.45, label = paste ("Martin County"), size = 5)+
scale_x_discrete(limits = c("11-17","12-17","01-18", "04-18", "05-18", "06-18", "08-18", "11-18", "12-18", "02-19", "03-19", "04-19", "05-19", "06-19"), labels = c("Nov 17","Dec 17","Jan 18", "Apr 18", "May 18", "Jun 18", "Aug 18", "Nov 18", "Dec 18", "Feb 19", "Mar 19", "Apr 19", "May 19", "Jun 19"))+
labs(color = "Site")+
ylim(0,0.5)
# takes the legend and saves it as a separate object (grob)
get_legend<-function(martin.prev.plot1){
tmp <- ggplot_gtable(ggplot_build(martin.prev.plot1))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
martin.legend <- tmp$grobs[[leg]]
return(martin.legend)
}
martin.legend=get_legend(martin.prev.plot1)
martin.prev.plot <- martin.prev.plot1 +
theme(panel.grid.major = element_line(size = 0.5, linetype = 'solid', colour = "white"),
panel.background = element_rect(fill = '#F5F5F5'),
plot.title = element_text(hjust = 0.5),
axis.line = element_line(colour = "black"),
axis.ticks = element_line(color="black"),
text = element_text(size=15, color="black"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_text(size=12, color="black")
)
martin.prev.plot
Plot for Palm Beach County
palm.prev.plot1 <-
ggplot(palm, aes(x=MonthYear, y=Prevalence, color = SiteCode, group = SiteCode, label = SiteCode))+
geom_line()+
geom_point()+
annotate("label", x = 2.45, y = 0.45, label = paste ("Palm Beach County"), size = 5)+
scale_x_discrete(limits = c("11-17","12-17","01-18", "04-18", "05-18", "06-18", "08-18", "11-18", "12-18", "02-19", "03-19", "04-19", "05-19", "06-19"), labels = c("Nov 17","Dec 17","Jan 18", "Apr 18", "May 18", "Jun 18", "Aug 18", "Nov 18", "Dec 18", "Feb 19", "Mar 19", "Apr 19", "May 19", "Jun 19"))+
labs(color = "Site")+
ylim(0,0.5)
# takes the legend and saves it as a separate object (grob)
legend<-function(palm.prev.plot1){
tmp <- ggplot_gtable(ggplot_build(palm.prev.plot1))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
martin.legend <- tmp$grobs[[leg]]
return(palm.legend)
}
palm.legend=get_legend(palm.prev.plot1)
palm.prev.plot <- palm.prev.plot1+
theme(panel.grid.major = element_line(size = 0.5, linetype = 'solid', colour = "white"),
panel.background = element_rect(fill = '#F5F5F5'),
plot.title = element_text(hjust = 0.5),
axis.line = element_line(colour = "black"),
axis.ticks = element_line(color="black"),
text = element_text(size=15, color="black"),
axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_text(size=12, color="black")
)
palm.prev.plot
Plot for Broward County
broward.prev.plot1 <-
ggplot(broward, aes(x=MonthYear, y=Prevalence, color = SiteCode, group = SiteCode, label = SiteCode))+
geom_line()+
geom_point()+
scale_x_discrete(limits = c("11-17","12-17","01-18", "04-18", "05-18", "06-18", "08-18", "11-18", "12-18", "02-19", "03-19", "04-19", "05-19", "06-19"), labels = c("Nov 17","Dec 17","Jan 18", "Apr 18", "May 18", "Jun 18", "Aug 18", "Nov 18", "Dec 18", "Feb 19", "Mar 19", "Apr 19", "May 19", "Jun 19"))+
ylim(0,0.5)+
annotate("label", x = 2.45, y = 0.45, label = paste ("Broward County"), size = 5) +
labs(y = "Prevalence", x = "Month", color = "Site")
legend<-function(broward.prev.plot1){
tmp <- ggplot_gtable(ggplot_build(broward.prev.plot1))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
martin.legend <- tmp$grobs[[leg]]
return(broward.legend)
}
broward.legend=get_legend(broward.prev.plot1)
broward.prev.plot <- broward.prev.plot1+
theme(panel.grid.major = element_line(size = 0.5, linetype = 'solid', colour = "white"),
panel.background = element_rect(fill = '#F5F5F5'),
plot.title = element_text(hjust = 0.5),
axis.line = element_line(colour = "black"),
axis.ticks = element_line(color="black"),
text = element_text(size=15, color="black"),
axis.text.x=element_text(angle = 45, hjust = 1,size=12, color="black"),
axis.text.y=element_text(size=12, color="black")
)
broward.prev.plot
You might have noticed that only the Broward County plot had labels and titles along the x-axis. That is because we are going to stack the plots using patchwork and saving it in our figures folder.
stacked.plot <- (martin.prev.plot / palm.prev.plot / broward.prev.plot)
ggsave("../figures/site.prevalence.png", plot= stacked.plot, width=12, height=8, units="in", dpi=600)
stacked.plot