Skip to content

Commit

Permalink
Commit from RStudio, MBP, EOD
Browse files Browse the repository at this point in the history
  • Loading branch information
bglarkin committed Dec 17, 2022
1 parent 843431e commit 8f43048
Show file tree
Hide file tree
Showing 11 changed files with 118 additions and 62 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@
.RData
.Ruserdata
pairwise_trt_table.csv
pairwise_rust_table.csv
pairwise_rust_table.csv
README.html
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@ Analysis of western white pine (*Pinus monitcola*) terpenes in response to blist

## Source data
The [Database]() directory contains raw csv files which are extracted, transformed, and loaded by [data_etl.md](data_etl.R). The ETL script is called separately in each analysis script for brevity.

- Raw script: [data_etl.R](data_etl.md)

## Global ordination and permutation tests
Ordinations of class centroids for each resistance class of seedlings. Locations based on all terpene compounds. Permutation tests conducted with `adonis2()` and 1999 permutations.

- Report format: [terpenes_globalTests.md](terpenes_globalTests.md)
- Raw script: [terpenes_globalTests.R](terpenes_globalTests.R)

Expand All @@ -16,10 +18,15 @@ within resistance classes and assessments, and between symbiont controls and tre
permuted within experimental greenhouse blocks.
2. Permutations tests as in #1, but between induced and control seedlings after the rust inoculation step
(**rust_inoc** vs. **rust_ctrl**)

- Report format: [terpenes_pairwiseTests.md](terpenes_pairwiseTests.md)
- Raw script: [terpenes_pairwiseTests.R](terpenes_pairwiseTests.R)

## Indicator terpenes
An indicator species analysis [Borcard et al. 2018](https://doi.org/10.1007/978-3-319-71404-2) is conducted
to identify which terpene compounds associate strongly with particular assessments,
resistance classes, or treatments. A post-hoc bootstrap test is conducted to produce visualizations and
corroborate the indicator species analysis.

- Report format: [terpenes_indicators.md](terpenes_indicators.md)
- Raw script: [terpenes_indicators.R](terpenes_indicators.R)
Expand Down
4 changes: 2 additions & 2 deletions terpenes_globalTests.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#' ---
#' title: "Global tests of terpene composition"
#' author: "Beau Larkin"
#' date: "2022-12-08"
#' author: "Beau Larkin\n"
#' date: "Last updated: `r format(Sys.time(), '%d %B, %Y')`"
#' output:
#' github_document:
#' toc: true
Expand Down
9 changes: 3 additions & 6 deletions terpenes_globalTests.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
Global tests of terpene composition
================
Beau Larkin
2022-12-08

Last updated: 16 December, 2022

- <a href="#description" id="toc-description">Description</a>
- <a href="#package-and-library-installation"
Expand Down Expand Up @@ -57,11 +58,6 @@ for (i in 1:length(packages_needed)) {
}
```

``` r
# Load ggplot styles and themes from text file
source("gg_style.txt")
```

# Data

See
Expand Down Expand Up @@ -117,6 +113,7 @@ sapply(data, function(x)
## # … with 16 more variables: nc5 <dbl>, pbr5 <dbl>, br5 <dbl>, ss5 <dbl>,
## # dm4 <dbl>, sv4 <dbl>, ss4 <dbl>, dm3 <dbl>, sv3 <dbl>, vig3 <dbl>,
## # bi3 <dbl>, nc3 <dbl>, pbr3 <dbl>, br3 <dbl>, ss3 <dbl>, ht1 <dbl>
## # ℹ Use `colnames()` to see all variable names

# Functions

Expand Down
63 changes: 41 additions & 22 deletions terpenes_indicators.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#' ---
#' title: "Identifying indicator terpenes for assessments, treatments, and resistance classes"
#' author: "Beau Larkin"
#' date: "2022-12-09"
#' author: "Beau Larkin\n"
#' date: "Last updated: `r format(Sys.time(), '%d %B, %Y')`"
#' output:
#' github_document:
#' toc: true
Expand All @@ -26,9 +26,10 @@
#' sometimes be difficult to interpret; for example, when indicators are found for groupings of control and treatment
#' seedlings.
#'
#' Finally, the function `strassoc()` is used to produce bootstrapped confidence intervals on indicator's strength of
#' Finally, the function `strassoc()` is used to produce bootstrapped confidence intervals on indicators' strength of
#' association to groups. The additional post-hoc test hopefully reduces the need for or concern over p-value corrections to
#' `multipatt()`.
#' `multipatt()`. The bootstrapped statistics and confidence intervals also allow the production of figures using
#' `ggplot()` for easy interpretation.
#'
#' Indicator species analysis here is run on subsets of the seedling response data:
#'
Expand Down Expand Up @@ -65,9 +66,10 @@ sapply(data, function(x)
head(x, 2))
#'
#' # Functions
#' Two wrapper functions are used to produce summaries of the function `multipatt()`. Two functions
#' Two wrapper functions are used to produce summaries of the functions `multipatt()` and `strassoc()`. Two functions
#' are needed because with the pre-rust assessment, only treatments are considered within each
#' resistance class. Post-rust, assessments and treatments must be considered within each resistance class.
#'
#' ## Pre-rust function
#+ pre-rust function
indVal_prerust_ci <- data.frame()
Expand Down Expand Up @@ -140,7 +142,7 @@ indic_post <- function(rc, a, p=999, nb=999) {
indVal <- multipatt(
X,
Y$treatment,
control = how(nperm = 999)
control = how(nperm = p)
)

ind_compounds <- indVal$sign %>%
Expand Down Expand Up @@ -189,6 +191,10 @@ indic_pre("MGR")
#' Abietic acid identified as an indicator in all treatment seedlings, as opposed to controls,
#' with decent specificity and very high fidelity.
#'
#' #### Summary
#' **Abietic acid is a consistent indicator of symbiont-treated seedlings across the board
#' in the pre_rust assessment.**
#'
#' ### Plot of indicators and confidence intervals
#' The plot below shows indicator statistics and confidence intervals on single-group comparisons.
#' The statistic shown may not match a significant pooled-group statistic if one was found using
Expand All @@ -197,7 +203,8 @@ indic_pre("MGR")
#+ ggstyle,echo=FALSE
source("gg_style.txt")
#+ indVal_prerust_plot,echo=FALSE,fig.dim=c(9,6)
ggplot(indVal_prerust_ci, aes(x = treatment, y = stat)) +
indVal_prerust_ci %>%
ggplot(aes(x = treatment, y = stat)) +
facet_grid(rows = vars(compound), cols = vars(resistance_class)) +
geom_hline(yintercept = 0, linetype = "dotted") +
geom_pointrange(aes(ymin = lowerCI, ymax = upperCI)) +
Expand Down Expand Up @@ -232,26 +239,33 @@ indic_post("MGR", "rust_ctrl")
#' No terpenes are indicators for one treatment group.
#' Levopiramic and neoabietic are indicators for treatments, as opposed to controls.
#'
#' **Constitutive terpenes after rust inoculation segregate along whether seedlings were treated with symbionts. There is minimal effect
#' #### Summary
#' - **Constitutive terpenes after rust inoculation segregate along whether seedlings were treated with symbionts. There is minimal effect
#' of resistance class. Levopiramic, neoabietic, and dehydroabietic, are consistent terpenes in this group.**
#' - **Abietic and palustric acids were associated with EMF (both EMF and EMF+FFE treatments) across resistance classes
#' among the rust controls.**
#'
#' ### Plot of indicators and confidence intervals
#' The plot below shows indicator statistics and confidence intervals on single-group comparisons.
#' The statistic shown may not match a significant pooled-group statistic if one was found using
#' `multipatt()`. Confidence intervals are based on boostrap replication in `strassoc()` (n=1000).
#' Confidence intervals which overlap zero mean that the statistic is non-significant.
#' Confidence intervals which overlap zero mean that the statistic is non-significant. Zero-overlapping
#' CIs are shown as red on the plot.
#+ indVal_rustctrl_plot,echo=FALSE,fig.dim=c(9,16)
indVal_postrust_ci %>%
filter(assessment == "rust_ctrl") %>%
ggplot(aes(x = treatment, y = stat)) +
mutate(color0 = case_when(lowerCI == 0 ~ "nosig", TRUE ~ "sig")) %>%
ggplot(aes(x = treatment, y = stat, color = color0)) +
facet_grid(rows = vars(compound), cols = vars(resistance_class)) +
geom_hline(yintercept = 0, linetype = "dotted") +
geom_pointrange(aes(ymin = lowerCI, ymax = upperCI)) +
labs(x = "",
y = "Single-group indicator statistic",
title = "Indicator terpenes in post-rust inoculation seedlings, rust controls") +
scale_color_manual(name = "", values = c("red", "black")) +
theme_bw() +
theme_bgl
theme_bgl +
theme(legend.position = "none")
#'
#' ### Rust-inoculated seedlings
#' #### Indicators in QDR seedlings
Expand All @@ -272,29 +286,34 @@ indic_post("MGR", "rust_inoc")
#' No terpenes are indicators for one treatment group.
#' Only ocimene is associated with symbiont treatments, as opposed to controls.
#'
#' **Terpenes in the post-rust, induced seedlings also often segregate along symbiont treatments vs. controls.
#' #### Summary
#' - **Terpenes in the post-rust, induced seedlings also often segregate along symbiont treatments vs. controls.
#' Consistent indicators in this group include ocimene and abietic acid.**
#' - **Results among resistance classes were spottier here. Where they were identified, indicators' patterns
#' were similar among resistance classes, but in several cases, indicators were only identified for one or two
#' resistance classes.**
#' - **Note that rust inoculation is a big hammer on terpenes. An indicator analysis performed on treatment=control
#' and rust_ctrl vs. rust_inoc assessments revealed that 21 of 26 terpenes were indicators, most of rust_inoc
#' (not shown). This didn't seem an interesting result given the lethality of this disease.**
#'
#' #' ### Plot of indicators and confidence intervals
#' ### Plot of indicators and confidence intervals
#' The plot below shows indicator statistics and confidence intervals on single-group comparisons.
#' The statistic shown may not match a significant pooled-group statistic if one was found using
#' `multipatt()`. Confidence intervals are based on boostrap replication in `strassoc()` (n=1000).
#' Confidence intervals which overlap zero mean that the statistic is non-significant.
#' Confidence intervals which overlap zero mean that the statistic is non-significant. Zero-overlapping
#' CIs are shown as red on the plot.
#+ indVal_rustinoc_plot,echo=FALSE,fig.dim=c(9,12)
indVal_postrust_ci %>%
filter(assessment == "rust_inoc") %>%
ggplot(aes(x = treatment, y = stat)) +
mutate(color0 = case_when(lowerCI == 0 ~ "nosig", TRUE ~ "sig")) %>%
ggplot(aes(x = treatment, y = stat, color = color0)) +
facet_grid(rows = vars(compound), cols = vars(resistance_class)) +
geom_hline(yintercept = 0, linetype = "dotted") +
geom_pointrange(aes(ymin = lowerCI, ymax = upperCI)) +
labs(x = "",
y = "Single-group indicator statistic",
title = "Indicator terpenes in post-rust inoculation seedlings, rust treated") +
scale_color_manual(name = "", values = c("red", "black")) +
theme_bw() +
theme_bgl



# IN the previous text, identify the terpenes that hit for only EMF groups....there are a couple.

# Consider a way to look at differences based on rust inoculation...in all resistance classes but only control treatments
theme_bgl +
theme(legend.position = "none")
79 changes: 57 additions & 22 deletions terpenes_indicators.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,17 @@ Identifying indicator terpenes for assessments, treatments, and
resistance classes
================
Beau Larkin
2022-12-09

Last updated: 16 December, 2022

- <a href="#description" id="toc-description">Description</a>
- <a href="#package-and-library-installation"
id="toc-package-and-library-installation">Package and library
installation</a>
- <a href="#data" id="toc-data">Data</a>
- <a href="#functions" id="toc-functions">Functions</a>
- <a href="#pre-rust-function" id="toc-pre-rust-function">Pre-rust
function</a>
- <a href="#post-rust-function" id="toc-post-rust-function">Post-rust
function</a>
- <a href="#results" id="toc-results">Results</a>
Expand All @@ -27,6 +30,9 @@ Beau Larkin
indicators and confidence intervals</a>
- <a href="#rust-inoculated-seedlings"
id="toc-rust-inoculated-seedlings">Rust-inoculated seedlings</a>
- <a href="#plot-of-indicators-and-confidence-intervals-2"
id="toc-plot-of-indicators-and-confidence-intervals-2">Plot of
indicators and confidence intervals</a>

# Description

Expand Down Expand Up @@ -54,9 +60,11 @@ for example, when indicators are found for groupings of control and
treatment seedlings.

Finally, the function `strassoc()` is used to produce bootstrapped
confidence intervals on indicator’s strength of association to groups.
confidence intervals on indicators’ strength of association to groups.
The additional post-hoc test hopefully reduces the need for or concern
over p-value corrections to `multipatt()`.
over p-value corrections to `multipatt()`. The bootstrapped statistics
and confidence intervals also allow the production of figures using
`ggplot()` for easy interpretation.

Indicator species analysis here is run on subsets of the seedling
response data:
Expand Down Expand Up @@ -152,14 +160,17 @@ sapply(data, function(x)
## # … with 16 more variables: nc5 <dbl>, pbr5 <dbl>, br5 <dbl>, ss5 <dbl>,
## # dm4 <dbl>, sv4 <dbl>, ss4 <dbl>, dm3 <dbl>, sv3 <dbl>, vig3 <dbl>,
## # bi3 <dbl>, nc3 <dbl>, pbr3 <dbl>, br3 <dbl>, ss3 <dbl>, ht1 <dbl>
## # ℹ Use `colnames()` to see all variable names

# Functions

Two wrapper functions are used to produce summaries of the function
`multipatt()`. Two functions are needed because with the pre-rust
assessment, only treatments are considered within each resistance class.
Post-rust, assessments and treatments must be considered within each
resistance class. \## Pre-rust function
resistance class.

## Pre-rust function

``` r
indVal_prerust_ci <- data.frame()
Expand Down Expand Up @@ -234,7 +245,7 @@ indic_post <- function(rc, a, p=999, nb=999) {
indVal <- multipatt(
X,
Y$treatment,
control = how(nperm = 999)
control = how(nperm = p)
)

ind_compounds <- indVal$sign %>%
Expand Down Expand Up @@ -357,6 +368,11 @@ indic_pre("MGR")
Abietic acid identified as an indicator in all treatment seedlings, as
opposed to controls, with decent specificity and very high fidelity.

#### Summary

**Abietic acid is a consistent indicator of symbiont-treated seedlings
across the board in the pre_rust assessment.**

### Plot of indicators and confidence intervals

The plot below shows indicator statistics and confidence intervals on
Expand Down Expand Up @@ -491,7 +507,7 @@ indic_post("MGR", "rust_ctrl")
## Group Control+FFE+EMF #sps. 2
## A B stat p.value
## ocimene 0.9735 1.0000 0.987 0.001 ***
## a_terpineol 0.8177 1.0000 0.904 0.002 **
## a_terpineol 0.8177 1.0000 0.904 0.001 ***
##
## Group EMF+FFE+EMF #sps. 2
## A B stat p.value
Expand All @@ -509,10 +525,15 @@ indic_post("MGR", "rust_ctrl")
No terpenes are indicators for one treatment group. Levopiramic and
neoabietic are indicators for treatments, as opposed to controls.

**Constitutive terpenes after rust inoculation segregate along whether
seedlings were treated with symbionts. There is minimal effect of
resistance class. Levopiramic, neoabietic, and dehydroabietic, are
consistent terpenes in this group.**
#### Summary

- **Constitutive terpenes after rust inoculation segregate along whether
seedlings were treated with symbionts. There is minimal effect of
resistance class. Levopiramic, neoabietic, and dehydroabietic, are
consistent terpenes in this group.**
- **Abietic and palustric acids were associated with EMF (both EMF and
EMF+FFE treatments) across resistance classes among the rust
controls.**

### Plot of indicators and confidence intervals

Expand All @@ -521,7 +542,8 @@ single-group comparisons. The statistic shown may not match a
significant pooled-group statistic if one was found using `multipatt()`.
Confidence intervals are based on boostrap replication in `strassoc()`
(n=1000). Confidence intervals which overlap zero mean that the
statistic is non-significant.
statistic is non-significant. Zero-overlapping CIs are shown as red on
the plot.

![](terpenes_indicators_files/figure-gfm/indVal_rustctrl_plot-1.png)<!-- -->

Expand Down Expand Up @@ -629,16 +651,29 @@ indic_post("MGR", "rust_inoc")
No terpenes are indicators for one treatment group. Only ocimene is
associated with symbiont treatments, as opposed to controls.

**Terpenes in the post-rust, induced seedlings also often segregate
along symbiont treatments vs. controls. Consistent indicators in this
group include ocimene and abietic acid.**

\#\### Plot of indicators and confidence intervals The plot below
shows indicator statistics and confidence intervals on single-group
comparisons. The statistic shown may not match a significant
pooled-group statistic if one was found using `multipatt()`. Confidence
intervals are based on boostrap replication in `strassoc()` (n=1000).
Confidence intervals which overlap zero mean that the statistic is
non-significant.
#### Summary

- **Terpenes in the post-rust, induced seedlings also often segregate
along symbiont treatments vs. controls. Consistent indicators in this
group include ocimene and abietic acid.**
- **Results among resistance classes were spottier here. Where they were
identified, indicators’ patterns were similar among resistance
classes, but in several cases, indicators were only identified for one
or two resistance classes.**
- **Note that rust inoculation is a big hammer on terpenes. An indicator
analysis performed on treatment=control and rust_ctrl vs. rust_inoc
assessments revealed that 21 of 26 terpenes were indicators, most of
rust_inoc (not shown). This didn’t seem an interesting result given
the lethality of this disease.**

### Plot of indicators and confidence intervals

The plot below shows indicator statistics and confidence intervals on
single-group comparisons. The statistic shown may not match a
significant pooled-group statistic if one was found using `multipatt()`.
Confidence intervals are based on boostrap replication in `strassoc()`
(n=1000). Confidence intervals which overlap zero mean that the
statistic is non-significant. Zero-overlapping CIs are shown as red on
the plot.

![](terpenes_indicators_files/figure-gfm/indVal_rustinoc_plot-1.png)<!-- -->
Binary file modified terpenes_indicators_files/figure-gfm/indVal_prerust_plot-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified terpenes_indicators_files/figure-gfm/indVal_rustctrl_plot-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified terpenes_indicators_files/figure-gfm/indVal_rustinoc_plot-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 8f43048

Please sign in to comment.