This lesson is still being designed and assembled (Pre-Alpha version)

R for plant science: Glossary

Key Points

Before we start
  • R is a programming language. RStudio is a user-friendly interface for coding in R.

  • R projects help you organize your work and keep file paths simple.

  • You can get help in the RStudio interface.

  • There are many resources available online for R help.

Introduction to R
  • Arithmetic operators can do math on numbers.

  • The assignment operator assigns values to objects.

  • Objects have types like numeric, character, or logical.

  • R has built in functions that do basic functions.

  • The subset operator lets you select data by position or value.

  • NAs represent missing values.

Starting with data
  • R stores data tables in a structure called a data frame.

  • Columns in a data frame must contain values of the same data type.

  • Subsetting by position is similar as in vectors.

  • Data frames can also be subset by names.

  • Factors store characters as integers with text labels.

  • strings.factors = FALSE can prevent R from reading strings as factors.

Manipulating, analyzing and exporting data with tidyverse
  • The tidyverse is built for data manipulation.

  • The read_csv function creates tibbles instead of data frames.

  • The select function picks particular columns based on names.

  • The filter function picks rows based on values.

  • The mutate function creates new columns based on the value of other columns.

  • The group_by and summarize functions can be used to create summary tables.

  • The write_csv function exports tibbles into a .csv file.

Data visualization with ggplot2
  • Creating a ggplot require 3 things: data, aesthetics, and geoms

  • Ggplots are highly customizable.

  • Faceting lets you make smaller graphs with cleaner plot areas.

  • Custom and premade themes can applied to any plot.

Investigating large datasets in R
  • Linear regression is a quick and easy way to evaluate the direct relationship between two variables.

  • IT Model Averaging is one approach to evaluate more complex relationships between variables.

  • There are often many valid approaches to address a question in R and understanding how to interpret R packages can help determine which approach might be most appropriate.

Glossary

Here are key terms