Lecture: Introduction
What is Machine Learning?
“Machine learning (ML) is the study of computer algorithms that can improve automatically through experience and by the use of data…” - Wikipedia
Supervised Learning vs Unsupervised Learning vs Semi-Supervised Learning vs Reinforcement Learning
ML workflow: split into training and test data, fit model on training data, predict on new data, evaluate model with test data
Accuracy vs precision
Artificial Intelligence (AI) > Machine Learning (ML) > Deep Learning (DL)
History of machine learning | Google Cloud
1950: Alan Turing comes up with Turing test: originally called the “imitation game” (see also Imitation Game movie with Benedict Cumberbatch on cracking German codes to save WWII), which is the test of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. He asked “Can machines think?” in his 1950 seminal paper “Computing Machinery and Intelligence.”
2015: Google’s AlphaGo was the first program to best a professional player at Go, considered the most difficult board game in the world. With this defeat, computers officially beat human opponents in every classical board game.
Ecology and Environment applications:
individual > population > landscape > ecosystem > global
Deep Learning architecture of convolutional neural nets (CNNs in next module): sounds, images
Species Distribution Modeling
Elith & Leathwick (2009): geographic -> environmental (fit) -> geographic (predict)
modeling techniques: GLM, GAM, TREE (rpart, RandomForest)
Ecological Niche
Grinell: environmental conditions
Elton: env + biotic
Hutchinson: “n-dimensional hypervolume”
Lab: Species: explore
Query the Global Biodiversity Information Facility (GBIF.org) for observations
Fetch environmental data to be used as predictors
Reading: Zhong et al. (2021) Machine Learning: New Ideas and Tools in Environmental Science and Engineering
inputs: tabular, image, graph, text
applications: making predictions; extracting feature importance; detecting anomalies; and discovering new materials/chemicals
workflow: preparation, model development, interpretation and deployment
Lecture: Logistic Regression
Lab: Species: regress
Reading: Elith & Leathwick (2009) Species Distribution Models: Ecological Explanation and Prediction Across Space and Time
Key steps: gathering relevant data; assessing its adequacy (the accuracy and comprehensiveness of the species data; the relevance and completeness of the predictors); deciding how to deal with correlated predictor variables; selecting an appropriate modeling algorithm; fitting the model to the training data; evaluating the model including the realism of fitted response functions, the model’s fit to data, characteristics of residuals, and predictive performance on test data; mapping predictions to geographic space; selecting a threshold if continuous predictions need reduction to a binary map; and iterating the process to improve the model in light of knowledge gained throughout the process.
Lecture: Decision Trees
Lab: Species: trees
Reading: Evans et al. (2011) Modeling Species Distribution and Change Using Random Forest
issues: complex non-linear interactions, spatial autocorrelation, high-dimensionality, non-stationary, historic signal, anisotropy, and scale
Classification and Regression Trees versus Random Forest Algorithm (ensemble)
Model Selection (parsimony), Imbalanced Data, Model Validation
Current and Future Prediction
Lecture: Model Evaluation
Lab: Species: evaluate
Reading: Andrade et al. (2020) ENMTML: An R package for a straightforward construction of complex ecological niche models
Lecture: Biodiversity; Clustering
Lab: Communities: cluster
Reading: Ch. 8-9 (p.123-152) of Kindt & Coe (2005)
Ch. 8 Analysis of differences in species composition; Ch. 9 Analysis of ecological distance by clustering
Ecological Distance metrics to build matrices of between site dissimilarity based on species composition: Euclidean, Manhattan, Bray-Curtis
Hierarchical clustering in ecological context
Dendrograms
Complements ordination
Lecture: Ordination
Lecture: Conservation Planning
Lab: Reserves