Geospatial Data Science: Statistics and Machine Learning

3.9 ★★★ ★ ★

Geospatial Data Science: Statistics and Machine Learning

Name: Geospatial Data Science: Statistics and Machine Learning
Rating: 3.9 (210 reviews)
Author: Michael Miller

Vector data analysis in Python with GeoPandas, statsmodels, and Scikit-learn

updated on icon Updated on May, 2024

language icon Language - English

person icon Michael Miller

English [CC]

category icon Development,Data Science and AI ML,Machine Learning

Lectures -61

Duration -12 hours

3.9 ★★★ ★ ★

Add to Cart Buy Now

30-days Money-Back Guarantee

Training 5 or more people ?

Get your team access to 10000+ top Tutorials Point courses anytime, anywhere.

Course Description

In this course I demonstrate open source python packages for the analysis of vector-based geospatial data. I use Jupyter Notebooks as an interactive Python environment. GeoPandas is used for reading and storing geospatial data, exploratory data analysis, preparing data for use in statistical models (feature engineering, dealing with outlier and missing data, etc.), and simple plotting. Statsmodels is used for statistical inference as it provides more detail on the explanatory power of individual explanatory variables and a framework for model selection. Scikit-learn is used for machine learning applications as it includes many advanced machine learning algorithms, as well as tools for cross-validation, regularization, assessing model performance, and more.

This is a project-based course. I use real data related to biodiversity in Mexico and walk through the entire process, from both a statistical inference and machine learning perspective. I use linear regression as the basis for developing conceptual understanding of the methodology and then also discuss Poisson Regression, Logistic Regression, Decision trees, Random Forests, K-NN classification, and unsupervised classification methods such as K-means clustering.

Throughout the course, the focus is on geospatial data and special considerations for spatial data such as spatial joins, map plotting, and dealing with spatial autocorrelation.

Important concepts including model selection, maximum likelihood estimation, differences between statistical inference and machine learning and more are explained conceptually in a manner intended for geospatial professionals rather than statisticians.

Goals

What will you learn in this course:

Basic concepts of statistical modeling
Pandas tools for data preparation
Feature engineering methods
Linear Regression
Logistic Regression
Other supervised classification methods (CART, K NN, SVM, etc)
Unsupervised classification methods
Non-parametric regression

Prerequisites

What are the prerequisites for this course?

You should be familiar with Python, GeoPandas, and Jupyter Notebooks and have a working environment. This knowledge can be gained through my courses "Survey of Python for GIS applications" and "Geospatial Data Science with Python: GeoPandas
You should have some familiarity with basic statistics, especially Linear Regression.

Geospatial Data Science: Statistics and Machine Learning

Curriculum

Check out the detailed breakdown of what’s inside the course

Introduction
3 Lectures

Introduction 06:37 06:37
What is machine learning? 05:20 05:20
About this course 08:50 08:50

Basic concepts in statistical modeling
11 Lectures

Data Preparation
6 Lectures

Data Analysis - Regression
9 Lectures

More complex regression models
4 Lectures

Categorical response variables with Logistic Regression
8 Lectures

Categorical response variables with decision trees and random forests
6 Lectures

K-Nearest Neighbors classification
2 Lectures

Support Vector Machines Classification
3 Lectures

Unsupervised classification with k-Means
2 Lectures

Machine learning project
2 Lectures

Where to go next?
1 Lectures

Additional material
3 Lectures

Instructor Details

Michael Miller

Course Certificate

Use your certificate to make a career change or to advance in your current career.