📊 Lecture 5: Advanced Statistical Analysis

Interactive notebooks for statistical testing, data visualization, FDR correction, and comprehensive biological data analysis.

🎯 Getting Started

Master statistical testing - Learn to apply appropriate tests to biological data.

Create publication-ready plots - Use Seaborn for professional visualizations.

End-to-end analysis - Complete a full biological data analysis workflow.

Available

🔍 Handling Missing Values

Learn strategies for dealing with missing data in biological datasets

  • Identifying missing values in DataFrames
  • Strategies for handling NaN values
  • Filtering and cleaning datasets
  • Imputation techniques
  • Best practices for biological data
Open in Colab
Available

📊 SciPy Statistical Analysis

Master statistical testing with SciPy for biological data

  • T-tests and statistical significance
  • Correlation analysis
  • Chi-square tests
  • P-values and interpretation
  • Choosing the right statistical test
Open in Colab
Available

🔧 Creating New DataFrames

Build and manipulate DataFrames for custom analyses

  • Creating DataFrames from scratch
  • Combining multiple data sources
  • Adding calculated columns
  • Reshaping and transforming data
  • Building analysis pipelines
Open in Colab
Available

🎯 False Discovery Rate (FDR)

Control for multiple testing in genomic analyses

  • Understanding multiple testing problem
  • Bonferroni correction
  • Benjamini-Hochberg FDR
  • Interpreting adjusted p-values
  • Application to gene expression data
Open in Colab
Available

🎨 Seaborn Visualization

Create publication-ready plots with Seaborn

  • Advanced plotting with Seaborn
  • Statistical visualizations
  • Customizing plot aesthetics
  • Multi-panel figures
  • Publication-quality graphics
Open in Colab
Available

🔢 Working with Sets

Use Python sets for efficient biological data operations

  • Set operations and theory
  • Finding unique elements
  • Set intersections and unions
  • Comparing gene lists
  • Venn diagram analysis
Open in Colab
Apply It!

🧬 End-to-End Analysis Project

Complete biological data analysis from start to finish

  • Full analysis workflow
  • Data cleaning and preparation
  • Statistical testing pipeline
  • Advanced visualizations
  • Interpreting biological results
Open in Colab

🗺️ Learning Path

Data Preparation: Handle missing values and clean datasets

Statistical Testing: Apply SciPy tests to biological data

Visualization: Create publication-ready plots with Seaborn

Integration: Complete end-to-end analysis project

🎓 Skills You'll Master

By completing these notebooks, you'll be able to:

  • Handle and clean messy biological datasets
  • Perform appropriate statistical tests
  • Control for multiple testing with FDR
  • Create publication-quality visualizations
  • Build complete analysis pipelines
  • Interpret and communicate statistical results

📚 Part of the Python for Biologists course by Helfrid Hochegger

University of Sussex | Year 3 Biology, Biochemistry & Neuroscience