Master data exploration and visualization with pandas, matplotlib, and real gene expression data. Transform raw data into biological insights through powerful visual analysis.
Vectorization for Speed - Learn why pandas is 100× faster than Python loops.
Master GroupBy - Compare gene expression across cancer types effortlessly.
Scientific Visualization - Create publication-ready figures with matplotlib.
Harness NumPy's power for lightning-fast data operations
Master exploratory data analysis techniques
Split-Apply-Combine for comparing cancer types
Create publication-quality scientific figures
Visualize how your data is distributed
Compare multiple genes in multi-panel figures
Discover relationships between genes
Compare distributions across cancer types
Part 1: Data Manipulation
Vectorization and GroupBy operations for efficient analysis
Part 2: Data Inspection
Quality control and exploratory data analysis techniques
Part 3: Visualization
Create histograms, scatter plots, and box plots with matplotlib
By completing these notebooks, you'll be able to:
Show distribution of a single variable
Use for: Understanding data spread, normality, and outliers
Reveal relationships between two genes
Use for: Finding correlations, co-regulation, and dependencies
Compare distributions across groups
Use for: Comparing cancer types, identifying group differences
📚 Part of the Python for Biologists course by Helfrid Hochegger
University of Sussex | Year 3 Biology, Biochemistry & Neuroscience