NEXTXEN - LMS Education

Syllabus Overview

NGS data analysis 4 weeks Program

Week 1 — RNA-Seq Data Analysis

Class 1: Introduction to NGS & NGS Data Types

Objective: Understand NGS technologies, applications, and data formats

ü What is NGS? Sequencing platforms (Illumina, ONT, PacBio)

ü Applications: Genomics, Transcriptomics, Metagenomics, Epigenomics

ü NGS workflow overview

ü Infrastructure requirements for NGS data analysis

ü NCBI SRA/ENA databases, metadata

Hands-on:

ü Linux basics (navigation, permissions, pipes, grep, awk)

ü Navigate NCBI SRA, ENA

ü Using SRA Toolkit: prefetch, fasterq-dump

ü Download sample datasets manually

Class 2: Introduction to RNA-seq & Experimental Design

Objective: Understand RNA-seq fundamentals

ü Overview of RNA-seq: concepts, applications, and data generation

ü Bulk vs. single-cell, library types, replicates, platforms

Hands on: Download RNA-Seq publicly available dataset.

Class 3: RNA-seq QC & Preprocessing

Objective: Perform QC & trimming specifically for transcriptomics

ü Introduction to tools and pipelines.

ü FASTQ structure and quality metrics

ü Quality assessment using:

-FastQC, fastp & MultiQC

ü Important parameters to look over: adapters, overrepresented sequences, duplication rate, GC content, etc.

ü Adapter/quality trimming

ü Trimming tools: Trimmomatic, fastp, Trim Galore

Hands on:

ü FASTQ QC, Trim reads using Trimmomatic/fastp, PostQC evaluation

Class 4: Alignment & Quantification

Objective: Map reads & quantify gene expression

ü Brief about alignment algorithms: HISAT2, STAR, Salmon/Kallisto (pseudo-alignment)

Hands on:

ü Alignment & Quantification: featureCounts / HTSeq

Class 5: Differential Expression & Functional Analysis

Objective: Identify Differentially expressed genes (DEGs) & interpretation

ü Short summary of what differential expression is and why it matters in clinical diagnostics and disease interpretation

Hands on:

ü DESeq2, edgeR, volcano plots, GO/KEGG Pathway analysis

Assignment for week 1: Download a publicly available RNA-seq dataset, perform complete preprocessing (QC + trimming), align or quantify the reads (HISAT2/STAR or Salmon/Kallisto), and generate a final count matrix along with a brief summary of differential expression and pathway results.

Week 2 — Whole Exome Sequencing (WES) Analysis: Germline & Somatic Variants

Class 1: Introduction to WES, Target Capture & Applications

Objective: Understand WES workflows and applications

ü Overview of WES, target enrichment strategies, and its clinical and research applications.

ü Applications of WES in disease research, diagnostics, and personalized medicine.

ü Comparison of WES with whole-genome sequencing (WGS).

Class 2: QC, Trimming & Alignment (BWA-MEM)

Objective: Preprocess and align exome data

ü Overview of paired-end sequencing data and differences in germline vs. somatic workflows.

ü FastQC, Trimmomatic, BWA-MEM

ü Post-alignment processing: sorting, marking duplicates, and base recalibration (BQSR) using GATK.

ü Introduction to SAM/BAM file formats and their handling.

Class 3: Germline & Somatic Variant Calling (GATK)

Objective: Call both germline and somatic variants

ü Germline: HaplotypeCaller, joint genotyping

ü Somatic: Mutect2, Panel of Normals, Filtering and refining somatic variant calls.

Class 4: Variant Annotation, Prioritization & Reporting

Objective: Interpret variants and build reports

ü Annotate variants using ANNOVAR, VEP, and SnpEff to determine their functional impact.

ü Interpret pathogenicity scores such as SIFT, PolyPhen, CADD, and MutationTaster.

ü Use key databases (gnomAD, ClinVar, COSMIC) to understand population frequency and clinical relevance.

ü Prioritize important variants based on pathogenicity, frequency, and functional significance, and review them in IGV.

Assignment for week 2: Download a publicly available WES dataset and complete the full workflow: run QC + trimming, align reads with BWA-MEM, call either germline or somatic variants using GATK, and finally annotate and prioritize key variants using ANNOVAR/VEP/SnpEff, summarizing the biologically relevant ones in a short report.

Week 3 — Metagenomic Data Analysis

Class 1: Introduction to Metagenomics

Objective: Understand metagenomics concepts & experimental workflows

ü Shotgun vs. 16S/ITS metagenomics

ü Microbiome study design

ü Challenges: contamination, low biomass, host reads

Reference databases: SILVA, Greengenes, GTDB

Hands-on: Explore metagenomic datasets in SRA

Class 2: QC, Host Removal & Taxonomic Profiling

Objective: Clean metagenomic reads and classify organisms

ü QC using FastQC/Fastp

ü Host read removal using Bowtie2

ü Taxonomic classification:

-Kraken2, MetaPhlAn, Kaiju

Hands-on:

ü Perform host-filtering

ü Run Kraken2/MetaPhlAn classification

Class 3: Functional Profiling & Diversity Analysis

Objective: Explore functional capabilities and diversity metrics

ü HUMAnN3 for functional/metabolic profiling

ü Diversity metrics: alpha, beta

ü Visualization: Krona, Heatmaps, PCA

ü QIIME2 workflows

Hands-on:

Run QIIME2 pipeline (import → denoise → taxa assignment)

Class 4: Metagenomic Assembly & Interpretation

Objective: Assemble metagenomes and interpret biological meaning

ü Metagenomic assembly using MEGAHIT / MetaSPAdes

ü Binning: MaxBin2, MetaBAT

ü MAG quality check: CheckM

ü Reporting results

Hands-on:

ü Assemble a sample using MEGAHIT

Assignment for week 3 : Download a real metagenomic dataset, perform QC and host-read removal, generate both taxonomic and functional profiles using tools like Kraken2/MetaPhlAn and HUMAnN3 or QIIME2, and submit a brief summary including diversity plots or key taxa identified.

Week 4 — Genome Assembly & Annotation

Class 1: Introduction to Genome Assembly

Objective: Understand assembly principles and strategies

ü De novo vs. reference-guided assembly

ü Illumina vs. long-read (ONT/PacBio) assemblies

ü Metrics: N50, L50, coverage, completeness

ü Popular assemblers (SPAdes, Unicycler, Flye)

Hands-on: Explore assembly tools & datasets

Class 2: De Novo Genome Assembly (Hands-On Focus)

Objective: Assemble a draft genome

ü SPAdes / MEGAHIT workflows

ü Contigs, scaffolds, k-mer strategies

ü QC of assembled genome using QUAST

Hands-on:

ü Assemble bacterial/viral genome

ü Assess assembly metrics

Class 3: Genome Annotation (Structural & Functional)

Objective: Annotate genomes using automated pipelines

ü Prokka (bacteria/virus annotation)

ü RAST, Bakta alternatives

ü Predicting CDS, tRNAs, rRNAs

ü Functional annotation: COG, KEGG, Pfam

Hands-on:

ü Annotate genome using Prokka

Class 4: Comparative Genomics & Downstream Applications

Objective: Analyze assembled genomes for biological insights

ü Multiple sequence alignment (MAFFT, Clustal Omega)

ü Phylogenetic analysis (IQ-TREE)

ü Identifying SNPs/indels across strains

ü Pan-genome analysis (Roary)

ü Applications:

o Vaccine design (epitope prediction)

o Drug target identification

Hands-on:

ü Run phylogenetic Analysis; identify conserved regions

Assignment for week 4: Select a real microbial genome dataset, perform a full de novo assembly using SPAdes/MEGAHIT, annotate the assembled genome with Prokka, and generate a brief comparative genomics summary including key metrics (N50, completeness), major genes, and a phylogenetic placement.

Course Details

NGS Data Analysis for Biologists

Contact Us

Working Hours

Follow Us

Syllabus Overview

Useful Links

Resources

Contact Us