1. The Framework of iWGCNA

<aside> đź’ˇ

iWGCNA comprises 6 primary analytical modules:

  1. Data Upload: Accepts gene expression matrices in .txt, .csv, or .tsv format.
  2. Preprocessing: Conducts quality control with outlier removal and filters for highly variable genes to improve downstream signal detection.
  3. Network Construction: Computes gene–gene correlations, converts them into a TOM matrix, and identifies gene groups using hierarchical clustering.
  4. Gene Group–Trait Association: Quantifies relationships between gene groups and phenotypic or clinical traits.
  5. Hub Gene Identification: Visualizes gene group–level interactions and highlights putative hub genes within each group.
  6. Functional Enrichment Analysis: Performs GO and KEGG enrichment to characterize the biological functions associated with each gene group. </aside>

image.png

1.1 Data Upload

<aside> đź’ˇ

Supported formats:

WGCNA accepts gene expression matrices in .txt, .tsv, or .csv format. The first column should contain gene identifiers, and each subsequent column should represent a sample. The matrix values may be either

  1. raw read counts or
  2. normalized expression values such as TPM, CPM, FPKM, or RPKM.
Gene Sample1 Sample2 Sample3 … SampleN
gene1 123 110 95 128 142
gene2 56 62 49 52 61
gene3 890 945 1012 978 1101
… 201 220 195 208 232
geneN 34 30 29 33 40
</aside>

<aside> đź’ˇ

Supported Organisms:

Category Organism Notes
Human Homo sapiens Human transcriptome-related studies
Animal Mus musculus Mammalian model organism
Animal Rattus norvegicus Mammalian model organism
Animal Danio rerio Vertebrate developmental model
Animal Drosophila melanogaster Invertebrate genetic model
Animal Caenorhabditis elegans Invertebrate developmental model
Plant Arabidopsis thaliana Model plant species
Fungi Saccharomyces cerevisiae Model unicellular eukaryote
Protist Trichomonas vaginalis Human parasitic protist
Protist Plasmodium falciparum Malaria parasite
</aside>

1.2 Preprocessing

<aside> đź’ˇ

Expression matrix filtering

1. Raw Read Count Data â’¶

Apply a variance-stabilizing normalization ( VST of DESeq2) step to reduce mean–variance dependence.

2. Already Normalized Data (FPKM/RPKM/TPM/CPM) â’·

Use original values directly, or apply log transformation log10(x+1) .

3. Low-Expression Noise Removal

Remove genes with consistently low expression (e.g., counts <10 in >90% of samples).

Thresholds should reflect sequencing depth, sample size, and study objectives.

4. Variable Gene Selection

Select the most variable genes using MAD or variance (VAR) metrics.

Retaining the Top N variable genes improves network robustness and computational efficiency.

5. Outlier Detection and Removal

Remove sample outliers (e.g., via hierarchical clustering) to avoid distortion of network results.

</aside>

image.png

1.3 Network Construction

1.3.1 Soft-Threshold Power Evaluation (β)

<aside> đź’ˇ

Calculate Soft-Threshold Power

Identify an appropriate β value that maximizes scale-free topology fit (R²) while maintaining adequate mean connectivity.

Select β from a tested range to ensure robust network topology and stable gene group (module) detection outcomes.

image.png

1.3.2 Gene Group (module) detection

<aside> đź’ˇ

Workflow

1. Adjacency Matrix Construction

Pairwise gene correlations are calculated and converted into an adjacency matrix using the selected network type and soft-threshold power (β).

2. Topological Overlap Matrix (TOM) Calculation

A TOM is generated to capture shared connectivity patterns among genes.

TOM structure (unsigned, signed, or hybrid) is automatically determined by the selected network type.

3. Hierarchical Clustering

Genes are clustered based on TOM dissimilarity to define initial gene groups.

4. Minimum Gene Size Filtering

Gene groups smaller than the user-defined minimum size are merged with the nearest clusters.

5. Gene Group Merging Threshold (Cut-Tree Height)

Remaining groups are merged according to the selected cut-tree height threshold to produce the final gene group assignments.

image.png

1.4 Gene Group-Trait Association

1.4.1 Trait/Phenotype format

sample treat control
Sample1 1 0
Sample2 1 0
Sample3 1 0
Sample4 0 1
Sample5 0 1
Sample6 0 1

1.4.2 Trait/Phenotype Table Preview