plastro.phenotype_simulation.generate_ad

plastro.phenotype_simulation.generate_ad(sample_structure: Tuple, n_dim: int, show_plots: bool = False) AnnData[source]

Generate a complete AnnData object with simulated single-cell data.

Creates a comprehensive single-cell dataset with realistic gene expression patterns, UMAP embedding, clustering annotations, and proper metadata for studying cellular plasticity and differentiation.

Parameters:
  • sample_structure (Tuple) – Binary tree structure from create_random_binary_tree.

  • n_dim (int) – Number of dimensions (genes) in the expression space.

  • show_plots (bool, optional) – Whether to display plots during computation, by default False.

Returns:

Complete annotated dataset containing: - X: Gene expression matrix (n_cells × n_genes) - obs: Cell metadata with ground truth, branch labels, colors - obsm: Dimensionality reductions (UMAP, diffusion components) - uns: Cluster colors and other metadata

Return type:

AnnData

Examples

>>> structure = create_random_binary_tree(n_leaves=6, sample_res=100)
>>> adata = generate_ad(structure, n_dim=20)
>>> print(f"Generated {adata.n_obs} cells with {adata.n_vars} genes")
>>>
>>> # Visualize the simulated data
>>> import scanpy as sc
>>> sc.pl.umap(adata, color='branch')
>>>
>>> # Generate with plots enabled
>>> adata_with_plots = generate_ad(structure, n_dim=20, show_plots=True)

Notes

The generated dataset includes: - Realistic branching trajectories in gene expression space - Ground truth probability densities for each cell - UMAP coordinates for visualization - Leiden clustering annotations - Color maps for consistent plotting - Diffusion components for plasticity analysis

This provides a complete testing framework for plasticity algorithms with known ground truth cellular relationships.