plastro.simulate_realistic_dataset
- plastro.simulate_realistic_dataset(n_cell_types: int = 6, cells_per_type: int = 100, n_genes: int = 20, noise_level: float = 0.2, seed: int | None = None, show_plots: bool = False) AnnData[source]
Generate a realistic single-cell dataset for plasticity testing.
Convenience function that combines tree generation and sampling to create a complete synthetic dataset with realistic parameters for testing plasticity simulation algorithms.
- Parameters:
n_cell_types (int, optional) – Number of terminal cell types to generate, by default 6.
cells_per_type (int, optional) – Approximate number of cells per cell type, by default 100.
n_genes (int, optional) – Number of genes (dimensions) in expression space, by default 20.
noise_level (float, optional) – Amount of noise in trajectories (0-1), by default 0.2.
seed (int, optional) – Random seed for reproducibility, by default None.
show_plots (bool, optional) – Whether to display plots during dataset generation, by default False.
- Returns:
Complete annotated dataset ready for plasticity analysis.
- Return type:
AnnData
Examples
>>> # Generate a standard test dataset >>> adata = simulate_realistic_dataset( ... n_cell_types=8, ... cells_per_type=150, ... n_genes=25, ... seed=42, ... show_plots=True # Display terminal branch plots ... ) >>> >>> # Visualize the dataset >>> import scanpy as sc >>> sc.pl.umap(adata, color=['branch', 'leiden'])
Notes
This function provides sensible defaults for most plasticity simulation experiments and ensures reproducible results when a seed is provided.