plastro.phenotype_simulation.sample_branch

plastro.phenotype_simulation.sample_branch(base: ndarray, velocity: ndarray, sample_structure: Tuple, curvature: float = 0.2, var_decay: float = 1.5, dens_decay: float = 0.9, n_dim: int = 15, branch_name: str = 'b') Tuple[List[ndarray], List, List[int], List[ndarray]][source]

Sample cells along a differentiation branch with realistic noise structure.

Generates synthetic single-cell data along a branching trajectory that mimics cellular differentiation. Uses a physics-inspired model where cells follow curved paths through gene expression space with decreasing variance over time.

Parameters:
  • base (np.ndarray) – Starting position in gene expression space (n_dim,).

  • velocity (np.ndarray) – Initial direction vector for trajectory (n_dim,).

  • sample_structure (Tuple) – Tree structure from create_random_binary_tree defining sampling.

  • curvature (float, optional) – Amount of random curvature in trajectory (0-1), by default 0.2. Higher values create more curved, realistic paths.

  • var_decay (float, optional) – Rate of variance decay along trajectory, by default 1.5. Higher values create more focused terminal populations.

  • dens_decay (float, optional) – Rate of density decay (cell loss), by default 0.9. Models cell death during differentiation.

  • n_dim (int, optional) – Number of dimensions (genes) in expression space, by default 15.

  • branch_name (str, optional) – Name identifier for this branch, by default ‘b’.

Returns:

  • samples: List of cell expression matrices for each sub-branch

  • distributions: List of multivariate normal distributions used

  • n_draws: List of cell counts for each sub-branch

  • names: List of branch name arrays for each sub-branch

Return type:

Tuple[List[np.ndarray], List, List[int], List[np.ndarray]]

Examples

>>> base = np.zeros(10)
>>> velocity = np.ones(10)
>>> structure = (100, [])  # Simple leaf with 100 cells
>>> samples, dists, counts, names = sample_branch(base, velocity, structure)

Notes

The sampling model creates realistic gene expression patterns by: - Adding curved random walk behavior via curvature parameter - Implementing variance decay to model cellular commitment - Using QR decomposition to create proper covariance structure - Applying density decay to model cell loss during development