plastro.phylo.neighbor_joining
- plastro.phylo.neighbor_joining(distance_matrix: DataFrame, outgroup: str | None = None) TreeNode[source]
Construct phylogenetic tree using neighbor-joining algorithm.
Uses scikit-bio’s robust neighbor-joining implementation to build an unrooted tree from a distance matrix. This is much more reliable than a custom implementation.
- Parameters:
distance_matrix (pd.DataFrame) – Symmetric matrix of pairwise distances between leaves. Index and columns should contain leaf names.
outgroup (str, optional) – Name of outgroup leaf for rooting the tree, by default None.
- Returns:
Phylogenetic tree constructed using neighbor-joining.
- Return type:
ete3.TreeNode
Examples
>>> import numpy as np >>> import pandas as pd >>> from plastro import neighbor_joining >>> >>> # Create sample distance matrix >>> cells = ['A', 'B', 'C', 'D'] >>> dists = np.random.rand(4, 4) >>> dists = (dists + dists.T) / 2 # Make symmetric >>> np.fill_diagonal(dists, 0) >>> dist_df = pd.DataFrame(dists, index=cells, columns=cells) >>> >>> # Build tree >>> tree = neighbor_joining(dist_df, outgroup='A') >>> print(tree.get_ascii())
Notes
This function uses scikit-bio’s neighbor-joining implementation, which is: - Well-tested and robust - O(n³) time complexity - Handles edge cases properly - Produces accurate phylogenetic trees
The result is converted from scikit-bio format to ETE3 format for compatibility with other PLASTRO functions.
- Raises:
ImportError – If scikit-bio is not installed.
ValueError – If distance matrix is invalid or outgroup not found.