Nested Cladistic Analysis:
The Nesting Algorithm
1. Begin with an unrooted phylogenetic tree containing all observed haplotypes. This tree may also include some unobserved internal haplotypes that link together the observed haplotypes. (For example, if we observe a pair of haplotypes that differ by exactly two mutations, it is usually safe to infer that they are linked by an intermediate haplotype sharing one mutation with each.) We will consider each haplotype to be a “0-step clade.”
2. Identify the tip clades. These are the 0-step clades that are connected to only one other 0-step clade.
3. Enclose each tip clade and its neighbor with a circle or other closed curve. If two or more tip clades have the same neighbor, enclose all of them and their common neighbor with a single curve.
4. Temporarily remove all enclosed portions of the tree. If there are still any clusters of two or more adjacent 0-step clades that were not enclosed at earlier steps, repeat steps 2 through 4 until no such clusters remain.
5. Return all portions of the tree that were removed in step 4.
6. If there are any “stranded” 0-step clades that have not been enclosed, expand an adjacent enclosure to include each one of them. When deciding which adjacent enclosure to expand in this way:
• If the stranded clade is empty, expand the enclosure with the smallest sample size;
• Otherwise, expand the enclosure that best equalizes the sample size among the adjacent enclosures.
Randomly break any remaining ties.
7. Every 0-step clade should now lie within exactly one enclosure. Label these enclosures 1-1, 1-2, 1-3, etc. Consider each of these enclosures to be a 1-step clade. These 1-step clades will be the unit for the next round of nesting.
8. Repeat steps 2 through 8, at each step raising the nesting level by one, until the final enclosure includes the entire tree.
1. Begin with an unrooted phylogenetic tree containing all observed haplotypes. This tree may also include some unobserved internal haplotypes that link together the observed haplotypes. (For example, if we observe a pair of haplotypes that differ by exactly two mutations, it is usually safe to infer that they are linked by an intermediate haplotype sharing one mutation with each.) We will consider each haplotype to be a “0-step clade.”
————————————————————————————————————
2. Identify the tip clades. These are the 0-step clades that are connected to only one other 0-step clade.
3. Enclose each tip clade and its neighbor with a circle or other closed curve. If two or more tip clades have the same neighbor, enclose all of them and their common neighbor with a single curve.
————————————————————————————————————
4. Temporarily remove all enclosed portions of the tree. If there are still any clusters of two or more adjacent 0-step clades that were not enclosed at earlier steps, repeat steps 2 through 4 until no such clusters remain.
5. Return all portions of the tree that were removed in step 4.
————————————————————————————————————
6. If there are any “stranded” 0-step clades that have not been enclosed, expand an adjacent enclosure to include each one of them. When deciding which adjacent enclosure to expand in this way:
• If the stranded clade is empty, expand the enclosure with the smallest sample size;
• Otherwise, expand the enclosure that best equalizes the sample size among the adjacent enclosures.
Randomly break any remaining ties.
7. Every 0-step clade should now lie within exactly one enclosure. Label these enclosures 1-1, 1-2, 1-3, etc. Consider each of these enclosures to be a 1-step clade. These 1-step clades will be the unit for the next round of nesting.
————————————————————————————————————
8. Repeat steps 2 through 8, at each step raising the nesting level by one, until the final enclosure includes the entire tree.