Journal Title
Title of Journal: Algorithmica
|
Abbravation: Algorithmica
|
|
|
|
|
Authors: Katharina T Huber Leo van Iersel Vincent Moulton Celine Scornavacca Taoyang Wu
Publish Date: 2015/09/14
Volume: 77, Issue: 1, Pages: 173-200
Abstract
Binets and trinets are phylogenetic networks with two and three leaves respectively Here we consider the problem of deciding if there exists a binary level1 phylogenetic network displaying a given set mathbb T of binary binets or trinets over a taxon set X and constructing such a network whenever it exists We show that this is NPhard for trinets but polynomialtime solvable for binets Moreover we show that the problem is still polynomialtime solvable for inputs consisting of binets and trinets as long as the cycles in the trinets have size three Finally we present an O3X polyX time algorithm for general sets of binets and trinets The latter two algorithms generalise to instances containing level1 networks with arbitrarily many leaves and thus provide some of the first supernetwork algorithms for computing networks from a set of rooted phylogenetic networksA key problem in biology is to reconstruct the evolutionary history of a set of taxa using data such as DNA sequences or morphological features These histories are commonly represented by phylogenetic trees and can be used for example to inform genomics studies analyse virus epidemics and understand the origins of humans 23 Even so in case evolutionary processes such as recombination and hybridization are involved it can be more appropriate to represent histories using phylogenetic networks instead of trees 2Generally speaking a phylogenetic network is any type of graph with a subset of its vertices labelled by the set of taxa that in some way represents evolutionary relationships between the taxa 13 Here however we focus on a special type of phylogenetic network called a level1 network We present the formal definition for this type of network in the next section but essentially it is a binary directed acyclic graph with a single root whose leaves correspond to the taxa and in which any two cycles are disjoint for example see Fig 2 below This type of network was first considered in 20 and is closely related to socalled galledtrees 3 7 Level1 networks have been used to for example analyse virus evolution 10 and are of practical importance since their simple structure allows for efficient construction 7 10 15 and comparison 17One of the main approaches that have been used to construct level1 networks is from triplet sets that is sets of rooted binary trees with three leaves see eg 10 18 19 22 Even so it has been observed that the set of triplets displayed by a level1 network does not necessarily provide all of the information required to uniquely define or encode the network 5 Motivated by this observation in 11 an algorithm was developed for constructing level1 networks from a network analogue of triplets rooted binary networks with three leaves or trinets This algorithm relies on the fact that the trinets displayed by a level1 network do indeed encode the network 11 Even so the algorithm was developed for dense trinet sets only ie sets in which there is a trinet associated to each combination of three taxaIn this paper we consider the problem of constructing level1 networks from arbitrary sets of level1 trinets and binets where a binet is an even simpler building block than a trinet consisting of a rooted binary network with just two leaves We consider binets as well as trinets since they can provide important information to help piece together sets of trinets Our approach can be regarded as a generalisation of the wellknown supertree algorithm called Build 1 23 for checking whether or not a set of triplets is displayed by a phylogenetic tree and constructing such a tree if this is the case In particular the algorithm we present in Sect 4 is one of the first supernetwork algorithms for constructing a phylogenetic network from a set of networks Note that some algorithms have already been developed for computing unrooted supernetworks—see for example 8 12We expect that our algorithm could be useful in practice as there are programs which can be used to compute trinets from biological data 14 25 and also binets as subnets of the computed trinets Some of these programs use optimisation criteria such as likelihood which can be very computationally expensive for large datasets but which are much more practical for small datasets Note that a similar strategy was used in the quartet puzzling 24 approach for computing phylogenetic trees from fourleaved trees or quartets based on likelihood before likelihood became more practical for larger data sets It should be noted however that most of the current programs for computing phylogenetic networks are based on the trees embedded within a network and so they might not be able to distinguish between different types of trinets 21 Hopefully the development of new models will make it possible to deal with potential difficulties in this respect Also it could be of interest to build networks from networks on slightly larger subsets such as sizefour and sizefive subets and try to merge these instead of or as well as trinets as such subsets may be more informative than sizethree onesWe now summarize the contents of the paper After introducing some basic notation in the next section in Sect 3 we begin by presenting a polynomialtime algorithm for deciding whether or not there exists some level1 network that displays a given set of level1 binets and for constructing such a network if it actually exists see Theorem 1 Then in Sect 4 we present an exponentialtime algorithm for an arbitrary set consisting of binets and trinets see Theorem 2 This algorithm uses a topdown approach that is somewhat similar in nature to the Build algorithm 1 23 but it is considerably more intricate The algorithm can be generalised to instances containing level1 networks with arbitrarily many leaves since trinets encode level1 networks 11In Sect 5 we show that for the special instance where each cycle in the input trinets has size three our exponentialtime algorithm is actually guaranteed to work in polynomial time This is still the case when the input consists of binary level1 networks with arbitrarily many leaves as long as all their cycles have length three However in Sect 6 we prove that in general it is NPhard to decide whether or not there exists a binary level1 network that displays an arbitrary set of trinets see Theorem 4 We also show that this problem remains NPhard if we insist that the network contains only one cycle Our proof is similar to the proof that it is NPhard to decide the same question for an arbitrary set of triplets given in 18 but the reduction is more complicated In Sect 7 we conclude with a discussion of some directions for future work
Keywords:
.
|
Other Papers In This Journal:
|