Character based approach of phylogenetic analysis

These approaches simultaneously compare all sequences in the alignment, considering a one character (a site in the alignment) at a time to calculate the tree score
Tree score is a quantitative measurement of the particular phylogenetic tree fits the observed data.
The tree score of the Maximum parsimony (MP) – minimum numbers of changes required to explain the observed data. (Minimum number of changes for maximum parsimony)
The tree score of the Maximum likelihood (ML) – The trees with the highest likelihood value is the best explain the observed data (The log likelihood value for maximum likelihood)
The tree score of the Bayesian inference – The tree score is based on the posterior probability.

Character-based approaches are computationally intensive in practice since they search through a large number of potential trees. Therefore most of the times heuristic search algorithms are used.
Heuristic search algorithm starts by constructing the initial tree topology using fast algorithms such as NJ & UPGMA, then perform the local rearrangements in order to improve the tree score. such as swapping two neighboring branches, moving sub trees.
Heuristic search algorithms are not guaranteed to finds the best tree topology, but it is useful for analyze very large datasets.

Maximum Parsimony Method (MP)

Maximum parsimony method do not used any explicit model of sequence evolution. in this method algorithm search all possible trees and find the one having fewest substitutions.(minimum possible changes in the sequences.)

Originally developed for morphological characters and later adapted for sequence data.
This parsimony criterion favors hypotheses of maximize congruence & minimize homoplasy.

This method is very good for similar sequences group with small amount of variations.
Relatively quick – all possible trees may be evaluated
The maximum parsimony tree is the tree that minimizes the tree score (trees which produce the minimum number of changes overall).

Not all sites of sequence informative to maximum parsimony method; there are informative sites and non informative sites

Non-informative sites –

constant sites: same nucleotide occurs in all the species.
Singleton sites – only one or very few species having distinct nucleotide, whereas all others are the same; these singleton sites can be problematic in phylogenetic analysis because of the singleton sites are occurred due to random mutations rather than evolutionary event.

Informative sites – at least two different nucleotides at a site each nucleotide type represent at least two sequences.

Advantages of Maximum Parsimony

Based on the derived shared characters therefore it is more cladistic rather than phenetic method.
Easy to describe and understand (Simplicity)
Evaluates different trees.

Disadvantages of Maximum Parsimony.

Long branch attraction which means long branches are grouped together, therefore wrong trees will generate.
Do not provide information on the branch lengths.
Do not used all sequence information; only informative sites are used.
Do not correct for multiple mutations.
Not used explicit evolutionary models.

Long branch attraction

This is occurred when two long branches (sequences that having high evolutionary rate) are separated by a short internal branch (sequence with low evolutionary rate) in this situation parsimony method connect the two long branches together it cause to incorrect tree topology. this happen because MP method does not use any explicit evolutionary model.

Maximum Likelihood Method.

Maximum Likelihood Method uses explicit method of sequence evolution. Nowadays this method is widely used because of increasing computational power and development of realistic models for sequence evolution.

This method was first proposed by the English statistician R.A.Fisher in 1922.

The model with the highest probability of generate the observed data is selected as the best model.

There are 3 main elements of maximum likelihood in phylogenetic analysis

Data – the alignment of the sequences
The tree – the tree represent the splitting sequences and the branch length.
The model – it is the mechanism that describe the probability of observing data

L = Pr(D|H)

L – Likelihood value of tree topology & model of the molecular evolution; it represent the probability of generation of observed data by particular model of molecular evolution. it measures the how well model of the molecular evolution fits with the observed data.
Pr – Probability of the observing data.
D – Data, it is the DNA or amino acid alignment sequence data that used for make the evolutionary relationships among these sequences.
H – Hypothesis or model of the molecular evolution.

Each site has a likelihood

There are two optimization steps are involved in ML tree construction

Optimization of branch lengths to calculate the tree score of each candidate tree.
Search the trees space for the maximum likelihood trees.

Maximum Parsimony Method is very good for similar sequences group with small amount of variations, However Maximum Likelihood is used to finds the relationships among diverse taxa. because it evaluates trees using explicit evolutionary models.

The total likelihood is product of the site likelihoods or sum of the log of the site likelihoods.

The maximum likelihood tree is the tree topology that gives the highest likelihood under the given model.

Advantages of Maximum Likelihood

More accurate than maximum parsimony method
statistical and evolutionary model-based method, understand the process of the sequence evolution
one of the most consistent method available.
can be used for character and rate analysis.
can be applied to nucleotide, amino acids and other types of data.
Use explicit evolutionary model therefore do not fool by long branch attraction.

Disadvantages of Maximum Likelihood

Not simple
Computational intense
Can be fooled by homoplasy

Maximum Parsimony Method (MP) VS Maximum Likelihood Method

Parsimony	Likelihood
seek the minimum number of changes(substitution)	seek to estimate the actual number of substitutions
assume that non-informative sites are not involve in evolution	sites evolve independently but by common mechanisms

Table 1:MP vs ML

Bayesian Method

Bayesian phtlogenetic method was introduce in 1990; this method is become popular because of the

The development of the powerful models for data analysis.
The availability of the user friendly computer programs to apply the models.

This method use probability distribution to describe the uncertainty of all unknowns, including the model parameters.

ML tries to find the best values of branch lengths and models parameters; Bayesian inferences allows to the parameters to have uncertainty.

The data are usually molecular sequence alignment or an alignment of morphological characters.

Branch length and substitution parameters (ratio of transition rate : transversion rate) are the parameters in the model.

ML likelihood expresses the probability of the data given the model; Bayesian expresses probability of the model given the observed data. (probability that the model is correct)

Important function of the Bayesian method

f(m) – the prior probability density function of the model; this is developed based on initial knowledge before observing the current data.

f(D|m) – the likelihood function; it represents the probability of observing the data (D) in given specific model.

f(m|D) – the posterior probability density function; it represent the update the information based on the new observed data, this updated information is the posterior distribution.

The prior probability density function f(m) is combines with the likelihood function f(D|m) in order to obtain the the posterior probability density function f(m|D)

It is difficult to calculate Bayesian phylogenetic applications analytically because most of times posterior probability distribution difficult to expressed by simple formula and impossible to calculate the posterior probability directly, therefore it relay on the intensive computational methods such as MCMC algorithm

MCMC algorithm (Markov chain Monte Carlo)

This algorithm generates sequences of the parameter values based on the prior values; this algorithm can handle more complex, high dimensional parameters and large number of parameters than ML (maximum likelihood method)

MCMC algorithm has no end point; most of the times it exploring trees that fits the data well therefore the program operator should tell when to stop.

Application of the MCMC algorithm

This method is widely used in evolutionary biology.
Use for develop the new models of data analysis.
Use to estimating the divergence time by integrating molecular data and the fossil record data.
Use to estimate the species tree using the multiple genetic loci sequence data.
Used in phylogenetic analysis of virus spread in humans.
Analysis the species diversification rate.
Genomic sequence data analysis

References:

Bayes’ Rule – Explained For Beginners. (2021, March 29). freeCodeCamp.org. https://www.freecodecamp.org/news/bayes-rule-explained/
Sequence homology – Wikipedia. (2014, December 8). Sequence Homology – Wikipedia. https://en.wikipedia.org/wiki/Sequence_homology