News Tree of life

Rob Sheldon on Darwin’s Twitter nerdfight

Spread the love
Dr Sheldon
Rob Sheldon

Re “Dreadful row breaks out re cladistics” (Darwin’s Tree of Life) , physicist Rob Sheldon offers,

It is about the relative merits of using parsimony (Occam’s Razor) versus Bayesian (likelihood) analysis in cladistics. The first is more causal, more deterministic, more rational while the second allows priors and opinions and “other stuff” to influence the output.

The Darwinian priors can be used in the 2nd, but are excluded from the 1st. So you would think Darwinians would like Bayesian, but they don’t; they seem to prefer parsimony. On the other hand, many have pointed out that parsimony merely looks deterministic, but smuggles in a lot of assumptions.

So perhaps the debate is between implicit versus explicit Darwinists–because as long as Darwinists can keep the assumptions implicit, no one can argue with them. Perhaps a biologist can tell me where the real dispute lies?

And if you understand this debate, what is the meaning of the last sentence of the Wired article?:

Darwin planted the tree over 150 years ago, and each day it thickens. Now scientists just have to figure out how to grow the thing without constantly trying to refresh its roots with the blood of patriots to its cause.”

🙂 Maybe it means that sensible people should go into a line of work other than cladistics?

See also: Maybe biological classification is more of an art exhibition than a science pursuit?

Follow UD News at Twitter!

9 Replies to “Rob Sheldon on Darwin’s Twitter nerdfight

  1. 1
    Bob O'H says:

    Oh dear. I guess it’s no surprise that Sheldon doesn’t understand the issues: he’s not a biologist. It’s worth noting at the start that Joe Felsenstein has been one of the main proponents of the likelihood approach, and has been a big target of the more aggressive part of the Willi Hennig Society.
    1. Bayesian and likelihood are not the same: the arguments were principally about parsimony vs maximum likelihood, so priors are not relevant (it’s also often pointed out that when morphology is used in these analyses, the characters used and their codings can be very subjective).
    2. The suggestion that “Darwinians would like Bayesian, but they don’t; they seem to prefer parsimony” is a mis-representation: people on both sides of the debate are evolutionary biologists. And, once more, it’s about the use of likelihood, not Bayesian approaches.

    I interpret the last sentence to be about how we shouldn’t all be aggressive towards each other: something that I hope will improve with retirements. In statistics, there used to be a similar debate between Bayesians and Frequentists, but the Bayesians won (acceptance, at least), and now it’s not a big issue. In the same way, I think most people producing phylogenies have moved past these arguments.

  2. 2
    wd400 says:

    Bayesian (likelihood)

    Um… No.

  3. 3

    Well, if Bayesian isn’t likelihood, then perhaps someone can enlighten me on the difference. It would seem to me that Bob O’H comment “..the characters used and their codings can be very subjective” is precisely the definition of a prior. If this subjectivity isn’t assumed prior to the calculation then why isn’t it a prior? And if it is a prior, then why isn’t likelihood just a variation on Bayesian?

    I’ve read the founding documents on Bayesian methods by ET Jaynes at bayes.wustl.edu, and it would seem to me that “likelihood” is just a restricted form of Bayesian analysis. If it does something more, please educate me.

  4. 4
    Mapou says:

    Cladistics = pseudoscience. Or as Karl Popper would put it, it’s a metaphysical research program.

  5. 5
    Bob O'H says:

    Robert Sheldon – the likelihood is the probability of the data given the parameters: Pr(X|theta), where X is the data and theta is the parameters. Maximum likelihood maximises this.

    Bayesian analysis works looks at Pr(theta|X): the probability of the parameters given the data. This is calculated from

    Pr(theta|X) = Pr(X|theta)Pr(theta)/Pr(X)

    where Pr(X) is just a normalising constant, Pr(X|theta) is (of course) the likelihood, and Pr(theta) is the prior distribution of the parameters (before the data are seen). The differences are too much to go into in a blog comment, but you could start with wikipedia, and search from there. You should really read up on statistical inference (D.R. Cox is usually good for this), in particular how frequentist and likelihood schools look at things before commenting.

    “..the characters used and their codings can be very subjective” is precisely the definition of a prior.

    No, this is the data – taxonomists spend a lot of time looking at specimens and deciding what traits are important, how to code them into something that can be analysed etc. it’s not easy (I certainly couldn’t do it), but it inevitably has some level of subjectivity. Incidentally, that was a comment aimed at the parsimony crowd, who most certainly do not use priors.

  6. 6

    Thanks Bob O’H,
    I think you and I are agreeing, though clearly I was misusing the jargon in the field. My understanding of Bayesian analysis, is that it represents all manipulations of the probability based on the equation that you wrote down. E.g., if anyone anywhere replaces Pr(T|D) with some version of Pr(D|T)Pr(D)/Pr(T), then they are using Bayes Theorem and hence doing Bayesian analysis.

    What separates frequentist and likelihood schools is the assumption of normal or Gaussian distributions for all the unknowns in the frequentist school. Or another way of saying that, is the frequentist school denies any coherence, any cross-products, any memory in the system.

    The parsimony crowd, if I interpret your comment correctly, apply this same criterion–that anything not known is automatically a random, Gaussian distributed variable, which is a prior that everyone assumed in the pre-Bayesian halcyon days before Jaynes (and was therefore invisible.) The difficulty with admitting it was a prior, is that it destroys the determinism, the force of causality that is widely thought to make cladistics or Darwinism inevitable.

    Therefore the cladistics wars are in microcosm, the same wars of ID vs Darwin, or Post-Modernism vs Modernism, or Feminism vs Paternalism. Bayes doesn’t solve the battle, but it helps in describing the battlefield.

  7. 7
    wd400 says:

    What separates frequentist and likelihood schools is the assumption of normal or Gaussian distributions for all the unknowns in the frequentist school. Or another way of saying that, is the frequentist school denies any coherence, any cross-products, any memory in the system.

    What? No. This is so wrong it’s hard even to correct.

    If you want to understand modes of statistical inference then the Wikipeida pages on Bayesian, Frequentist and Maximum Likelihood schools are a pretty good start. After that you may which to read Cox.

    At their most extreme the parsimony crowd claim there are no random variables at all: The Data is The Data, The Tree is The Tree and their method with all ways find The Tree. If you’d like to understand the history of phylogenetic methods Joe Felsenstein’s book has a summary .

  8. 8
    wd400 says:

    BTW, a much better lens to look at this debate would be to say parsimiony is a non-parametric statistical approach, while likelihood and later methods are parametric.

    It also happens that parsimony is not a great non-parametric method, so doesn’t have many of the advantages put forward for that class of methods. And (at least some of) the parsimiony crowd will claim their method is not statistical at all.

  9. 9
    Bob O'H says:

    Robert – as wd400 has pointed out, you haven’t understood the issues at all. I’m not sure if it would help to try to correct you: it would take a lot of time, from where you are in your understanding now, I’m afraid the confusions are too great.

    You could do a lot worse than read Joe Felsenstein’s book on the subject.

Leave a Reply