Uncommon Descent Serving The Intelligent Design Community

Logic and First Principles, 2: How could Induction ever work? (Identity and universality in action . . . )

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

In a day when first principles of reason are at a steep discount, it is unsurprising to see that inductive reasoning is doubted or dismissed in some quarters.

And yet, there is still a huge cultural investment in science, which is generally understood to pivot on inductive reasoning.

Where, as the Stanford Enc of Phil notes, in the modern sense, Induction ” includes all inferential processes that “expand knowledge in the face of uncertainty” (Holland et al. 1986: 1), including abductive inference.” That is, inductive reasoning is argument by more or less credible but not certain support, especially empirical support.

How could it ever work?

A: Surprise — NOT: by being an application of the principle of (stable) distinct identity. (Which is where all of logic seems to begin!)

Let’s refresh our thinking, partitioning World W into A and ~A, W = {A|~A}, so that (physically or conceptually) A is A i/l/o its core defining characteristics, and no x in W is A AND also ~A in the same sense and circumstances, likewise any x in W will be A or else ~A, not neither nor both. That is, once a dichotomy of distinct identity occurs, it has consequences:

Laws of logic in action as glorified common-sense first principles of right reason

Where also, we see how scientific models and theories tie to the body of observations that are explained or predicted, with reliable explanations joining the body of credible but not utterly certain knowledge:

Abductive, inductive reasoning and the inherent provisionality of scientific theorising

As I argued last time:

>>analogical reasoning [–> which is closely connected to inductive reasoning] “is fundamental to human thought” and analogical arguments reason from certain material and acknowledged similarities (say, g1, g2 . . . gn) between objects of interest, say P and Q to further similarities gp, gp+1 . . . gp+k. Also, observe that analogical argument is here a form of inductive reasoning in the modern sense; by which evidence supports and at critical mass warrants a conclusion as knowledge, but does not entail it with logical necessity.

How can this ever work reliably?

By being an application of the principle of identity.

Where, a given thing, P, is itself in light of core defining characteristics. Where that distinctiveness also embraces commonalities. That is, we see that if P and Q come from a common genus or archetype G, they will share certain common characteristics that belong to entities of type G. Indeed, in computing we here speak of inheritance. Men, mice and whales are all mammals and nurture their young with milk, also being warm-blooded etc. Some mammals lay eggs and some are marsupials, but all are vertebrates, as are fish. Fish and guava trees are based on cells and cells use a common genetic code that has about two dozen dialects. All of these are contingent embodied beings, and are part of a common physical cosmos.

This at once points to how an analogy can be strong (or weak).

For, if G has in it common characteristics {g1, g2 . . . gn, | gp, gp+1 . . . gp+k} then if P and Q instantiate G, despite unique differences they must have to be distinct objects, we can reasonably infer that they will both have the onward characteristics gp, gp+1 . . . gp+k. Of course, this is not a deductive demonstration, at first level it is an invitation to explore and test until we are reasonably, responsibly confident that the inference is reliable. That is the sense in which Darwin reasoned from artificial selection by breeding to natural selection. It works, the onward debate is the limits of selection.>>

Consider the world, in situation S0, where we observe a pattern P. Say, a bright, red painted pendulum swinging in a short arc and having a steady period, even as the swings gradually fade away. (And yes, according to the story, this is where Galileo began.) Would anything be materially different in situation S1, where an otherwise identical bob were bright blue instead? (As in, strip the bob and repaint it.)

“Obviously,” no.

Why “obviously”?

We are intuitively recognising that the colour of paint is not core to the aspect of behaviour we are interested in. A bit more surprising, within reason, the mass of the bob makes little difference to the slight swing case we have in view. Length of suspension does make a difference as would the prevailing gravity field — a pendulum on Mars would have a different period.

Where this points, is that the world has a distinct identity and so we understand that certain things (here comes that archetype G again) will be in common between circumstances Si and Sj. So, we can legitimately reason from P to Q once that obtains. And of course, reliability of behaviour patterns or expectations so far is a part of our observational base.

Avi Sion has an interesting principle of [provisional] uniformity:

>>We might . . . ask – can there be a world without any ‘uniformities’? A world of universal difference, with no two things the same in any respect whatever is unthinkable. Why? Because to so characterize the world would itself be an appeal to uniformity. A uniformly non-uniform world is a contradiction in terms.

Therefore, we must admit some uniformity to exist in the world.

The world need not be uniform throughout, for the principle of uniformity to apply. It suffices that some uniformity occurs. Given this degree of uniformity, however small, we logically can and must talk about generalization and particularization. There happens to be some ‘uniformities’; therefore, we have to take them into consideration in our construction of knowledge. The principle of uniformity is thus not a wacky notion, as Hume seems to imply . . . .

The uniformity principle is not a generalization of generalization; it is not a statement guilty of circularity, as some critics contend. So what is it? Simply this: when we come upon some uniformity in our experience or thought, we may readily assume that uniformity to continue onward until and unless we find some evidence or reason that sets a limit to it. Why? Because in such case the assumption of uniformity already has a basis, whereas the contrary assumption of difference has not or not yet been found to have any. The generalization has some justification; whereas the particularization has none at all, it is an arbitrary assertion.

It cannot be argued that we may equally assume the contrary assumption (i.e. the proposed particularization) on the basis that in past events of induction other contrary assumptions have turned out to be true (i.e. for which experiences or reasons have indeed been adduced) – for the simple reason that such a generalization from diverse past inductions is formally excluded by the fact that we know of many cases [of inferred generalisations; try: “we can make mistakes in inductive generalisation . . . “] that have not been found worthy of particularization to date . . . .

If we follow such sober inductive logic, devoid of irrational acts, we can be confident to have the best available conclusions in the present context of knowledge. We generalize when the facts allow it, and particularize when the facts necessitate it. We do not particularize out of context, or generalize against the evidence or when this would give rise to contradictions . . . [Logical and Spiritual Reflections, BK I Hume’s Problems with Induction, Ch 2 The principle of induction.]>>

So, by strict logic, SOME uniformity must exist in the world, the issue is to confidently identify reliable cases, however provisionally. So, even if it is only that “we can make mistakes in generalisations,” we must rely on inductively identified regularities of the world.Where, this is surprisingly strong, as it is in fact an inductive generalisation. It is also a self-referential claim which brings to bear a whole panoply of logic; as, if it is assumed false, it would in fact have exemplified itself as true. It is an undeniably true claim AND it is arrived at by induction so it shows that induction can lead us to discover conclusions that are undeniably true!

Therefore, at minimum, there must be at least one inductive generalisation which is universally true.

But in fact, the world of Science is a world of so-far successful models, the best of which are reliable enough to put to work in Engineering, on potential risk of being found guilty of tort in court.

Illustrating:

How is such the case? Because, observing the reliability of a principle is itself an observation, which lends confidence in the context of a world that shows a stable identity and a coherent, orderly pattern of behaviour. Or, we may quantify. Suppose an individual observation O1 is 99.9% reliable. Now, multiply observations, each as reliable, the odds that all of these are somehow collectively in a consistent error falls as (1 – p)^n. Convergent, multiplied credibly independent observations are mutually, cumulatively reinforcing, much as how the comparatively short, relatively weak fibres in a rope can be twisted and counter-twisted together to form a long, strong, trustworthy rope.

And yes, this is an analogy.

(If you doubt it, show us why it is not cogent.)

So, we have reason to believe there are uniformities in the world that we may observe in action and credibly albeit provisionally infer to. This is the heart of the sciences.

What about the case of things that are not directly observable, such as the micro-world, historical/forensic events [whodunit?], the remote past of origins?

That is where we are well-advised to rely on the uniformity principle and so also the principle of identity. We would be well-advised to control arbitrary speculation and ideological imposition by insisting that if an event or phenomenon V is to be explained on some cause or process E, the causal mechanism at work C should be something we observe as reliably able to produce the like effect. And yes, this is one of Newton’s Rules.

For relevant example, complex, functionally specific alphanumerical text (language used as messages or as statements of algorithms) has but one known cause, intelligently directed configuration. Where, it can be seen that blind chance and/or mechanical necessity cannot plausibly generate such strings beyond 500 – 1,000 bits of complexity. There just are not enough atoms and time in the observed cosmos to make such a blind needle in haystack search a plausible explanation. The ratio of possible search to possible configurations trends to zero.

So, yes, on its face, DNA in life forms is a sign of intelligently directed configuration as most plausible cause. To overturn this, simply provide a few reliable cases of text of the relevant complexity coming about by blind chance and/or mechanical necessity. Unsurprisingly, random text generation exercises [infinite monkeys theorem] fall far short, giving so far 19 – 24 ASCII characters, far short of the 72 – 143 for the threshold. DNA in the genome is far, far beyond that threshold, by any reasonable measure of functional information content.

Similarly, let us consider the fine tuning challenge.

The laws, parameters and initial circumstances of the cosmos turn out to form a complex mathematical structure, with many factors that seem to be quite specific. Where, mathematics is an exploration of logic model worlds, their structures and quantities. So, we can use the power of computers to “run” alternative cosmologies, with similar laws but varying parameters. Surprise, we seem to be at a deeply isolated operating point for a viable cosmos capable of supporting C-Chemistry, cell-based, aqueous medium, terrestrial planet based life. Equally surprising, our home planet seems to be quire privileged too. And, if we instead posit that there are as yet undiscovered super-laws that force the parameters to a life supporting structure, that then raises the issue, where did such super-laws come from; level-two fine tuning, aka front loading.

From Barnes:

Barnes: “What if we tweaked just two of the fundamental constants? This figure shows what the universe would look like if the strength of the strong nuclear force (which holds atoms together) and the value of the fine-structure constant (which represents the strength of the electromagnetic force between elementary particles) were higher or lower than they are in this universe. The small, white sliver represents where life can use all the complexity of chemistry and the energy of stars. Within that region, the small “x” marks the spot where those constants are set in our own universe.” (HT: New Atlantis)

That is, the fine tuning observation is robust.

There is a lot of information caught up in the relevant configurations, and so we are looking again at functionally specific complex organisation and associated information.

(Yes, I commonly abbreviate: FSCO/I. Pick any reasonable index of configuration-sensitive function and of information tied to such specific functionality, that is a secondary debate, where it is not plausible that say the amount of information in DNA and proteins or in the cluster of cosmological factors is extremely low. FSCO/I is also a robust phenomenon, and we have an Internet full of cases in point multiplied by a world of technology amounting to trillions of cases that show that it has just one commonly observed cause, intelligently directed configuration. AKA, design.)

So, induction is reasonable, it is foundational to a world of science and technology.

It also points to certain features of our world of life and the wider world of the physical cosmos being best explained on design, not blind chance and mechanical necessity.

Those are inductively arrived at inferences, but induction is not to be discarded at whim, and there is a relevant body of evidence.

Going forward, can we start from this? END

PS: Per aspect (one after the other) Explanatory Filter, adapting Dembski et al:

Comments
Equally weighting which priors? In the BF the prior for the models doesn't appear, and for the parameters of each model it's difficult to see (in general) how any equal weighting could be done: the models could have very different parameter space (as they do in design vs evolution, for example). A zero probability event would be one that was considered impossible. As this is subjective, it means that it could be right, but that whoever assigned the probability thought it flat out wrong.Bob O'H
November 26, 2018
November
11
Nov
26
26
2018
09:15 AM
9
09
15
AM
PDT
Ah, actually ID is not positing equal priors. The probabilistic resources is the prior for the chance hypothesis. Then, CSI is the probabilistic deficiency that is not accounted for.EricMH
November 26, 2018
November
11
Nov
26
26
2018
09:13 AM
9
09
13
AM
PDT
@Bob O'H what are your thoughts on equally weighting the priors? And, what does the occurrence of an event with zero prior probability mean? Does it mean the hypothesis is flat out wrong and another is needed?EricMH
November 26, 2018
November
11
Nov
26
26
2018
07:54 AM
7
07
54
AM
PDT
As you're very much strayed onto my territory, a couple of comments: 1. From a frequentist (or fiducial) standpoint, LAMBDA is a likelihood ratio 2. For a Bayesian, LAMBDA is a Bayes factor. Botha are used as measures of evidence. The difference is how other unknowns are treated: the frequentist maximises over them, the Bayesian marginalises. It's well known (at least in statistical circles) that the Bayes Factor is sensitive to the priors of the parameters. I think one can interpret a Bayesian approach as induction under uncertainty. It allows us to update our knowledge as new observations come in, but because it assumes uncertainty in the observations and the process, it's more flexible (essentially, as long as the models says that it is possible for black swans to exist, we don't panic when we see one). But if we make an observation that has zero prior probability, then we do panic. It's also worth noting that Bayesian mathematics only works when the assumed models are true. Which is unfortunate as we all know that all models are wrong. Evidently, they are not wrong enough to still be useful.Bob O'H
November 26, 2018
November
11
Nov
26
26
2018
06:43 AM
6
06
43
AM
PDT
Are you saying that if M makes a prediction error then we can inductively infer the general observation that M is false? I also request shorter words and sentences. I have but a small brain that finds it difficult to grasp abstract concepts. Equally weighting the priors means that P(x|A)/P(x|B) = P(A|x)/P(B|x). So, if A makes x more likely, then A is the better theory. This seems to be the hidden premise behind the ID inference to design.EricMH
November 26, 2018
November
11
Nov
26
26
2018
04:05 AM
4
04
05
AM
PDT
JMH Let's go back through 15:
To assume for argument that there are no such universal properties, ~U, leads immediately to the insight, that [~U] would itself inadvertently exemplify a universal property. That is the attempt to deny U instantly confirms it as undeniably and indeed self-evidently true. Universal, stable properties are inevitable characteristics of a world. But are such amenable to our minds? Are they intelligible or an inscrutable enigma? This can be answered through a case in point. Following Josiah Royce and Elton Trueblood [who reminded us of Royce’s half-forgotten work] we can see that “error exists” is an undeniable truth. But also, we vividly recall class work, say, elementary school sums that came back full of big red X’s. (When I was in the classroom, I insisted on using green ink, having been traumatised in my own day.) So, we can see a candidate: we may make errors in inductive generalisations, call this M. Try the denial ~M. Instantly on well known cases we know that ~M is actually false and we can see from the logic of inference to so far “best” explanation that such fallibility is locked into the logic.
[--> That is, it is first, part of our factual background that we do as a matter of experience make errors in inductive generalisation. One countervailing fact wrecks a universal claim. Also, we know from the logic of explanations, that the implication is not an equivalence so the support provided by observations so far is fallible, i.e. there is a possibility of error. Which is enough again.]
. . . We do not have here a case of true premises and strictly deductive consequences guaranteeing the truth of conclusions. Thus, we have a case of a certainly known undeniably true universal property accessed through inductive generalisation, M.
[--> The mere logic establishes the possibility of error, and we know the fact of error from experience also.]
That may have been established on a trivial case [and is backed by a survey of the history of science], but it is a powerful result: there are intelligible, universal properties of the world. And, we also know that we may establish reliable, plausible but provisional generalisations or explanations through empirical investigations and linked reasoning. Equally, on relevant history of science and other disciplines.
On the Likelihood analysis, yes if we are indifferent across T1 and T2 as to which is more likely, the subjective probability ratio goes to 1, as of two options that are under indifference, 0.5/0.5 = 1. But that means we have no basis for choosing one over the other, they are so far empirically equivalent. Also, they would have to be very carefully chosen to be exhaustive of possibilities or P(T1) + P(T2) would sum to less than 1, maybe to a large degree and one that is unknown. KFkairosfocus
November 25, 2018
November
11
Nov
25
25
2018
01:59 PM
1
01
59
PM
PDT
Thanks KF, that is a very detailed exposition on the topic, and I will continue thinking it over. While you are right there is always some general pattern, even in a random world, I do not yet follow your example of how error allows us to infer the general pattern. It sounds a bit like Popper's falsificationism, which is logically flawed per the fair coin example. My current thought is to combine maximum entropy with Bayesian reasoning to eliminate the P(T1)/P(T2) term, since maximum entropy sets the term to 1. But I'm sure there is a problem with that approach.EricMH
November 25, 2018
November
11
Nov
25
25
2018
10:13 AM
10
10
13
AM
PDT
EMH, the design inference on complex [functionally] specified information is an inductive inference in the abductive form. That is brought out through the design inference explanatory filter (especially in the per aspect form that I will append to the OP). Similarly, Bayesian inference probability revision (and more importantly its extension into likelihood reasoning [an in-page in my always linked briefing note]) is again abductive, where in the latter case relative likelihood of alternative hypotheses is on the table. This is of course essentially a statistical study tied into bayesian subjective probabilities and thence wider probability theory. Which then often pointlessly bogs down in debates over defining and estimating probabilities, in relevant contexts. But inductive reasoning is much broader than Bayesian reasoning as extended into likelihoods etc. Indeed, it comes from a different world of thought. One, where we have to credibly account for how people have reasoned empirically and fallibly but reliably enough to build civilisations including engineering disciplines, science and technology, management and academic disciplines such as history for many thousands of years. Something that is pretty messy but vital. Especially as we know that inductive reasoning cannot reduce to deductive reasoning. For simple example, say, we have a hyp H that explains observed or in-hand "facts" F1, F2 . . . Fm and predicts Pn+1, Pn+2 . . . Pn, i.e, n being now. But, to infer: H => {F + P} {F + P} is so, Therefore, H . . . is fallacious, affirming the consequent. In effect implication is not to be confused with equivalence aka double implication. Simplistically, If Tom is a cat then Tom is a mammal does not sustain that if Tom is a mammal Tom must be a cat. Ask any boy named Tom. So, how is induction sustained as a responsible argument? Especially, given that for most of history and even now for most real world affairs, Bayesian reasoning simply has not been on the table. In short, how can one erect credibly reliable albeit inherently fallible [thus, in principle provisional] support for conclusions? Sufficiently reliable to stand up in a Tort case, to risk considerable sums of money on [building buildings or bridges or ships or company business models], or to trust with one's life [medical treatment, vehicles, ships, aircraft, weapons . . . think of how a Katana is made following a highly complex, trade secret based traditional recipe handed down across generations], etc? That is a far deeper challenge. As I pointed out in the OP:
In a day when first principles of reason are at a steep discount, it is unsurprising to see that inductive reasoning is doubted or dismissed in some quarters. And yet, there is still a huge cultural investment in science, which is generally understood to pivot on inductive reasoning. Where, as the Stanford Enc of Phil notes, in the modern sense, Induction ” includes all inferential processes that “expand knowledge in the face of uncertainty” (Holland et al. 1986: 1), including abductive inference.” That is, inductive reasoning is argument by more or less credible but not certain support, especially empirical support. How could it ever work? A: Surprise — NOT: by being an application of the principle of (stable) distinct identity. (Which is where all of logic seems to begin!) . . . . Consider the world, in situation S0, where we observe a pattern P. Say, a bright, red painted pendulum swinging in a short arc and having a steady period, even as the swings gradually fade away. (And yes, according to the story, this is where Galileo began.) Would anything be materially different in situation S1, where an otherwise identical bob were bright blue instead? (As in, strip the bob and repaint it.) “Obviously,” no. Why “obviously”? We are intuitively recognising that the colour of paint is not core to the aspect of behaviour we are interested in. A bit more surprising, within reason, the mass of the bob makes little difference to the slight swing case we have in view. Length of suspension does make a difference as would the prevailing gravity field — a pendulum on Mars would have a different period. Where this points, is that the world has a distinct identity and so we understand that certain things (here comes that archetype G again [--> which holds a cluster of in-common, i.e. universal characteristics that here would manifest as reliable regularities]) will be in common between circumstances Si and Sj. So, we can legitimately reason from P [--> case or cases Si together with inferred plausible, so far best explanation Ebi] to Q [--> a future situation Sj to be managed or interacted with i/l/o Ebi] once that obtains. And of course, reliability of behaviour patterns or expectations so far is a part of our observational base.
I have of course augmented slightly. This is really descriptive so far, we do this on the assumption that there are reliable stable [= universalisable] and intelligible characteristics of the world. How could this ever work or at least plausibly be acceptable? That's where Avi Sion's observation becomes crucial. To assume for argument that there are no such universal properties, ~U, leads immediately to the insight, that [~U] would itself inadvertently exemplify a universal property. That is the attempt to deny U instantly confirms it as undeniably and indeed self-evidently true. Universal, stable properties are inevitable characteristics of a world. But are such amenable to our minds? Are they intelligible or an inscrutable enigma? This can be answered through a case in point. Following Josiah Royce and Elton Trueblood [who reminded us of Royce's half-forgotten work] we can see that "error exists" is an undeniable truth. But also, we vividly recall class work, say, elementary school sums that came back full of big red X's. (When I was in the classroom, I insisted on using green ink, having been traumatised in my own day.) So, we can see a candidate: we may make errors in inductive generalisations, call this M. Try the denial ~M. Instantly on well known cases we know that ~M is actually false and we can see from the logic of inference to so far "best" explanation that such fallibility is locked into the logic. We do not have here a case of true premises and strictly deductive consequences guaranteeing the truth of conclusions. Thus, we have a case of a certainly known undeniably true universal property accessed through inductive generalisation, M. That may have been established on a trivial case [and is backed by a survey of the history of science], but it is a powerful result: there are intelligible, universal properties of the world. And, we also know that we may establish reliable, plausible but provisional generalisations or explanations through empirical investigations and linked reasoning. Equally, on relevant history of science and other disciplines. We also know from modelling theory that a strictly false model may be highly reliable in a defined domain of testing. This gives us confidence to trust that the stability of the actual properties of the world will sustain models we can use to confidently design, build and act on reliable but provisional, weak form knowledge. (Strong form knowledge is not merely reliable and credible or plausible but actually true . . . a very hard to meet, quite restrictive requirement. Not even complex axiomatic mathematical systems, post Godel, meet that standard. My own confidence in Math in material part rests on a large body of fact, demonstrated reliability and equally demonstrated coherence across domains such as in the Euler expression 0 = 1 + e^i*pi, etc.) As for multiverse hyp, this is injection of an unobserved and likely unobservable entity, and it inadvertently crosses over into speculative metaphysics rather than science. Philosophy done in a lab coat is still philosophy and it must answer to comparative difficulties across competing world views. Or else, we have grand question-begging. Where, evolutionary materialistic scientism is manifestly self-referentially incoherent. J B S Haldane long since put his finger on what we can elaborate as a core problem:
"It seems to me immensely unlikely that mind is a mere by-product of matter. For if my mental processes are determined wholly by the motions of atoms in my brain I have no reason to suppose that my beliefs are true. They may be sound chemically, but that does not make them sound logically. And hence I have no reason for supposing my brain to be composed of atoms. In order to escape from this necessity of sawing away the branch on which I am sitting, so to speak, I am compelled to believe that mind is not wholly conditioned by matter.” ["When I am dead," in Possible Worlds: And Other Essays [1927], Chatto and Windus: London, 1932, reprint, p.209. (NB: DI Fellow, Nancy Pearcey brings this right up to date (HT: ENV) in a current book, Finding Truth.)]
Materialism is non-viable, though those caught up in the system don't tend to see that clearly. Exposing the incoherence and asking, first justify yourself being significantly free thus able to observe and reason accurately and responsibly so your arguments have legs to stand on is an important move in breaking the spell of materialism dressed up in the lab coat and presuming it has cornered the market on rationality. The evasiveness or cornered rat lashing out will soon enough reveal the problem. I should note that Dembski has set his work in the context of abductive reasoning. The big, civilisation level question is induction. KF PS: Let me clip my discussion on Bayesian inference (the linked has colour-coded elements that help to clarify). Of course, we should not get bogged down on so specific an issue given the much broader question on the table:
We often wish to find evidence to support a theory, where it is usually easier to show that the theory [if it were for the moment assumed true] would make the observed evidence “likely" to be so [on whatever scale of weighting subjective/epistemological "probabilities" we may wish etc . . .]. So in effect we have to move: from p[E|T] to p[T|E], i.e from "probability of evidence given theory" to "probability of theory given evidence," which last is what we can see. (Notice also how easily the former expression p[E|T] "invites" the common objection that design thinkers are "improperly" assuming an agent at work ahead of looking at the evidence, to infer to design. Not so, but why takes a little explanation.) Let us therefore take a quick look at the algebra of Bayesian probability revision and its inference to a measure of relative support of competing hypotheses provided by evidence: a] First, look at p[A|B] as the ratio, (fraction of the time we would expect/observe A AND B to jointly occur)/(fraction of the the time B occurs in the POPULATION). --> That is, for ease of understanding in this discussion, I am simply using the easiest interpretation of probabilities to follow, the frequentist view. b] Thus, per definition given at a] above: p[A|B] = p[A AND B]/p[B], or, p[A AND B] = p[A|B] * p[B] c] By “symmetry," we see that also: p[B AND A] = p[B|A] * p[A], where the two joint probabilities (in green) are plainly the same, so: p[A|B] * p[B] = p[B|A] * p[A], which rearranges to . . . d] Bayes’ Theorem, classic form: p[A|B] = (p[B|A] * p[A]) / p[B] e] Substituting, E = A, T = B, E being evidence and T theory: p[E|T] = (p[T|E] * p[E])/ p[T], p[T|E] -- probability of theory (i.e. hypothesis or model) given evidence seen -- being here by initial simple "definition," turned into L[E|T]: L[E|T] is (by definition) the likelihood of theory T being "responsible" for what we observe, given observed evidence E [NB: note the "reversal" of how the "|" is being read]; at least, up to some constant. (Cf. here, here, here, here and here for a helpfully clear and relatively simple intro. A key point is that likelihoods allow us to estimate the most likely value of variable parameters that create a spectrum of alternative probability distributions that could account for the evidence: i.e. to estimate the maximum likelihood values of the parameters; in effect by using the calculus to find the turning point of the resulting curve. But, that in turn implies that we have an "agreed" model and underlying context for such variable probabilities.) Thus, we come to a deeper challenge: where do we get agreed models/values of p[E] and p[T] from? This is a hard problem with no objective consensus answers, in too many cases. (In short, if there is no handy commonly accepted underlying model, we may be looking at a political dust-up in the relevant institutions.) f] This leads to the relevance of the point that we may define a certain ratio, LAMBDA = L[E|h2]/L[E|h1], This ratio is a measure of the degree to which the evidence supports one or the other of competing hyps h2 and h1. (That is, it is a measure of relative rather than absolute support. Onward, as just noted, under certain circumstances we may look for hyps that make the data observed "most likely" through estimating the maximum of the likelihood function -- or more likely its logarithm -- across relevant variable parameters in the relevant sets of hypotheses. But we don't need all that for this case.) g] Now, by substitution A --> E, B --> T1 or T2 as relevant: p[E|T1] = p[T1|E]* p[E]/p[T1], and p[E|T2] = p[T2|E]* p[E]/p[T2]; so also, the ratio: p[E|T2]/ p[E|T1] = {p[T2|E] * p[E]/p[T2]}/ {p[T1|E] * p[E]/p[T1]} = {p[T2|E] /p[T2]}/ {p[T1|E] /p[T1]} h] Thus, rearranging: p[T2|E]/p[T1|E] = {p[E|T2]/ p[E|T1]} * {P(T1)/P(T2)} i] So, substituting: L[E|T2]/ L[E|T1] = LAMBDA = {p[E|T2]/ p[E|T1]} * {P(T2)/P(T1)} Thus, the lambda measure of the degree to which the evidence supports one or the other of competing hyps T2 and T1, is a ratio of the conditional probabilities of the evidence given the theories (which of course invites the "assuming the theory" objection, as already noted), times the ratio of the probabilities of the theories being so. [In short if we have relevant information we can move from probabilities of evidence given theories to in effect relative probabilities of theories given evidence, and in light of an agreed underlying model.] All of this is fine as a matter of algebra (and onward, calculus) applied to probability, but it confronts us with the issue that we have to find the outright credible real world probabilities of T1, and T2 (or onward, of the underlying model that generates a range of possible parameter values). In some cases we can get that, in others, we cannot; but at least, we have eliminated p[E]. Then, too, what is credible to one may not at all be so to another. This brings us back to the problem of selective hyperskepticism, and possible endless spinning out of -- too often specious or irrelevant but distracting -- objections [i.e closed minded objectionism]. Now, by contrast the “elimination" approach rests on the well known, easily observed principle of the valid form of the layman's "law of averages." Namely, that in a "sufficiently" and "realistically" large [i.e. not so large that it is unable or very unlikely to be instantiated] sample, wide fluctuations from "typical" values characteristic of predominant clusters, are very rarely observed. [For instance, if one tosses a "fair" coin 500 times, it is most unlikely that one would by chance go far from a 50-50 split that would be in no apparent order. So if the observed pattern turns out to be ASCII code for a message or to be nearly all-heads or alternating heads and tails, or the like, then it is most likely NOT to have been by chance. (See, also, Joe Czapski's "Law of Chance" tables, here.)] Elimination therefore looks at a credible chance hyp and the reasonable distribution across possible outcomes it would give [or more broadly the "space" of possible configurations and the relative frequencies of relevant "clusters" of individual outcomes in it]; something we are often comfortable in doing. Then, we look at the actual observed evidence in hand, and in certain cases -- e.g. Caputo -- we see it is simply too extreme relative to such a chance hyp, per probabilistic resource exhaustion. So the material consequence follows: when we can “simply" specify a cluster of outcomes of interest in a configuration space, and such a space is sufficiently large that a reasonable random search will be maximally unlikely within available probabilistic/ search resources, to reach the cluster, we have good reason to believe that if the actual outcome is in that cluster, it was by agency. [Thus the telling force of Sir Fred Hoyle’s celebrated illustration of the utter improbability of a tornado passing through a junkyard and assembling a 747 by chance. By far and away, most of the accessible configurations of the relevant parts will most emphatically be unflyable. So, if we are in a flyable configuration, that is most likely by intent and intelligently directed action, not chance. ] We therefore see why the Fisherian, eliminationist approach makes good sense even though it does not so neatly line up with the algebra and calculus of probability as would a likelihood or full Bayesian type approach. Thence, we see why the Dembski-style explanatory filter can be so effective, too.
kairosfocus
November 25, 2018
November
11
Nov
25
25
2018
01:22 AM
1
01
22
AM
PDT
What you are describing sounds pretty similar to CSI. Your scheme sounds equivalent to saying we pick the viable explanation that maximizes CSI. So, from my previous formulation, what gets us from #1 to #2 is the premise that a viable explanation that maximizes a posteriori probability is the best. It looks like the viability requirement is the same as Dembski's detachability requirement, plus some other requirements such as parsimony and consistency. I think this makes good sense, and is mathematically well founded in probability. I will ponder how this addresses the Bayesian problem: The missing term in my formulation is P(ID) and P(chance). Given that P(x|ID) > P(x|chance), the conclusion that P(ID|x) > P(chance|x) only follows if P(x|ID) * P(ID) > P(x|chance) * P(chance). A materialist must insist that a priori P(chance) > P(ID), so that P(x|ID) > P(x|chance) illustrates is there are some missing materialistic probability resources they have not identified yet, hence the multiverse hypothesis. I will have to think about how your formulation gives a principled response to the materialist.EricMH
November 24, 2018
November
11
Nov
24
24
2018
03:52 PM
3
03
52
PM
PDT
EMH First, induction is much bigger than ID, and it is at the core of a lot of reasoning in science and the day to day world. Given some common ideologies out there, it is induction we first need to address. Next, Induction is not deduction, not even in a probabilistic sense. And most of it is not about Bayesian or Likelihood inference. Induction is not statistics. Induction is about reasonable, responsible inference on empirical evidence. Argument by support rather than deduction. Which latter then runs into, how do you set up the premises. As in, if you have P => Q, and you don't like Q, reverse: ~Q => ~P. That then exposes the real debate: premises. I usually use abduction as relevant frame of such arguments. Observations F1, F2, . . Fn seem puzzling but some explanation E entails them. It predicts R1, R2 . . . and we see it being consistently correct. Then we make what is in deductive terms a logically fallacious move: accept E as reliable and credibly known (at least, provisionally). Two things are going on, first it is an empirical observation, call it S, that E is reliable. It is a candidate universal (in a scientific context). Reliable is not the same as definitively true. But at second level, we have a conviction the world has universal properties and that some are identifiable on investigation, observation, analysis. So, when we see something that is consistently reliable, we accept it provisionally, open to correction. This reflects a weak, potentially defeatable form sense of knowledge -- well warranted, reliable, credibly true. Reasonable, responsible faith. Then, what about the issue that any number of possible explanations could entail what we see? First, if we have in hand a cluster of candidates E1, E2, . . . Em, then we see which is best so far. If say Ei and Ej are "equally good" then we accept them as empirically equivalent. We may look at simplicity, coherence, not being simplistic or ad hoc etc. One of the key tests is coherence with wider bodies of knowledge. Though, that can be overdone. For example, a common controlling a priori can bias across several fields of study. Much of this is descriptive, summarising effective praxis. Induction simply is not generally going to deliver utter certainty. So, we learn to live with a chastened view of our body of knowledge, attainable warrant and the balance between confident trust and openness to adjustment or replacement. As has been on the table since Newton, Locke and company. KFkairosfocus
November 24, 2018
November
11
Nov
24
24
2018
02:04 PM
2
02
04
PM
PDT
On further thought, I still think there is a premise missing here. These ID information measures are essentially some kind of hypothesis test. We are observing 1) P(X|ID) > P(X|chance), and then inferring 2) P(ID|X) > P(chance|X). What principle, besides common sense, gets us from #1 to #2? This seems to be necessary for induction to work, but I do not see how AS uniformity argument would apply here.EricMH
November 24, 2018
November
11
Nov
24
24
2018
11:36 AM
11
11
36
AM
PDT
EMH, statistical thermodynamics is based on the order that emerges from large numbers of randomly interacting particles. For instance, the equilibrium is a cluster of microscopic states consistent with macro-state that has overwhelming statistical weight. The coins or paramagnetic substance example is a fairly common first example and shows how the overwhelming number of possibilities is near 50-50 h/t in no particular order, and that even fairly small fluctuations are hard to observe though not strictly impossible. The overall pattern of possibilities forms a sharply peaked binomial distributions, e.g. the 500 or 1000 coin cases. This ties into the design inference as functionally specific complex configs are maximally implausible under blind chance and/or mechanical necessity, as we have a space of 3.27 * 10^150 or 1.07*10^301 possibilities, respectively. Even were every atom in the sol system [500 bit case] or the observed cosmos [1,000 bit case] an observer with a tray of coins, flipping at random every 10^-12 to -14 s, the fraction of space that could be sampled is negligible. That's why FSCO/I is not credibly observable on blind chance and/or mechanical necessity. When we look at DNA, which for a genome is well beyond such a range, seeing alphanumeric code so language and algorithms, this therefore screams design. But, too often our senses are ideologically dulled, we are hard of hearing. KFkairosfocus
November 24, 2018
November
11
Nov
24
24
2018
03:41 AM
3
03
41
AM
PDT
@KF, hmm, that is a very interesting point. So even in the case of completely uniform randomness there is a generalized pattern. I stand corrected!EricMH
November 23, 2018
November
11
Nov
23
23
2018
01:54 PM
1
01
54
PM
PDT
EMH, I think the context is scientific induction regarding a real world, not any one phenomenon in it. However, even if coin flips -- or better, magnetisation patterns of paramagnetic substances -- were utterly 50-50 flat random, they will collectively fit a binomial distribution, which is a level of universally applicable order. BTW, the Quincunx gives an interesting case that trends to the normal curve as an array of rods give a 50-50 R/L split, giving a classic bell. Bias the H/T states so p, (1 - p) are asymmetrical and you get related distributions. This illustrates how it is really hard to avoid having some universally applicable ordering. KFkairosfocus
November 23, 2018
November
11
Nov
23
23
2018
01:48 PM
1
01
48
PM
PDT
@KF, I agree your analysis works from an intuitive standpoint. I've never met a consistent Humean. But, from the strictly logical standpoint, the case is not so clear to me. However, it is tough to decouple the logical argument from the intuitive argument in these sorts of discussions. That is what I'm trying to do with the coin flip example. How would the AS principle apply to a run of heads in a long sequence of fair coin flips, without a priori knowledge whether the coin is fair or not?EricMH
November 23, 2018
November
11
Nov
23
23
2018
12:34 PM
12
12
34
PM
PDT
EMH, pardon but the world is not a sequence, here we are speaking of observed reality extending across space and time, evidently starting with a bang some 13.85 BYA. Second, it is not plausible that a Turing machine plus driven constructor could build such a cosmos as we observe; it's not just conceptual models here, it is actual experienced reality. The attempt to deny the legitimacy of generalisation from a finite set of observations, on the so-called pessimistic induction turns on that generalisations have failed in some cases. But, not all, and I put up one not vulnerable to future observations. Its direct answer is, first, that the denial of universalisability runs into logical trouble as outlined. Next, the distinct identity of a world and its content is in part observable and identifiable at least to provisional degree, with significant reliability. So, given that there is evidence of lawlike patterns and that some may indeed be universal, we should not allow ourselves to lose confidence in reliable patterns on the mere abstract possibility that they may be erroneous. In short, science and engineering can be confident. KFkairosfocus
November 23, 2018
November
11
Nov
23
23
2018
12:24 PM
12
12
24
PM
PDT
@KF, ok, I think I get it. Sounds like the standard response to relativism, "is it true there is no truth?" showing relativism is self contradicting. However, it is unclear how this transfers to empirical modeling and prediction. If the world is a random sequence, then any appearance of a 'law of nature' is pure happenstance, and cannot be generalized. We thus cannot infer from perceived regularity that the world is not a random sequence, which is Hume's argument. I don't see how AS gets us out of the dilemma. For example, if I apply AS principle to a long sequence of random coinflips, then a run of heads is bound to show up. Wouldn't AS require me to assume the run is a law, and predict the next coin flip will be heads? The one way I can make sense of this principle is that in a truly random sequence all models are equally useless, so we don't lose anything by inferring order where there is none. On the other hand, if the sequence is non random, then we do lose out by not inferring order where there is some. So, it is kind of a Pascal's wager approach to induction. We never actually know to any degree whether induction is valid, and the only way we lose is when it is valid and we assume it is not. But, this does not sound like what AS is saying, since he seems to think there is an absolute law, not a gambler's wager.EricMH
November 23, 2018
November
11
Nov
23
23
2018
10:34 AM
10
10
34
AM
PDT
EMH, Avi Sion speaks to the logical import of suggesting a world with no universal uniformities. But on looking again at the suggestion; oops. The lack of universalities [say ~U] is inadvertently self referential and would have to hold across the world. It would be a case of U. Self-contradiction, so U is undeniable, the world necessarily has universal properties. (BTW, this also has nothing to do with generalising on random sequences, which do not exhaust the world; a better candidate is whether apparent laws of nature arrived at by observation of several cases are in fact universal.) So, the onward question is, to identify (at least provisionally) cases of such. In the above, I suggest one: that we may make mistakes with [inductive] generalisations, M. If we try a denial ~M, it is again self-referential and counters itself. This is also a case of arriving at a universal, undeniable property inductively as we know of the possibility of failure through actual cases. Most famously, the breakdown of classical Newtonian Physics from 1879 to 1930 or thereabouts. KF PS: Let me clip AS: "can there be a world without any ‘uniformities’? A world of universal difference, with no two things the same in any respect whatever is unthinkable. Why? Because to so characterize the world would itself be an appeal to uniformity. [--> Conclusion:] A uniformly non-uniform world is a contradiction in terms."kairosfocus
November 23, 2018
November
11
Nov
23
23
2018
09:52 AM
9
09
52
AM
PDT
This premise does not seem especially strong: > A uniformly non-uniform world is a contradiction in terms. Is the word 'uniform' the same in both cases? It seems like if we take this principle to a logical conclusion, then a completely random sequence should contain uniformity that we can generalize from, but that is false.EricMH
November 23, 2018
November
11
Nov
23
23
2018
09:16 AM
9
09
16
AM
PDT
Jawa, it seems logic and its first principles are at steep discount nowadays. I have felt strongly impressed that we need to look at key facets of argument which are antecedent to specifics of the case. It turns out that analogy is acknowledged as foundational to reasoning (and so to the warranting of knowledge), and that it is rooted in the principle of identity. Now, we see that induction -- which, despite dismissive talk-points to the contrary is fundamental to science -- is also similarly rooted. KFkairosfocus
November 23, 2018
November
11
Nov
23
23
2018
08:56 AM
8
08
56
AM
PDT
Another timely refreshing review of fundamental concepts that are imprescindible for serious discussions. Thanks.jawa
November 23, 2018
November
11
Nov
23
23
2018
05:00 AM
5
05
00
AM
PDT
Logic and First Principles: How could Induction ever work? (Identity and universality in action . . . including on the design inference)kairosfocus
November 23, 2018
November
11
Nov
23
23
2018
02:08 AM
2
02
08
AM
PDT
1 2 3

Leave a Reply