Uncommon Descent Serving The Intelligent Design Community

Functional information defined

Categories
Intelligent Design
Share
Facebook
Twitter/X
LinkedIn
Flipboard
Print
Email

What is function? What is functional information? Can it be measured?

Let’s try to clarify those points a little.

Function is often a controversial concept. It is one of those things that everybody apparently understands, but nobody dares to define. So it happens that, as soon as you try to use that concept in some reasoning, your kind interlocutor immediately stops you at the beginning, with the following smart request: “Yes, but what is function? How can you define it?

So, I will try to define it.

A premise. As we are not debating philosophy, but empirical science, we need to remain adherent to what can be observed. So, in defining function, we must stick to what can be observed: objects and events, in a word facts.

That’s what I will do.

But as usual I will include, in my list of observables, conscious beings, and in particular humans. And all the observable processes which take place in their consciousness, including the subjective experiences of understanding and purpose. Those things cannot be defined other than as specific experiences which happen in a conscious being, and which we all understand because we observe them in ourselves.

That said, I will try to begin introducing two slightly different, but connected, concepts:

a) A function (for an object)

b) A functionality (in a material object)

I define a function for an object as follows:

a) If a conscious observer connects some observed object to some possible desired result which can be obtained using the object in a context, then we say that the conscious observer conceives of a function for that object.

b) If an object can objectively be used by a conscious observer to obtain some specific desired result in a certain context, according to the conceived function, then we say that the object has objective functionality, referred to the specific conceived function.

The purpose of this distinction should be clear, but I will state it explicitly just the same: a function is a conception of a conscious being, it does not exist  in the material world outside of us, but it does exist in our subjective experience. Objective functionalities, instead, are properties of material objects. But we need a conscious observer to connect an objective functionality to a consciously defined function.

Let’s make an example.

I am a conscious observer. At the beach, I see various stones. In my consciousness, I represent the desire to use a stone as a chopping tool to obtain a specific result (to chop some kind of food). And I choose one particular stone which seems to be good for that.

So we have:

a) The function: chopping food as desired. This is a conscious representation in the observer, connecting a specific stone to the desired result. The function is not in the stone, but in the observer’s consciousness.

b) The functionality in the chosen stone: that stone can be used to obtain the desired result.

So, what makes that stone “good” to obtain the result? Its properties.

First of all, being a stone. Then, being in some range of dimensions and form and hardness. Not every stone will do. If it is too big, or too small, or with the wrong form, etc., it cannot be used for my purpose.

But many of them will be good.

So, let’s imagine that we have 10^6 stones on that beach, and that we try to use each of them to chop some definite food, and we classify each stone for a binary result: good – not good, defining objectively how much and how well the food must be chopped to give a “good” result. And we count the good stones.

I call the total number of stones: the Search space.

I call the total number of good stones: the Target space

I call –log2 of the ratio Target space/Search space:  Functionally Specified Information (FSI) for that function in the system of all the stones I can find in that beach. It is expressed in bits, because we take -log2 of the number.

So, for example, if 10^4 stones on the beach are good, the FSI for that function in that system is –log2 of 10^-2, that is  6,64386 bits.

What does that mean? It means that one stone out of 100 is good, in the sense we have defined, and if we choose randomly one stone in that beach we have a probability to find a good stone of 0.01 (2^-6,64386).

I hope that is clear.

So, the general definitions:

c) Specification. Given a well defined set of objects (the search space), we call “specification”, in relation to that set, any explicit objective rule that can divide the set in two non overlapping subsets:  the “specified” subset (target space) and the “non specified” subset.  IOWs, a specification is any well defined rule which generates a binary partition in a well defined set of objects.

d) Functional Specification. It is a special form of specification (in the sense defined above), where the rule that specifies is of the following type:  “The specified subset in this well defined set of objects includes all the objects in the set which can implement the following, well defined function…” .  IOWs, a functional specification is any well defined rule which generates a binary partition in a well defined set of objects using a function defined as in a) and verifying if the functionality, defined as in b), is present in each object of the set.

It should be clear that functional specification is a definite subset of specification. Other properties, different from function, can in principle be used  to specify. But for our purposes we will stick to functional specification, as defined here.

e) The ratio Target space/Search space  expresses the probability of getting an object from the search space by one random search attempt, in a system where each object has the same probability of being found by a random search (that is, a system with an uniform probability of finding those objects).

f) The Functionally Specified  Information  (FSI)  in bits is simply –log2 of that number. Please, note that I  imply  no specific  meaning of the word “information” here. We could call it any other way. What I mean is exactly what I have defined, and nothing more.

One last step. FSI is a continuous numerical value, different for each function and system.  But it is possible to categorize  the concept in order to have a binary variable (yes/no) for each function in a system.

So, we define a threshold (for some specific  system of objects). Let’s say 30 bits.  We compute different values of FSI for many different functions which can be conceived for the objects in that system. We say that those functions which have a value of FSI above the threshold we have chosen (for example, more than 30 bits) are complex. I will not discuss here how the threshold is chosen, because that is part of the application of these concepts to the design inference, which will be the object of another post.

g) Functionally Specified Complex Information is therefore a binary property defined for a function in a system by a threshold. A function, in a specific system, can be “complex” (having  FSI above the threshold). In that case, we say that the function implicates FSCI in that system, and if an object observed in that system implements that function we say that the object exhibits FSCI.

h) Finally, if the function for which we use our objects is linked to a digital sequence which can be read in the object, we simply speak of digital FSCI: dFSCI.

So, FSI is a subset of SI, and dFSI is a subset of FSI. Each of these can be expressed in categorical form (complex/non complex).

Some final notes:

1) In this post, I have said nothing about design. I will discuss in a future post how these concepts can be used for a design inference, and why dFSCI is the most useful concept to infer design for biological information.

2) As you can see, I have strictly avoided to discuss what information is or is not. I have used the word for a specific definition, with no general implications at all.

3) Different functionalities for different functions can be defined for the same object or set of objects. Each function will have different values of FSI. For example, a tablet computer can certainly be used as a paperweight. It can also be used to make complex computations. So, the same object has different functionalities. Obviously, the FSI will be very different for the two functions: very low for the paperweight function (any object in that range of dimensions and weight will do), and very high for the computational function (it’s not so easy to find a material object that can work as a computer).

4) Although I have used a conscious observer to define function, there is no subjectivity in the procedures. The conscious observer can define any possible function he likes. He is absolutely free. But he has to define objectively the function, and how to measure the functionality, so that everyone can objectively verify the measurement. So, there is no subjectivity in the measurements, but each measurement is referred to a specific function, objectively defined by a subject.

Comments
Piotr at #60:
As for 500 Heads, of course it’s just as unlikely as any other particular sequence.
But the point is exactly that: any other particular sequence which is too unlikely will practically never be seen as the result of chance. What we see is "non particular" sequences. It's exactly like in the second laws. Please, read again my statement about gas molecules. Ordered states (all the molecules in half the space of the container) are in no way more unlikely than each individual state where the molecules are dispersed quite equally in all the space of the container. But the number of states of the second type is infinitely more numerous than the number of ordered states. That's why we never see spontaneously ordered states of a gas. You are right that 500 heads could be the result of an unfair coin. That is true. Ordered states can be the result of necessity. But, if we are sure that the coin is fair, and that the system is truly random, then we will never see 500 heads. Can you see the difference?gpuccio
May 6, 2014
May
05
May
6
06
2014
02:05 PM
2
02
05
PM
PDT
Eric at #59: Yes, in this thread I am using "function" in a restricted sense, and not in the general sense of "purposeful output". So, I have distinguished between three different potential specifications: a) Order, regularity, compressibility b) Meaning (descriptive information according to Abel) c) Function (prescriptive information according to Abel). As I have tried to argue, all three can be valid specifications, but it is useful to understand their differences: a) Order can be the result of algorithms, and not of design. So, specification by order needs special attention to exclude any known algorithmic cause. It is, in a sense, "weaker" for a design inference. As Piotr has correctly stated, a 500 heads sequence can well be the result of an unfair coin. b) Meaning (language) and function (software, machines) are very good indicators of design. The main difference is that meaning is more "passive": a sonnet written on a piece of paper needs to be read by someone who understands it to make some difference in the outer world, while enzymes have been working for billion of years even when nobody (except maybe the designer) knew that they existed. Moreover, biological molecules are machines: they are made to do things, rather than to convey meaning. If and when we find sonnets in DNA, obviously, my statements could be falsified. :) However, it is true that all forms of design (order, meaning, function) are purposeful actions, so in a more general sense they could be called "functions". But I believe that a more specific terminology can only help.gpuccio
May 6, 2014
May
05
May
6
06
2014
01:57 PM
1
01
57
PM
PDT
EricAnderson @65 So are we or are we not dealing with design in this case? ACGTAGTGGCGTTCTTCGACTGTTCCCAAA TTGTAACTTATTGTTTTGTGAAAATCAAAG TTATTTCTCGATCCTTTTTATGTACGTACC ATATTCTTTTAATTCTTTGGTTATTTTTCC GAAGTAGGAGTGAATAAACTTTCGTTTACG TCTTATTATTAATGATATAGCTATGCACTT TGTPiotr
May 6, 2014
May
05
May
6
06
2014
01:29 PM
1
01
29
PM
PDT
Piotr @63:
. . . the lower their redundancy, and the larger the amount of information that can be packed into them (in Shannon’s terms).
Yep. And that is why so-called "Shannon information" is essentially useless for determining whether we are dealing with design or not.Eric Anderson
May 6, 2014
May
05
May
6
06
2014
01:11 PM
1
01
11
PM
PDT
On the other hand, if 500 coins appeared out of thin air (in a carefully controlled and observed environment), then an alternative scientific explanation (like a quantum anomaly on a macro-scale) would be wildly speculative and ultimately unsatisfactory. Of course it wouldn't hurt to try and replicate the experiment (preferably with a professional illusionist enlisted as a consultant). ;-)Piotr
May 6, 2014
May
05
May
6
06
2014
11:40 AM
11
11
40
AM
PDT
Any old random sequence fails the design filter because it is not specified.
Actually, the fewer formal constraints on the structure of "old random sequences" (e.g. if they don't have to be periodic, palindromic, etc.), the lower their redundancy, and the larger the amount of information that can be packed into them (in Shannon's terms).Piotr
May 6, 2014
May
05
May
6
06
2014
11:34 AM
11
11
34
AM
PDT
#54 Piotr, The outcome would be miraculous, but the process that generated the outcome was, by all manner of detection, completely "natural". Scientifically, there is no alternative to describe the mechanics of the event except in a traditionally materialist sense. On the other hand, if 500 coins appeared out of thin air (in a carefully controlled and observed environment), then an alternative scientific explanation (like a quantum anomaly on a macro-scale) would be wildly speculative and ultimately unsatisfactory. The "problem" some in the ID community seem to have with the first scenario is that it does not refute evolutionary theory's mechanical explanations, but compliments them.rhampton7
May 6, 2014
May
05
May
6
06
2014
11:32 AM
11
11
32
AM
PDT
Piotr @60: Quote true that highly repetitive sequences do not allow us to infer design, in and of themselves. They are terrible examples of how to infer design, except in very limited cases.
Why? Because it’s precisely repetitive, periodic or symmetrical structures that are easily produced by dumb mechanical processes.
Exactly. Well said. However, it is most definitely not true that because every sequence is just as likely to happen by chance as every other sequence from a purely statistical standpoint, that chance is the best answer or that we can't infer design. Particularly when we see sequences that are functional, meaningful, that have independent specification apart from the odds of generating the sequence itself. The repetitive sequence fails the design filter because it is not complex. Any old random sequence fails the design filter because it is not specified. Both aspects are required.Eric Anderson
May 6, 2014
May
05
May
6
06
2014
10:58 AM
10
10
58
AM
PDT
As long as you realize it wasn’t by chance, Piotr. That is the point. If I see a highly regular repetitive sequence like ...ACACACACACACACACAC... or ...GGGGGGGGGGGG.... in a genome, the very last explanation that comes to mind is "design" or "miracle". Why? Because it's precisely repetitive, periodic or symmetrical structures that are easily produced by dumb mechanical processes. As for 500 Heads, of course it's just as unlikely as any other particular sequence. Whatever result you get from 500 flips is just as unique and "specific" as HHHHH...HHHHH. It's just us humans, with our pattern recognition skills and the perceptual bias they produce, who see regular sequences as special, and so we lump together sequences like HHHTHTTHHTTTHTTTHH or TTHTTHHHTHHHHHTHTH as "ordinary" but regard HHHHHHHHHHHHHHHHHH or HTHHTHHTHHTHHTHHTH as "extraordinary".Piotr
May 6, 2014
May
05
May
6
06
2014
10:49 AM
10
10
49
AM
PDT
gpuccio @41: Thanks for your additional clarifications. A follow-up question: A sonnet written on a piece of paper presumably has a function. We might say its function is to express an idea or sentiment or thought. Its function might be to convey information. My dictionary defines "function" as "the purpose for which something was designed." In that sense, essentially everything that is designed has a "function," regardless of whether the function is more mechanical or more mental in nature. For example, this sentence has a function. You seem to be using "function" to describe mechanical or biochemical work. I'm wondering if you intend to limit it that way, and if so, is there another way to describe this mechanical work rather than using the broader word "function"?Eric Anderson
May 6, 2014
May
05
May
6
06
2014
09:06 AM
9
09
06
AM
PDT
I must have forgotten to add the emphasis in the first quote. It was meant for the following paragraph: For example, we could generate a partition by dividing sequences in compressible (ordered) and non compressible. I believe that Dembski sometimes uses this concept. But being compressible is not a function, but another kind of property.gpuccio
May 6, 2014
May
05
May
6
06
2014
09:04 AM
9
09
04
AM
PDT
Piotr at #54: Please, see my answer to Eric at #41:
Specification. Given a well defined set of objects (the search space), we call “specification”, in relation to that set, any explicit objective rule that can divide the set in two non overlapping subsets: the “specified” subset (target space) and the “non specified” subset. IOWs, a specification is any well defined rule which generates a binary partition in a well defined set of objects. Now, there are many possible ways to generate a binary partition in a set of objects. Defining a function for those objects (something for which they can be used) is only one way. For example, we could generate a partition by dividing sequences in compressible (ordered) and non compressible. I believe that Dembski sometimes uses this concept. But being compressible is not a function, but another kind of property. With Piotr, I have made an example based on meaning, and I have introduced a distinction between descriptive information (meaning) and prescriptive information (function). I take that distinction from Abel, and I find it very useful. There is a subtle difference between meaning and function, even if they are strictly linked, and even if both can be used to specify. A meaning is in the object (let’s say a sonnet on a sheet of paper), but if nobody understands it, nothing happens. A function is in the object (let’s say a machine), and if the machine is not working, nothing happens. But if the machine is working, a definite result happens in the outer world, even if nobody is there to understand and recognize it. A sonnet on a sheet of paper is inert. It can only be understood by a conscious being. A working machine works. It needs not a conscious being to work, even if a conscious being was necessary to build it. We detect enzymatic activities even when we don’t know anything about the protein, its sequence and how it works. We can just see the results.
Emphasis added. And to InVivoVeritas at #48:
The concept of function is specially apt to be a tool to detect design, because it is obviously connected to one of the two fundamental experiences of conscious beings: purpose. Meaning is the other fundamental conscious experience, and it is equally good to detect design, but it is more appropriate for objects with descriptive information (language). Function is the natural specification for software and biological molecules. Instead, specification based on compressibility, while valid, is less useful in our contexts as a design detection tool. Compressibility can be connected to conscious experiences, but the link is less obvious. Moreover, order and compressibility have another hindrance: in appropriate contexts, they can be generated by necessity algorithms. This is an aspect I have not yet discussed in this thread, but it is obviously part of the design detection procedure (excluding necessity).
Emphasis added. You may be aware that excluding a necessity (algorithmic) explanation is an integral part of ID from the beginning, it's already there in Dembski's explanatory filter. Complexity due to order is often (but not always) generate by algorithms. That is not true of complexity due to function (prescriptive information) or to meaning (descriptive information). Those types of complexity are scarcely compressible, and cannot be generated by simple algorithms. I have not dealt with this part in detail, in this thread, but it is an important part. It includes explaining why protein sequences can never be generated by NS acting on RV.gpuccio
May 6, 2014
May
05
May
6
06
2014
09:02 AM
9
09
02
AM
PDT
CuriousCat: I think I answered many of the darwinist objection you mention in my post #23. Regarding folding, in SCOP 2.03 classification there are 1194 independent foldings, 1961 superfamilies, 4496 families. While foldings is the fundamental grouping, superfamilies are still a completely sequence isolated grouping, based mostly on structure and function. That's why I refer usually to them. Foldings would be good too, and probably also families, although between families you could sometimes find some possible vague evolutionary connection. There is no absolute connection between the length of a sequence and a p value. A p value must be referred to some definite result, for which we can build H0 and some alternative hypothesis. So, if we have a 200 bit sequence, there is no p value about it in itself. If we get that sequence randomly, it's fine. It is just a sequence that we can get randomly. The probability of getting a generic 200 bit sequence by 200 random events each generating 1 bit is 1. But, if we have a 200 digit sequence and we ask for the probability of generating that sequnce after we have defined it, it is 1:2^200. This is an example of pre-specification. The sequence has nothing peculiar, but it becomes peculiar becuase we know it in advance. And if we ask the probability of getting a sequence of 200 1s or 200 0s, the probability is 2:2^200, always. Here, it's not important that we define the peculiarity before or after. The peculiarity is there anyway. Please, see my post #29 to Piotr:
There are many special numbers, but you will never get the first, say, 100 digital figures of any of them by chance. There are many ways in which all the molecules of a gas in a container could stay in half the space of the container, leaving absolute void in the other half of the container. But that will never happen, as the second law of termodinamics tells us. Why? Because there are so many more, hugely many more, ways in which the molecules are diffused almost equally in all the space of the container. And there are so many more, hugely many more, real numbers which are not special at all. As the search space increases exponentially, there is no chance at all that the target space linked to order, function or meaning can increase adequately. The probabilities of ordered or functional states diminish inexorably, and very early they become an empirical impossibility.
gpuccio
May 6, 2014
May
05
May
6
06
2014
08:51 AM
8
08
51
AM
PDT
As long as you realize it wasn't by chance, Piotr. That is the point.Joe
May 6, 2014
May
05
May
6
06
2014
07:57 AM
7
07
57
AM
PDT
Actually, I remember a discussion here, where Darwinists were not persuaded by 500 coins all Heads as some special event, and they said this is just a sequence among 2^500. I wouldn't say so. I would treat such as result as proof that the coin isn't fair (and I'd hypothesise that it most likely has Heads on either side). I'd also think of ruling out the possibility of an illusion trick. An apparently unlikely result may have a mundane explanation. Heads in 500 consecutive fair flips would qualify as a putative miracle -- but then I've never seen any such thing happen.Piotr
May 6, 2014
May
05
May
6
06
2014
07:31 AM
7
07
31
AM
PDT
gpuccio:
If there are only a few sequences that reach the exit, and the rat easily finds the exit, then we can reject the Hypothesis that it is moving randomly (if that is our null hypothesis).
EXACTLY! If there are ONLY A FEW sequences that reach the exit, then we reject the null hypothesis that rat is moving randomly. However, the way a Darwinist thinks is different from a person who sees the universe in teleological perspective. The way he/she would respond this hypothesis test would be by saying that how do we know that these are the only paths? Actually, I remember a discussion here, where Darwinists were not persuaded by 500 coins all Heads as some special event, and they said this is just a sequence among 2^500. As a molecular biology analogy, they may have a point (on the other hand, for a real life engineering situation this thinking is sheer stupidity). It is humans that give a special meaning to all heads case, since this is not something randomly occurring in our daily lives. If we try to "isolate" a certain event from its surroundings and test its randomness, it will be a biased experiment and analysis. That's also the issue with P-value getting smaller (I mentioned earlier), the longer the sequence the lower probability of obtaining a specific sequence (0.5^500), but it may us who give a meaning to specific sequence. Now, you may say that what if this specific sequence is functional (objective function). Say that we toss a dice 500 times and the resulting sequence somehow creates a key and this key opens a door. Can't we now do the hypothesis test? Still no, unless we know that tossing other sequences will not open a door. We have but one advantage over Darwinism, biology tells us that most of the other sequences will not open a door, and will not have a function. However, we have a another problem. A Darwinist may argue that those doors are not aligned side by side, but they are one in another. So other doors (call it A1, A2, ..) are opened when you open the first one (call it A). And if you happen to open another door (call it B) in the first instance, different doors (B1, B2, ..) may be waiting for you. In summary, objective functions have not been clearly defined previously (how can it be for a random process?), but the functions themselves are evolving (or formed) as sequences are evolving. So we're back to square one, because a single sequence having a function, or a series of sequences having functions cannot be tested because the functions themselves do not preexist (I'm still Darwinist's shoes, in my shoes I believe they preexist). I think it was Stephen Gould who said something like it is only due to chance that humans instead of dolphins, or other creatures rule the earth (this might be an awful quotation but I cannot find the exact words right now), which shows that the path and existing status of sequences and functions could have been different. If this is the case (or dominating view in science), then I do not think we are justified in the hypothesis test we are arguing above. On the other hand, you already suggested a solution here. You said ~2000 functional protein families exist. The reason why I think functionality here could be replaced by "folds" (and that may be the reason why Axe based his study in folds) is that coin tossing sequence should first produce a key (fold), so that it may open a door (have a function). Now, we see that fold is a preexisting entity (in Platonic sense), which depends only on the fundamental laws of nature. Though function itself is like a fluid, whose existence depends on the existence of other functions and may change (again from a Darwinist perspective) fold is not. Please go back to what Piotr asked how do you determine whether the protein has any kind of “functionality”? How do you define your target space?. This means (in my opinion): in the current laboratory environment, protein may not have a function. It may not currently have any function in the whole nature. However, in 1000.. so years, it may have function. Or in an alternative path that could have been taken by evolution, it would have a function. So a random non-functional sequence would be functional in an alternative evolution history. I think I have written too much, and I may have bored people (if anyone bothered to read up to this point), so I stop here and not say anymore for this discussion. We'll probably continue in another topic gpuccio. I must say I really like the atmosphere of the discussion board here :) Erratum: In the previous post I wrote 2-d code. It should be of course 1-d code. One last unscientific point: I find the existence of a protein fold a miracle; the connection between a protein and its function another miracle.CuriousCat
May 6, 2014
May
05
May
6
06
2014
07:04 AM
7
07
04
AM
PDT
aqeels: I am honored of your appreciation.
The best bit about it is that for any given protein we dont even need to work out all of the “functions”; On the contrary we just need to find one unambiguous function.
That's it! By leaving the observer completely free to define any function he likes, we are no more interested in how many different functions can be found. One single function which is also complex will be enough for the design inference.gpuccio
May 6, 2014
May
05
May
6
06
2014
06:46 AM
6
06
46
AM
PDT
Great post gpuccio. For what it is worth, the concept of dFSCI has always been very clear to me and IMHO it is the most productive version of the specified complexity argument. The best bit about it is that for any given protein we dont even need to work out all of the "functions"; On the contrary we just need to find one unambiguous function. Eric:
It seems the only distinction here is that the specification wasn’t really very specified? Meaning, if we loosely and vaguely define our specification then it might not be specific enough to clearly articulate the function. But if we then go on to better articulate the details, the function emerges, almost by definition.
Well said. That is precisely the distinction that dFSCI is trying to make and that is why it will be the most productive approach in demonstrating the implausability of what the non-design proponents are trying to say...aqeels
May 6, 2014
May
05
May
6
06
2014
06:27 AM
6
06
27
AM
PDT
CuriousCat: Thank you for your very good contributions, both indirect and direct. :) A few comments.
Say that we have a maze, in which we put a rat. Rat may choose left (L) or right (R) directions. We leave the rat, it goes LRLLRRLLLLLRRRRRLLLRRRRR (make it as long as you want) and finds the exit. How can we test the hypothesis that the movements (choosing L or R) of the rat is random or not? One way is to assume say that Ho: LRLLRRLLLLLRRRRRLLLRRRRR pattern is random vs. H1: LRLLRRLLLLLRRRRRLLLRRRRR pattern is NOT random. When you find the P-value for this specific pattern, it’s going to come out …. (whatever). The thing is as the sequence gets larger larger, P-value will get smaller, so we reject the randomly moving rat hypothesis.
I am not sure I understand what you mean here. Why would the p value get smaller? How are you getting a p value here? What is the H0 hypothesis? If there are only a few sequences that reach the exit, and the rat easily finds the exit, then we can reject the Hypothesis that it is moving randomly (if that is our null hypothesis). We still have to try to explain how the rat found the exit route: IOWs, rejecting the null hypothesis does not automatically support an alternative hypothesis. That is another methodological problem which is often misunderstood. IOPWs, rejecting that null hypothesis just means rejecting that the rat is moving randomly, but does not explain automatically how it finds the route. The only reason why a longer pattern would be "less random" is that, if the rat finds the route, the longer the way, the lower is the probability that he finds the route by chance. Therefore, if the way is long enough, we can safely reject the null hypothesis of a random movement. If there are only 3 binary nodes, and only one sequence finds the exit, we have 2^3 = 8 possible sequences, and the probability of the rat finding the exit by chance is 1:8, 0.125. Rejecting the null hypothesis does not make sense. But is there are 100 nodes, and still the rat finds the exit in one attempt, the probability of that is about 1e-30. If still the rat finds the exit, I would definitely reject the null hypothesis. The rat is not moving randomly. The scenario is similar to the problem of protein function. Overfitting has nothing to do with this situation. Observing the result after it has happened has no relevance. Finding the exit is a well defined special result, and nothing changes if we define it before or after it happens. What other special result could the poor rat get? Flying? General considerations on life have no relevance too. Finding a 300 AAs long proteins which is an enzyme, and accelerates a reaction which in nature is extremely difficult and slow, or just would not happen, is like finding the exit with hundreds of nodes. It will never happen by chance. The null hypothesis of chance can very safely be rejected.gpuccio
May 6, 2014
May
05
May
6
06
2014
04:25 AM
4
04
25
AM
PDT
It seems that I had an indirect contribution (while sleeping :)) to the rest of this interesting discussion with a previous post about p-values linked given by Mung. Very briefly, I should say that I agree with gpuccio on the use of P-values in the scenario related with proteins. However, I emphasize the importance of uniqueness once again, and I slightly disagree with gpuccio who says that this has nothing to do with design inference. I do not want to be tedious, but I believe this is the heart of the controversy between the Theistic and Darwinian views, so I'll try to explain it shortly, using the example of Feynman. Say that we have a maze, in which we put a rat. Rat may choose left (L) or right (R) directions. We leave the rat, it goes LRLLRRLLLLLRRRRRLLLRRRRR (make it as long as you want) and finds the exit. How can we test the hypothesis that the movements (choosing L or R) of the rat is random or not? One way is to assume say that Ho: LRLLRRLLLLLRRRRRLLLRRRRR pattern is random vs. H1: LRLLRRLLLLLRRRRRLLLRRRRR pattern is NOT random. When you find the P-value for this specific pattern, it's going to come out .... (whatever). The thing is as the sequence gets larger larger, P-value will get smaller, so we reject the randomly moving rat hypothesis. So, are we justifed in performing the test in the above presented manner? My answer is a reserved no. The rat would of course choose a L-R pattern, and the specific chosen path would be a highly unlikely choice as the sequence gets larger. However, there are two reservations: 1. If we can show (or assume) that this is the ONLY path that leads to the exit, then this is a special path. I think gpuccio would call (and I agree) this path a functional path. In this case, we MAY be justifed in this hypothesis test. When it comes proteins-life case, this path may correspond to folded proteins, I think. That's why I think folded proteins should be the starting point for such a theory. Life is not only a transformation of 2-d coding to function, in this case Darwinists would be more justified in their views. For instance, as Piotr pointed out, why can't we assume that any 2-d code would not work (be functional for life)? We cannot because folding (at least under some conditions) is a MUST. Hence, life is 2-d code -> 3-d structures -> life, it is not something like "anything goes". So the paths which lead to life (as we know up this day) consists of finite number of functional entities. This is why I insist on the uniqueness of paths leading to life. Otherwise, the hypothesis test presented above would not work, in my opinion. 2. The other is more controversial and maybe irrelevant to the current topic, but may be relevant to the general ID-Darwinism controversy. Let's say that the shortest path to exit is LRLLLRR. However, mouse makes a couple of wrong turns, does LRLLLRLRR (an additional LR added) but finds the exit at the end. Can we still use the above hypothesis test? Not directly, but in the following way we may, I guess. We may consider all the paths which do not lead to the exit and which lead to exit making with some wrong turns in between, and then see where the current observation lies. For a low P-value, we would say that this is not random. So, why do I call this controversial? Unfortunately, life is not a single exit as presented here. Say, there are cheese pieces along the way to the exit, so the immediate aim of the mouse may not be to get out of the maze but first eat these cheese pieces, then get out. So it may unnecessarily (from the point of view of a person who thinks exiting the maze is the first aim) visit many additonal paths. This is a mistake many Darwinists make, I think. They assume that they know the mind of God (who they do not believe), and say that if this were a guided process, this and that would not have happened, which is actually a bad hypothesis test, in my opinion.CuriousCat
May 6, 2014
May
05
May
6
06
2014
02:51 AM
2
02
51
AM
PDT
InVivoVeritas: Thank you for the kind words. And for the good questions, which allow some better clarification of aspects which I have not detailed enough. So, your 4 questions: 1) A generic specification is any kind of specification (any "rule" which generates a binary partition in the set of objects). Let's call this set of specifications S. As functional specifications are a subset of S, FS is included in S. But, if you want some specification in S which is not in FS, then you have to use some rule which is not related to a function to generate the partition. Your example for 1 is not an example of S specification which is not an FS specification. It is an example of incomplete FS specification. The difference between S (non FS) and FS is in the type of rule, not in the completeness of the definition of the rule. An example of S specification which is not an FS specification, based for instance on order/compressibility, could be the following: any stone on the beach which is almost perfectly spherical (with defined limits of tolerance). That rule generates a binary partition, but is not related to a function. 2) A functional specification would be the "chopping food" rule, with enough details (what food needs to be chopped, in what context, and how well) to make the assessment of the function objective. 3) It should be clear now why the two specifications are different. Although we could define a function for a spherical stone, the specification by a geometrical form in itself makes no reference to a function, or to a specific use of that form. I believe that FS is more useful than generic specification for design inference in biology for two reasons: a) It is the natural kind of specification for biological molecules, especially proteins: proteins are biological machines, their information is prescriptive, not descriptive. b) The concept of function is specially apt to be a tool to detect design, because it is obviously connected to one of the two fundamental experiences of conscious beings: purpose. Meaning is the other fundamental conscious experience, and it is equally good to detect design, but it is more appropriate for objects with descriptive information (language). Function is the natural specification for software and biological molecules. Instead, specification based on compressibility, while valid, is less useful in our contexts as a design detection tool. Compressibility can be connected to conscious experiences, but the link is less obvious. Moreover, order and compressibility have another hindrance: in appropriate contexts, they can be generated by necessity algorithms. This is an aspect I have not yet discussed in this thread, but it is obviously part of the design detection procedure (excluding necessity). 4) I think I have already clarified that the difference between S (non FS) and FS is of substance, and not of degree. Regarding the final "questionnaire", what can I say? Being naturally humble, I hope my essay has significant value for all three of them. :)gpuccio
May 6, 2014
May
05
May
6
06
2014
01:36 AM
1
01
36
AM
PDT
On the distinction between Specification (S) and Functional Specification (FS) as per Eric Anderson at # 38 and gpuccio at #41. Gpuccio your posts excell always by their clarity, interesting perspectives and topics. On the topic at hand I wonder if you can help me to better understand the S and FS by giving us a concrete (complete) example of a Specification and a Functional Specification for the “chooping tool” scenario you used at the beginning of your post
For the "stones on the beach" case can you tell us: 1. What would be a valid, concrete “specification” 2. What would be a valid, concrete “functional specification” 3. Emphasize for us what would be common between the two and what would be the specific difference between the two. 4. Can be argued(as Eric did) that the difference between 1 and 2 is more a matter of degree rather than one of substance?
If I can try my hand here for 1 and 2 above:
1. The natural language statement: “a stone appropriate to be used as a chopping tool” 2. A stone is appropriate to be used as a chopping tool if it complies with this "function" (conditions | parameters): a. Weight greater than 1 pound b. Weight smaller than 10 pounds c. Made of hard material (not soft or breakable) (hardness between x and y) d. Can be held in my hand comfortably (size: between 4 and 8 inches)
You may have a better idea about particular and concrete examples of Specification and Functional Specification for your "stones on the beach" scenario. I am wondering if you have time and inspiration to give concrete examples of an S and the corresponding FS for another, non-trivial example/case. (Is Behe’s mouse trap too complex of a case?) On a different line I would like your thoughts on the following question?
Where do you see the most significant value of your essay about Functional Information? a. Conceptual value (clearly defines and demarcates fundamental concepts for the domain) b. Theoretical value (set the basis of coherent theory with correspondence to reality) c. Pragmatic value (provide specific approaches and formulas to compute FSI, FSCI, dFSCI, etc., detect design, etc.)
InVivoVeritas
May 6, 2014
May
05
May
6
06
2014
12:21 AM
12
12
21
AM
PDT
Mung: CuriousCat is perfectly right: "P-value is not the probability of null hypothesis being correct given the current data. It is the probability of obtaining the current data given that null hypothesis is correct. " The correct definition of p in hypothesis testing is the first thing I try to explain to young medical doctors as soon as I can. It is perfectly true that the number of medical doctors who correctly understands that definition is... (no, I will not say it! :) ) "Experimenters in psychology and (in many cases) biology misuse (or misinterpret) this subtlety in the meaning of P-value, and take a P-value smaller than 0.05 as an indication of the improbability of the null hypothesis being true and reject it in favor of the alternative hypothesis. Since it is usually the alternative hypothesis that draws attention in the scientific community and makes the research publishable, researches collect many data points and “filter” them to get a magical P-value smaller than 0.05!" True. The use of statistic in medicine is often embarrassing, sometimes shameful. But it is possible to use it well. And many times it is used well, even in medicine. However, all that has nothing to do with our design inferences, which are a good example of a good use of statistics. And, certainly, are not at all based on "a magical P-value smaller than 0.05" :)gpuccio
May 5, 2014
May
05
May
5
05
2014
10:20 PM
10
10
20
PM
PDT
gpuccio:
When you get p values of the order of 1e-10 or 1e-20, overfitting is certainly not your major concern.
CuriousCat on P-valueMung
May 5, 2014
May
05
May
5
05
2014
06:22 PM
6
06
22
PM
PDT
Eric:
Z, has more specification by definition, because it is more specific.
Specification By Example :)Mung
May 5, 2014
May
05
May
5
05
2014
06:17 PM
6
06
17
PM
PDT
gpuccio:
CuriousCat: Thank you for your intervention, and fur your very good and thoughtful objections.
There, I fixed it fur ya!Mung
May 5, 2014
May
05
May
5
05
2014
06:02 PM
6
06
02
PM
PDT
CuriousCat: Thank you for the link. Being involved in statistical analysis all the time, I am well aware of the problem of overfitting. It is certainly a serious problem, and one often not considered enough, especially in medical literature. However, it is a problem that can be solved, and I don't think it is really relevant here. Here, we are not modeling some result by many variables, so that random noise in those variables can be considered as a true effect if we use a dubious threshold for significance. Here, we are just rejecting the null hypothesis that random noise can generate a very powerful, objective effect that we really observe, and we reject that hypothesis because its improbability is amazing, tens and hundreds of magnitude beyond any conventional threshold. When you get p values of the order of 1e-10 or 1e-20, overfitting is certainly not your major concern. And the effect of functionality in an enzyme is not certainly a false effect, generated by random noise. The effect is there, as big as the sun. Thousand of functional proteins are not a false effect. If the appearance of design were a false effect of random noise, it would not appears in 2000 different and isolated systems. Functional information is certainly not the result of overfitting in our analysis. It must be explained, either by neo darwinism (which can't do it) or by design. Those are the only two games in town. One of them must be true. Guess which?gpuccio
May 5, 2014
May
05
May
5
05
2014
02:03 PM
2
02
03
PM
PDT
Eric: Probably, I have been too quick about the problem of specification, just to avoid being too long in the OP. I will try to clarify better. I have written: Specification. Given a well defined set of objects (the search space), we call “specification”, in relation to that set, any explicit objective rule that can divide the set in two non overlapping subsets: the “specified” subset (target space) and the “non specified” subset. IOWs, a specification is any well defined rule which generates a binary partition in a well defined set of objects. Now, there are many possible ways to generate a binary partition in a set of objects. Defining a function for those objects (something for which they can be used) is only one way. For example, we could generate a partition by dividing sequences in compressible (ordered) and non compressible. I believe that Dembski sometimes uses this concept. But being compressible is not a function, but another kind of property. With Piotr, I have made an example based on meaning, and I have introduced a distinction between descriptive information (meaning) and prescriptive information (function). I take that distinction from Abel, and I find it very useful. There is a subtle difference between meaning and function, even if they are strictly linked, and even if both can be used to specify. A meaning is in the object (let's say a sonnet on a sheet of paper), but if nobody understands it, nothing happens. A function is in the object (let's say a machine), and if the machine is not working, nothing happens. But if the machine is working, a definite result happens in the outer world, even if nobody is there to understand and recognize it. A sonnet on a sheet of paper is inert. It can only be understood by a conscious being. A working machine works. It needs not a conscious being to work, even if a conscious being was necessary to build it. We detect enzymatic activities even when we don't know anything about the protein, its sequence and how it works. We can just see the results. So, when I say that functional specification is a subset of specification, I don't mean that specification is vague, and functional specification is more detailed. Not at all. I mean that functional specification is specification by a function, while meaning based specification is a specification based on meaning, and could be good for language. And specification by compressibility can be good to separate ordered sequences from non ordered sequences. It's not a question of detail, but of what we use to generate a binary partition of the set. But in all cases, the specification must be clear, detailed ad objective. Otherwise, no reasoning can be done. I hope that clarifies better my views.gpuccio
May 5, 2014
May
05
May
5
05
2014
01:38 PM
1
01
38
PM
PDT
gpuccio I have been thinking about this example, but I could not remember where I read it. Going through a couple of statistics papers, I've found it!! Here's the link: http://library.mpib-berlin.mpg.de/ft/gg/GG_Mindless_2004.pdf Starting from the last paragraph on page 602 and through 603, the author tells a story attributing to Feynman. Though I, now, agree that the way you define functionality is unlike the case mentioned here (before vs. after the experiment case), that may be a kind of response you may encounter so I thought maybe you would like to take a look at it. Have a nice day.CuriousCat
May 5, 2014
May
05
May
5
05
2014
01:32 PM
1
01
32
PM
PDT
CuriousCat: OK, I agree with your comments, but remember: even if completely different forms of life were possible on other planets, that would not help explain how a new protein structure arises here, in a cell which is already based on our biochemistry. As I said, the existence of a complex cellular environment limits extremely the number of useful new solutions in that environment. IOWs, you need not only write a new software procedure, but you also have to make one which may be useful in Windows 8, and compatible with the existing code!gpuccio
May 5, 2014
May
05
May
5
05
2014
11:59 AM
11
11
59
AM
PDT
1 6 7 8 9 10

Leave a Reply