What controls might have kept Marks and Dembski from making inflated claims of their own remain unspecified. "2ndclass" went first class, though, in giving Robert Marks the first chance to deal with the obviously erroneous claims made in the essay. Here is his email to Marks, released by permission:There is a wider lesson to be learned from this case study. To maintain its integrity, the field of evolutionary computing needs to institute the following control: all computer simulations of evolutionary search must make explicit (1) a numerical measure of the inherent difficulty of the problem to be solved (i.e. the endogenous information) and (2) a numerical measure of the information contributed by the search structure toward solving the problem (i.e., the active information). This control acts as a safeguard against inflated claims for the information-generating power of evolutionary algorithms.
The PDF of the essay was taken down from the "evolutionaryinformatics.org" site shortly thereafter. However, other than that amount of reaction to the news that the essay's conclusions have a basis somewhat less stable than Birnam Wood, Marks has not taken advantage of the opportunity provided by "2ndclass" to publicly retract the findings. Tom Schneider's critique of the criticism, though, appeared to have less impact on Marks and Dembski. Schneider responded to various criticisms on August 3rd and 4th. Back on August 12, 2007, Schneider had this response on his "ev" blog:Date: Wed, 26 Sep 2007 10:08:19 -0600 To: "Marks, Robert J." Subject: Serious error in the paper, "Unacknowledged Information Costs" Dr. Marks, A few months ago I asked you some questions about your work in the Evolutionary Informatics Lab, and you graciously took the time to respond, which I appreciate. The point of this email is to give you a heads up on a major problem in your response to Tom Schneider's ev paper. The upshot is that your empirically-determined value for p_S is off by many orders of magnitude, and the cause for the discrepancy is a bug in Histogram.m. I'm telling you this in advance of the problem being reported by ID opponents, in case you would like to report the problem yourself first. I have sent the following summary to others for verification, and I expect the problem to be reported shortly. Furthermore, there are easy ways to test your reported value for p_S that demonstrate it to be wrong, so I expect that others will eventually take notice. For instance, Tom Schneider's site has a java GUI version of ev that allows you to change parameters such as population size. Changing the population size to 4390 should result in 10 perfect initial organisms if p_S = 1/439, but it doesn't, even if you tweak the code so the target is the same as Schneider's. Obviously, this problem doesn't invalidate the work of the EIL. The objective in reporting this problem will be to show that nobody has carefully read the paper in question, not even the ID proponents at Uncommon Descent who are making grandiose claims about the EIL. Regards, ************* *** START OF SUMMARY *** This is quick response to a paper by William Dembski and Robert Marks, "Unacknowledged Information Costs in Evolutionary Computing: A Case Study on the Evolution of Nucleotide Binding Sites," which is a response to Tom Schneider's paper, "Evolution of Biological Information." I explain why their empirically-determined value for p_S cannot possibly be correct, I point out the bug that led to their erroneous value, and I show how to come up with a better estimate. D&M's reported value of .00228 for p_S should raise questions in the minds of readers. Why would a targeted evolutionary algorithm require 100 times as many queries as random sampling? (See the last paragraph in section 1 of D&M's paper.) Why would random sampling be able to find the target in 439 queries while parallel random walks, starting at the target, are unable to find the target again in 64000 queries? (See the random walk w/o selection starting at generation 1000 in Figure 2a in Schneider's paper.) But the surest evidence of a problem is Figure 3 in D&M's paper, which shows that the perceptron heavily favors vectors with very few binding sites or very few non-binding sites. D&M legitimately use this fact to explain why p_S is higher than 2^-131, but the same fact also puts a limit on how high p_S can be. For example, if every vector with 15 binding sites is more probable than any vector with 16 binding sites, then p_S for a vector with 16 binding sites can't possibly be 1/439. There are {131 \choose 15} = 2e19 unique vectors with 15 binding sites, and it's impossible for each of those vectors to have a probability of more than 1/439. p_S has to be lower than 2e-19 if every vector with 15 binding sites has a probability higher than p_S. To find the actual value of p_S, we first look at the bug that resulted in the erroneous value. If we look at the vector t after running Histogram.m, we see that it consists of all 1's, while the obvious intent of lines 25-27 is that it contain 1's only at the targeted binding sites and zeros elsewhere. The problem is obviously in line 24. (In Octave, there is also a problem in constructing the vertical t vector with a horizontal targ vector. I don't know if this is also the case for MATLAB.) Once these problems are fixed and we insure that t is constructed correctly, the resulting histogram shows a p_S of zero even if we run a billion generations, so it would seem that p_S is too small to be determined empirically. But it can be roughly determined indirectly. The histogram of errors in Figure 2 of D&M's paper is useful in that it accurately represents the case in which every site is a targeted binding site. By symmetry, this is equivalent to the histogram for the case in which there are NO targeted binding sites. In the latter case, every positive is a false positive, so the number of errors is the number of binding sites recognized by the perceptron. From the histogram, it appears that about .6% of the queries yield vectors with 16 binding sites, which can be verified by looking at histt(17) after running the script. There are {131 \choose 16} = 1.4e20unique vectors with 16 binding sites, so the average p_S for these vectors is .006/1.4e20 = 4e-23. IF p_S for Schneider's target vector is about average for all vectors with 16 binding sites, then it should be close 4e-23. That's a big IF, but it's a premise that can be tested in a scaled-down problem. I scaled the problem down to 32 potential binding sites, ran a billion random queries, and looked at the counts for only the outcomes that had exactly 4 binding sites. Upon looking at these vectors, sorted in order of frequency, a clear pattern emerged: The vectors in which the binding sites overlap significantly were found at both extremes (which differed by less than an order of magnitude), while the vectors with little or no overlap were found in the middle, very close to the average frequency. Assuming that this pattern holds when the problem isn't scaled down, p_S for Schneider's non-overlapping target vector should be pretty close to the average p_S of all vectors with 16 binding sites, which is 4e-23, or 2^-74. Even if this estimate is off by a large margin, the true p_S is obviously much smaller than the value reported by D&M, which invalidates some of D&M's conclusions and leaves others intact. The conclusion that Schneider's evolutionary algorithm is less efficient than blind search is false. The fact remains that the perceptron introduces "active information", which I find insightful, although the active information is much less than D&M claim (if my estimate for p_S is in the right ballpark, 57 bits as opposed to 122 bits). *** END OF SUMMARY ***
While it is good that Marks and Dembski have begun the process of removing false claims about "ev" from their websites, that process is still incomplete, and leaves other commentary standing. For example, another paper still available on their website relies upon the bogus "ev" critique in making claims:2007 Aug 12. In Unacknowledged Information Costs in Evolutionary Computing: A Case Study on the Evolution of Nucleotide Binding Sites" Dembski claims that "Using repeated random sampling of the perceptron, inversion is therefore expected in 1/pS = 439 queries." That is, according to Dembski random generation of sequences should give a solution in which there exists a creature with zero mistakes more quickly than the Ev program running with selection on. If this were true, then random generation of genomes would be more efficient than natural selection, which takes 675 generations for Rs to exceed Rf in the standard run. While anyone familiar with natural selection will immediately sense something wrong with Dembski's numbers, it is reasonable to test his 'hypothesis'. For the discussion of 2007 Aug 03 I used the running Ev program and counted each generation. This was a little unfair because there was only had one mutation per generation. This might make a tighter Rs distribution around zero bits because each generation is similar to the last one. Thus the estimate may be too high. A more fair test is to generate a completely new random genome each time. What is the distribution? For 100 independent generations, the mean was -0.02114 bits and the standard deviation was 0.26227 bits. This is essentially the same standard deviation as before! so the probability of getting 4 bits is, again, 4/0.26227 = 1.11x10-16. Considering 439 queries as 3 orders of magnitude, Dembski's estimate is off by about 13 orders of magnitude.
As Schneider notes, Dembski has been criticizing Schneider's work and person publicly based on the bogus data from the busted Matlab script. The process will not be complete until an effective retraction of the claims is made such that the people who heard the false information from Marks and Dembski are likely to also have heard of its demise, and any resulting publication notes the problem in the first attempt and credits those who put in the effort that Marks and Dembski did not.Using the information rich structure of ev, a random query will thus generate a successful result with the relatively large probability of pS = 0.00228. Interestingly, ev uses an evolutionary program to find the target solution with a smaller probability than is achievable with random search. Ev therefore takes poor advantage of the underlying search structure, induce negative active information with respect to it. With the adoption of the method sketched in this paper for assessing the information costs incurred by search algorithms, such improper claims for the power of evolutionary search would find their way into the literature less often.
51 Comments
David Stanton · 9 October 2007
I am sure that we will get a full and complete apology, Dembski style, and that all those involved will receive proper acknowledgement when the article is finally published in the DI newsletter.
But seriously, if all of the flawed creationist argument were retracted when their errors were pointed out to them, there would be none left. Oh well, at least maybe now Dembski will quit whining that mathematicians don't pay any attention to him.
Of course, this also raises the possibility that the "errors" were deliberate and that they were hoping that no one would notice. This is a common creationist tactic. After all, as Leonard McCoy once said: "I find that evil usually wins, unless good is very very careful". There is no way to prove intention here, but the reaction to criticism may be at least an indicator of the original intent.
millipj · 9 October 2007
Wesley R. Elsberry · 9 October 2007
Wesley R. Elsberry · 9 October 2007
David vun Kannon · 9 October 2007
Which document were you quoting to show that references to the faulty conclusions were still in circulation? I know they are in the slides of a keynote presentation, which is where they first caught my eye.
I look forward to seeing a revised and improved paper.
Wesley R. Elsberry · 9 October 2007
This one, which is still prominently listed on "evolutionaryinformatics.org" as a "Publication", though with the parenthetic notice of "(in review)".
Wesley R. Elsberry · 9 October 2007
David, I think that you may be in for a long wait. The entire premise of the "ev" critique evaporates if one uses a value for "p_s" that is within a centi-dembski (a "dembski" here being an error of 65 orders of magnitude) of the actual value: evolutionary computation is more effective for the "ev" scenario than blind search, and contributes a substantial fraction of the information seen in the result. An honest revision would not support Dembski's agenda, and therefore I am not expecting Dembski to ever publish a revised version. You may color me surprised if it happens.
Wesley R. Elsberry · 9 October 2007
Of course, looking at another of the papers that is listed on the EIL site, I noticed that it includes a clear error that I informed Dembski of long ago.
In fact, today is the seventh anniversary of the unregarded notification of that error. This is the standard for unacknowledged errors that Dembski has set. Time will tell as to whether Robert Marks will be an apt pupil...
secondclass · 9 October 2007
Happy Anniversary, Wesley.
As Dembski misunderstood Dawkins' WEASEL problem, he and Marks also misunderstood Schneider's ev. They assumed that only the last 131 bases were candidates for recognition. If you run the java version of ev (downloadable from Scheider's site) and you turn off selection, it's immediately apparent that this assumption is false. All bases in the sequence are subject to recognition, so M&D's error histogram should go from 0 to 256, not 0 to 131.
IOW, they have more problems to fix than just the bugs in the scripts.
Mike Elzinga · 9 October 2007
These errors in D&M's work remind me of an incident many years ago when an irate student wanted to prove to me that his answer on a physics exam was correct and not the complete nonsense anyone looking at it would recognize.
He punched numbers into his calculator (this was an early TI calculator that held only four pending operations) and proceeded to show me that his calculator gave the "correct" answer and that I was wrong for grading him off for his answer.
That the answer itself was obvious nonsense (an impossible number given the problem being solved) was not sufficient to alert this student that something was not right.
Using a calculator (or computer program) without knowing what needs to be considered and checked is inexcusable, especially if major claims are being based on computational results. This is so basic that it should never be overlooked. Results should be checked using data that can be calculated by independent means.
One of the characteristics I have seen in Dembski's work over the years is his complete lack of comprehension of the underlying physics of what he is attempting to do. He apparently has no sense whatsoever of what makes sense and what is nonsense. He simply takes the answer that satisfies him and seems to think it must be right because it is what his religion tells him is correct.
David vun Kannon · 9 October 2007
Thanks for the link, Wesley.
Wrt the chance of seeing an improved paper, color me hopelessly optimistic. Bob Marks has done solid work in the past, I can only wish for him that he avoids the Behe swoon in output.
I commented on UD (before being banned (thanks, DaveScot!)) that Dembski's research approach should be "what evolution can't do", viz. a careful analysis of what resources an evolutionary algorithm needs to solve certain problems. Does speciation need the concept of space and isolation? What does it take to evolve sex?Then show that biological reality falls into that class. If you can't evolve sex, then you can argue "Male and female He created them."
In re "ev", they should give up attacking the evolutionary algorithm. Attack the idea of a perceptron adequately modeling reality, if anything. Marks is a NN expert, they should have brought those insights to bear, not his Matlab skills.
Dave W. · 9 October 2007
Wait a minute. If a Dembski is an error of 65 orders of magnitude, wouldn't a centiDembski be an error of 63 orders of magnitude, since it'd take a hundred of them to make a Dembski? Perhaps, since a Dembski is an exponential value, a centiDembski is an error of only 0.65 orders of magnitude (a factor of 4.47 or so). Still pretty big, but that'd make a milliDembski an error of just 16%, and a microDembski would be an error of just 0.015%, getting us into the realm of respectibility (depending on one's field, of course).
Brian · 9 October 2007
While I agree that Dembski's work is mathematically and scientifically vacuous, I sometimes wonder whether the best tact is to simply ignore the man. Every time he spews some junk mathematics, it gets a lot more attention than the idea merits.
In short, do these constant debunkings of Dembski's work help or hurt the cause of good science?
Wesley R. Elsberry · 9 October 2007
[quote]
Perhaps, since a Dembski is an exponential value, a centiDembski is an error of only 0.65 orders of magnitude (a factor of 4.47 or so).
[/quote]
That was my intent, but as you note there is some ambiguity to the bare term.
Wesley R. Elsberry · 9 October 2007
Flint · 9 October 2007
RBH · 9 October 2007
Coin · 9 October 2007
Stuart Weinstein · 9 October 2007
Dembski's apology:
" I apparently made a MATLAB error with heathen Tom Schneider's Ev program "
Golly, the first thing you learn in programming is to properly initialize your variables.
With many systems and compilers, variables are not initialized to zero automatically
and may actually hold values depending on whatever flotsam and jetsam happen to exist
in memory.
I have made this error myself, I must say, but fortunately I was circumspect enough
not to publish a paper claiming the Navier-Stokes equations were wrong.
Stuart
Wesley R. Elsberry · 9 October 2007
They did attempt to initialize the variable in question; they just botched the job and didn't check it.
GuyeFaux · 9 October 2007
Wesley R. Elsberry · 9 October 2007
Sending a copy of "No Free Lunch" to Richard Wein may have been about the most productive $50 I've spent.
Dave W. · 9 October 2007
Torbjörn Larsson, OM · 9 October 2007
I would like to thanks all involved in their tenacity in pursuing this. I had completely missed Tom Schneider's response log, so I'm glad to retroactively see that he recognized D&M's problems.
My view of Marks output has changed dramatically in the last 24 h.
Yesterday, it become apparent that marks coauthors a paper with Dembski (according to Dembski) that openly uses misdirection on Häggström's critique of their "active information" methods.
Häggströms critique in short is that DNA sequences trivially shows clustering or the handful of mutations we all have would likely kill us, and an uniform distribution is an illegible model for evolution. (And that in general it is also an illegible prior in cases of large sets.) D&M claim that this observation is an assumption from Häggström.
[On the lighter side, Dembski concedes the efficiency of selection in evolution. “… the random search is 2.9387x1041 per cent worse than partitioned search. Partitioned search contributes an enormous amount of information.”]
Now this.
Bill Gascoyne · 9 October 2007
"Let me explain: when I was a physicist, people would come to me from time to time with problems in mathematics they couldn't solve. They wanted me to check their numbers for them. But after a while I learned not to waste my time checking the numbers -- because the numbers were almost always right. However, if I checked the assumptions, they were almost always wrong."
Eliyahu M. Goldratt, "The Goal"
Torbjörn Larsson, OM · 9 October 2007
Oops, a misquote. It should be “… the random search is 2.9387x1041 per cent worse than partitioned search. Partitioned search contributes an enormous amount of information.”
On another note, the reason why I personally didn't react to D&M claim of ev making a worse search than random was that they (in fig.2 IIRC) used a wider parameter space than ev is designed for in their comparison. Schneider mentions that a constricted space is used when simulating iid of mutations.
So I assumed that it was a non sequitur claim based on a consequence of their non-realistic method, not their mistakes. Never assume without testing the assumption. :-P
[Yay. I think PT has fixed the preview script parsing UTF-8 characters erroneously, in all text boxes. Checking: åäö.
Thanks, Reed!]
GuyeFaux · 9 October 2007
Torbjörn Larsson, OM · 9 October 2007
Almost. The preview parsing leaves UTF-8 characters, but now the final parsing in Submit doesn't get, or changes, the text in the name box.
/Torbjörn
Wesley R. Elsberry · 9 October 2007
Torbjörn Larsson, OM · 9 October 2007
See, we have the same problem as in some ScienceBlogs, the name box seems to put out ISO-8859-1 code for mysterious reasons. The solution for me is to use that in the browser, then everything looks OK. But UTF-8 which has twice the amount of characters is more web-friendly. :-)
Wesley R. Elsberry · 9 October 2007
Torbjörn Larsson, OM · 9 October 2007
Let me test that once more, I'm not sure it was a ISO 'ö' input to my name.
To make this on topic, can we expect that 1065 constitutes an Upper Dembski Bound (UDB), or is his errors tending to the infinite?
Coin · 9 October 2007
Oddly, Torbjorn's omlaut renders just fine in the "recent comments" box on the front page, but not here in the comments sections. Huh.
Reed A. Cartwright · 9 October 2007
Please discuss the encoding issues on the new encoding post.
Torbjörn Larsson, OM · 9 October 2007
Reed,
Thanks for your work!
I should have looked up the old (or better, looked for a new encoding post) for my tests, but you know how it is with boys and their new toys...
Pete Dunkelberg · 9 October 2007
W. Kevin Vicklund · 9 October 2007
re: Dave W's equation
Something struck me when I read the description of the equation. Isn't 10^150 Dembski's Universal Probability Bound? Seems like if we set B to 150, that gives additional justification for choosing that constant. Setting B to exactly 150 gives the original error a value of .994 Dembskis, but since Dembski undoubtedly rounded his reported value, I think we're justified to use only one significant figure in the calculation of his original error.
Reed A. Cartwright · 10 October 2007
Bumpage---Last encoding issue fixed.
Richard Wein · 10 October 2007
RBH · 10 October 2007
Brian · 10 October 2007
"I think that when dealing with fringe ideas like IDC that it is important that technical critiques are available for the readers, who may be uncommitted or only partly predisposed to agree with the anti-science side."
It's a good argument, and it is more or less what I expected to hear. Given that many "technical critiques" are already available, I wonder whether making new ones, as opposed to referring people to the record, just feeds the beast...
Dave W. · 10 October 2007
Wesley R. Elsberry · 10 October 2007
I think we might be forgiven for a 40 microdembski slop given that "150" is somewhat easier to recall and apply than 149.6680310446129694611694445545.
Brian · 10 October 2007
Has anyone considered making a second Wikipedia article for Dembski as a unit of measurement and error?
W. Kevin Vicklund · 10 October 2007
We need an SI abbreviation. I propose Dmb.
Dave W. · 10 October 2007
Whoops. Wait a second. I just noticed that the original, correct figure used to derive B shouldn't have been 1065 but instead 5.555117x1065. So B should really be 151.3827505295889703551314588227, and all previous measurements in Dembskis are high by 75 μDmb (for the 149.668... value of B). Using 150 for B will generate 61 μDmb errors.
W. Kevin Vicklund · 10 October 2007
Of course, one has to consider that Dembski rounded his answer to the nearest order of magnitude, which can introduce an error between 0-1.07 cDmb. If his actual calculated value was about 3.98x10-288, then B=150 is almost exact.
Dave W. · 10 October 2007
Considered and accepted. And just think, originally I was doing the math with log10, and B was 65. I switched to natural log on a whim, and by sheer coincidence got close to the exponent for the UPB.
...Or maybe it was all designed that way...
Reed A. Cartwright · 10 October 2007
I suggest the following as the symbol for dembski units: Δ.
It represents the 'd' in dembski. It is already often used to denote distance and perhaps error. And it looks like those hats that teachers put on students who screw up one too many times.
Popper's Ghost · 10 October 2007
secondclass · 21 October 2007