As the discussion over the
Liu-Ochman flagellum evolution paper continues, it is clear that I need to do a little more arguing to defend my position. Although some were convinced that skepticism was justified based the previous PT posts (basically: 1.
this goes against much prior published knowledge and 2.
just look at the obviously different structures), others have defended the paper or at least suggested that the alleged problems are not as overwhelmingly obvious as they seem to me. Two primary lines of argument have been raised. First, some have pointed out, correctly, that the reputation of the authors and journal in question far outweighs the reputation of a blogger like me, so why should readers trust me over
PNAS? I will concede the case when it comes to reputation; all I can say is that over the years I have developed some familiarity with the literature pertinent to flagellum evolution, and as I read through the
PNAS paper it became apparent that it was going against much of what was already known. This is not necessarily bad if a direct attempt is made to rebut conventional wisdom, but if assertions are made without much evidence of awareness that they go against previous work, that is problematic.
Second and more importantly, the Liu-Ochman paper reports reasonably significant e-values (e < 0.0001) for their claimed homologies (all of the lines in Liu-Ochman's Figure 3 represent matches with e-values of 0.0001 or less, in one or more of the 41 bacterial genomes they searched). I have been hinting that there are more technical problems with the paper, and that I and some others are working on a more detailed critique. For the moment -- especially to forestall suggestions that we are ignoring Liu and Ochman's BLAST results, and that we don't know how BLAST statistics work, etc., I will post some preliminary results of an attempt to replicate Liu and Ochman's findings.
A little background on BLAST, e-values, and homology
BLAST stands for Basic Local Alignment Search Tool, a standard program in bioinformatics that is used to find statistically significant matches between two sequences (amino acid or DNA). It is implemented in numerous web applications that can search massive online databases, and in stand-alone executables that can search local or online databases.
Homology is similarity due to common ancestry. In proteins and DNA, this is typically sequence similarity. As a very rough guide, for protein amino acid sequences, sequence similarity of 30% or more is typically strong evidence of homology, sequence similarity of 20-30% is the "twilight zone" where the assignment of homology typically becomes uncertain, and sequence similarity below 20% can often be due to chance resemblance. (Various details make this picture more complicated, e.g. shorter proteins need higher similarity to confidently assign homology.)
Structure is more conserved than sequence. It has been repeatedly observed that proteins down to 30% or 20% similarity will commonly exhibit very similar tertiary structure and folds. There are ways for mutations to change structures so this is not a universal rule, but it is a very good generalization. Homology will often be assigned based on detailed structural similarity and weak sequence similarity. It is thus suspicious if a claim of significant sequence similarity is contradicted by the observation of no structural similarity.
Along with alignments, BLAST produces an
e-value statistic, which is a better statistical measure of the significance of an alignment than percent similarity. The e-value represents the number of times that a given sequence match of a certain length and strength would be expected by chance, given a database of a certain size. ("e" is for "expected") The larger the database, the more likely it is that a weak match would occur by chance. An e-value of 1 indicates that one match of similar length and strength or better would be expected by chance, and therefore the match is clearly not significant. There is no hard and fast line for significance, and the e-value is not an infallible statistic anyway, but the rules of thumb seem to be that e-values less than 0.01 are interesting, and e-values less than perhaps 10
-8 or so are almost always a good indicator of homology, assuming no human error. Very close matches -- 50% or more sequence in common -- can have e-values of 10
-30 or less. Identical proteins, e.g. a protein BLASTed against itself, will have an e-value of 0.
An attempt to replicate the homology hits in Liu and Ochman (2007)
Recall Liu & Ochman's Figure 3:

The lines represent alignments that are significant according to an e-value cutoff of e = 0.0001 or less. The numbers represent the number of genomes (out of 41) where the homology connection was reported. The blue lines represent the matches found specifically in the
E. coli K12 genome. According to Figure 3, FliC is homologous to FliD (cap protein), FlgD (rod), FlgE (rod), FlgK (adapter between hook and FlgL), and FlgL (adapter between FlgK and FliC). Homology between FliC and FlgL seems to be
well-accepted and
retrievable with PSI-BLAST (a search more sensitive than regular protein BLAST), but the others are novel, or at least it is novel to claim that a simple BLAST search can detect them with decent significance.
I and others have been attempting to replicate the results in Figure 3. According to the paper's methods, Figure 3 is based on pairwise comparison using the executable bl2seq (BLAST 2 sequences). The bl2seq executable can be downloaded from the NCBI
here. (I got blast-2.2.10-ia32-win32.exe to work on my 2004 windows32 PC; you will have to download other versions for other machines and operating systems.) The bl2seq documentation is
online here. According to Table 1 of the paper's
Supplementary Material, the
E. coli genome was
E. coli K12, NC_000913.2, which is
online here. I downloaded the FASTA-format sequences for the 24 "core" flagellar proteins that the authors identified; I have uploaded them
here (right-click to download) as a zipfile if you would like them.
The table below shows the search results for BLASTing
FliC against the 24 flagellar proteins. The table columns, from left to right, contain:
- 1. Protein name
- 2. Liu-Ochman matches to FliC (from Figure 3)
- 3. e-values for bl2seq search default filters off
(example search: bl2seq -p blastp -F F -i FliC.fasta -j FlgD.fasta -o FliCvFlgD.out)
- 4. e-values for bl2seq search default filters on
(example search:bl2seq -p blastp -i FliC.fasta -j FlgD.fasta -o FliCvFlgD_filters.out)
- 5. e-values for bl2seq, default filters on, database size = 7163
(example search: bl2seq -p blastp -i FliC.fasta -j FlgD.fasta -o FliCvFlgD_filters_db7163.out -d 7163)
- 6. e-values for bl2seq, default filters on, database size = 293683
(example search: bl2seq -p blastp -i FliC.fasta -j FlgD.fasta -o FliCvFlgD_filters_db293683.out -d 293683)
Although the methods section of Liu & Ochman (2007) says that the bl2seq BLAST searches were run with defaults (basically column #4 in the table below), it is apparent that the BLAST searches were actually run in the non-default setting of filters off (column #3). Through the grapevine I have heard that the authors are telling correspondents about this error in email, and plan to issue a correction, which is good.
An additional issue is database size. Searching 23 proteins instead of one means that the database size is not the size of one protein, but the size of all 23 proteins strung together, or 7163 amino acids in length. Furthermore, the authors actually ran these pairwise searches between the 24 core proteins in each of 41 genomes, so the full size of the database searched is actually approximately 7163 x 41 = 293683. Columns 5 and 6 show the resultant e-values when bl2seq is run with the -d (database size) parameter set at these values.
Table: e-values resulting from bl2seq search of
E. coli K12
FliC against 23 other core flagellar proteins, using different search options.
ns = no significant hit according to Figure 3 of Liu and Ochman (2007).
na = no significant alignment returned by bl2seq.
| Protein | Liu-Ochman hits for E. coli K12 (Figure 3) | default filters off | default filters on | default filters on, database size = 7163 | default filters on, database size = 293683 |
| FlgB | ns | 0.2500 | 0.2500 | na | na |
| FlgC | ns | 0.3200 | 2.1000 | na | na |
| FlgD | <0.0001 | 0.0003 | 0.0110 | 2.3000 | na |
| FlgE | <0.0001 | 4e-06 | 0.0110 | 0.2100 | na |
| FlgF | ns | 0.0120 | 0.0120 | 0.3500 | na |
| FlgG | ns | 0.1700 | 0.6600 | na | na |
| FlgK | <0.0001 | 2e-10 | 3e-05 | 0.0100 | na |
| FlgL | <0.0001 | 4e-09 | 0.0250 | na | na |
| FlhA | ns | na | na | na | na |
| FlhB | ns | na | na | na | na |
| FliD | <0.0001 | 9e-09 | 7e-06 | 0.0080 | 8.0000 |
| FliE | ns | na | na | na | na |
| FliF | ns | 0.0350 | 0.0350 | 0.4600 | na |
| FliG | ns | 0.8600 | 0.8600 | na | na |
| FliH | ns | 1.7000 | 1.7000 | na | na |
| FliI | ns | 1.2000 | 1.6000 | na | na |
| FliM | ns | na | na | na | na |
| FliN | ns | 0.2500 | 0.2500 | na | na |
| FliP | ns | 5.2000 | 5.2000 | na | na |
| FliQ | ns | na | na | na | na |
| FliR | ns | na | na | na | na |
| MotA | ns | na | na | na | na |
| MotB | ns | 0.6100 | 0.6100 | na | na |
As you can see, with default filters turned on, 5 significant hits become only 2. With filters on, plus database sizes larger than a single protein, no hits are significant.
Removing filters from a BLAST search is an extremely serious decision with major impacts on an analysis, because the filters prevent spurious matches that are due to similarities that are not phylogenetically informative, such as low-complexity regions and biases in amino acid composition. Similarly, the database size has a massive impact on e-value.
We have not yet run the same searches systematically through the other flagellar proteins and the other 40 genomes, but it is apparent that the results would be similarly dire, and that most or all of the new significant hits reported in Liu and Ochman's Figure 3 would evaporate. Thus the only support for the all-flagellum-genes-from-one hypothesis, which was unlikely from the beginning based on background information, also evaporates.
Acknowledgements
Doug Theobald and Ian Musgrave ran some of these searches before me, and Doug educated me on the database size issue and made various other helpful comments. Any errors are of course mine.
Note: The FliC-FlgD match in Column 3 has an e-value of 0.0003, which is actually higher than the 0.0001 cutoff. So either there is a slight difference in our databases or techniques, or 0.0003 was mistakenly reported as a hit below 0.0001 in Figure 3.
55 Comments
Ralf Koebnik · 24 April 2007
Hi Nick,
I wonder why there are several cases of "0.0000" in columns 3 and 4 (e.g. FlgK and FliD) ?
Another comment: for pairwise comparisons, you could also use the PRSS algorithm:
http://www.ch.embnet.org/software/PRSS_form.html
To my experience (I was teaching applied bioinformatics for 6 years: http://www2.biologie.uni-halle.de/genet/plant/staff/koebnik/teaching/bioinfo/index.html), this algorithm is superior to BLAST.
Best regards,
Ralf
Ralf Koebnik · 24 April 2007
Here the (german) Bioinformatics link again (in the link above, there is one bracket at the end which has to de removed):
http://www2.biologie.uni-halle.de/genet/plant/staff/koebnik/teaching/bioinfo/index.html
Ralf Koebnik · 24 April 2007
Here is the (German) bioinformatics link again. Above, there is a bracket at the end which one has to remove.
http://www2.biologie.uni-halle.de/genet/plant/staff/koebnik/teaching/bioinfo/index.html
Nick (Matzke) · 24 April 2007
Those 0.0000 numbers just mean the e-value is lower than 0.00005 and was cut off in the spreadsheet. I will post the exponent values in a sec.
Daniel Morgan · 24 April 2007
Good job with this. It is always crucial to show the sequence analysis when criticizing a paper in a journal like PNAS. While I was sympathetic to your original analysis regarding structure image eyeballing, and was not so harsh as some of your critics in the update thread, you have now laid down the gauntlet and the onus has shifted.
RPM · 24 April 2007
Dan Gaston · 24 April 2007
Glad you posted the results Nick. As I said in my previous posts I wasn't trying to take up any particular position just yet, I just disliked people saying they thought there were serious methodological problems before they posted their own results showing that. Have you tried contacting the authors about the database size and such? Quickest way to find out is simply to ask.
Unsympathetic reader · 24 April 2007
Howard Ochman is extremely approachable.
David vun Kannon · 24 April 2007
Nick, thank you and the others for contributing additional data and helpful explanations.
I suppose my sanity check for understanding the numbers in the table is the FlgL row, since you say that this is a well accepted homology. Default filters "on", this value is 0.0250. All the Liu & Ochman hits fall below that number in that column. In addition, this homology disappears at the larger database sizes, just like the others. Oh well.
As I understand your current argument, Liu & Ochman are making an exansive claim based on narrow support - BLAST analysis alone. "All pairwise hits with e values better than FliC/FlgL with default filters" sounds like an intermediate stage of analysis, not the basis of a publishable result.
I still find your observations on structure somewhat weak. Certainly structural similarity will help the inference of homology, and it is easy to understand that there can be many changes in sequence that do not change structure radically, but the lack of that support does not argue for the negative conclusion. A series of changes that leads to a very different tertiary structure can still be a homology.
Nick (Matzke) · 24 April 2007
Dan Gaston · 24 April 2007
Nick:
What did the alignments look like in general with the filters turned off? I know in my own work sometimes it is necessary to turn off the filters in order to retrieve exact matches to the query from a database if those low complexity regions are present. It ends up being a complicated business using BLAST for these sorts of analysis because in general there aren't any hard and fast rules as I am sure you are aware.
As for the word homology, personally I am of the opinion that as scientists/science-enthusiasts we should be using these terms precisely. An inference of Homology necessarily ties in with an inference of common descent because it is exactly what homology means. It isn't a tautology because the two terms are directly tied to one another. Personally I suggest rephrasing the argument in order to better confront Creationists instead of altering your definition of a precise scientific definition. Remember we get frustrated when the Creationists twist the meanings of things to suit their own ends, and even if in this case it is done for a more benign reason I would shy way from it.
Douglas Theobald · 24 April 2007
Nick (Matzke) · 24 April 2007
Ugh, I was trying to avoid the usual hair-splitting over the definition of "homology." My position is not an arbitrary anti-creationist position. Homology does not have to be defined by common ancestry: as everyone knows, "homology" was recognized and observed (and defined, by Owen) as a particular kind of similarity long before Darwin published common ancestry as the convincing explanation.
I think with sequence similarity analysis, common ancestry has been the known explanation from the start, so people make "sequence similarity" or "sequence identity" the observation and "homology (common ancestry)" the conclusion. There is nothing wrong with this but it doesn't quite match with the original usage of of the term.
Reed A. Cartwright · 24 April 2007
About Homology:
Mindell and Meyer (2001) defines homology as "relationship between traits of organisms that are shared as a result of common ancestry".
See IndexCC for the full citation and more discussion.
Nick (Matzke) · 24 April 2007
Nick (Matzke) · 24 April 2007
According to UCMP, Owen famously defined homology in 1843 as "the same organ in different animals under every variety of form and function."
Douglas Theobald · 24 April 2007
Nick, I'm going to have to side with the others against you on the homology def. Pre-Darwin, yes, homology was defined Owen-style. In evolutionary biology, no-- homology has been redefined as similarity in structure that is due to inheritance from common ancestors. Homology is no longer an observation; it is an inference based on similar structures. Homology no longer simply means "the same organ" (where Owen had no etiology for "sameness"). Rather, similar structures can be homologous or not.
If you want to step outside evolutionary biology (perhaps to talk to a creationist in order to establish the validity of evolutionary bio) and use anachronistic definitions for modern terms, fine -- just make it clear you are doing so.
The circularity argument is bogus anyway. Homology as evidence for common ancestry is no more circular than using a line fit to a set of data as evidence for a linear relationship. Sure, they're both circular in a way, but if the postulated relationship isn't there, then the fit doesn't work well. A good fit is evidence for the hypothesis.
Dan Gaston · 24 April 2007
Nick Thanks for the info on the alignments, I'm the type who always wants to see alignments in these sorts of studies myself because e-values and scores only tell so much. So thanks, you gave me what I wanted to know.
DouglasExactly (on both points). Grishin's work is exemplary in terms of structural evolution, which is what I am primarily interested in. The Evolution of protein folds and the mechanisms at play. Fascinating stuff. Lately with all of the work I have been reading I tend to envision sequence/structural space as follows:
Sequence space as an infinite space (or set depending on how you like to look at these things) with allowed/realistic structures representing anchor points in that space. Sequences can "explore" a great deal of space around an allowed/stable structure with very little variation in the overall structure. Of course there are all sorts of interesting implications when we start talking about marginally stable proteins and intrinsically disordered proteins.
As for homology, spot on.
Nick (Matzke) · 24 April 2007
Doug, if I'm remembering correctly, you suggested that specific homology wording to me in email comments on the draft of this post... ;-)
Nick (Matzke) · 24 April 2007
Douglas Theobald · 24 April 2007
Well, I don't know what RBH is referring to, as you say "Homology is similarity due to common ancestry" in the post, which is what I suggested, and which is correct.
Pinko Punko · 24 April 2007
I'd love for you guys to get to the bottom of this, but if I ever had a problem with someone's work, I'd most likely talk to them first and try to figure things out. This would not preclude me from talking about things in public, but it is the collegial thing to do. I think a step back would confirm that this has not really been handled in the optimal way.
Lurker · 24 April 2007
I second the above: aren't people talking to the authors?
Wesley R. Elsberry · 24 April 2007
I have to disagree on the "wait until you've discussed this at length with the authors" angle. While a discussion with the authors is a good thing, that is a separate issue from taking up a matter of public discourse in a timely manner. If we are not prepared to note publicly our disagreements with published work that is simultaneously being lauded publicly elsewhere, we risk being the echo chamber that anti-science advocates have claimed we are.
I know that Nick and others are also working toward a contribution to the peer-reviewed literature in response. But they would have been shirking a clear responsibility if they had simply sat back, allowed others to praise the paper in public, and said nothing. PT contributors are carving out new roles for the weblog in scientific discourse; Reed's being added to a paper recently demonstrated that early description of a hypothesis via weblog can be noticed, and Nick's criticism in the current case certainly has generated further scrutiny of the work. Maybe this will not become accepted practice in the long run, but I think that scientific discourse has altered appreciably over time and with respect to available means of communication. It is too soon to say whether this is the way the future will run or not, but count me on the side of those who look forward to more discussion of this sort.
Pumpkinhead · 24 April 2007
Nick (Matzke) · 24 April 2007
Pinko Punko · 24 April 2007
A little bit Wesley and mostly Nick,
I understand your points of view, but even a statement in the original or subsequent posts such as "here are our concerns, we have contacted the authors of the study in the hopes that they may enlighten us about these issues. We understand, of course, if they do not want to join such a public discussion of the work. To continue..."
This would have done it for me. Even though it seems great to strike while the iron is hot, or maybe it is getting discussed in the press, but when does the press get things really right, and what would a day or two matter in the long run. I don't think it would mean anything in regards to Panda's Thumb readers, but it would be much more seemly. This is not a plea for false civility or what not, it is merely how I would personally handle an issue like this, given no other backstory/conflicts with the group publishing a problematic paper. I would go through the motion first, under the odd chance that I could be wrong, or the odd chance that they could be massively wrong and admit to such upon reflection. I would then diplomatically say whatever it is I was going to say in the first place. In terms of educating the PT audience, who may not necessarily be able to follow all the scientific ins and outs, false alarms and incomplete arguments would be avoided by this approach.
Kevin · 24 April 2007
well crap I don't understand more than 36.3% of that and that's because I included stuff like
Removing filters from a BLAST search is an extremely serious decision with major impacts on an analysis, because the filters prevent spurious matches that are due to similarities that are not phylogenetically informative
which are somewhat written in english.
anyway, let me say that this paper must be taught as fact if only to teach the controversy surrounding it.
Kevin · 24 April 2007
I think a step back would confirm that this has not really been handled in the optimal way.
WHAT? was someone beat up in the parking lot?>
sparc · 25 April 2007
Lurker · 25 April 2007
It bothers me that so much of this is being done because of some whacky perception that we are being scrutinized by Creationists for being ... not scientific enough? Come on now. It feels faux and forced, like maybe we have some dirty laundry to hide after all. Science is being done by the normal reflex action of scientists, and for some reason all of PT wants the Creationists to notice this one instance of it. Why?
Nick has found, what, 3 blogs that have talked about this article since its publication, including a blurb in those Science news reports that we know all scientists get wet even thinking about being mentioned in ... And since then, we _know_ to declare that this is a highly visible (for the public!) scientific report? Seriously. This is all funny stuff.
BTW, bulletin boards have long predated any electronic media. Trees with stapled notes, even. I am simply not familiar with the practice of posting long diatribes against a scientific result, so that the public, and the targeted scientist, may read about it on their way to work. There is nothing "new" about the blogosphere in this regard. The intent is still the same, though: it is to discredit with a political motivation. We now have Nick discussing motives of the researchers -- they're trying to be "authoritative", as if no scientist ever wants that -- without contacting the authors even. If there is ever a valid criticism of scientists, it is they are not naturally a collegiate bunch.
Lurker · 25 April 2007
"Should one try to contact authors of creationist papers published in Rivista before debunking them?"
Do you normally go out of your way to talk about Rivista articles by Creationists as they are published?
Pinko Punko · 25 April 2007
sparc,
None of this is an either/or situation. Given that Nick has published in the field and now someone else is publishing in his field, he has an opportunity for gaining a new colleague, even in disagreement. There are no hard and fast rules for when you just open a can of whoop ass on someone or when you talk to them. If someone were to strongly criticize my work, I'd like to hear from them. It is just a matter of style. Disagreements can sometimes even turn into collaboration. What is odd is that I feel like I am arguing for a somewhat obvious course of action, albeit in hindsight. If people want to treat my comments in the light of a concern troll, that's great, but this isn't high school and there aren't teams here. I would love to know who is right and I am really proud of Nick for raising his concerns in an open forum like this. I only wish he also would have opened up a dialog because to me it seems the right thing to do in THIS situation not necessarily other HYPOTHETICAL or STRAW situations. Not everything in this world is people lobbing poop at one another- not everyone is DaveScot, or thick as a brick. The internet really taints some forms of interaction.
chunkdz · 25 April 2007
delphi_ote · 25 April 2007
"It bothers me that so much of this is being done because of some whacky perception that we are being scrutinized by Creationists for being ... not scientific enough?"
It is being done in the interest of providing the public with accurate information, which is kind of the point of this whole anti-creationist movement, no?
Mike · 25 April 2007
I have a question. I use BLAST on a fairly regular basis to look for a number of things, but I had always read that it is a blunt tool built for speed. Its optimized for fast comparison to large databases, and is capable of missing relatively weak similarities. Don't anatomists and paleontologists use other tools?
Dan Gaston · 25 April 2007
Mike: Anatomists and Paleontologists aren't generally working with molecular data, which is what BLAST is intended for. You can in theory translate physical features into a coded alphabet and do alignments and look for similarity but it isn't quite the same thing. BLAST is a blunt tool, but it is also powerful and its optimization parameters don't hamper it much in doing pairwise comparisons. The final stage of the process for good candidate matches is a full smith-waterman alignment correction anyway so what comes out in the end is pretty precise.
Gross physical feature comparison is even more blunt and open to interpretation than things like BLAST anwyay.
And on the note that Pinko brought up, as I've said in several posts on this discussion I have no problem with being openly critical, I do think the tone of some of the criticism has been unnecessarily confrontational though, and not collegial at all. I don't think that is very professional but perhaps that is just my opinion. *shrug*
Nick (Matzke) · 25 April 2007
I don't think my critics are fully understanding the scale of the problems with the Liu & Ochman paper. Imagine if someone published a paper in PNAS claiming that the mammalian middle ear bones evolved by duplication of a finger bone, based on looking at the wrong illustrations in an anatomy book. It's kind of like that.
Dan Gaston · 25 April 2007
Nick I fully understand why you are being critical, as I said being critical is a good thing. I'm merely commenting on approach towards that criticism, and I think that is all several others are pointing out as well. It looks like you guys are now on the right track with your rebuttal of the findings. I'm just suggesting that in retrospect a little more tact could have perhaps been used in the delivery of said criticism *shrug* just my opinion as I said. Then again as a Grad student perhaps it is just my innate desire to not step on anyones toes at this stage of my hopefully career.
Popper's Ghost · 27 April 2007
Popper's Ghost · 27 April 2007
Popper's Ghost · 27 April 2007
Thanatos · 29 April 2007
Nick ,I know this thread and part i of it don't fall into the usual category of more or less "popularised" biology
posted here to refute ID but is there any chance of a translation of it into plain english or more general science-talk?
This thread is better than part i but again is sounds -reads- like chinese to me (being a greek de facto I cannot say like greek :-) )
I know I'm coming in late but after part i I waited for a popularisation
(as you and all the others biologists were engaged in arguments and counterarguments)
which unfortunately didn't exactly came with the present though it's more detailed.
haven't you noticed that besides the IDiots' I-comment-on-anything-and-everything-since-I've-got-god-on-my-side
the rest of the I'm-not-a-biologist crowd has been more or less mute?
I know there is the time-issue and obviously you're busy,but if not you ,is there not here some other biologist
to do the work?
some of us are of other disciplines you know...
a priori ,regardless of if it will happen, thanks
Nick (Matzke) · 29 April 2007
Thanatos · 29 April 2007
hi Nick,I'm no newcommer to the ID debate,the flagelli alleged IC or PT.I'm just not a biologist.
your excellent posts(and of course not just yours but PT crew's in general) are usually more ''accessible'' and understandable by-to the
general scientifically educated (or not) crowd and not just to biologists.that's what I meant.
as for specific questions,well,I wouldn't know where to start from.
Again,
is there no chance of posting some kind if the usual step by step explanation-popularisation?
thanks
Nick (Matzke) · 29 April 2007
Thanatos · 29 April 2007
Thanatos · 29 April 2007
again thanks a priori for your interest,
I'm very much obliged.
Chairein!
Thanatos · 29 April 2007
correction:
I already understood the
above(below) conclusions of yoursNick (Matzke) · 30 April 2007
Nick (Matzke) · 30 April 2007
trying to refresh page, ignore this...
Thanatos · 30 April 2007
Nick,
although I needed a theoretical, but in less technical(biological) words, explanation,
(ie nature of transformations and correlations that's why i asked for a "gedanken" blast
picturesque example of 2 tiny sequences)
some of my questions or miscomprehensions were answered or corrected (ie na means e>10)
and anyway I can't ask for more,i would be wasting your precious time.
perhaps some other time when you or someone else will have time to kill.
I was hoping to avoid having to go through all the literature,links etc myself
and asking for an in a nutshell but scientific-technical enough walk-through of
identifying homologies, in order to appreciate-judge your argument.
I'm using a 56k dialup modem(time costs and the speed would be awful), so downloading blast
and blasting just for fun is out of the question.
yes there are still places in the world where 56k dialup and time-charge still exist and perhaps rule.
furthermore here (Greece) only in the last 1-2 years broadband services' prices
(only aDSL) have fallen to a more reasonable price.
one should notice
a.ADSL (broadband in general) has been here (scarcely) available (if I remember correctly)
for less than 5 years.
b.reasonable prices means 20 euros per month for a 768K line in a country where
the minimal per month wage is 600 euros ,where the cost of living has caught up with western europe
(after introducing the euro) and where the unemployement rate is 10%.
c.due to poor network (DSLam etc) investment programming,
if one is lucky one'll get the aDSL line in less than two months after signing up for it,
and speed (especially) in rush hours is less than half of nominal.
d.only about 5% of the total population (11M) connects via a DSL line.most of the population is
internet-wise-digitally illiterate.
one should notice
a.yes I'm unemployed
b.yes I'm not rich
c.yes life is a bitch
Nick again,
much obliged :-)
ci vediamo
Nick (Matzke) · 30 April 2007
I think you are looking for an explanation of the BLAST algorithm.
Running the web BLAST search takes no more bandwith than a standard webpage. Click BLAST, wait 30 seconds, see the results on a webpage. It would use less bandwith than the characters you sent for your last comment.
Or if you just want everything in one click, click BLink on the E. coli FliH sequence and look at the results.
Thanatos · 30 April 2007
Nick
I'll check them out
I tried to follow long before asking stupid questions, the initial links you had posted
but they were downloads of blast(11MB), a help file on blast that wasn't so helpful,etc
so I was discouraged on continuing.
I didn't wiki or anything else cause I was sure that I would have to go from link to link for ages to get
the full picture.usually PT summarises critically all these for us non experts.
I hope my questions will be solved and I'll stop being such a burden.
anyway thanks a lot... :-)
scienceminded · 21 January 2009
I have come across this article from PLoS ONE and I wondered if anyone, much more well read than I, could either explain or direct me to a site that could explain their argument and "probability calculations", and whether they are *ahem* on par, with Dembski's 'work'. Any help is appreciated, thanks.