The misuse of terms 'homology' in bioinformatics community

In a recent letter to the editor of journal Bioinformatics Marabotti and Facchiano have raised the concern over the misuse of term ‘homology’ in peer-reviwed bioinformatics papers. This issues is not new for the scientific community at all, in fact severity of the issue was first recognized in 1987 when a letter to the Editor of Cell addressed the use and misuse of the term ‘homology’. When two or more proteins are homologous, precisely it means that they have a common evolutionary origin and nothing more than that. But over the years that term ‘homology’ has been frequently and wrongly used as synonym of ‘similarity’ or percent identity such as protein sequence A and B are 70% homologous. Homology is a binary character, either sequences share ancestry or they don’t. Marabotti and Facchiano suggest that

[I]t is not possible to associate the term [‘homology’] to an adjective as low or high, or indicate a degree of homology with a number, as an example a percentage value. The common origin exists or not. Moreover, it is not possible to apply this term to a single object, being referred to a quality which includes the existence of at least two homologous proteins.

Protein color key
They have an interesting example of misuse of term ‘homology’ from a recent Bioinformatics paper, I don’t know how some of these papers made their way to journal Bioinformatics. For instance,

A protein is low-homology if we cannot obtain sufficient amount of homologous information for it from existing protein sequence databases

Last year Marabotti and Facchiano published a comprehensive report on the misuse of the term ‘homology’ by performing text analysis over the PubMed archive for articles published in year 2007 and 1986 that have the keyword ‘homology’ in their abstract or title. They found that the term ‘homology’ was incorrectly used in 47% abstracts published in year 2007 and 51% of abstracts published in year 1986 which means the percentage of errors is nearly unchanged. Based on analysis they suggested that we have not learned from our mistakes in past and even 20 years after the first debate the wrong usage of the term homology is very difficult to remove from the literature.

Reference:

Marabotti, A., & Facchiano, A. (2009). When it comes to homology, bad habits die hard Trends in Biochemical Sciences, 34 (3), 98-99 DOI: 10.1016/j.tibs.2008.12.001
Marabotti, A., & Facchiano, A. (2010). The misuse of terms in scientific literature Bioinformatics DOI: 10.1093/bioinformatics/btq438

Share and Enjoy:
  • HackerNews
  • Twitter
  • Facebook
  • Google Buzz
  • LinkedIn
  • Posterous
  • Tumblr
  • Digg
  • Reddit
  • del.icio.us
  • DZone
  • FriendFeed
  • Suggest to Techmeme via Twitter
  • Print
  • RSS
  • Slashdot

30 Responses to “The misuse of terms 'homology' in bioinformatics community”
  1. 08.16.2010

    The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/bEI5eU #science

  2. 08.16.2010

    The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/bEI5eU #fisheye

  3. 08.16.2010

    The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/bEI5eU

  4. 08.16.2010

    RT @sysbio The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/bEI5eU #fisheye

  5. 08.16.2010

    The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/9fYghl Science.alltop

  6. 08.16.2010

    The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/a6Fk9j

  7. The misuse of terms ‘homology’ in bioinformatics community http://goo.gl/fb/iz0u3

  8. 08.16.2010

    RT @ResearchBlogs: The misuse of terms ‘homology’ in bioinformatics community http://goo.gl/fb/iz0u3

  9. 08.16.2010

    RT @ResearchBlogs: The misuse of terms ‘homology’ in bioinformatics community http://goo.gl/fb/iz0u3

  10. 08.16.2010

    This has always been a problem indeed. ‘homology’ has become equivalent to ‘sequence similarity’ for many people I’ve talked to and it is only a minority that actually knows better. And even then… I know the difference, but even I am guilty of by accident saying homology when I mean sequence similarity. But there are many sentences in which you can use both ‘homology’ or ‘sequence similarty’. The meaning of the sentence changes for sure but otherwise produce sentences that in most cases are correct. For instance:
    “These proteins are homologous” vs. “These proteins share sequence similarity”.
    (Better examples may exist, but my time is restricted to come up with better ones)

    It is therefore not difficult to see that people will interchange these two terms. I think that as bioinformatics continues to perculate to the workbench of biologists and more weight is given to bioinformatics teaching in universities this problem will diminish.

    A misuse of terms I find often more troubling is the use of orthology as “having similar function”. It makes it very hard for me to talk about orthologous proteins while not wanting to convey the message that protein X in species X and protein Y in species Y may have the same function.

    Anyway, great post! I haven’t read the articles yet, but I will. Thanks!

    PS. You Captcha system is providing me with quite the challenge. I had to refresh 10 times before I had one that I could actually read!

    • abhishektiwari
      08.16.2010

      Hi John,
      Thanks for your thoughtful comment and apologies for inconvenience caused by Captch challenge, its really hard to pick between two extremes, spam or no comment. I think you have reached to the root of this problem, we need better awareness about the key concepts. Often people make this mistake unknowingly.

  11. 08.16.2010

    RT @abhishektiwari: The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/bEI5eU

  12. 08.16.2010

    RT @abhishektiwari The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/bEI5eU

  13. 08.16.2010

    RT @abhishektiwari: The misuse of terms ‘homology’ in bioinformatics community http://bit.ly/bEI5eU

  14. 08.16.2010

    Using the same word to mean two different things can be okay. “Nucleus” can be either the center of the cell or the center of an atom.

    I don’t have a problem with people using “homology” to mean “sequence identity” as long as they warn me.

  15. 08.16.2010

    The same word can mean two things in science without too much problem. “Nucleus” can be either the center of an atom or the center of a cell.

    I don’t mind people using “homology” to mean “sequence identity” as long as they warn me.

    • abhishektiwari
      08.16.2010

      HI Zen, I guess no one minds but publishing with warning will be not different from warning “smoking is injurious to health”, people smoke even though there is warning. You cant stop them interpreting your content in a wrong way.

  16. 08.16.2010

    RT@ abhishektiwari The misuse of term ‘homology’ in #bioinformatics http://bit.ly/bEI5eU Thanks! Too common a mistake: homology ≠ similarity

  17. 08.16.2010

    Very nice and very necessary to revisit. Too often is this mistake made – and so often overlooked by editors and reviewers of journals. Thank you for this post!

  18. 08.16.2010

    RT @larry_parnell: RT@ abhishektiwari The misuse of term ‘homology’ in #bioinformatics http://bit.ly/bEI5eU Thanks! Too common a mistake …

  19. 08.16.2010

    The misuse of term ‘homology’ in #bioinformatics http://bit.ly/bEI5eU (RT@ abhishektiwari )

  20. 08.16.2010

    "Homology" abuse. http://bit.ly/9ytw7f @abhishektiwari

  21. 08.16.2010

    My computational biology professor addressed this issue on the first day of class:

    “Homology,” he said, “is like pregnancy. You either are, or you aren’t. No one is ever 70% pregnant, and no two proteins are ever 70% homologous.*”

    I think that analogy got the true definition of homology to stick, at least for my year of grad students.

    *Except in cases of putative domain shuffling.

    • abhishektiwari
      08.16.2010

      Very true James, either you are pregnant or you are not or may be you don’t know . We have to stick with this definition otherwise bioinformatics will go wild west

  22. 08.16.2010

    Homology, is like pregnancy. You either are, or you aren’t. In comments of this post – http://goo.gl/Dprb

  23. 08.17.2010

    Let’s hope that “homology” do not receive the same fate of “genetic code”.

    • abhishektiwari
      08.17.2010

      Interesting point. My people don’t mind to replace “genome” with “genetic code”.

      • 08.17.2010

        We know that natural language is always evolving with new words being created and modified (both semantically and syntactically), and that is ok. However, I am a bit more conservative about scientific definitions as their misuse can create confusion in the future (maybe even in the present).

        Let’s imagine a researcher 10 years from now doing a literature review about genomics in the 2000′s. Maybe he will need to guess when “genetic code” is used as synonym for genome sequence or as the traditional definition for the codon translation rules. The same with homology and similarity, words that define different meanings. Another example is the homolog X ortholog confusion.

        For most of the cases, it can be easy to distinguish between the meanings, taking the context in account, but I am not sure that this is always possible.

        • abhishektiwari
          08.17.2010

          I could not agree more. We have to stick with the standard interpretations otherwise there will be too much of garbage in and garbage out.

  24. 08.18.2010

    “The misuse of term ‘homology’ in #bioinformatics_ 가끔 혼용하시는 분들 계시던데 잘 정리가 되어있네요. http://bit.ly/bEI5eU (RT@ abhishektiwari ) @raunakms