NodeXL for visualizing biological network data in Excel and some other serious Microsoft ventures in Bioinformatics

Lately Microsoft has released few interesting add-ins for their Office products. Yesterday I wrote about a Word add-in that enables the annotation of scientific documents using bio-ontologies and controlled vocabularies. Today I learned about their latest add-in for Excel, NodeXL, which enables user to create and analyze network visualizations without any hassle. Like the ontology add-in NodeXL is absolutely free and code base is open sourced. NodeXL is primarily designed for extracting, analyzing and visualizing social media networks, in fact it can connect directly to social networking website such as Twitter to import the network data for analysis in Excel. For instance you can import network data of people who have recently tweeted a certain term which might interest you. Coming back to the main motive behind this post, I feel there is nothing which stops us to use NodeXL with biological data. It will be quite easy to extend the NodeXL to import the data from biological databases such as BIND. I know there are plenty of tools already available for the biological networks visualization, interested readers can find a detailed review of major biological network visualization tools by Pavlopoulos et. al. But considering the fact that most of biologist are more familiar with Excel than any other tool out there for network visualization, using NodeXL will make more sense to them. Even in current form itself NodeXL can be used to visualize the biological data, user can import graph data in any major file formats such as GraphML, UCINet, Pajek, and matrix.

Another thing which caught my attention is Microsoft is working with two more bioinformatics extensions, first one is Microsoft Biology Foundation (MBF) a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET framework and the second one is Biology Extension for Excel which is built on top of MBF and enables a simple and flexible way to work with genomic sequences, metadata, and interval data in an Excel. Both of these products are currently in their beta release and developed by a dedicated Microsoft development team. In current release MBF and its SDK include several features,
Microsoft Biology Foundation:
Core MBF system library (bio.dll).
Core MBF web service library (WebServicesHandler.dll).
Advanced algorithms for use in performing Multiple Sequence Alignment as well as whole genome de novo assembly, included both as core library features as well as examples of how to add modular functionality to the core via the Addin auto-registration mechanism (Addin\PaDeNa.dll, Addin\Pamsam.dll).
Introductory as well as advanced technical documentation to better inform those interested in learning more about the technology and capabilities (Doc).
SDK:
Complete technical reference documentation explaining all available capabilities of the MBF framework (BioDotNet.chm).
Trident activity samples for generating genomics workflows using the Trident workflow workbench (bio.workflow.dll).
IronPython project to show how the MBF library of functional can be used with the Python scripting language (BioDemo.py).
Sequence Simulator, a small sample application which allows for splitting of large sequences into multiple shorter “reads” – intended to replicate what might come off a next-generation sequencing machine.

Despite being a .Net centric project I think MBF is going to offer a serious competition to OBF projects such as BioPerl, BioPython etc in terms of quality, optimization and stability. A final release of MBF is expected around June this year, after that project will be more open for external contribution which currently seems quite restricted.

Share and Enjoy:
  • HackerNews
  • Twitter
  • Facebook
  • Google Buzz
  • LinkedIn
  • Posterous
  • Tumblr
  • Digg
  • Reddit
  • del.icio.us
  • DZone
  • FriendFeed
  • Suggest to Techmeme via Twitter
  • Print
  • RSS
  • Slashdot

8 Responses to “NodeXL for visualizing biological network data in Excel and some other serious Microsoft ventures in Bioinformatics”
  1. 03.05.2010

    NodeXL for visualizing biological network data in Excel and some other serious Microsoft ventures in Bioinformatics http://bit.ly/cVmX2S

  2. 03.05.2010

    NodeXL for visualizing biological network data in Excel and some other serious Microsoft ventures in Bioin… http://bit.ly/cVmX2S #science

  3. 03.05.2010

    NodeXL for visualizing biological network data in Excel and some other serious Microsoft ventures in Bioin… http://bit.ly/cVmX2S #fisheye

  4. 03.05.2010

    That would be terrible, I don’t want people to use excel to do bioinformatics… I really hope they won’t continue with these plans for bioinformatics, considering all the wrong deeds and problems that Microsoft caused in other fields..

  5. 03.05.2010

    Thanks for you comment. I guess for bioinormaticians this is definitely a terrible news but for biologist I think this is quite a good feature. Writing your own code is always good but not for every small requirement and often biologist have to be dependent on bioinformatics programmer. In Chemistry R&D Excel is used widely for different things, in fact there are Excel add-ins for chemistry such as ISIS Excel and JChem for Excel and I always I wondered why no one given a thought to develop Excel extensions for bioinformatics. On other note I think this is very good to have quality competition here, there is lot of space for everyone and these projects are open source so overall its good thing.

  6. anilbioma
    03.05.2010

    I could not agree more, but their restricted open licensing terms might be a big hurdle for external contributors and early adopters. Lets see how it goes.

  7. 03.05.2010

    NodeXL for visualizing biological network data in Excel http://bit.ly/cVmX2S

  8. 03.12.2010

    Fisheye Perspective – NodeXL for visualizing biological network data in Excel and some other serious MS ventures in http://bit.ly/dkQi6m