Search This Blog

Sunday 12 June 2011

A better estimate of the number of NHases.

I have spent some time devising a spreadsheet into which you can download a list of FASTA data, and it will scan through looking for the "CTLCSC" fragment which indicates cobalt-centred and "CSLCSC" fragment which indicates iron centring, and total up how many hits there are of each in the list. It is all a bit more definite than just relying on "nitrile hydratase alpha" to give you accurate numbers!
Using the term "nitrile hydratase alpha" and then selecting the RefSeq selections, you get 121 cobalt centred NHases and 18 iron centred NHases. If on the other hand, you use all those sequences which are tagged as "bacteria", you get 278 cobalt-centred NHases and 131 iron-centred NHases. There are a few things to mention- these numbers don't include those sequences which have X in the tag to indicate that the cysteine is oxidized (these tend to be the PDB linked sequences) but do include some sequences which don't start with M and would make a molecular biologist wince, and the list hasn't been checked for duplication.

No comments: