Search This Blog

Tuesday 17 May 2011

Searching with CTLCSC or CSLCSC

If you search the NCBI archives for nitrile hydratase enzymes just using that name you do get a large number of hits. Previous blog posts have recorded the size and rate of change of the number of hits you get if you do simple text searches with the phrase "nitrile hydratase". It is a slightly different story if you look at the RefSeq subset and then search through for the six amino acid sequence which is the metal binding motif for NHases. I am sure there are more elegant ways to do this but the way I did this was to download the relevant search with the sequences in FASTA format, load it up into MS Word and then use the "Find" function to look for each occurrence. [You can use the "reading highlight" tool and it gives you the count in the pop up box immediately].
Anyway the results were
CTLCSC (cobalt centred) -  123 hits
CSLCSC (iron centred) -  19 hits
So there are many more cobalt versions (87%) recorded than iron versions (13%) currently but given the incompleteness of the genome record, I reckon this may represent the current status of what that has been sequenced rather than the relative levels of occurrence in the wild.

No comments: