Tuesday, January 13, 2015

Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder

Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.

The Python script below searches for ORFs in six frames and returns the longest one. It doesn't consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA). Six frames are the sequence itself and its reverse complement, and each has three frames obtained by shifting one nucleotide right twice.

There are also local and online tools that perform the same task. My suggestions are:

NCBI ORF Finder (Online)
STAR: Orf by MIT (Online / Local)

No comments: