Notes on Using BLAST

| Comments

BLAST (not BLAST+) provides an option for tabular output that is easily parsed. Use the -m 8 option for tabular output, or the -m 9 option to include headers.

blastall -i input.fa -d /path/to/db.fa -p blastn -m 8
Assumes that db.fa is in a directory that also has a correctly formatted database. This can be achieved by:
formatdb -i db.fa -p F
The fields for tabular BLAST output are:
1 Query The query sequence id
2 Subject The matching subject sequence id
3 % id
4 alignment length
5 mistmatches
6 gap openings
7 q.start
8 q.end
9 s.start
10 s.end
11 e-value
12 bit score
Parse the information:


for line in open(“myfile.blast”):
(queryId, subjectId, percIdentity, alnLength, mismatchCount, gapOpenCount, queryStart, queryEnd, subjectStart, subjectEnd, eVal, bitScore) = line.split(“t”)


while (<>) {
($queryId, $subjectId, $percIdentity, $alnLength, $mismatchCount, $gapOpenCount, $queryStart, $queryEnd, $subjectStart, $subjectEnd, $eVal, $bitScore) = split(/t/)