This directory contains D.pseudoobscura/D.melanogaster alignments made using the Aug. 2003 D.pseudoobscura assembly (dp1, BCM HGSC Freeze 1) vs. the Jan. 2003 D.melanogaster assembly (dm1, BDGP Release 3.1). Chained blastz alignments are in chain.gz. The chain format is described here: http://www.soe.ucsc.edu/~kent/src/unzipped/hg/mouseStuff/chainFormat.doc A 'net' file that describes rearrangements between the fruitflies and the 'best' melanogaster match to any part of pseudoobscura is in net.gz. The net format is described here: http://www.soe.ucsc.edu/~kent/src/unzipped/hg/mouseStuff/netFormat.doc The alignments in axtNet/ are in 'axt' format. Each alignment contains three lines and is separated from the next alignment by a space: Line 1 - summarizes the alignment. Line 2 - contains the D.pseudoobscura sequence with inserts. Line 3 - contains the D.melanogaster sequence with inserts. The summary line contains 9 blank separated fields with the following meanings: 1 - Alignment number. The first alignment in a file is numbered 0, the next 1, and so forth. 2 - D.pseudoobscura chromosome. 3 - Start in D.pseudoobscura chromosome. The first base is numbered 1. 4 - End in D.pseudoobscura chromosome. The end base is included. 5 - D.melanogaster chromosome. 6 - Start in D.melanogaster. 7 - End in D.melanogaster. 8 - D.melanogaster strand. If this is '-' then the D.melanogaster start/ D.melanogaster end fields are relative to the reverse complemented D.melanogaster chromosome. 9 - Blastz score. The scoring matrix blastz uses is: A C G T A 91 -114 -31 -123 C -114 100 -125 -31 G -31 -125 100 -114 T -123 -31 -114 91 with a gap open penalty of 400 and a gap extension penalty of 30. The minimum score for an alignment to be kept was 3000 for the first pass, and then 2200 for the second pass, which just restricts the search space to the regions between two alignments found in the first pass. The alignments were done with blastz, which is available from Webb Miller's group at PSU. Each chromosome was divided into 10010000 base chunks with 10000 bases of overlap. The .lav format blastz output, which does not include the sequence, was converted to .axt with lavToAxt.