This file is from: http://hgdownload.cse.ucsc.edu/goldenPath/geoFor1/multiz7way/README.txt This directory contains compressed multiple alignments of the following assemblies to the medium ground finch genome (geoFor1/GRCm38, Dec. 2011): Assemblies used in these alignments: ==== reference assembly: Medium ground finch - Geospiza fortis Apr. 2012 BGI/geoFor1 reference ==== Birds subset: Zebra finch Taeniopygia guttata Jul. 2008 WUGSC 3.2.4/taeGut1 Syntenic net Budgerigar Melopsittacus undulatus Sep. 2011 WUSTL v6.3/melUnd1 Syntenic net Chicken Gallus gallus Nov. 2011 ICGSC 4.0/galGal4 Syntenic net Turkey Meleagris gallopavo Dec. 2009 TGC 2.01/melGal1 Syntenic net ==== Vertebrate subset: Human Homo sapiens Feb. 2009 GRCh37/hg19 Reciprocal best Mouse Mus musculus Dec. 2019 GRCm38/mm10 Reciprocal best --------------------------------------------------------------- These alignments were prepared using the methods described in the track description file: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=geoFor1&g=cons7way based on the phylogenetic tree: geoFor1.7way.nh. Files in this directory: - geoFor1.7way.nh - phylogenetic tree used during the multiz multiple alignment - geoFor1.commonNames.7way.nh - same as geoFor1.7way.nh with the UCSC database name replaced by the common name for the species - upstream1000.xenoRefGene.maf.gz - alignments in regions upstream, see below - upstream2000.xenoRefGene.maf.gz - alignments in regions upstream, see below - upstream5000.xenoRefGene.maf.gz - alignments in regions upstream, see below See also: http://genomewiki.ucsc.edu/index.php/GeoFor1_conservation_alignment The "alignments" directory contains compressed FASTA alignments for the Xeno RefSeq CDS regions of the medium ground finch genome (geoFor1, Apr. 2012) aligned to the assemblies. The maf/geoFor1.7way.maf.gz file contains the alignments to the medium ground finch assembly, with additional annotations to indicate gap context, and genomic breaks for the sequence in the underlying genome assemblies. Beware, the compressed data size of these files is 1.5 Gb, uncompressed is approximately 7.5 Gb. The upstream*.maf.gz files contain alignments in regions upstream of annotated transcription starts for Xeno RefSeq genes with annotated 5' UTRs. These files differ from the standard MAF format: they display alignments that extend from start to end of the upstream region in mouse, whether or not alignments actually exist. In situations where no alignments exist or the alignments of one or more species are missing, dot (".") is used as a placeholder. Multiple regions of an assembly's sequence may align to a single region in the medium ground finch; therefore, only the species name is displayed in the alignment data and no position information is recorded. The alignment score is always zero in these files. These files are updated weekly. For a description of multiple alignment format (MAF), see http://genome.ucsc.edu/goldenPath/help/maf.html. PhastCons conservation scores for these alignments are available at: http://hgdownload.cse.ucsc.edu/goldenPath/geoFor1/phastCons7way PhyloP conservation scores for these alignments are available at: http://hgdownload.cse.ucsc.edu/goldenPath/geoFor1/phyloP7way --------------------------------------------------------------- To download a large file or multiple files from this directory, we recommend that you use rsync or ftp rather than downloading the files via our website. There is approximately 31 Gb of compressed data in this directory. Via rsync: rsync -avz --progress \ rsync://hgdownload.cse.ucsc.edu/goldenPath/geoFor1/multiz7way/ ./ Via FTP: ftp hgdownload.cse.ucsc.edu user name: anonymous password: go to the directory goldenPath/geoFor1/multiz7way To download multiple files from the UNIX command line, use the "mget" command. mget ... - or - mget -a (to download all the files in the directory) Use the "prompt" command to toggle the interactive mode if you do not want to be prompted for each file that you download. --------------------------------------------------------------- All the files in this directory are freely usable for any purpose. For data use restrictions regarding the individual genome assemblies, see http://genome.ucsc.edu/goldenPath/credits.html. ---------------------------------------------------------------