Description

This track shows multiple alignments of 11 primate assemblies on this target/reference assembly (siamang - 2024-01-05 - National Human Genome Research Institute, National Institutes of Health).

This is a composite track with both a multiz calculated multiple alignment and the same set of assemblies used to calculate the 11-way multiple alignment with the cactus alignment procedure.

The multiz multiple alignment is generated using multiz and other tools in the UCSC/Penn State Bioinformatics comparative genomics alignment pipeline.

Gap Annotation

The Display chains between alignments configuration option enables display of gaps between alignment blocks in the pairwise alignments in a manner similar to the Chain track display. Missing sequence in any assembly is highlighted in the track display by regions of yellow when zoomed out and by Ns when displayed at base level. The following conventions are used:

Genomic Breaks

Discontinuities in the genomic context (chromosome, scaffold or region) of the aligned DNA in the aligning species are shown as follows:

Base Level

When zoomed-in to the base-level display, the track shows the base composition of each alignment. The numbers and symbols on the Gaps line indicate the lengths of gaps in the siamang - 2024-01-05 - National Human Genome Research Institute, National Institutes of Health sequence at those alignment positions relative to the longest non-siamang sequence. If there is sufficient space in the display, the size of the gap is shown. If the space is insufficient and the gap size is a multiple of 3, a "*" is displayed; other gap sizes are indicated by "+".

count alignment
percent
assembly and
browser link
maf file
type
common name/assembly date
assembly submitter
01 n/a GCA_028878055.2_NHGRI_mSymSyn1-v2.0_pri reference siamang (v2 Jambi primary hap 2024)/2024-01-05
02 81.412 GCA_028885625.2 syntenic net Bornean orangutan/2024-01-08/NHGRI/NIH
03 81.382 GCA_029281585.2 syntenic net western lowland gorilla/2024-01-08/NHGRI/NIH
04 81.379 GCA_028885655.2 syntenic net Sumatran orangutan/2024-01-05/NHGRI/NIH
05 81.295 hg38 syntenic net Human/hg38/Dec. 2013 (GRCh38/hg38)/GRCh38 Genome Reference Consortium Human Reference 38 (GCA_000001405.15)
06 81.290 GCA_028858775.2 syntenic net chimpanzee/2024-01-08/NHGRI/NIH
07 81.285 GCA_029289425.2 syntenic net pygmy chimpanzee/2024-01-08/NHGRI/NIH
08 81.278 hs1 syntenic net Human/hs1/Jan. 2022 (T2T CHM13v2.0/hs1)/Telomere to telomere (T2T) assembly of haploid CHM13 + chrY (GCA_009914755.4)
09 62.736 GCF_011100555.1 maf net white-tufted-ear marmoset/2021-04-28/VGP
10 29.935 GCF_020740605.2 maf net Ring-tailed lemur/2021-11-04/VGP
11 13.927 GCF_027406575.1 maf net slow loris/2022-12-28/VGP

Alignments identity

showing percent identity, how much of the target is matched by the query
chainssyntenicreciprocal
best
common
name
assembly
81.41280.73677.837Bornean orangutanGCA_028885625.2
81.38280.55977.455western lowland gorillaGCA_029281585.2
81.37980.70577.828Sumatran orangutanGCA_028885655.2
81.29580.44777.315Humanhg38
81.29080.50577.400chimpanzeeGCA_028858775.2
81.28580.44277.384pygmy chimpanzeeGCA_029289425.2
81.27880.47577.353Humanhs1
62.73661.84360.015white-tufted-ear marmosetGCF_011100555.1
29.93529.27728.347Ring-tailed lemurGCF_020740605.2
13.92712.97213.320slow lorisGCF_027406575.1

Display Conventions and Configuration

In full and pack display modes, conservation scores are displayed as a wiggle track (histogram) in which the height reflects the size of the score. The conservation wiggles can be configured in a variety of ways to highlight different aspects of the displayed information. Click the Graph configuration help link for an explanation of the configuration options.

Methods

Pairwise alignments of each species to the siamang//hive/data/genomes/asmHubs/genbankBuild/GCA/028/878/055/GCA_028878055.2_NHGRI_mSymSyn1-v2.0_pri/html/GCA_028878055.2_NHGRI_mSymSyn1-v2.0_pri.names.tab/GCA_028878055.2/2024-01-05 genome are displayed below the conservation histogram as a grayscale density plot (in pack mode) or as a wiggle (in full mode) that indicates alignment quality. In dense display mode, conservation is shown in grayscale using darker values to indicate higher levels of overall conservation as scored by phastCons.

Checkboxes on the track configuration page allow selection of the species to include in the pairwise display. Note that excluding species from the pairwise display does not alter the the conservation score display.

To view detailed information about the alignments at a specific position, zoom the display in to 30,000 or fewer bases, then click on the alignment.

From the cactus alignment, a target specific maf file was extracted from the cactus hal file. This maf file is used to construct the track.

Credits

This track was created using the following programs:

References

Harris RS. Improved pairwise alignment of genomic DNA. Ph.D. Thesis. Pennsylvania State University, USA. 2007.

PhyloP:

Cooper GM, Stone EA, Asimenos G, NISC Comparative Sequencing Program., Green ED, Batzoglou S, Sidow A. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005 Jul;15(7):901-13. PMID: 15965027; PMC: PMC1172034; DOI: 10.1101/gr.3577405

Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 2010 Jan;20(1):110-21. PMID: 19858363; PMC: PMC2798823

Siepel A, Haussler D. Phylogenetic Hidden Markov Models. In: Nielsen R, editor. Statistical Methods in Molecular Evolution. New York: Springer; 2005. pp. 325-351. DOI: 10.1007/0-387-27733-1_12

Siepel A, Pollard KS, and Haussler D. New methods for detecting lineage-specific selection. In Proceedings of the 10th International Conference on Research in Computational Molecular Biology (RECOMB 2006), pp. 190-205. DOI: 10.1007/11732990_17

Armstrong J, Hickey G, Diekhans M, Fiddes IT, Novak AM, Deran A, Fang Q, Xie D, Feng S, Stiller J et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature. 2020 Nov;587(7833):246-251. DOI: 10.1038/s41586-020-2871-y; PMID: 33177663; PMC: PMC7673649