================================================================
========   addCols   ====================================
================================================================
addCols - Sum columns in a text file.
usage:
   addCols <fileName>
adds all columns (up to 16 columns) in the given file, 
outputs the sum of each column.  <fileName> can be the
name: stdin to accept input from stdin.
================================================================
========   ameme   ====================================
================================================================
ameme - find common patterns in DNA
usage
    ameme good=goodIn.fa [bad=badIn.fa] [numMotifs=2] [background=m1] [maxOcc=2] [motifOutput=fileName] [html=output.html] [gif=output.gif] [rcToo=on] [controlRun=on] [startScanLimit=20] [outputLogo] [constrainer=1]
where goodIn.fa is a multi-sequence fa file containing instances
of the motif you want to find, badIn.fa is a file containing similar
sequences but lacking the motif, numMotifs is the number of motifs
to scan for, background is m0,m1, or m2 for various levels of Markov
models, maxOcc is the maximum occurrences of the motif you 
expect to find in a single sequence and motifOutput is the name 
of a file to store just the motifs in. rcToo=on searches both strands.
If you include controlRun=on in the command line, a random set of 
sequences will be generated that match your foreground data set in size, 
and your background data set in nucleotide probabilities. The program 
will then look for motifs in this random set. If the scores you get in a 
real run are about the same as those you get in a control run, then the motifs
Improbizer has found are probably not significant.

================================================================
========   autoDtd   ====================================
================================================================
autoDtd - Give this a XML document to look at and it will come up with a DTD
to describe it.
usage:
   autoDtd in.xml out.dtd out.stats
options:
   -tree=out.tree - Output tag tree.
   -atree=out.atree - Output attributed tag tree.

================================================================
========   autoSql   ====================================
================================================================
autoSql - create SQL and C code for permanently storing
a structure in database and loading it back into memory
based on a specification file
usage:
    autoSql specFile outRoot {optional: -dbLink -withNull -json} 
This will create outRoot.sql outRoot.c and outRoot.h based
on the contents of specFile. 

options:
  -dbLink - optionally generates code to execute queries and
            updates of the table.
  -addBin - Add an initial bin field and index it as (chrom,bin)
  -withNull - optionally generates code and .sql to enable
              applications to accept and load data into objects
              with potential 'missing data' (NULL in SQL)
              situations.
  -defaultZeros - will put zero and or empty string as default value
  -django - generate method to output object as django model Python code
  -json - generate method to output the object in JSON (JavaScript) format.

================================================================
========   autoXml   ====================================
================================================================
autoXml - Generate structures code and parser for XML file from DTD-like spec
usage:
   autoXml file.dtdx root
This will generate root.c, root.h
options:
   -textField=xxx what to name text between start/end tags. Default 'text'
   -comment=xxx Comment to appear at top of generated code files
   -picky  Generate parser that rejects stuff it doesn't understand
   -main   Put in a main routine that's a test harness
   -prefix=xxx Prefix to add to structure names. By default same as root
   -positive Don't write out optional attributes with negative values

================================================================
========   ave   ====================================
================================================================
ave - Compute average and basic stats
usage:
   ave file
options:
   -col=N Which column to use.  Default 1
   -tableOut - output by columns (default output in rows)
   -noQuartiles - only calculate min,max,mean,standard deviation
                - for large data sets that will not fit in memory.
================================================================
========   aveCols   ====================================
================================================================
aveCols - average together columns
usage:
   aveCols file
adds all columns (up to 16 columns) in the given file, 
outputs the average (sum/#ofRows) of each column.  <fileName> can be the
name: stdin to accept input from stdin.
================================================================
========   axtChain   ====================================
================================================================
axtChain - Chain together axt alignments.
usage:
   axtChain -linearGap=loose in.axt tNibDir qNibDir out.chain
Where tNibDir/qNibDir are either directories full of nib files, or the
name of a .2bit file
options:
   -psl Use psl instead of axt format for input
   -faQ qNibDir is a fasta file with multiple sequences for query
   -faT tNibDir is a fasta file with multiple sequences for target
   -minScore=N  Minimum score for chain, default 1000
   -details=fileName Output some additional chain details
   -scoreScheme=fileName Read the scoring matrix from a blastz-format file
   -linearGap=<medium|loose|filename> Specify type of linearGap to use.
              *Must* specify this argument to one of these choices.
              loose is chicken/human linear gap costs.
              medium is mouse/human linear gap costs.
              Or specify a piecewise linearGap tab delimited file.
   sample linearGap file (loose)
tablesize       11
smallSize       111
position        1       2       3       11      111     2111    12111   32111   72111   152111  252111
qGap    325     360     400     450     600     1100    3600    7600    15600   31600   56600
tGap    325     360     400     450     600     1100    3600    7600    15600   31600   56600
bothGap 625     660     700     750     900     1400    4000    8000    16000   32000   57000

================================================================
========   axtSort   ====================================
================================================================
axtSort - Sort axt files
usage:
   axtSort in.axt out.axt
options:
   -query - Sort by query position, not target
   -byScore - Sort by score

================================================================
========   axtSwap   ====================================
================================================================
axtSwap - Swap source and query in an axt file
usage:
   axtSwap source.axt target.sizes query.sizes dest.axt
options:
   -xxx=XXX

================================================================
========   axtToMaf   ====================================
================================================================
axtToMaf - Convert from axt to maf format
usage:
   axtToMaf in.axt tSizes qSizes out.maf
Where tSizes and qSizes is a file that contains
the sizes of the target and query sequences.
Very often this with be a chrom.sizes file
Options:
    -qPrefix=XX. - add XX. to start of query sequence name in maf
    -tPrefex=YY. - add YY. to start of target sequence name in maf
    -tSplit Create a separate maf file for each target sequence.
            In this case output is a dir rather than a file
            In this case in.maf must be sorted by target.
    -score       - recalculate score 
    -scoreZero   - recalculate score if zero 

================================================================
========   axtToPsl   ====================================
================================================================
axtToPsl - Convert axt to psl format
usage:
   axtToPsl in.axt tSizes qSizes out.psl
Where tSizes and qSizes are tab-delimited files with
       <seqName><size>
columns.
options:
   -xxx=XXX

================================================================
========   bedClip   ====================================
================================================================
bedClip - Remove lines from bed file that refer to off-chromosome places.
usage:
   bedClip input.bed chrom.sizes output.bed
options:
   -verbose=2 - set to get list of lines clipped and why

================================================================
========   bedCommonRegions   ====================================
================================================================
bedCommonRegions - Create a bed file (just bed3) that contains the regions common to all inputs.
Regions are common only if exactly the same chromosome, starts, and end.  Overlap is not enough.
Each region must be in each input at most once. Output is stdout.
usage:
   bedCommonRegions file1 file2 file3 ... fileN

================================================================
========   bedCoverage   ====================================
================================================================
bedCoverage - Analyse coverage by bed files - chromosome by 
chromosome and genome-wide.
usage:
   bedCoverage database bedFile
Note bed file must be sorted by chromosome
   -restrict=restrict.bed Restrict to parts in restrict.bed

================================================================
========   bedExtendRanges   ====================================
================================================================
bedExtendRanges - extend length of entries in bed 6+ data to be at least the given length,
taking strand directionality into account.

usage:
   bedExtendRanges database length files(s)

options:
   -host	mysql host
   -user	mysql user
   -password	mysql password
   -tab		Separate by tabs rather than space
   -verbose=N - verbose level for extra information to STDERR

example:

   bedExtendRanges hg18 250 stdin

   bedExtendRanges -user=genome -host=genome-mysql.cse.ucsc.edu hg18 250 stdin

will transform:
    chr1 500 525 . 100 +
    chr1 1000 1025 . 100 -
to:
    chr1 500 750 . 100 +
    chr1 775 1025 . 100 -

================================================================
========   bedGeneParts   ====================================
================================================================
bedGeneParts - Given a bed, spit out promoter, first exon, or all introns.
usage:
   bedGeneParts part in.bed out.bed
Where part is either 'exons' or 'firstExon' or 'introns' or 'promoter' or 'firstCodingSplice'
or 'secondCodingSplice'
options:
   -proStart=NN - start of promoter relative to txStart, default -100
   -proEnd=NN - end of promoter relative to txStart, default 50

================================================================
========   bedGraphToBigWig   ====================================
================================================================
bedGraphToBigWig v 4 - Convert a bedGraph file to bigWig format.
usage:
   bedGraphToBigWig in.bedGraph chrom.sizes out.bw
where in.bedGraph is a four column file in the format:
      <chrom> <start> <end> <value>
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
The input bedGraph file must be sorted, use the unix sort command:
  sort -k1,1 -k2,2n unsorted.bedGraph > sorted.bedGraph
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -unc - If set, do not use compression.
================================================================
========   bedIntersect   ====================================
================================================================
bedIntersect - Intersect two bed files
usage:
   bedIntersect a.bed b.bed output.bed
options:
   -aHitAny        output all of a if any of it is hit by b
   -minCoverage=0.N  min coverage of b to output match (or if -aHitAny, of a).
                   Not applied to 0-length items.  Default 0.000010
   -bScore         output score from b.bed (must be at least 5 field bed)
   -tab            chop input at tabs not spaces
   -allowStartEqualEnd  Don't discard 0-length items of a or b
                        (e.g. point insertions)

================================================================
========   bedItemOverlapCount   ====================================
================================================================
bedItemOverlapCount - count number of times a base is overlapped by the
	items in a bed file.  Output is bedGraph 4 to stdout.
usage:
 sort bedFile.bed | bedItemOverlapCount [options] <database> stdin
To create a bigWig file from this data to use in a custom track:
 sort -k1,1 bedFile.bed | bedItemOverlapCount [options] <database> stdin \
         > bedFile.bedGraph
 bedGraphToBigWig bedFile.bedGraph chrom.sizes bedFile.bw
   where the chrom.sizes is obtained with the script: fetchChromSizes
   See also:
 http://genome-test.cse.ucsc.edu/~kent/src/unzipped/utils/userApps/fetchChromSizes
options:
   -zero      add blocks with zero count, normally these are ommitted
   -bed12     expect bed12 and count based on blocks
              Without this option, only the first three fields are used.
   -max       if counts per base overflows set to max (4294967295) instead of exiting
   -outBounds output min/max to stderr
   -chromSize=sizefile	Read chrom sizes from file instead of database
             sizefile contains two white space separated fields per line:
		chrom name and size
   -host=hostname	mysql host used to get chrom sizes
   -user=username	mysql user
   -password=password	mysql password

Notes:
 * You may want to separate your + and - strand
   items before sending into this program as it only looks at
   the chrom, start and end columns of the bed file.
 * Program requires a <database> connection to lookup chrom sizes for a sanity
   check of the incoming data.  Even when the -chromSize argument is used
   the <database> must be present, but it will not be used.

 * The bed file *must* be sorted by chrom
 * Maximum count per base is 4294967295. Recompile with new unitSize to increase this
================================================================
========   bedPileUps   ====================================
================================================================
bedPileUps - Find (exact) overlaps if any in bed input
usage:
   bedPileUps in.bed
Where in.bed is in one of the ascii bed formats.
The in.bed file must be sorted by chromosome,start,
  to sort a bed file, use the unix sort command:
     sort -k1,1 -k2,2n unsorted.bed > sorted.bed

Options:
  -name - include BED name field 4 when evaluating uniqueness
  -tab  - use tabs to parse fields
  -verbose=2 - show the location and size of each pileUp

================================================================
========   bedRemoveOverlap   ====================================
================================================================
bedRemoveOverlap - Remove overlapping records from a (sorted) bed file.  Gets rid of
`the smaller of overlapping records.
usage:
   bedRemoveOverlap in.bed out.bed
options:
   -xxx=XXX

================================================================
========   bedRestrictToPositions   ====================================
================================================================
bedRestrictToPositions - Filter bed file, restricting to only ones that match chrom/start/ends specified in restrict.bed file.
usage:
   bedRestrictToPositions in.bed restrict.bed out.bed
options:
   -xxx=XXX

================================================================
========   bedSort   ====================================
================================================================
bedSort - Sort a .bed file by chrom,chromStart
usage:
   bedSort in.bed out.bed
in.bed and out.bed may be the same.
================================================================
========   bedToBigBed   ====================================
================================================================
bedToBigBed v. 2.5 - Convert bed file to bigBed. (BigBed version: 4)
usage:
   bedToBigBed in.bed chrom.sizes out.bb
Where in.bed is in one of the ascii bed formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bb is the output indexed big bed file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
The in.bed file must be sorted by chromosome,start,
  to sort a bed file, use the unix sort command:
     sort -k1,1 -k2,2n unsorted.bed > sorted.bed

options:
   -type=bedN[+[P]] : 
                      N is between 3 and 15, 
                      optional (+) if extra "bedPlus" fields, 
                      optional P specifies the number of extra fields. Not required, but preferred.
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      (see http://genome.ucsc.edu/FAQ/FAQformat.html#format1)
   -as=fields.as - If you have non-standard "bedPlus" fields, it's great to put a definition
                   of each field in a row in AutoSql format here.
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 512
   -unc - If set, do not use compression.
   -tab - If set, expect fields to be tab separated, normally
           expects white space separator.
   -extraIndex=fieldList - If set, make an index on each field in a comma separated list
           extraIndex=name and extraIndex=name,id are commonly used.

================================================================
========   bedToExons   ====================================
================================================================
bedToExons - Split a bed up into individual beds.
One for each internal exon.
usage:
   bedToExons originalBeds.bed splitBeds.bed
options:
   -cdsOnly - Only output the coding portions of exons.

================================================================
========   bedToGenePred   ====================================
================================================================
Too few arguments:
bedToGenePred - convert bed format files to genePred format
usage:
   bedToGenePred bedFile genePredFile

Convert a bed file to a genePred file. If BED has at least 12 columns,
then a genePred with blocks is created. Otherwise single-exon genePreds are
created.

================================================================
========   bedToPsl   ====================================
================================================================
Too few arguments:
bedToPsl - convert bed format files to psl format
usage:
   bedToPsl chromSizes bedFile pslFile

Convert a BED file to a PSL file. This the result is an alignment.
 It is intended to allow processing by tools that operate on PSL.
If the BED has at least 12 columns, then a PSL with blocks is created.
Otherwise single-exon PSLs are created.

Options:
-keepQuery  -  instead of creating a fake query, create PSL with identical query and
                target specs. Useful if bed features are to be lifted with pslMap and one 
                wants to keep the source location in the lift result.

================================================================
========   bedWeedOverlapping   ====================================
================================================================
bedWeedOverlapping - Filter out beds that overlap a 'weed.bed' file.
usage:
   bedWeedOverlapping weeds.bed input.bed output.bed
options:
   -maxOverlap=0.N - maximum overlapping ratio, default 0 (any overlap)
   -invert - keep the overlapping and get rid of everything else

================================================================
========   bigBedInfo   ====================================
================================================================
bigBedInfo - Show information about a bigBed file.
usage:
   bigBedInfo file.bb
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -chroms - list all chromosomes and their sizes
   -zooms - list all zoom levels and their sizes
   -as - get autoSql spec
   -extraIndex - list all the extra indexes

================================================================
========   bigBedNamedItems   ====================================
================================================================
bigBedNamedItems - Extract item of given name from bigBed
usage:
   bigBedNamedItems file.bb name output.bed
options:
   -nameFile - if set, treat name parameter as file full of space delimited names
   -field=fieldName - use index on field name, default is "name"

================================================================
========   bigBedSummary   ====================================
================================================================
bigBedSummary - Extract summary information from a bigBed file.
usage:
   bigBedSummary file.bb chrom start end dataPoints
Get summary data from bigBed for indicated region, broken into
dataPoints equal parts.  (Use dataPoints=1 for simple summary.)
options:
   -type=X where X is one of:
         coverage - % of region that is covered (default)
         mean - average depth of covered regions
         min - minimum depth of covered regions
         max - maximum depth of covered regions
   -fields - print out information on fields in file.
      If fields option is used, the chrom, start, end, dataPoints
      parameters may be omitted
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigBedToBed   ====================================
================================================================
bigBedToBed - Convert from bigBed to ascii bed format.
usage:
   bigBedToBed input.bb output.bed
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -maxItems=N - if set, restrict output to first N items
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigAverageOverBed   ====================================
================================================================
bigWigAverageOverBed - Compute average score of big wig over each bed, which may have introns.
usage:
   bigWigAverageOverBed in.bw in.bed out.tab
The output columns are:
   name - name field from bed, which should be unique
   size - size of bed (sum of exon sizes
   covered - # bases within exons covered by bigWig
   sum - sum of values over all bases covered
   mean0 - average over bases with non-covered bases counting as zeroes
   mean - average over just covered bases
Options:
   -bedOut=out.bed - Make output bed that is echo of input bed but with mean column appended
   -sampleAroundCenter=N - Take sample at region N bases wide centered around bed item, rather
                     than the usual sample in the bed item.

================================================================
========   bigWigCorrelate   ====================================
================================================================
bigWigCorrelate - Correlate bigWig files, optionally only on target regions.
usage:
   bigWigCorrelate a.bigWig b.bigWig
options:
   -restrict=restrict.bigBed - restrict correlation to parts covered by this file
   -threshold=N.N - clip values to this threshold

================================================================
========   bigWigInfo   ====================================
================================================================
bigWigInfo - Print out information about bigWig file.
usage:
   bigWigInfo file.bw
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
   -chroms - list all chromosomes and their sizes
   -zooms - list all zoom levels and their sizes
   -minMax - list the min and max on a single line

================================================================
========   bigWigMerge   ====================================
================================================================
bigWigMerge - Merge together multiple bigWigs into a single output bedGraph.
You'll have to run bedGraphToBigWig to make the output bigWig.
The signal values are just added together to merge them
usage:
   bigWigMerge in1.bw in2.bw .. inN.bw out.bedGraph
options:
   -threshold=0.N - don't output values at or below this threshold. Default is 0.0
   -adjust=0.N - add adjustment to each value
   -clip=NNN.N - values higher than this are clipped to this value

================================================================
========   bigWigSummary   ====================================
================================================================
bigWigSummary - Extract summary information from a bigWig file.
usage:
   bigWigSummary file.bigWig chrom start end dataPoints
Get summary data from bigWig for indicated region, broken into
dataPoints equal parts.  (Use dataPoints=1 for simple summary.)

NOTE:  start and end coordinates are in BED format (0-based)

options:
   -type=X where X is one of:
         mean - average value in region (default)
         min - minimum value in region
         max - maximum value in region
         std - standard deviation in region
         coverage - % of region that is covered
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigToBedGraph   ====================================
================================================================
bigWigToBedGraph - Convert from bigWig to bedGraph format.
usage:
   bigWigToBedGraph in.bigWig out.bedGraph
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   bigWigToWig   ====================================
================================================================
bigWigToWig - Convert bigWig to wig.  This will keep more of the same structure of the
original wig than bigWigToBedGraph does, but still will break up large stepped sections
into smaller ones.
usage:
   bigWigToWig in.bigWig out.wig
options:
   -chrom=chr1 - if set restrict output to given chromosome
   -start=N - if set, restrict output to only that over start
   -end=N - if set, restict output to only that under end
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   blastToPsl   ====================================
================================================================
blastToPsl - Convert blast alignments to PSLs.

usage:
   blastToPsl [options] blastOutput psl

Options:
  -scores=file - Write score information to this file.  Format is:
       strands qName qStart qEnd tName tStart tEnd bitscore eVal
  -verbose=n - n >= 3 prints each line of file after parsing.
               n >= 4 dumps the result of each query
  -eVal=n n is e-value threshold to filter results. Format can be either
          an integer, double or 1e-10. Default is no filter.
  -pslx - create PSLX output (includes sequences for blocks)

Output only results of last round from PSI BLAST

================================================================
========   blastXmlToPsl   ====================================
================================================================
blastXmlToPsl - convert blast XML output to PSLs
usage:
   blastXmlToPsl [options] blastXml psl

options:
  -scores=file - Write score information to this file.  Format is:
       strands qName qStart qEnd tName tStart tEnd bitscore eVal qDef tDef
  -verbose=n - n >= 3 prints each line of file after parsing.
               n >= 4 dumps the result of each query
  -eVal=n n is e-value threshold to filter results. Format can be either
          an integer, double or 1e-10. Default is no filter.
  -pslx - create PSLX output (includes sequences for blocks)
  -convertToNucCoords - convert protein to nucleic alignments to nucleic
   to nucleic coordinates
  -qName=src - define element used to obtain the qName.  The following
   values are support:
     o query-ID - use contents of the <Iteration_query-ID> element if it
       exists, otherwise use <BlastOutput_query-ID>
     o query-def0 - use the first white-space separated word of the
       <Iteration_query-def> element if it exists, otherwise the first word
       of <BlastOutput_query-def>.
   Default is query-def0.
  -tName=src - define element used to obtain the tName.  The following
   values are support:
     o Hit_id - use contents of the <Hit-id> element.
     o Hit_def0 - use the first white-space separated word of the
       <Hit_def> element.
     o Hit_accession - contents of the <Hit_accession> element.
   Default is Hit-def0.
  -forcePsiBlast - treat as output of PSI-BLAST. blast-2.2.16 and maybe
   others indentify psiblast as blastp.
Output only results of last round from PSI BLAST

================================================================
========   blat   ====================================
================================================================
blat - Standalone BLAT v. 35x1 fast sequence search command line tool
usage:
   blat database query [-ooc=11.ooc] output.psl
where:
   database and query are each either a .fa , .nib or .2bit file,
   or a list these files one file name per line.
   -ooc=11.ooc tells the program to load over-occurring 11-mers from
               and external file.  This will increase the speed
               by a factor of 40 in many cases, but is not required
   output.psl is where to put the output.
   Subranges of nib and .2bit files may specified using the syntax:
      /path/file.nib:seqid:start-end
   or
      /path/file.2bit:seqid:start-end
   or
      /path/file.nib:start-end
   With the second form, a sequence id of file:start-end will be used.
options:
   -t=type     Database type.  Type is one of:
                 dna - DNA sequence
                 prot - protein sequence
                 dnax - DNA sequence translated in six frames to protein
               The default is dna
   -q=type     Query type.  Type is one of:
                 dna - DNA sequence
                 rna - RNA sequence
                 prot - protein sequence
                 dnax - DNA sequence translated in six frames to protein
                 rnax - DNA sequence translated in three frames to protein
               The default is dna
   -prot       Synonymous with -t=prot -q=prot
   -ooc=N.ooc  Use overused tile file N.ooc.  N should correspond to 
               the tileSize
   -tileSize=N sets the size of match that triggers an alignment.  
               Usually between 8 and 12
               Default is 11 for DNA and 5 for protein.
   -stepSize=N spacing between tiles. Default is tileSize.
   -oneOff=N   If set to 1 this allows one mismatch in tile and still
               triggers an alignments.  Default is 0.
   -minMatch=N sets the number of tile matches.  Usually set from 2 to 4
               Default is 2 for nucleotide, 1 for protein.
   -minScore=N sets minimum score.  This is the matches minus the 
               mismatches minus some sort of gap penalty.  Default is 30
   -minIdentity=N Sets minimum sequence identity (in percent).  Default is
               90 for nucleotide searches, 25 for protein or translated
               protein searches.
   -maxGap=N   sets the size of maximum gap between tiles in a clump.  Usually
               set from 0 to 3.  Default is 2. Only relevent for minMatch > 1.
   -noHead     suppress .psl header (so it's just a tab-separated file)
   -makeOoc=N.ooc Make overused tile file. Target needs to be complete genome.
   -repMatch=N sets the number of repetitions of a tile allowed before
               it is marked as overused.  Typically this is 256 for tileSize
               12, 1024 for tile size 11, 4096 for tile size 10.
               Default is 1024.  Typically only comes into play with makeOoc.
               Also affected by stepSize. When stepSize is halved repMatch is
               doubled to compensate.
   -mask=type  Mask out repeats.  Alignments won't be started in masked region
               but may extend through it in nucleotide searches.  Masked areas
               are ignored entirely in protein or translated searches. Types are
                 lower - mask out lower cased sequence
                 upper - mask out upper cased sequence
                 out   - mask according to database.out RepeatMasker .out file
                 file.out - mask database according to RepeatMasker file.out
   -qMask=type Mask out repeats in query sequence.  Similar to -mask above but
               for query rather than target sequence.
   -repeats=type Type is same as mask types above.  Repeat bases will not be
               masked in any way, but matches in repeat areas will be reported
               separately from matches in other areas in the psl output.
   -minRepDivergence=NN - minimum percent divergence of repeats to allow 
               them to be unmasked.  Default is 15.  Only relevant for 
               masking using RepeatMasker .out files.
   -dots=N     Output dot every N sequences to show program's progress
   -trimT      Trim leading poly-T
   -noTrimA    Don't trim trailing poly-A
   -trimHardA  Remove poly-A tail from qSize as well as alignments in 
               psl output
   -fastMap    Run for fast DNA/DNA remapping - not allowing introns, 
               requiring high %ID. Query sizes must not exceed 5000.
   -out=type   Controls output file format.  Type is one of:
                   psl - Default.  Tab separated format, no sequence
                   pslx - Tab separated format with sequence
                   axt - blastz-associated axt format
                   maf - multiz-associated maf format
                   sim4 - similar to sim4 format
                   wublast - similar to wublast format
                   blast - similar to NCBI blast format
                   blast8- NCBI blast tabular format
                   blast9 - NCBI blast tabular format with comments
   -fine       For high quality mRNAs look harder for small initial and
               terminal exons.  Not recommended for ESTs
   -maxIntron=N  Sets maximum intron size. Default is 750000
   -extendThroughN - Allows extension of alignment through large blocks of N's
================================================================
========   calc   ====================================
================================================================
calc - Little command line calculator
usage:
   calc this + that * theOther / (a + b)

================================================================
========   catDir   ====================================
================================================================
catDir - concatenate files in directory to stdout.
For those times when too many files for cat to handle.
usage:
   catDir dir(s)
options:
   -r            Recurse into subdirectories
   -suffix=.suf  This will restrict things to files ending in .suf
   '-wild=*.???' This will match wildcards.
   -nonz         Prints file name of non-zero length files

================================================================
========   catUncomment   ====================================
================================================================
catUncomment - Concatenate input removing lines that start with '#'
Output goes to stdout
usage:
   catUncomment file(s)

================================================================
========   chainAntiRepeat   ====================================
================================================================
chainAntiRepeat - Get rid of chains that are primarily the results of repeats and degenerate DNA
usage:
   chainAntiRepeat tNibDir qNibDir inChain outChain
options:
   -minScore=N - minimum score (after repeat stuff) to pass
   -noCheckScore=N - score that will pass without checks (speed tweak)

================================================================
========   chainFilter   ====================================
================================================================
chainFilter - Filter chain files.  Output goes to standard out.
usage:
   chainFilter file(s)
options:
   -q=chr1,chr2 - restrict query side sequence to those named
   -notQ=chr1,chr2 - restrict query side sequence to those not named
   -t=chr1,chr2 - restrict target side sequence to those named
   -notT=chr1,chr2 - restrict target side sequence to those not named
   -id=N - only get one with ID number matching N
   -minScore=N - restrict to those scoring at least N
   -maxScore=N - restrict to those scoring less than N
   -qStartMin=N - restrict to those with qStart at least N
   -qStartMax=N - restrict to those with qStart less than N
   -qEndMin=N - restrict to those with qEnd at least N
   -qEndMax=N - restrict to those with qEnd less than N
   -tStartMin=N - restrict to those with tStart at least N
   -tStartMax=N - restrict to those with tStart less than N
   -tEndMin=N - restrict to those with tEnd at least N
   -tEndMax=N - restrict to those with tEnd less than N
   -qOverlapStart=N - restrict to those where the query overlaps a region starting here
   -qOverlapEnd=N - restrict to those where the query overlaps a region ending here
   -tOverlapStart=N - restrict to those where the target overlaps a region starting here
   -tOverlapEnd=N - restrict to those where the target overlaps a region ending here
   -strand=?    -restrict strand (to + or -)
   -long        -output in long format
   -zeroGap     -get rid of gaps of length zero
   -minGapless=N - pass those with minimum gapless block of at least N
   -qMinGap=N     - pass those with minimum gap size of at least N
   -tMinGap=N     - pass those with minimum gap size of at least N
   -qMaxGap=N     - pass those with maximum gap size no larger than N
   -tMaxGap=N     - pass those with maximum gap size no larger than N
   -qMinSize=N    - minimum size of spanned query region
   -qMaxSize=N    - maximum size of spanned query region
   -tMinSize=N    - minimum size of spanned target region
   -tMaxSize=N    - maximum size of spanned target region
   -noRandom      - suppress chains involving '_random' chromosomes
   -noHap         - suppress chains involving '_hap' chromosomes

================================================================
========   chainMergeSort   ====================================
================================================================
chainMergeSort - Combine sorted files into larger sorted file
usage:
   chainMergeSort file(s)
Output goes to standard output
options:
   -saveId - keep the existing chain ids.
   -inputList=somefile - somefile contains list of input chain files.
   -tempDir=somedir/ - somedir has space for temporary sorting data, default ./

================================================================
========   chainNet   ====================================
================================================================
chainNet - Make alignment nets out of chains
usage:
   chainNet in.chain target.sizes query.sizes target.net query.net
where:
   in.chain is the chain file sorted by score
   target.sizes contains the size of the target sequences
   query.sizes contains the size of the query sequences
   target.net is the output over the target genome
   query.net is the output over the query genome
options:
   -minSpace=N - minimum gap size to fill, default 25
   -minFill=N  - default half of minSpace
   -minScore=N - minimum chain score to consider, default 2000.0
   -verbose=N - Alter verbosity (default 1)
   -inclHap - include query sequences name in the form *_hap*. Normally
              these are excluded from nets as being haplotype
              pseudochromosomes

================================================================
========   chainPreNet   ====================================
================================================================
chainPreNet - Remove chains that don't have a chance of being netted
usage:
   chainPreNet in.chain target.sizes query.sizes out.chain
options:
   -dots=N - output a dot every so often
   -pad=N - extra to pad around blocks to decrease trash
            (default 1)
   -inclHap - include query sequences name in the form *_hap*. Normally
              these are excluded from nets as being haplotype
              pseudochromosomes

================================================================
========   chainSort   ====================================
================================================================
chainSort - Sort chains.  By default sorts by score.
Note this loads all chains into memory, so it is not
suitable for large sets.  Instead, run chainSort on
multiple small files, followed by chainMergeSort.
usage:
   chainSort inFile outFile
Note that inFile and outFile can be the same
options:
   -target sort on target start rather than score
   -query sort on query start rather than score
   -index=out.tab build simple two column index file
                    <out file position>  <value>
                  where <value> is score, target, or query 
                  depending on the sort.

================================================================
========   chainSplit   ====================================
================================================================
chainSplit - Split chains up by target or query sequence
usage:
   chainSplit outDir inChain(s)
options:
   -q  - Split on query (default is on target)
   -lump=N  Lump together so have only N split files.

================================================================
========   chainStitchId   ====================================
================================================================
chainStitchId - Join chain fragments with the same chain ID into a single
   chain per ID.  Chain fragments must be from same original chain but
   must not overlap.  Chain fragment scores are summed.
usage:
   chainStitchId in.chain out.chain

================================================================
========   chainSwap   ====================================
================================================================
chainSwap - Swap target and query in chain
usage:
   chainSwap in.chain out.chain

================================================================
========   chainToAxt   ====================================
================================================================
chainToAxt - Convert from chain to axt file
usage:
   chainToAxt in.chain tNibDirOr2bit qNibDirOr2bit out.axt
options:
   -maxGap=maximum gap sized allowed without breaking, default 100
   -maxChain=maximum chain size allowed without breaking, default 1073741823
   -minScore=minimum score of chain
   -minId=minimum percentage ID within blocks
   -bed  Output bed instead of axt

================================================================
========   chainToPsl   ====================================
================================================================
chainToPsl - Convert chain file to psl format
usage:
   chainToPsl in.chain tSizes qSizes target.lst query.lst out.psl
Where tSizes and qSizes are tab-delimited files with
       <seqName><size>
columns.
The target and query lists can either be fasta files, nib files, 2bit files
or a list of fasta, 2bit and/or nib files one per line
options:
   -tMasked - If specified, the target is soft-masked and the repMatch counts are
    computed

================================================================
========   checkAgpAndFa   ====================================
================================================================

checkAgpAndFa - takes a .agp file and .fa file and ensures that they are in synch
usage:

   checkAgpAndFa in.agp in.fa

options:
   -exclude=seq - Ignore seq (e.g. chrM for which we usually get
                  sequence from GenBank but don't have AGP)
in.fa can be a .2bit file.  If it is .fa then sequences must appear
in the same order in .agp and .fa.


================================================================
========   checkCoverageGaps   ====================================
================================================================
checkCoverageGaps - Check for biggest gap in coverage for a list of tracks.
For most tracks coverage of 10,000,000 or more will indicate that there was
a mistake in generating the track.
usage:
   checkCoverageGaps database track1 ... trackN
Note: for bigWig and bigBeds, the biggest gap is rounded to the nearest 10,000 or so
options:
   -allParts  If set then include _hap and _random and other wierd chroms
   -female If set then don't check chrY
   -noComma - Don't put commas in biggest gap output

================================================================
========   checkHgFindSpec   ====================================
================================================================
checkHgFindSpec - test and describe search specs in hgFindSpec tables.
usage:
  checkHgFindSpec database [options | termToSearch]
If given a termToSearch, displays the list of tables that will be searched
and how long it took to figure that out; then performs the search and the
time it took.
options:
  -showSearches       Show the order in which tables will be searched in
                      general.  [This will be done anyway if no
                      termToSearch or options are specified.]
  -checkTermRegex     For each search spec that includes a regular
                      expression for terms, make sure that all values of
                      the table field to be searched match the regex.  (If
                      not, some of them could be excluded from searches.)
  -checkIndexes       Make sure that an index is defined on each field to
                      be searched.

================================================================
========   checkTableCoords   ====================================
================================================================
checkTableCoords - check invariants on genomic coords in table(s).
usage:
  checkTableCoords database [tableName]
Searches for illegal genomic coordinates in all tables in database
unless narrowed down using options.  Uses ~/.hg.conf to determine
genome database connection info.  For psl/alignment tables, checks
target coords only.
options:
  -table=tableName  Check this table only.  (Default: all tables)
  -daysOld=N        Check tables that have been modified at most N days ago.
  -hoursOld=N       Check tables that have been modified at most N hours ago.
                    (days and hours are additive)
  -exclude=patList  Exclude tables matching any pattern in comma-separated
                    patList.  patList can contain wildcards (*?) but should
                    be escaped or single-quoted if it does.  patList can
                    contain "genbank" which will be expanded to all tables
                    generated by the automated genbank build process.
  -ignoreBlocks     To save time (but lose coverage), skip block coord checks.
  -verboseBlocks    Print out more details about illegal block coords, since 
                    they can't be found by simple SQL queries.

================================================================
========   chopFaLines   ====================================
================================================================
chopFaLines - Read in FA file with long lines and rewrite it with shorter lines
usage:
   chopFaLines in.fa out.fa

================================================================
========   chromGraphFromBin   ====================================
================================================================
chromGraphFromBin - Convert chromGraph binary to ascii format.
usage:
   chromGraphFromBin in.chromGraph out.tab
options:
   -chrom=chrX - restrict output to single chromosome

================================================================
========   chromGraphToBin   ====================================
================================================================
chromGraphToBin - Make binary version of chromGraph.
usage:
   chromGraphToBin in.tab out.chromGraph
options:
   -xxx=XXX

================================================================
========   colTransform   ====================================
================================================================
colTransform - Add and/or multiply column by constant.
usage:
   colTransform column input.tab addFactor mulFactor output.tab
where:
   column is the column to transform, starting with 1
   input.tab is the tab delimited input file
   addFactor is what to add.  Use 0 here to not change anything
   mulFactor is what to multiply by.  Use 1 here not to change anything
   output.tab is the tab delimited output file

================================================================
========   countChars   ====================================
================================================================
countChars - Count the number of occurences of a particular char
usage:
   countChars char file(s)
Char can either be a two digit hexadecimal value or
a single letter literal character
================================================================
========   crTreeIndexBed   ====================================
================================================================
crTreeIndexBed - Create an index for a bed file.
usage:
   crTreeIndexBed in.bed out.cr
options:
   -blockSize=N - number of children per node in index tree. Default 1024
   -itemsPerSlot=N - number of items per index slot. Default is half block size
   -noCheckSort - Don't check sorting order of in.tab

================================================================
========   crTreeSearchBed   ====================================
================================================================
crTreeSearchBed - Search a crTree indexed bed file and print all items that overlap query.
usage:
   crTreeSearchBed file.bed index.cr chrom start end

================================================================
========   dbSnoop   ====================================
================================================================
dbSnoop - Produce an overview of a database.
usage:
   dbSnoop database output
options:
   -unsplit - if set will merge together tables split by chromosome
   -noNumberCommas - if set will leave out commas in big numbers

================================================================
========   dbTrash   ====================================
================================================================
dbTrash - drop tables from a database older than specified N hours
usage:
   dbTrash -age=N [-drop] [-historyToo] [-db=<DB>] [-verbose=N]
options:
   -age=N - number of hours old to qualify for drop.  N can be a float.
   -drop - actually drop the tables, default is merely to display tables.
   -db=<DB> - Specify a database to work with, default is customTrash.
   -historyToo - also consider the table called 'history' for deletion.
               - default is to leave 'history' alone no matter how old.
               - this applies to the table 'metaInfo' also.
   -extFile    - check extFile for lines that reference files
               - no longer in trash
   -extDel     - delete lines in extFile that fail file check
               - otherwise just verbose(2) lines that would be deleted
   -topDir     - directory name to prepend to file names in extFile
               - default is /usr/local/apache/trash
               - file names in extFile are typically: "../trash/ct/..."
   -tableStatus  - use 'show table status' to get size data, very inefficient
   -delLostTable - delete tables that exist but are missing from metaInfo
                 - this operation can be even slower than -tableStatus
                 - if there are many tables to check.
   -verbose=N - 2 == show arguments, dates, and dropped tables,
              - 3 == show date information for all tables.
================================================================
========   estOrient   ====================================
================================================================
wrong # of args:
estOrient [options] db estTable outPsl

Read ESTs from a database and determine orientation based on
estOrientInfo table or direction in gbCdnaInfo table.  Update
PSLs so that the strand reflects the direction of transcription.
By default, PSLs where the direction can't be determined are dropped.

Options:
   -chrom=chr - process this chromosome, maybe repeated
   -keepDisoriented - don't drop ESTs where orientation can't
    be determined.
   -disoriented=psl - output ESTs that where orientation can't
    be determined to this file.
   -inclVer - add NCBI version number to accession if not already
    present.
   -fileInput - estTable is a psl file
   -estOrientInfo=file - instead of getting the orientation information
    from the estOrientInfo table, load it from this file.  This data is the
    output of polyInfo command.  If this option is specified, the direction
    will not be looked up in the gbCdnaInfo table and db can be `no'.
   -info=infoFile - write information about each EST to this tab
    separated file 
       qName tName tStart tEnd origStrand newStrand orient
    where orient is < 0 if PSL was reverse, > 0 if it was left unchanged
    and 0 if the orientation couldn't be determined (and was left
    unchanged).

================================================================
========   faCmp   ====================================
================================================================
faCmp - Compare two .fa files
usage:
   faCmp [options] a.fa b.fa
options:
    -softMask - use the soft masking information during the compare
                Differences will be noted if the masking is different.
    -sortName - sort input files by name before comparing
    -peptide - read as peptide sequences
default:
    no masking information is used during compare.  It is as if both
    sequences were not masked.

Exit codes:
   - 0 if files are the same
   - 1 if files differ
   - 255 on an error


================================================================
========   faCount   ====================================
================================================================
faCount - count base statistics and CpGs in FA files.
usage:
   faCount file(s).fa
     -summary  show only summary statistics
     -dinuc    include statistics on dinucletoide frequencies
     -strands  count bases on both strands

================================================================
========   faFilter   ====================================
================================================================
faFilter - Filter fa records, selecting ones that match the specified conditions
usage:
   faFilter [options] in.fa out.fa

Options:
    -name=wildCard  - Only pass records where name matches wildcard
                      * matches any string or no character.
                      ? matches any single character.
                      anything else etc must match the character exactly
                      (these will will need to be quoted for the shell)
    -namePatList=filename - A list of regular expressions, one per line, that
                            will be applied to the fasta name the same as -name
    -v - invert match, select non-matching records.
    -minSize=N - Only pass sequences at least this big.
    -maxSize=N - Only pass sequences this size or smaller.
    -maxN=N Only pass sequences with fewer than this number of N's
    -uniq - Removes duplicate sequence ids, keeping the first.
    -i    - make -uniq ignore case so sequence IDs ABC and abc count as dupes.

All specified conditions must pass to pass a sequence.  If no conditions are
specified, all records will be passed.

================================================================
========   faFilterN   ====================================
================================================================
faFilterN - Get rid of sequences with too many N's
usage:
   faFilterN in.fa out.fa maxPercentN
options:
   -out=in.fa.out
   -uniq=self.psl

================================================================
========   faFrag   ====================================
================================================================
faFrag - Extract a piece of DNA from a .fa file.
usage:
   faFrag in.fa start end out.fa
options:
   -mixed - preserve mixed-case in FASTA file

================================================================
========   faNoise   ====================================
================================================================
faNoise - Add noise to .fa file
usage:
   faNoise inName outName transitionPpt transversionPpt insertPpt deletePpt chimeraPpt
options:
   -upper - output in upper case

================================================================
========   faOneRecord   ====================================
================================================================
faOneRecord - Extract a single record from a .FA file
usage:
   faOneRecord in.fa recordName

================================================================
========   faPolyASizes   ====================================
================================================================
faPolyASizes - get poly A sizes
usage:
   faPolyASizes in.fa out.tab

output file has four columns:
   id seqSize tailPolyASize headPolyTSize

options:

================================================================
========   faRandomize   ====================================
================================================================
faRandomize - Program to create random fasta records
usage:
  faRandomize [-seed=N] in.fa randomized.fa
    Use optional -seed argument to specify seed (integer) for random
    number generator (rand).  Generated sequence has the
    same base frequency as seen in original fasta records.
================================================================
========   faRc   ====================================
================================================================
faRc - Reverse complement a FA file
usage:
   faRc in.fa out.fa
In.fa and out.fa may be the same file.
options:
   -keepName - keep name identical (don't prepend RC)
   -keepCase - works well for ACGTUN in either case. bizarre for other letters.
               without it bases are turned to lower, all else to n's
   -justReverse - prepends R unless asked to keep name
   -justComplement - prepends C unless asked to keep name
                     (cannot appear together with -justReverse)

================================================================
========   faSize   ====================================
================================================================
faSize - print total base count in fa files.
usage:
   faSize file(s).fa
Command flags
   -detailed        outputs name and size of each record
                    has the side effect of printing nothing else
   -tab             output statistics in a tab separated format

================================================================
========   faSomeRecords   ====================================
================================================================
faSomeRecords - Extract multiple fa records
usage:
   faSomeRecords in.fa listFile out.fa
options:
   -exclude - output sequences not in the list file.

================================================================
========   faSplit   ====================================
================================================================
faSplit - Split an fa file into several files.
usage:
   faSplit how input.fa count outRoot
where how is either 'about' 'byname' 'base' 'gap' 'sequence' or 'size'.  
Files split by sequence will be broken at the nearest fa record boundary. 
Files split by base will be broken at any base.  
Files broken by size will be broken every count bases.

Examples:
   faSplit sequence estAll.fa 100 est
This will break up estAll.fa into 100 files
(numbered est001.fa est002.fa, ... est100.fa
Files will only be broken at fa record boundaries

   faSplit base chr1.fa 10 1_
This will break up chr1.fa into 10 files

   faSplit size input.fa 2000 outRoot
This breaks up input.fa into 2000 base chunks

   faSplit about est.fa 20000 outRoot
This will break up est.fa into files of about 20000 bytes each by record.

   faSplit byname scaffolds.fa outRoot/ 
This breaks up scaffolds.fa using sequence names as file names.
       Use the terminating / on the outRoot to get it to work correctly.

   faSplit gap chrN.fa 20000 outRoot
This breaks up chrN.fa into files of at most 20000 bases each, 
at gap boundaries if possible.  If the sequence ends in N's, the last
piece, if larger than 20000, will be all one piece.

Options:
    -verbose=2 - Write names of each file created (=3 more details)
    -maxN=N - Suppress pieces with more than maxN n's.  Only used with size.
              default is size-1 (only suppresses pieces that are all N).
    -oneFile - Put output in one file. Only used with size
    -extra=N - Add N extra bytes at the end to form overlapping pieces.  Only used with size.
    -out=outFile Get masking from outfile.  Only used with size.
    -lift=file.lft Put info on how to reconstruct sequence from
                   pieces in file.lft.  Only used with size and gap.
    -minGapSize=X Consider a block of Ns to be a gap if block size >= X.
                  Default value 1000.  Only used with gap.
    -noGapDrops - include all N's when splitting by gap.
    -outDirDepth=N Create N levels of output directory under current dir.
                   This helps prevent NFS problems with a large number of
                   file in a directory.  Using -outDirDepth=3 would
                   produce ./1/2/3/outRoot123.fa.
    -prefixLength=N - used with byname option. create a separate output
                   file for each group of sequences names with same prefix
                   of length N.

================================================================
========   faToFastq   ====================================
================================================================
faToFastq - Convert fa to fastq format, just faking quality values.
usage:
   faToFastq in.fa out.fastq
options:
   -qual=X quality letter to use.  Default is '<' which is good I think....

================================================================
========   faToTab   ====================================
================================================================
faToTab - convert fa file to tab separated file
usage:
   faToTab infileName outFileName
options:
     -type=seqType   sequence type, dna or protein, default is dna
     -keepAccSuffix - don't strip dot version off of sequence id, keep as is

================================================================
========   faToTwoBit   ====================================
================================================================
faToTwoBit - Convert DNA from fasta to 2bit format
usage:
   faToTwoBit in.fa [in2.fa in3.fa ...] out.2bit
options:
   -noMask       - Ignore lower-case masking in fa file.
   -stripVersion - Strip off version number after . for genbank accessions.
   -ignoreDups   - only convert first sequence if there are duplicate sequence
                 - names.  Use 'twoBitDup' to find duplicate sequences.
================================================================
========   faTrans   ====================================
================================================================
faTrans - Translate DNA .fa file to peptide
usage:
   faTrans in.fa out.fa
options:
   -stop stop at first stop codon (otherwise puts in Z for stop codons)
   -offset=N start at a particular offset.
   -cdsUpper - cds is in upper case

================================================================
========   fastqToFa   ====================================
================================================================
fastqToFa - Convert from fastq to fasta format.
usage:
   fastqToFa [options] in.fastq out.fa
options:
   -nameVerify='string' - for multi-line fastq files, 'string' must
	match somewhere in the sequence names in order to correctly
	identify the next sequence block (e.g.: -nameVerify='Supercontig_')
   -qual=file.qual.fa - output quality scores to specifed file
	(default: quality scores are ignored)
   -qualSizes=qual.sizes - write sizes file for the quality scores
   -noErrors - warn only on problems, do not error out
              (specify -verbose=3 to see warnings
   -solexa - use Solexa/Illumina quality score algorithm
	(instead of Phread quality)
   -verbose=2 - set warning level to get some stats output during processing
================================================================
========   featureBits   ====================================
================================================================
featureBits - Correlate tables via bitmap projections. 
usage:
   featureBits database table(s)
This will return the number of bits in all the tables anded together
Pipe warning:  output goes to stderr.
Options:
   -bed=output.bed   Put intersection into bed format. Can use stdout.
   -fa=output.fa     Put sequence in intersection into .fa file
   -faMerge          For fa output merge overlapping features.
   -minSize=N        Minimum size to output (default 1)
   -chrom=chrN       Restrict to one chromosome
   -chromSize=sizefile       Read chrom sizes from file instead of database. 
                             (chromInfo three column format)
   -or               Or tables together instead of anding them
   -not              Output negation of resulting bit set.
   -countGaps        Count gaps in denominator
   -noRandom         Don't include _random (or Un) chromosomes
   -noHap            Don't include _hap chromosomes
   -dots=N           Output dot every N chroms (scaffolds) processed
   -minFeatureSize=n Don't include bits of the track that are smaller than
                     minFeatureSize, useful for differentiating between
                     alignment gaps and introns.
   -bin=output.bin   Put bin counts in output file
   -binSize=N        Bin size for generating counts in bin file (default 500000)
   -binOverlap=N     Bin overlap for generating counts in bin file (default 250000)
   -bedRegionIn=input.bed    Read in a bed file for bin counts in specific regions 
                     and write to bedRegionsOut
   -bedRegionOut=output.bed  Write a bed file of bin counts in specific regions 
                     from bedRegionIn
   -enrichment       Calculates coverage and enrichment assuming first table
                     is reference gene track and second track something else
                     Enrichment is the amount of table1 that covers table2 vs. the
                     amount of table1 that covers the genome. It's how much denser
                     table1 is in table2 than it is genome-wide.
   '-where=some sql pattern'  Restrict to features matching some sql pattern
You can include a '!' before a table name to negate it.
Some table names can be followed by modifiers such as:
    :exon:N          Break into exons and add N to each end of each exon
    :cds             Break into coding exons
    :intron:N        Break into introns, remove N from each end
    :utr5, :utr3     Break into 5' or 3' UTRs
    :upstream:N      Consider the region of N bases before region
    :end:N           Consider the region of N bases after region
    :score:N         Consider records with score >= N 
    :upstreamAll:N   Like upstream, but doesn't filter out genes that 
                     have txStart==cdsStart or txEnd==cdsEnd
    :endAll:N        Like end, but doesn't filter out genes that 
                     have txStart==cdsStart or txEnd==cdsEnd
The tables can be bed, psl, or chain files, or a directory full of
such files as well as actual database tables.  To count the bits
used in dir/chrN_something*.bed you'd do:
   featureBits database dir/_something.bed

================================================================
========   fetchChromSizes   ====================================
================================================================
usage: fetchChromSizes <db> > <db>.chrom.sizes
   used to fetch chrom.sizes information from UCSC for the given <db>
<db> - name of UCSC database, e.g.: hg18, mm9, etc ...

This script expects to find one of the following commands:
   wget, mysql, or ftp in order to fetch information from UCSC.
Route the output to the file <db>.chrom.sizes as indicated above.

Example:   fetchChromSizes hg18 > hg18.chrom.sizes
================================================================
========   findMotif   ====================================
================================================================
findMotif - find specified motif in sequence
usage:
   findMotif [options] -motif=<acgt...> sequence
where:
   sequence is a .fa , .nib or .2bit file or a file which is a list of sequence files.
options:
   -motif=<acgt...> - search for this specified motif (case ignored, [acgt] only)
   -chr=<chrN> - process only this one chrN from the sequence
   -strand=<+|-> - limit to only one strand.  Default is both.
   -bedOutput - output bed format (this is the default)
   -wigOutput - output wiggle data format instead of bed file
   -verbose=N - set information level [1-4]
   NOTE: motif must be longer than 4 characters, less than 17
   -verbose=4 - will display gaps as bed file data lines to stderr
================================================================
========   gapToLift   ====================================
================================================================
gapToLift - create lift file from gap table(s)
usage:
   gapToLift [options] db liftFile.lft
       uses gap table(s) from specified db.  Writes to liftFile.lft
       generates lift file segements separated by non-bridged gaps.
options:
   -chr=chrN - work only on given chrom
   -minGap=M - examine gaps only >= than M
   -insane - do *not* perform coordinate sanity checks on gaps
   -bedFile=fileName.bed - output segments to fileName.bed
   -verbose=N - N > 1 see more information about procedure
================================================================
========   genePredCheck   ====================================
================================================================
genePredCheck - validate genePred files or tables
usage:
   genePredCheck [options] fileTbl ..

If fileTbl is an existing file, then it is check.  Otherwise, if -db
is provided, then a table by this name is checked.

options:
   -db=db - If specified, then this database is used to
    get chromosome sizes, and perhaps the table to check.


================================================================
========   genePredHisto   ====================================
================================================================
wrong number of arguments

genePredHisto - get data for generating histograms from a genePred file.
usage:
   genePredHisto [options] what genePredFile histoOut

Options:
  -ids - a second column with the gene name, useful for finding outliers.

The what arguments indicates the type of output. The output file is
a list of numbers suitable for input to textHistogram or similar
The following values are current implemented
   exonLen- length of exons
   5utrExonLen- length of 5'UTR regions of exons
   cdsExonLen- length of CDS regions of exons
   3utrExonLen- length of 3'UTR regions of exons
   exonCnt- count of exons
   5utrExonCnt- count of exons containing 5'UTR
   cdsExonCnt- count of exons count CDS
   3utrExonCnt- count of exons containing 3'UTR

================================================================
========   genePredSingleCover   ====================================
================================================================
wrong # args

genePredSingleCover - create single-coverage genePred files

genePredSingleCover [options] inGenePred outGenePred

Create a genePred file that have single CDS coverage of the genome.
UTR is allowed to overlap.  The default is to keep the gene with the
largest numberr of CDS bases.

Options:
  -scores=file - read scores used in selecting genes from this file.
   It consists of tab seperated lines of
       name chrom txStart score
   where score is a real or integer number. Higher scoring genes will
   be choosen over lower scoring ones.  Equaly scoring genes are
   choosen by number of CDS bases.  If this option is supplied, all
   genes must be in the file


================================================================
========   genePredToBed   ====================================
================================================================
genePredToBed - Convert from genePred to bed format. Does not yet handle genePredExt
usage:
   genePredToBed in.genePred out.bed
options:
   -xxx=XXX

================================================================
========   genePredToFakePsl   ====================================
================================================================
genePredToFakePsl - Create a psl of fake-mRNA aligned to gene-preds from a file or table.
usage:
   genePredToFakePsl db fileTbl pslOut cdsOut

If fileTbl is an existing file, then it is used.
Otherwise, the table by this name is used.

pslOut specifies the fake-mRNA output psl filename.

cdsOut specifies the output cds tab-separated file which contains
genbank-style CDS records showing cdsStart..cdsEnd
e.g. NM_123456 34..305


================================================================
========   genePredToGtf   ====================================
================================================================
genePredToGtf - Convert genePred table or file to gtf.
usage:
   genePredToGtf database genePredTable output.gtf
If database is 'file' then track is interpreted as a file
rather than a table in database.
options:
   -utr - Add 5UTR and 3UTR features
   -honorCdsStat - use cdsStartStat/cdsEndStat when defining start/end
    codon records
   -source=src set source name to uses
   -addComments - Add comments before each set of transcript records.
    allows for easier visual inspection
Note: use a refFlat table or extended genePred table or file to include
the gene_name attribute in the output.  This will not work with a refFlat
table dump file. If you are using a genePred file that starts with a numeric
bin column, drop it using the UNIX cut command:
    cut -f 2- in.gp | genePredToGtf file stdin out.gp

================================================================
========   genePredToMafFrames   ====================================
================================================================
wrong # args

genePredToMafFrames - create mafFrames tables from a genePreds

genePredToMafFrames [options] targetDb maf mafFrames geneDb1 genePred1 [geneDb2 genePred2...] 

Create frame annotations for one or more components of a MAF.
It is significantly faster to process multiple gene sets in the same"run, as 95% of the CPU time is spent reading the MAF

Arguments:
  o targetDb - db of target genome
  o maf - input MAF file
  o mafFrames - output file
  o geneDb1 - db in MAF that corresponds to genePred's organism.
  o genePred1 - genePred file.  Overlapping annotations ahould have
    be removed.  This file may optionally include frame annotations
Options:
  -bed=file - output a bed of for each mafFrame region, useful for debugging.
  -verbose=level - enable verbose tracing, the following levels are implemented:
     3 - print information about data used to compute each record.
     4 - dump information about the gene mappings that were constructed
     5 - dump information about the gene mappings after split processing
     6 - dump information about the gene mappings after frame linking


================================================================
========   gfClient   ====================================
================================================================
gfClient v. 35x1 - A client for the genomic finding program that produces a .psl file
usage:
   gfClient host port seqDir in.fa out.psl
where
   host is the name of the machine running the gfServer
   port is the same as you started the gfServer with
   seqDir is the path of the .nib or .2bit files relative to the current dir
       (note these are needed by the client as well as the server)
   in.fa is a fasta format file.  May contain multiple records
   out.psl where to put the output
options:
   -t=type     Database type.  Type is one of:
                 dna - DNA sequence
                 prot - protein sequence
                 dnax - DNA sequence translated in six frames to protein
               The default is dna
   -q=type     Query type.  Type is one of:
                 dna - DNA sequence
                 rna - RNA sequence
                 prot - protein sequence
                 dnax - DNA sequence translated in six frames to protein
                 rnax - DNA sequence translated in three frames to protein
   -prot       Synonymous with -d=prot -q=prot
   -dots=N   Output a dot every N query sequences
   -nohead   Suppresses psl five line header
   -minScore=N sets minimum score.  This is twice the matches minus the 
               mismatches minus some sort of gap penalty.  Default is 30
   -minIdentity=N Sets minimum sequence identity (in percent).  Default is
               90 for nucleotide searches, 25 for protein or translated
               protein searches.
   -out=type   Controls output file format.  Type is one of:
                   psl - Default.  Tab separated format without actual sequence
                   pslx - Tab separated format with sequence
                   axt - blastz-associated axt format
                   maf - multiz-associated maf format
                   sim4 - similar to sim4 format
                   wublast - similar to wublast format
                   blast - similar to NCBI blast format
                   blast8- NCBI blast tabular format
                   blast9 - NCBI blast tabular format with comments
   -maxIntron=N  Sets maximum intron size. Default is 750000
================================================================
========   gfServer   ====================================
================================================================
gfServer v 35x1 - Make a server to quickly find where DNA occurs in genome.
To set up a server:
   gfServer start host port file(s)
   Where the files are .nib or .2bit format files specified relative to the current directory.
To remove a server:
   gfServer stop host port
To query a server with DNA sequence:
   gfServer query host port probe.fa
To query a server with protein sequence:
   gfServer protQuery host port probe.fa
To query a server with translated dna sequence:
   gfServer transQuery host port probe.fa
To query server with PCR primers
   gfServer pcr host port fPrimer rPrimer maxDistance
To process one probe fa file against a .nib format genome (not starting server):
   gfServer direct probe.fa file(s).nib
To test pcr without starting server:
   gfServer pcrDirect fPrimer rPrimer file(s).nib
To figure out usage level
   gfServer status host port
To get input file list
   gfServer files host port
Options:
   -tileSize=N size of n-mers to index.  Default is 11 for nucleotides, 4 for
               proteins (or translated nucleotides).
   -stepSize=N spacing between tiles. Default is tileSize.
   -minMatch=N Number of n-mer matches that trigger detailed alignment
               Default is 2 for nucleotides, 3 for protiens.
   -maxGap=N   Number of insertions or deletions allowed between n-mers.
               Default is 2 for nucleotides, 0 for protiens.
   -trans  Translate database to protein in 6 frames.  Note: it is best
           to run this on RepeatMasked data in this case.
   -log=logFile keep a log file that records server requests.
   -seqLog    Include sequences in log file (not logged with -syslog)
   -ipLog     Include user's IP in log file (not logged with -syslog)
   -syslog    Log to syslog
   -logFacility=facility log to the specified syslog facility - default local0.
   -mask      Use masking from nib file.
   -repMatch=N Number of occurrences of a tile (nmer) that trigger repeat masking the tile.
               Default is 1024.
   -maxDnaHits=N Maximum number of hits for a dna query that are sent from the server.
               Default is 100.
   -maxTransHits=N Maximum number of hits for a translated query that are sent from the server.
               Default is 200.
   -maxNtSize=N Maximum size of untranslated DNA query sequence
               Default is 40000
   -maxAaSize=N Maximum size of protein or translated DNA queries
               Default is 8000
   -canStop If set then a quit message will actually take down the
            server

================================================================
========   gff3ToGenePred   ====================================
================================================================
gff3ToGenePred - convert a GFF3 file to a genePred file
usage:
   gff3ToGenePred inGff3 outGp
options:
  -maxParseErrors=50 - Maximum number of parsing errors before aborting. A negative
   value will allow an unlimited number of errors.  Default is 50.
  -maxConvertErrors=50 - Maximum number of conversion errors before aborting. A negative
   value will allow an unlimited number of errors.  Default is 50.
  -honorStartStopCodons - only set CDS start/stop status to complete if there are
   corresponding start_stop codon records
This converts:
   - top-level gene records with mRNA records
   - top-level mRNA records
   - mRNA records that contain:
       - exon and CDS
       - CDS, five_prime_UTR, three_prime_UTR
       - only exon for non-coding
   - top-level gene records with transcript records
   - top-level transcript records
   - transcript records that contain:
       - exon
The first step is to parse GFF3 file, up to 50 errors are reported before
aborting.  If the GFF3 files is successfully parse, it is converted to gene,
annotation.  Up to 50 conversion errors are reported before aborting.

Input file must conform to the GFF3 specification:
   http://www.sequenceontology.org/gff3.shtml

================================================================
========   gff3ToPsl   ====================================
================================================================
gff3ToPsl - convert a GFF3 CIGAR file to a PSL file
usage:
   gff3ToPsl mapFile inGff3 out.psl
arguments:
   mapFile    mapping of locus names to chroms and sizes.
              File formatted:  locusName chromeName chromSize
   inGff3     GFF3 formatted file with Gap attribute in match records
   out.psl    PSL formatted output
options:
This converts:
The first step is to parse GFF3 file, up to 50 errors are reported before
aborting.  If the GFF3 files is successfully parse, it is converted to PSL

Input file must conform to the GFF3 specification:
   http://www.sequenceontology.org/gff3.shtml

================================================================
========   gmtime   ====================================
================================================================
gmtime - convert unix timestamp to date string
usage: gmtime <time stamp>
	<time stamp> - integer 0 to 2147483647
================================================================
========   gtfToGenePred   ====================================
================================================================
gtfToGenePred - convert a GTF file to a genePred
usage:
   gtfToGenePred gtf genePred

options:
     -genePredExt - create a extended genePred, including frame
      information and gene name
     -allErrors - skip groups with errors rather than aborting.
      Useful for getting infomation about as many errors as possible.
     -ignoreGroupsWithoutExons - skip groups contain no exons rather than
      generate an error.
     -infoOut=file - write a file with information on each transcript
     -sourcePrefix=pre - only process entries where the source name has the
      specified prefix.  May be repeated.
     -impliedStopAfterCds - implied stop codon in after CDS
     -simple    - just check column validity, not hierarchy, resulting genePred may be damaged
     -geneNameAsName2 - if specified, use gene_name for the name2 field
      instead of gene_id.

================================================================
========   headRest   ====================================
================================================================
headRest - Return all *but* the first N lines of a file.
usage:
   headRest count fileName
You can use stdin for fileName
options:
   -xxx=XXX

================================================================
========   hgFindSpec   ====================================
================================================================
hgFindSpec - Create hgFindSpec table from trackDb.ra files.

usage:
   hgFindSpec [options] orgDir database hgFindSpec hgFindSpec.sql hgRoot

Options:
  -strict		Add spec to hgFindSpec only if its table(s) exist.
  -raName=trackDb.ra - Specify a file name to use other than trackDb.ra
   for the ra files.
  -release=alpha|beta|public - Include trackDb entries with this release tag only.

================================================================
========   hgGcPercent   ====================================
================================================================
hgGcPercent - Calculate GC Percentage in 20kb windows
usage:
   hgGcPercent [options] database nibDir
     nibDir can be a .2bit file, a directory that contains a
     database.2bit file, or a directory that contains *.nib files.
     Loads gcPercent table with counts from sequence.
options:
   -win=<size> - change windows size (default 20000)
   -noLoad - do not load mysql table - create bed file
   -file=<filename> - output to <filename> (stdout OK) (implies -noLoad)
   -chr=<chrN> - process only chrN from the nibDir
   -noRandom - ignore randome chromosomes from the nibDir
   -noDots - do not display ... progress during processing
   -doGaps - process gaps correctly (default: gaps are not counted as GC)
   -wigOut - output wiggle ascii data ready to pipe to wigEncode
   -overlap=N - overlap windows by N bases (default 0)
   -verbose=N - display details to stderr during processing
   -bedRegionIn=input.bed   Read in a bed file for GC content in specific regions and write to bedRegionsOut
   -bedRegionOut=output.bed Write a bed file of GC content in specific regions from bedRegionIn

example:
  calculate GC percent in 5 base windows using a 2bit assembly (dp2):
    hgGcPercent -wigOut -doGaps -win=5 -file=stdout -verbose=0 \
      dp2 /cluster/data/dp2 \
    | wigEncode stdin gc5Base.wig gc5Base.wib
================================================================
========   hgLoadBed   ====================================
================================================================
hgLoadBed - Load a generic bed file into database
usage:
   hgLoadBed database track files(s).bed
options:
   -noSort  don't sort (you better be sorting before this)
   -noBin   suppress bin field
   -oldTable add to existing table
   -onServer This will speed things up if you're running in a directory that
             the mysql server can access.
   -sqlTable=table.sql Create table from .sql file
   -renameSqlTable Rename table created with -sqlTable to match track
   -trimSqlTable   If sqlTable has n rows, and input has m rows, only load m rows, meaning the last n-m rows in the sqlTable are optional
   -type=bedN[+[P]] : 
                      N is between 3 and 15, 
                      optional (+) if extra "bedPlus" fields, 
                      optional P specifies the number of extra fields. Not required, but preferred.
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      (see http://genome.ucsc.edu/FAQ/FAQformat.html#format1)
                      Recommended to use with -as option for better bedPlus validation.
   -as=fields.as   If you have extra "bedPlus" fields, it's great to put a definition
                     of each field in a row in AutoSql format here.
   -chromInfo=file.txt    Specify chromInfo file to validate chrom names and sizes.
   -tab       Separate by tabs rather than space
   -hasBin    Input bed file starts with a bin field.
   -noLoad     - Do not load database and do not clean up tab files
   -noHistory  - Do not add history table comments (for custom tracks)
   -notItemRgb - Do not parse column nine as r,g,b when commas seen (bacEnds)
   -bedGraph=N - wiggle graph column N of the input file as float dataValue
               - bedGraph N is typically 4: -bedGraph=4
   -bedDetail  - bedDetail format with id and text for hgc clicks
               - requires tab and sqlTable options
   -maxChromNameLength=N  - specify max chromName length to avoid
               - reference to chromInfo table
   -tmpDir=<path>  - path to directory for creation of temporary .tab file
                   - which will be removed after loading
   -noNameIx  - no index for the name column (default creates index)
   -ignoreEmpty  - no error on empty input file
   -noStrict  - don't perform coord sanity checks
              - by default we abort when: chromStart >= chromEnd
   -allowStartEqualEnd  - even when doing strict checks, allow
                          chromStart==chromEnd (zero-length e.g. insertion)
   -allowNegativeScores  - sql definition of score column is int, not unsigned
   -customTrackLoader  - turns on: -noNameIx, -noHistory, -ignoreEmpty,
                         -allowStartEqualEnd, -allowNegativeScores, -verbose=0
                         Plus, this turns on a 20 minute time-out exit.
   -fillInScore=colName - if every score value is zero, then use column 'colName' to fill in the score column (from minScore-1000)
   -minScore=N - minimum value for score field for -fillInScore option (default 100)
   -verbose=N - verbose level for extra information to STDERR
   -dotIsNull=N - if the specified field is a '.' the replace it with -1

================================================================
========   hgLoadOut   ====================================
================================================================
hgLoadOut - load RepeatMasker .out files into database
usage:
   hgLoadOut database file(s).out
For each table chrN.out this will create the table
chrN_rmsk in the database
options:
   -tabFile=text.tab - don't actually load database, just create tab file
   -nosplit - assume single rmsk table rather than chrN_rmsks
   -split - load chrN_rmsk tables even if a single file is given
   -table=name - use a different suffix other than the default (rmsk)

================================================================
========   hgLoadWiggle   ====================================
================================================================
hgLoadWiggle - Load a wiggle track definition into database
usage:
   hgLoadWiggle [options] database track files(s).wig
options:
   -noBin	suppress bin field
   -noLoad	do not load table, only create .tab file
   -noHistory	do not add history table comments (for custom tracks)
   -oldTable	add to existing table
   -tab		Separate by tabs rather than space
   -pathPrefix=<path>	.wib file path prefix to use (default /gbdb/<DB>/wib)
   -chromInfoDb=<DB>	database to extract chromInfo size information
   -maxChromNameLength=N  - specify max chromName length to avoid
               - reference to chromInfo table
   -tmpDir=<path>  - path to directory for creation of temporary .tab file
                   - which will be removed after loading
   -verbose=N	N=2 see # of lines input and SQL create statement,
		N=3 see chrom size info, N=4 see details on chrom size info
================================================================
========   hgTrackDb   ====================================
================================================================
hgTrackDb - Create trackDb table from text files.

Note that the browser supports multiple trackDb tables, usually
in the form: trackDb_YourUserName. Which particular trackDb
table the browser uses is specified in the hg.conf file found
either in your home directory file '.hg.conf' or in the web server's
cgi-bin/hg.conf configuration file with the setting: db.trackDb=trackDb
see also: src/product/ex.hg.conf discussion of this setting.
usage:
   hgTrackDb [options] org database trackDb trackDb.sql hgRoot

Options:
  org - a directory name with a hierarchy of trackDb.ra files to examine
      - in the case of a single directory with a single trackDb.ra file use .
  database - name of database to create the trackDb table in
  trackDb  - name of table to create, usually trackDb, or trackDb_${USER}
  trackDb.sql  - SQL definition of the table to create, typically from
               - the source tree file: src/hg/lib/trackDb.sql
               - the table name in the CREATE statement is replaced by the
               - table name specified on this command line.
  hgRoot - a directory name to prepend to org to locate the hierarchy:
           hgRoot/trackDb.ra - top level trackDb.ra file processed first
           hgRoot/org/trackDb.ra - second level file processed second
           hgRoot/org/database/trackDb.ra - third level file processed last
         - for no directory hierarchy use .
  -strict - only include tables that exist (and complain about missing html files).
  -raName=trackDb.ra - Specify a file name to use other than trackDb.ra
   for the ra files.
  -release=alpha|beta|public - Include trackDb entries with this release tag only.
  -settings - for trackDb scanning, output table name, type line,
            -  and settings hash to stderr while loading everything.
================================================================
========   hgWiggle   ====================================
================================================================
hgWiggle - fetch wiggle data from data base or file
usage:
   hgWiggle [options] <track names ...>
options:
   -db=<database> - use specified database
   -chr=chrN - examine data only on chrN
   -chrom=chrN - same as -chr option above
   -position=[chrN:]start-end - examine data in window start-end (1-relative)
             (the chrN: is optional)
   -chromLst=<file> - file with list of chroms to examine
   -doAscii - perform the default ascii output, in addition to other outputs
            - Any of the other -do outputs turn off the default ascii output
            - ***WARNING*** this ascii output is 0-relative offset which
            - *** is *not* the normal wiggle input format.  Use the -lift
            - *** argument -lift=1 to get 1-relative offset:
   -lift=<D> - lift ascii output positions by D (0 default)
   -rawDataOut - output just the data values, nothing else
   -htmlOut - output stats or histogram in HTML instead of plain text
   -doStats - perform stats measurement, default output text, see -htmlOut
   -doBed - output bed format
   -bedFile=<file> - constrain output to ranges specified in bed <file>
   -dataConstraint='DC' - where DC is one of < = >= <= == != 'in range'
   -ll=<F> - lowerLimit compare data values to F (float) (all but 'in range')
   -ul=<F> - upperLimit compare data values to F (float)
		(need both ll and ul when 'in range')

   -help - display more examples and extra options (to stderr)

   When no database is specified, track names will refer to .wig files

   example using the file chrM.wig:
	hgWiggle chrM
   example using the database table hg17.gc5Base:
	hgWiggle -chr=chrM -db=hg17 gc5Base
================================================================
========   hgsqldump   ====================================
================================================================
hgsqldump - Execute mysqldump using passwords from .hg.conf
usage:
   hgsqldump [OPTIONS] database [tables]
or:
   hgsqldump [OPTIONS] --databases [OPTIONS] DB1 [DB2 DB3 ...]
or:
   hgsqldump [OPTIONS] --all-databases [OPTIONS]
Generally anything in command line is passed to mysqldump
	after an implicit '-u user -ppassword
See also: mysqldump
Note: directory for results must be writable by mysql.  i.e. 'chmod 777 .'
Which is a security risk, so remember to change permissions back after use.
e.g.: hgsqldump --all -c --tab=. cb1

================================================================
========   htmlCheck   ====================================
================================================================
htmlCheck - Do a little reading and verification of html file
usage:
   htmlCheck how url
where how is:
   ok - just check for 200 return.  Print error message and exit -1 if no 200
   getAll - read the url (header and html) and print to stdout
   getHeader - read the header and print to stdout
   getCookies - print list of cookies
   getHtml - print the html, but not the header to stdout
   getForms - print the form structure to stdout
   getVars - print the form variables to stdout
   getLinks - print links
   getTags - print out just the tags
   checkLinks - check links in page
   checkLinks2 - check links in page and all subpages in same host
             (Just one level of recursion)
   checkLocalLinks - check local links in page
   checkLocalLinks2 - check local links in page and connected local pages
             (Just one level of recursion)
   submit - submit first form in page if any using 'GET' method
   validate - do some basic validations including TABLE/TR/TD nesting
options:
   cookies=cookie.txt - Cookies is a two column file
           containing <cookieName><space><value><newLine>
note: url will need to be in quotes if it contains an ampersand or question mark.
================================================================
========   hubCheck   ====================================
================================================================
hubCheck - Check a track data hub for integrity.
usage:
   hubCheck http://yourHost/yourDir/hub.txt
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs.
                           Will create this directory if not existing
   -verbose=2            - output verbosely
   -clear=browserMachine - clear hub status, no checking
   -noTracks             - don't check each track, just trackDb

================================================================
========   ixIxx   ====================================
================================================================
ixIxx - Create indices for simple line-oriented file of format 
<symbol> <free text>
usage:
   ixIxx in.text out.ix out.ixx
Where out.ix is a word index, and out.ixx is an index into the index.
options:
   -prefixSize=N Size of prefix to index on in ixx.  Default is 5.
   -binSize=N Size of bins in ixx.  Default is 64k.

================================================================
========   lavToAxt   ====================================
================================================================
lavToAxt - Convert blastz lav file to an axt file (which includes sequence)
usage:
   lavToAxt in.lav tNibDir qNibDir out.axt
Where tNibDir/qNibDir are either directories full of nib files, or a single
twoBit file
options:
   -fa  qNibDir is interpreted as a fasta file of multiple dna seq instead of directory of nibs
   -tfa tNibDir is interpreted as a fasta file of multiple dna seq instead of directory of nibs
   -dropSelf  drops alignment blocks on the diagonal for self alignments
   -scoreScheme=fileName Read the scoring matrix from a blastz-format file.
                (only used in conjunction with -dropSelf, to rescore 
                alignments when blocks are dropped)

================================================================
========   lavToPsl   ====================================
================================================================
lavToPsl - Convert blastz lav to psl format
usage:
   lavToPsl in.lav out.psl
options:
   -target-strand=c set the target strand to c (default is no strand)
   -bed output bed instead of psl
   -scoreFile=filename  output lav scores to side file, such that
                        each psl line in out.psl is matched by a score line.

================================================================
========   ldHgGene   ====================================
================================================================
ldHgGene - load database with gene predictions from a gff file.
usage:
     ldHgGene database table file(s).gff
options:
     -bin         Add bin column (now the default)
     -nobin       don't add binning (you probably don't want this)
     -exon=type   Sets type field for exons to specific value
     -oldTable    Don't overwrite what's already in table
     -noncoding   Forces whole prediction to be UTR
     -gtf         input is GTF, stop codon is not in CDS
     -predTab     input is already in genePredTab format
     -requireCDS  discard genes that don't have CDS annotation
     -out=gpfile  write output, in genePred format, instead of loading
                  table. Database is ignored.
     -genePredExt create a extended genePred, including frame
                  information and gene name
     -impliedStopAfterCds - implied stop codon in GFF/GTF after CDS

================================================================
========   liftOver   ====================================
================================================================
liftOver - Move annotations from one assembly to another
usage:
   liftOver oldFile map.chain newFile unMapped
oldFile and newFile are in bed format by default, but can be in GFF and
maybe eventually others with the appropriate flags below.
The map.chain file has the old genome as the target and the new genome
as the query.

***********************************************************************
WARNING: liftOver was only designed to work between different
         assemblies of the same organism. It may not do what you want
         if you are lifting between different organisms. If there has
         been a rearrangement in one of the species, the size of the
         region being mapped may change dramatically after mapping.
***********************************************************************

options:
   -minMatch=0.N Minimum ratio of bases that must remap. Default 0.95
   -gff  File is in gff/gtf format.  Note that the gff lines are converted
         separately.  It would be good to have a separate check after this
         that the lines that make up a gene model still make a plausible gene
         after liftOver
   -genePred - File is in genePred format
   -sample - File is in sample format
   -bedPlus=N - File is bed N+ format
   -positions - File is in browser "position" format
   -hasBin - File has bin value (used only with -bedPlus)
   -tab - Separate by tabs rather than space (used only with -bedPlus)
   -pslT - File is in psl format, map target side only
   -ends=N - Lift the first and last N bases of each record and combine the
             result. This is useful for lifting large regions like BAC end pairs.
   -minBlocks=0.N Minimum ratio of alignment blocks or exons that must map
                  (default 1.00)
   -fudgeThick    (bed 12 or 12+ only) If thickStart/thickEnd is not mapped,
                  use the closest mapped base.  Recommended if using 
                  -minBlocks.
   -multiple               Allow multiple output regions
   -minChainT, -minChainQ  Minimum chain size in target/query, when mapping
                           to multiple output regions (default 0, 0)
   -minSizeT               deprecated synonym for -minChainT (ENCODE compat.)
   -minSizeQ               Min matching region size in query with -multiple.
   -chainTable             Used with -multiple, format is db.tablename,
                               to extend chains from net (preserves dups)
   -errorHelp              Explain error messages

================================================================
========   liftOverMerge   ====================================
================================================================
liftOverMerge - Merge multiple regions in BED 5 files
                   generated by liftOver -multiple
usage:
   liftOverMerge oldFile newFile
options:
   -mergeGap=N    Max size of gap to merge regions (default 0)

================================================================
========   liftUp   ====================================
================================================================
liftUp - change coordinates of .psl, .agp, .gap, .gl, .out, .align, .gff, .gtf
.bscore .tab .gdup .axt .chain .net, genePred, .wab, .bed, or .bed8 files to
parent coordinate system.

usage:
   liftUp [-type=.xxx] destFile liftSpec how sourceFile(s)
The optional -type parameter tells what type of files to lift
If omitted the type is inferred from the suffix of destFile
Type is one of the suffixes described above.
DestFile will contain the merged and lifted source files,
with the coordinates translated as per liftSpec.  LiftSpec
is tab-delimited with each line of the form:
   offset oldName oldSize newName newSize
LiftSpec may optionally have a sixth column specifying + or - strand,
but strand is not supported for all input types.
The 'how' parameter controls what the program will do with
items which are not in the liftSpec.  It must be one of:
   carry - Items not in liftSpec are carried to dest without translation
   drop  - Items not in liftSpec are silently dropped from dest
   warn  - Items not in liftSpec are dropped.  A warning is issued
   error - Items not in liftSpec generate an error
If the destination is a .agp file then a 'large inserts' file
also needs to be included in the command line:
   liftUp dest.agp liftSpec how inserts sourceFile(s)
This file describes where large inserts due to heterochromitin
should be added. Use /dev/null and set -gapsize if there's not inserts file.

options:
   -nohead  No header written for .psl files
   -dots=N Output a dot every N lines processed
   -pslQ  Lift query (rather than target) side of psl
   -axtQ  Lift query (rather than target) side of axt
   -chainQ  Lift query (rather than target) side of chain
   -netQ  Lift query (rather than target) side of net
   -wabaQ  Lift query (rather than target) side of waba alignment
   	(waba lifts only work with query side at this time)
   -nosort Don't sort bed, gff, or gdup files, to save memory
   -gapsize change contig gapsize from default
   -ignoreVersions - Ignore NCBI-style version number in sequence ids of input files
   -extGenePred lift extended genePred

================================================================
========   linesToRa   ====================================
================================================================
linesToRa - generate .ra format from lines with pipe-separated fields
usage:
   linesToRa in.txt out.ra

================================================================
========   localtime   ====================================
================================================================
localtime - convert unix timestamp to date string
usage: localtime <time stamp>
	<time stamp> - integer 0 to 2147483647
================================================================
========   mafAddIRows   ====================================
================================================================
mafAddIRows - add 'i' rows to a maf
usage:
   mafAddIRows mafIn twoBitFile mafOut
WARNING:  requires a maf with only a single target sequence
options:
   -nBeds=listOfBedFiles
       reads in list of bed files, one per species, with N locations
   -addN
       adds rows of N's into maf blocks (rather than just annotating them)
   -addDash
       adds rows of -'s into maf blocks (rather than just annotating them)

================================================================
========   mafAddQRows   ====================================
================================================================
mafAddQRows - Add quality data to a maf
usage:
   mafAddQRows species.lst in.maf out.maf
where each species.lst line contains two fields
   1) species name
   2) directory where the .qac and .qdx files are located
options:
  -divisor=n is value to divide Q value by.  Default is 5.

================================================================
========   mafCoverage   ====================================
================================================================
mafCoverage - Analyse coverage by maf files - chromosome by 
chromosome and genome-wide.
usage:
   mafCoverage database mafFile
Note maf file must be sorted by chromosome,tStart
   -restrict=restrict.bed Restrict to parts in restrict.bed
   -count=N Number of matching species to count coverage. Default = 3 

================================================================
========   mafFetch   ====================================
================================================================
mafFetch - get overlapping records from an MAF using an index table
usage:
   mafFetch db table overBed mafOut

Select MAF records overlapping records in the BED using the
the database table to lookup the file and record offset.
Only the first 3 columns are required in the bed.

Options:

================================================================
========   mafFilter   ====================================
================================================================
mafFilter - Filter out maf files. Output goes to standard out
usage:
   mafFilter file(s).maf
options:
   -tolerate - Just ignore bad input rather than aborting.
   -minCol=N - Filter out blocks with fewer than N columns (default 1)
   -minRow=N - Filter out blocks with fewer than N rows (default 2)
   -maxRow=N - Filter out blocks with >= N rows (default 100)
   -factor - Filter out scores below -minFactor * (ncol**2) * nrow
   -minFactor=N - Factor to use with -minFactor (default 5)
   -minScore=N - Minimum allowed score (alternative to -minFactor)
   -reject=filename - Save rejected blocks in filename
   -needComp=species - all alignments must have species as one of the component
   -overlap - Reject overlapping blocks in reference (assumes ordered blocks)
   -componentFilter=filename - Filter out blocks without a component listed in filename 
   -speciesFilter=filename - Filter out blocks without a species listed in filename 

================================================================
========   mafFrag   ====================================
================================================================
mafFrag - Extract maf sequences for a region from database
usage:
   mafFrag database mafTrack chrom start end strand out.maf
options:
   -outName=XXX  Use XXX instead of database.chrom for the name

================================================================
========   mafFrags   ====================================
================================================================
mafFrags - Collect MAFs from regions specified in a 6 column bed file
usage:
   mafFrags database track in.bed out.maf
options:
   -orgs=org.txt - File with list of databases/organisms in order
   -bed12 - If set, in.bed is a bed 12 file, including exons
   -thickOnly - Only extract subset between thickStart/thickEnd
   -meFirst - Put native sequence first in maf
   -txStarts - Add MAF txstart region definitions ('r' lines) using BED name
    and output actual reference genome coordinates in MAF.
   -refCoords - output actual reference genome coordinates in MAF.

================================================================
========   mafGene   ====================================
================================================================
mafGene - output protein alignments using maf and genePred
usage:
   mafGene dbName mafTable genePredTable species.lst output
arguments:
   dbName         name of SQL database
   mafTable       name of maf file table
   genePredTable  name of the genePred table
   species.lst    list of species names
   output         put output here
options:
   -useFile           genePredTable argument is a genePred file name
   -geneName=foobar   name of gene as it appears in genePred
   -geneList=foolst   name of file with list of genes
   -geneBeds=foo.bed  name of bed file with genes and positions
   -chrom=chr1        name of chromosome from which to grab genes
   -exons             output exons
   -noTrans           don't translate output into amino acids
   -includeUtr        include the UTRs, use only with -noTrans
   -delay=N           delay N seconds between genes (default 0)
   -noDash            don't output lines with all dashes

================================================================
========   mafMeFirst   ====================================
================================================================
mafMeFirst - Move component to top if it is one of the named ones.  
Useful in conjunction with mafFrags when you don't want the one with 
the gene name to be in the middle.
usage:
   mafMeFirst in.maf me.list out.maf
options:
   -xxx=XXX

================================================================
========   mafOrder   ====================================
================================================================
mafOrder - order components within a maf file
usage:
   mafOrder mafIn order.lst mafOut
where order.lst has one species per line
options:

================================================================
========   mafRanges   ====================================
================================================================
mafRanges - Extract ranges of target (or query) coverage from maf and 
            output as BED 3 (e.g. for processing by featureBits).
usage:
   mafRanges in.maf db out.bed
            db should appear in in.maf alignments as the first part of 
            "db.seqName"-style sequence names.  The seqName part will 
            be used as the chrom field in the range printed to out.bed.
options:
   -otherDb=oDb  Output ranges only for alignments that include oDb.
                 oDB can be comma-separated list.
   -notAllOGap   Don't include bases for which all other species have a gap.


================================================================
========   mafSpeciesList   ====================================
================================================================
mafSpeciesList - Scan maf and output all species used in it.
usage:
   mafSpeciesList in.maf out.lst
options:
   -ignoreFirst - If true ignore first species in each maf, useful when this
                  is a mafFrags result that puts gene id there.

================================================================
========   mafSpeciesSubset   ====================================
================================================================
mafSpeciesSubset - Extract a maf that just has a subset of species.
usage:
   mafSpeciesSubset in.maf species.lst out.maf
Where:
    in.maf is a file where the sequence source are either simple species
           names, or species.something.  Usually actually it's a genome
           database name rather than a species before the dot to tell the
           truth.
    species.lst is a file with a list of species to keep
    out.maf is the output.  It will have columns that are all - or . in
           the reduced species set removed, as well as the lines representing
           species not in species.lst removed.
options:
   -keepFirst - If set, keep the first 'a' line in a maf no matter what
                Useful for mafFrag results where we use this for the gene name

================================================================
========   mafSplit   ====================================
================================================================
mafSplit - Split multiple alignment files
usage:
   mafSplit splits.bed outRoot file(s).maf
options:
   -byTarget       Make one file per target sequence.  (splits.bed input
                   is ignored).
   -outDirDepth=N  For use only with -byTarget.
                   Create N levels of output directory under current dir.
                   This helps prevent NFS problems with a large number of
                   file in a directory.  Using -outDirDepth=3 would
                   produce ./1/2/3/outRoot123.maf.
   -useSequenceName  For use only with -byTarget.
                     Instead of auto-incrementing an integer to determine
                     output filename, expect each target sequence name to
                     end with a unique number and use that number as the
                     integer to tack onto outRoot.
   -useFullSequenceName  For use only with -byTarget.
                     Instead of auto-incrementing an integer to determine
                     output filename, use the target sequence name
                     to tack onto outRoot.
   -useHashedName=N  For use only with -byTarget.
                     Instead of auto-incrementing an integer or requiring
                     a unique number in the sequence name, use a hash
                     function on the sequence name to compute an N-bit
                     number.  This limits the max #filenames to 2^N and
                     ensures that even if different subsets of sequences
                     appear in different pairwise mafs, the split file
                     names will be consistent (due to hash function).
                     This option is useful when a "scaffold-based"
                     assembly has more than one sequence name pattern,
                     e.g. both chroms and scaffolds.


================================================================
========   mafSplitPos   ====================================
================================================================
mafSplitPos - Pick positions to split multiple alignment input files
usage:
   mafSplitPos database size(Mbp) out.bed
options:
   -chrom=chrN   Restrict to one chromosome
   -minGap=N     Split only on gaps >N bp, defaults to 100, specify -1 to disable
   -minRepeat=N  Split only on repeats >N bp, defaults to 100, specify -1 to disable

================================================================
========   mafToAxt   ====================================
================================================================
mafToAxt - Convert from maf to axt format
usage:
   mafToAxt in.maf tName qName output
Where tName and qName are the names for the
target and query sequences respectively.
tName should be maf target since it must always
be oriented in "+" direction.
  Use 'first' for tName to always use first sequence
Options:
  -stripDb - Strip names from start to first period.

================================================================
========   mafToPsl   ====================================
================================================================
mafToPsl - Convert maf to psl format
usage:
   mafToPsl querySrc targetSrc in.maf out.psl

The query and target src can be either an organism prefix (hg17),
or a full src sequence name (hg17.chr11), or just the sequence name
if the MAF does not contain organism prefixes.


================================================================
========   mafsInRegion   ====================================
================================================================
mafsInRegion - Extract MAFS in a genomic region
usage:
    mafsInRegion regions.bed out.maf|outDir in.maf(s)
options:
    -outDir - output separate files named by bed name field to outDir
    -keepInitialGaps - keep alignment columns at the beginning and of a block that are gapped in all species

================================================================
========   makeTableList   ====================================
================================================================
makeTableList - create/recreate tableList tables (cache of SHOW TABLES)
usage:
   makeTableList [assemblies]
options:
   -all               recreate tableList for all assemblies
================================================================
========   maskOutFa   ====================================
================================================================
maskOutFa - Produce a masked .fa file given an unmasked .fa and
a RepeatMasker .out file, or a .bed file to mask on.
usage:
   maskOutFa in.fa maskFile out.fa.masked
where in.fa and out.fa.masked can be the same file, and
maskFile can end in .out (for RepeatMasker) or .bed.
MaskFile parameter can also be the word 'hard' in which case 
lower case letters are converted to N's.
options:
   -soft - puts masked parts in lower case other in upper.
   -softAdd - lower cases masked bits, leaves others unchanged
   -clip - clip out of bounds mask records rather than dying.
   -maskFormat=fmt - "out" or "bed" for when input does not have required extension.

================================================================
========   mktime   ====================================
================================================================
mktime - convert date string to unix timestamp
usage: mktime YYYY-MM-DD HH:MM:SS
valid dates: 1970-01-01 00:00:00 to 2038-01-19 03:14:07
================================================================
========   mrnaToGene   ====================================
================================================================
mrnaToGene - convert PSL alignments of mRNAs to gene annotations
usage:
   mrnaToGene [options] psl genePredFile

Convert PSL alignments with CDS annotation from genbank to  gene
annotations in genePred format.  Accessions without valids CDS are
optionally dropped. A best attempt is made to convert incomplete CDS
annotations.

The psl argument may either be a PSL file or a table in a databases,
depending on options.  CDS maybe obtained from the database or file.
Accession in PSL files are tried with and with out genbank versions.

Options:
  -db=db - get PSLs and CDS from this database, psl specifies the table.
  -cdsDb=db - get CDS from this database, psl is a file.
  -cdsFile=file - get CDS from this database, psl is a file.
   File is table seperate with accession as the first column and
   CDS the second
  -insertMergeSize=8 - Merge inserts (gaps) no larger than this many bases.
   A negative size disables merging of blocks.  This differs from specifying zero
   in that adjacent blocks will not be merged, allowing tracking of frame for
   each block. Defaults to 8 unless -cdsMergeSize or -utrMergeSize are specified,
   if either of these are specified, this option is ignored.
  -smallInsertSize=n - alias for -insertMergetSize
  -cdsMergeSize=-1 - merge gaps in CDS no larger than this size.
   A negative values disables.
  -cdsMergeMod3 - only merge CDS gaps if they mod 3
  -utrMergeSize=-1  - merge gaps in UTR no larger than this size.
   A negative values disables.
  -requireUtr - Drop sequences that don't have both 5' and 3' UTR annotated.
  -genePredExt - create a extended genePred, including frame information.
  -allCds - consider PSL to be all CDS.
  -noCds - consider PSL to not contain any CDS.
  -keepInvalid - Keep sequences with invalid CDS.
  -quiet - Don't print print info about dropped sequences.
  -ignoreUniqSuffix - ignore all characters after last `-' in qName
   when looking up CDS. Used when a suffix has been added to make qName
   unique.  It is not removed from the name in the genePred.


================================================================
========   netChainSubset   ====================================
================================================================
netChainSubset - Create chain file with subset of chains that appear in the net
usage:
   netChainSubset in.net in.chain out.chain
options:
   -gapOut=gap.tab - Output gap sizes to file
   -type=XXX - Restrict output to particular type in net file
   -splitOnInsert - Split chain when get an insertion of another chain
   -wholeChains - Write entire chain references by net, don't split
    when a high-level net is encoundered.  This is useful when nets
    have been filtered.
   -skipMissing - skip chains that are not found instead of generating
    an error.  Useful if chains have been filtered.

================================================================
========   netClass   ====================================
================================================================
netClass - Add classification info to net
usage:
   netClass [options] in.net tDb qDb out.net
       tDb - database to fetch target repeat masker table information
       qDb - database to fetch query repeat masker table information
options:
   -tNewR=dir - Dir of chrN.out.spec files, with RepeatMasker .out format
                lines describing lineage specific repeats in target
   -qNewR=dir - Dir of chrN.out.spec files for query
   -noAr - Don't look for ancient repeats
   -qRepeats=table - table name for query repeats in place of rmsk
   -tRepeats=table - table name for target repeats in place of rmsk
                   - for example: -tRepeats=windowmaskerSdust
   -liftQ=file.lft - Lift in.net's query coords to chrom-level using
                     file.lft (for accessing chrom-level coords in qDb)
   -liftT=file.lft - Lift in.net's target coords to chrom-level using
                     file.lft (for accessing chrom-level coords in tDb)

================================================================
========   netFilter   ====================================
================================================================
netFilter - Filter out parts of net.  What passes
filter goes to standard output.  Note a net is a
recursive data structure.  If a parent fails to pass
the filter, the children are not even considered.
usage:
   netFilter in.net(s)
options:
   -q=chr1,chr2 - restrict query side sequence to those named
   -notQ=chr1,chr2 - restrict query side sequence to those not named
   -t=chr1,chr2 - restrict target side sequence to those named
   -notT=chr1,chr2 - restrict target side sequence to those not named
   -minScore=N - restrict to those scoring at least N
   -maxScore=N - restrict to those scoring less than N
   -minGap=N  - restrict to those with gap size (tSize) >= minSize
   -minAli=N - restrict to those with at least given bases aligning
   -maxAli=N - restrict to those with at most given bases aligning
   -minSizeT=N - restrict to those at least this big on target
   -minSizeQ=N - restrict to those at least this big on query
   -qStartMin=N - restrict to those with qStart at least N
   -qStartMax=N - restrict to those with qStart less than N
   -qEndMin=N - restrict to those with qEnd at least N
   -qEndMax=N - restrict to those with qEnd less than N
   -tStartMin=N - restrict to those with tStart at least N
   -tStartMax=N - restrict to those with tStart less than N
   -tEndMin=N - restrict to those with tEnd at least N
   -tEndMax=N - restrict to those with tEnd less than N
   -qOverlapStart=N - restrict to those where the query overlaps a region starting here
   -qOverlapEnd=N - restrict to those where the query overlaps a region ending here
   -tOverlapStart=N - restrict to those where the target overlaps a region starting here
   -tOverlapEnd=N - restrict to those where the target overlaps a region ending here
   -type=XXX - restrict to given type, maybe repeated to allow several types
   -syn        - do filtering based on synteny (tuned for human/mouse).  
   -minTopScore=N - Minimum score for top level alignments. default 300000
   -minSynScore=N - Min syntenic block score (def=200,000). 
                      Default covers 27,000 bases including 9,000 
                      aligning--a very stringent requirement. 
   -minSynSize=N - Min syntenic block size (def=20,000). -
   -minSynAli=N  - Min syntenic alignment size(def=10,000). -
   -maxFar=N     - Max distance to allow synteny (def=200,000). 
   -nonsyn     - do inverse filtering based on synteny (tuned for human/mouse).  
   -chimpSyn   - do filtering based on synteny (tuned for human/chimp).  
   -fill - Only pass fills, not gaps. Only useful with -line.
   -gap  - Only pass gaps, not fills. Only useful with -line.
   -line - Do this a line at a time, not recursing
   -noRandom      - suppress chains involving 'random' chromosomes
   -noHap         - suppress chains involving chromosome names inc '_hap'

================================================================
========   netSplit   ====================================
================================================================
netSplit - Split a genome net file into chromosome net files
usage:
   netSplit in.net outDir
options:
   -xxx=XXX

================================================================
========   netSyntenic   ====================================
================================================================
netSyntenic - Add synteny info to net.
usage:
   netSyntenic in.net out.net
options:
   -xxx=XXX

================================================================
========   netToAxt   ====================================
================================================================
netToAxt - Convert net (and chain) to axt.
usage:
   netToAxt in.net in.chain target.2bit query.2bit out.axt
note:
   directories full of .nib files (an older format)
   may also be used in place of target.2bit and query.2bit.
options:
   -qChain - net is with respect to the q side of chains.
   -maxGap=N - maximum size of gap before breaking. Default 100
   -gapOut=gap.tab - Output gap sizes to file
   -noSplit - Don't split chain when there is an insertion of another chain

================================================================
========   netToBed   ====================================
================================================================
netToBed - Convert target coverage of net to a bed file.
usage:
   netToBed in.net out.bed
options:
   -maxGap=N - break up at gaps of given size or more
   -minFill=N - only include fill of given size of above.

================================================================
========   newProg   ====================================
================================================================
newProg - make a new C source skeleton.
usage:
   newProg progName description words
This will make a directory 'progName' and a file in it 'progName.c'
with a standard skeleton

Options:
   -jkhgap - include jkhgap.a and mysql libraries as well as jkweb.a archives 
   -cgi    - create shell of a CGI script for web
================================================================
========   nibFrag   ====================================
================================================================
nibFrag - Extract part of a nib file as .fa (all bases/gaps lower case by default)
usage:
   nibFrag [options] file.nib start end strand out.fa
where strand is + (plus) or m (minus)
options:
   -masked - use lower case characters for bases meant to be masked out
   -hardMasked - use upper case for not masked-out and 'N' characters for masked-out bases
   -upper - use upper case characters for all bases
   -name=name Use given name after '>' in output sequence
   -dbHeader=db Add full database info to the header, with or without -name option
   -tbaHeader=db Format header for compatibility with tba, takes database name as argument

================================================================
========   nibSize   ====================================
================================================================
nibSize - print size of nibs
usage:
   nibSize nib1 [...]

================================================================
========   oligoMatch   ====================================
================================================================
oligoMatch - find perfect matches in sequence.
usage:
   oligoMatch oligos sequence output.bed
where "oligos" and "sequence" can be .fa, .nib, or .2bit files.
================================================================
========   overlapSelect   ====================================
================================================================
wrong # args:  overlapSelect [options] selectFile inFile outFile

Select records based on overlapping chromosome ranges.  The ranges are
specified in the selectFile, with each block specifying a range.
Records are copied from the inFile to outFile based on the selection
criteria.  Selection is based on blocks or exons rather than entire
range.

Options starting with -select* apply to selectFile and those starting
with -in* apply to inFile.

Options:
  -selectFmt=fmt - specify selectFile format:
          psl - PSL format (default for *.psl files).
          pslq - PSL format, using query instead of target
          genePred - genePred format (default for *.gp or
                     *.genePred files).
          bed - BED format (default for *.bed files).
                If BED doesn't have blocks, the bed range is used. 
          chain - chain file format (default from .chain files)
          chainq - chain file format, using query instead of target
  -selectCoordCols=spec - selectFile is tab-separate with coordinates
       as described by spec, which is one of:
            o chromCol - chrom in this column followed by start and end.
            o chromCol,startCol,endCol,strandCol,name - chrom, start, end, and
              strand in specified columns. Columns can be omitted from the end
              or left empty to not specify.
          NOTE: column numbers are zero-based
  -selectCds - Use only CDS in the selectFile
  -selectRange - Use entire range instead of blocks from records in
          the selectFile.
  -inFmt=fmt - specify inFile format, same values as -selectFmt.
  -inCoordCols=spec - inFile is tab-separate with coordinates specified by
      spec, in format described above.
  -inCds - Use only CDS in the inFile
  -inRange - Use entire range instead of blocks of records in the inFile.
  -nonOverlapping - select non-overlapping instead of overlapping records
  -strand - must be on the same strand to be considered overlapping
  -oppositeStrand - must be on the opposite strand to be considered overlapping
  -excludeSelf - don't compare records with the same coordinates and name.
      Warning: using only one of -inCds or -selectCds will result in different
      coordinates for the same record.
  -idMatch - only select overlapping records if they have the same id
  -aggregate - instead of computing overlap bases on individual select entries, 
      compute it based on the total number of inFile bases overlap by selectFile
      records. -overlapSimilarity and -mergeOutput will not work with
      this option.
  -overlapThreshold=0.0 - minimum fraction of an inFile record that
      must be overlapped by a single select record to be considered
      overlapping.  Note that this is only coverage by a single select
      record, not total coverage.
  -overlapThresholdCeil=1.1 - select only inFile records with less than
      this amount of overlap with a single record, provided they are selected
      by other criteria.
  -overlapSimilarity=0.0 - minimum fraction of inFile and select records that
      Note that this is only coverage by a single select record and this
      is; bidirectional inFile and selectFile must overlap by this
      amount.  A value of 1.0 will select identical records (or CDS if
      both CDS options are specified.  Not currently supported with
      -aggregate.
  -overlapSimilarityCeil=1.1 - select only inFile records with less than this
      amount of similarity with a single record. provided they are selected by
      other criteria.
  -overlapBases=-1 - minimum number of bases of overlap, < 0 disables.
  -statsOutput - output overlap statistics instead of selected records. 
      If no overlap criteria is specified, all overlapping entries are
      reported, Otherwise only the pairs passing the criteria are
      reported. This results in a tab-separated file with the columns:
         inId selectId inOverlap selectOverlap overBases
      Where inOverlap is the fraction of the inFile record overlapped by
      the selectFile record and selectOverlap is the fraction of the
      select record overlap by inFile records.  With -aggregate, output
      is:
         inId inOverlap inOverBases inBases
  -statsOutputAll - like -statsOutput, however output all inFile records,
      including those that are not overlapped.
  -statsOutputBoth - like -statsOutput, however output all selectFile and
      inFile records, including those that are not overlapped.
  -mergeOutput - output file with be a merge of the input file with the
      selectFile records that selected it.  The format is
         inRec<tab>selectRec.
      if multiple select records hit, inRec is repeated. This will increase
      the memory required. Not supported with -nonOverlapping or -aggregate.
  -idOutput - output a tab-separated file of pairs of
         inId selectId
      with -aggregate, only a single column of inId is written
  -dropped=file  - output rows that were dropped to this file.
  -verbose=n - verbose > 1 prints some details,

================================================================
========   paraFetch   ====================================
================================================================
paraFetch - try to fetch url with multiple connections
usage:
   paraFetch N R URL {outPath}
   where N is the number of connections to use
         R is the number of retries
   outPath is optional. If not specified, it will attempt to parse URL to discover output filename.
options:
   -newer  only download a file if it is newer than the version we already have.
   -progress  Show progress of download.

================================================================
========   paraSync   ====================================
================================================================
paraSync 1.0
paraSync - uses paraFetch to recursively mirror url to given path
usage:
   paraSync {options} N R URL outPath
   where N is the number of connections to use
         R is the number of retries
options:
   -A='ext1,ext2'  means accept only files with ext1 or ext2
   -newer  only download a file if it is newer than the version we already have.
   -progress  Show progress of download.

================================================================
========   positionalTblCheck   ====================================
================================================================
positionalTblCheck - check that positional tables are sorted
usage:
   positionalTblCheck db table

options:
  -verbose=n  n>=2, print tables as checked
This will check sorting of a table in a variety of formats.
It looks for commonly used names for chrom and chrom start
columns.  It also handles split tables

================================================================
========   pslCDnaFilter   ====================================
================================================================
wrong # of args:  pslCDnaFilter [options] inPsl outPsl

Filter cDNA alignments in psl format.  Filtering criteria are
comparative, selecting near best in genome alignments for each
given cDNA and non-comparative, based only on the quality of an
individual alignment.

WARNING: comparative filters requires that the input is sorted by
query name.  The command: 'sort -k 10,10' will do the trick.

Each alignment is assigned a score that is based on identity and
weighted towards longer alignments and those with introns.  This
can do either global or local best-in-genome selection.  Local
near best in genome keeps fragments of an mRNA that align in
discontinuous locations from other fragments.  It is useful for
unfinished genomes.  Global near best in genome keeps alignments
based on overall score.

Options:
   -algoHelp - print message describing the filtering algorithm.

   -localNearBest=-1.0 - local near best in genome filtering,
    keeping aligments within this fraction of the top score for
    each aligned portion of the mRNA. A value of zero keeps only
    the best for each fragment. A value of -1.0 disables
    (default).

   -globalNearBest=-1.0 - global near best in genome filtering,
    keeping aligments withing this fraction of the top score.  A
    value of zero keeps only the best alignment.  A value of -1.0
    disables (default).

   -ignoreNs - don't include Ns (repeat masked) while calculating the
    score and coverage. That is treat them as unaligned rather than
    mismatches.  Ns are still counts as mismatches when calculating
    the identity.

   -ignoreIntrons - don't favor apparent introns when scoring.

   -minId=0.0 - only keep alignments with at least this fraction
    identity.

   -minCover=0.0 - minimum fraction of query that must be
    aligned.  If -polyASizes is specified and the query is in
    the file, the ploy-A is not included in coverage
    calculation.

   -decayMinCover  -  the minimum coverage is calculated
    per alignment from the query size using the formula:
       minCoverage = 1.0 - qSize / 250.0
    and minCoverage is bounded between 0.25 and 0.9.

   -minSpan=0.0 - keep only alignments whose target length are
    at least this fraction of the longest alignment passing the
    other filters.  This can be useful for removing possible
    retroposed genes.

   -minQSize=0 - drop queries shorter than this size

   -minAlnSize=0 - minimum number of aligned bases.  This includes
    repeats, but excludes poly-A/poly-T bases if available.

   -minNonRepSize=0 - Minimum number of matching bases that are not repeats.
    This does not include mismatches.
    Must use -repeats on BLAT if doing unmasked alignments.

   -maxRepMatch=1.0 - Maximum fraction of matching bases
    that are repeats.  Must use -repeats on BLAT if doing
    unmasked alignments.

   -maxAligns=-1 - maximum number of alignments for a given query. If
    exceeded, then alignments are sorted by score and only this number
    will be saved.  A value of -1 disables (default)

   -polyASizes=file - tab separate file with information about
    poly-A tails and poly-T heads.  Format is outputted by
    faPolyASizes:

        id seqSize tailPolyASize headPolyTSize

   -usePolyTHead - if a poly-T head was detected and is longer
    than the poly-A tail, it is used when calculating coverage
    instead of the poly-A head.

   -bestOverlap - filter overlapping alignments, keeping the best of
    alignments that are similar.  This is designed to be used with
    overlapping, windowed alignments, where one alignment might be truncated.
    Does not discarding ones with weird overlap unless -filterWeirdOverlapped
    is specified.

   -hapRegions=psl - PSL format alignments of each haplotype pseudo-chromosome
    to the corresponding reference chromosome region.  This is used to map
    alignments between regions.

   -dropped=psl - save psls that were dropped to this file.

   -weirdOverlapped=psl - output weirdly overlapping PSLs to
    this file.

   -filterWeirdOverlapped - Filter weirdly overlapped alignments, keeping
    the single highest scoring one or an arbitrary one if multiple with
    the same high score.

   -alignStats=file - output the per-alignment statistics to this file

   -uniqueMapped - keep only cDNAs that are uniquely aligned after all
    other filters have been applied.

   -noValidate - don't run pslCheck validation.

   -verbose=1 - 0: quite
                1: output stats
                2: list problem alignment (weird or invalid)
                3: list dropped alignments and reason for dropping
                4: list kept psl and info
                5: info about all PSLs

   -hapRefMapped=psl - output PSLs of haplotype to reference chromosome
    cDNA alignments mappings (for debugging purposes).

   -hapRefCDnaAlns=psl - output PSLs of haplotype cDNA to reference cDNA
    alignments (for debugging purposes).

   -hapLociAlns=outfile - output grouping of final alignments create by
    haplotype mapping process.  Each row will start with an integer haplotype
    group id number follow by a PSL record.  All rows with the same id are
    alignments of the a given cDNA that were determined to be haplotypes of
    the same locus.  Alignments that are not part of a haplotype locus are not
    included.

   -alnIdQNameMode - add internal assigned alignment numbers to cDNA names
    on output.  Useful for debugging, as they are include in the verbose
    tracing as [#1], etc.  Will make a mess of normal production usage.

   -blackList=file.txt - adds a list of accession ranges to a black list.
    Any accession on this list is dropped. Black list file is two columns
    where the first column is the beginning of the range, and the second
    column is the end of the range, inclusive.


The default options don't do any filtering. If no filtering
criteria are specified, all PSLs will be passed though, except
those that are internally inconsistent.

THE INPUT MUST BE BE SORTED BY QUERY for the comparative filters.

================================================================
========   pslCat   ====================================
================================================================
pslCat - concatenate psl files
usage:
   pslCat file(s)
options:
   -check parses input.  Detects more errors but slower
   -nohead omit psl header
   -dir  files are directories (concatenate all in dirs)
   -out=file put output to file rather than stdout
   -ext=.xxx  limit files in directories to those with extension

================================================================
========   pslCheck   ====================================
================================================================
pslCheck - validate PSL files
usage:
   pslCheck fileTbl(s)
options:
   -db=db - get targetSizes from this database, and if file doesn't exist,
    look for a table in this database.
   -prot - confirm psls are protein psls
   -noCountCheck - don't validate that match/mismatch counts are match
    the total size of the alignment blocks
   -pass=pslFile - write PSLs without errors to this file
   -fail=pslFile - write PSLs with errors to this file
   -targetSizes=sizesFile - tab file with columns of target and size.
    If specified, psl is check to have a valid target and target
    coordinates.
   -querySizes=sizesFile - file with query sizes.
   -quiet - no write error message, just filter

================================================================
========   pslDropOverlap   ====================================
================================================================
pslDropOverlap - deletes all overlapping self alignments. 
usage:
    pslDropOverlap in.psl out.psl

================================================================
========   pslFilter   ====================================
================================================================
pslFilter - filter out psl file
    pslFilter in.psl out.psl 
options
    -dir  Input files are directories rather than single files
    -reward=N (default 1) Bonus to score for match
    -cost=N (default 1) Penalty to score for mismatch
    -gapOpenCost=N (default 4) Penalty for gap opening
    -gapSizeLogMod=N (default 1.00) Penalty for gap sizes
    -minScore=N (default 15) Minimum score to pass filter
    -minMatch=N (default 30) Min match (including repeats to pass)
    -minUniqueMatch (default 20) Min non-repeats to pass)
    -maxBadPpt (default 700) Maximum divergence in parts per thousand
    -minAli (default 600) Minimum ratio query in alignment in ppt
    -noHead  Don't output psl header
    -minAliT (default 0) Like minAli for target

================================================================
========   pslHisto   ====================================
================================================================
wrong # of args:
pslHisto [options] what inPsl outHisto

Collect counts on PSL alignments for making histograms. These
then be analyzed with R, textHistogram, etc.

The 'what' argument determines what data to collect, the following
are currently supported:

  o alignsPerQuery - number of alignments per query. Output is one
    line per query with the number of alignments.

  o coverSpread - difference between the highest and lowest coverage
    for alignments of a query.  Output line per query, with the difference.
    Only includes queries with multiple alignments

  o idSpread - difference between the highest and lowest fraction identity
    for alignments of a query.  Output line per query, with the difference.

Options:
   -multiOnly - omit queries with only one alignment from output.
   -nonZero - omit queries with zero values.

================================================================
========   pslLiftSubrangeBlat   ====================================
================================================================
pslLiftSubrangeBlat - lift PSLs from blat subrange alignments
usage:
   pslLiftSubrangeBlat isPsl outPsl

Lift a PSL with target coordinates from a blat subrange query
(e.g. blah/hg18.2bit:chr1:1000-20000) which has subrange
coordinates as the target name (e.g. chr1:1000-200000) to
actual target coordinates.

options:
  -tSizes=szfile - lift target side based on tName, using target sizes from
                   this tab separated file.
  -qSizes=szfile - lift query side based on qName, using query sizes from
                   this tab separated file.
Must specify at least on of -tSizes or -qSize or both.

================================================================
========   pslMap   ====================================
================================================================
Error: wrong number of arguments
pslMap - map PSLs alignments to new targets using alignments of
the old target to the new target.  Given inPsl and mapPsl, where
the target of inPsl is the query of mapPsl, create a new PSL
with the query of inPsl aligned to all the targets of mapPsl.
If inPsl is a protein to nucleotide alignment and mapPsl is a
nucleotide to nucleotide alignment, the resulting alignment is
nucleotide to nucleotide alignment of a hypothetical mRNA that
would code for the protein.  This is useful as it gives base
alignments of spliced codons.  A chain file may be used instead
mapPsl.

usage:
   pslMap [options] inPsl mapFile outPsl

Options:
  -chainMapFile - mapFile is a chain file instead of a psl file
  -swapMap - swap query and target sides of map file.
  -swapIn - swap query and target sides of inPsl file.
  -suffix=str - append str to the query ids in the output
   alignment.  Useful with protein alignments, where the result
   is not actually and alignment of the protein.
  -keepTranslated - if either psl is translated, the output psl
   will be translated (both strands explicted).  Normally an
   untranslated psl will always be created
  -mapInfo=file - output a file with information about each mapping.
   The file has the following columns:
     o srcQName, srcQStart, srcQEnd, srcQSize - qName, etc of
       psl being mapped (source alignment)
     o srcTName, srcTStart, srcTEnd - tName, etc of psl being
       mapped
     o srcStrand - strand of psl being mapped
     o srcAligned - number of aligned based in psl being mapped
     o mappingQName, mappingQStart, mappingQEnd - qName, etc of
       mapping psl used to map alignment
     o mappingTName, mappingTStart, mappingTEnd - tName, etc of
       mapping psl
     o mappingStrand - strand of mapping psl
     o mappingId - chain id, or psl file row
     o mappedQName mappedQStart, mappedQEnd - qName, etc of
       mapped psl
     o mappedTName, mappedTStart, mappedTEnd - tName, etc of
       mapped psl
     o mappedStrand - strand of mapped psl
     o mappedAligned - number of aligned bases that were mapped
     o qStartTrunc - aligned bases at qStart not mapped due to
       mapping psl/chain not covering the entire soruce psl.
       This is from the start of the query in the positive
       direction.
     o qEndTrunc - similary for qEnd
   If the psl count not be mapped, the mapping* and mapped* columns are empty.
  -mappingPsls=pslFile - write mapping alignments that were used in
   PSL format to this file.  Transformations that were done, such as
   -swapMap, will be reflected in this file.  There will be a one-to-one
   correspondence of rows of this file to rows of the outPsl file.
  -simplifyMappingIds - simplifying mapping ids (inPsl target
   name and mapFile query name) before matching them. This
   first drops everything after the last `-', and then drops
   everything after the last remaining `.'.

================================================================
========   pslMrnaCover   ====================================
================================================================
pslMrnaCover - Make histogram of coverage percentage of mRNA in psl.
usage:
   pslMrnaCover mrna.psl mrna.fa
options:
   -minSize=N  - default 100.  Minimum size of mRNA considered
   -listZero=zero.tab - List accessions that don't align in zero.tab

================================================================
========   pslPairs   ====================================
================================================================
pslPairs - join paired ends in psl alignments
usage: pslPairs <pslFile> <pairFile> <pslTableName> <outFilePrefix>
  creates: <outFilePrefix>.pairs file
  pslFile	- filtered psl alignments of ends from kluster run
  pairFile	- three column tab separated: forward reverse cloneId
		- forward and reverse columns can be comma separated end ids
  pslTableName	- table name the psl alignments have been loaded into
  outFilePrefix	- prefix used for each output file name
Options:
  -max=N	- maximum length of clone sequence (default=47000)
  -min=N	- minimum length of clone sequence (default=32000)
  -slopval=N	- deviation from max/min clone lengths allowed for slop report
		- (default=5000)
  -nearTop=N	- maximium deviation from best match allowed (default=0.001)
  -minId=N	- minimum pct ID of at least one end (default=0.96)
  -minOrphanId=N - minimum pct ID for orphan alignment (default=0.96)
  -tInsert=N	- maximum insert bases allowed in sequence alignment
		- (default=500)
  -hardMax=N	- absolute maximum clone length for long report (default=75000)
  -verbose	- display all informational messages
  -noBin	- do not include bin column in output file
  -noRandom	- do not include placements on random portions
		- {length(chr name) < 7}
  -slop		- create <outFilePrefix>.slop file of pairs that fall within
		- slop length
  -short	- create <outFilePrefix>.short file of pairs shorter than
		- min size
  -long		- create <outFilePrefix>.long file of pairs longer than
		- max size, but less than hardMax size
  -mismatch	- create <outFilePrefix>.mismatch file of pairs with
		- bad orientation of ends
  -orphan	- create <outFilePrefix>.orphan file of unmatched end sequences
================================================================
========   pslPartition   ====================================
================================================================
Error: wrong # args
pslPartition - split PSL files into non-overlapping sets
usage:
   pslPartition [options] pslFile outDir

Split psl files into non-overlapping sets for use in cluster jobs,
limiting memory usage, etc. Multiple levels of directories can be are
created under outDir to prevent slow access to huge directories.
The pslFile maybe compressed and no ordering is assumed.

options:
  -outLevels=0 - number of output subdirectory levels.  0 puts all files
   directly in outDir, 2, will create files in the form outDir/0/0/00.psl
  -partSize=20000 - will combine non-overlapping partitions, while attempting
   to keep them under this number of PSLs.  This reduces the number of
   files that are created while ensuring that there are no overlaps
   between any two PSL files.  A value of 0 creates a PSL file per set of
   overlapping PSLs.
  -dropContained - drop PSLs that are completely contained in a block of
   another PSL.


================================================================
========   pslPretty   ====================================
================================================================
pslPretty - Convert PSL to human readable output
usage:
   pslPretty in.psl target.lst query.lst pretty.out
options:
   -axt - save in something like Scott Schwartz's axt format
          Note gaps in both sequences are still allowed in the
          output which not all axt readers will expect
   -dot=N Put out a dot every N records
   -long - Don't abbreviate long inserts
   -check=fileName - Output alignment checks to filename
It's a really good idea if the psl file is sorted by target
if it contains multiple targets.  Otherwise this will be
very very slow.   The target and query lists can either be
fasta, 2bit or nib files, or a list of fasta, 2bit and/or nib files
one per line

================================================================
========   pslRecalcMatch   ====================================
================================================================
pslRecalcMatch - Recalculate match,mismatch,repMatch columns in psl file.
This can be useful if the psl went through pslMap, or if you've added 
lower-case repeat masking after the fact
usage:
   pslRecalcMatch in.psl targetSeq querySeq out.psl
where targetSeq is either a nib directory or a two bit file
and querySeq is a fasta file, nib file, two bit file, or list
of such files.  The psl's should be simple non-translated ones.
This will work faster if the in.psl is sorted on target.
options:
   -ignoreQUniq - ignore everything after the last `-' in the qName field, that
    is sometimes used to generate a unique identifier

================================================================
========   pslReps   ====================================
================================================================
pslReps - analyse repeats and generate genome wide best
alignments from a sorted set of local alignments
usage:
    pslReps in.psl out.psl out.psr
where in.psl is an alignment file generated by psLayout and
sorted by pslSort, out.psl is the best alignment output
and out.psr contains repeat info
options:
    -nohead don't add PSL header
    -ignoreSize Will not weigh in favor of larger alignments so much
    -noIntrons Will not penalize for not having introns when calculating
              size factor
    -singleHit  Takes single best hit, not splitting into parts
    -minCover=0.N minimum coverage to output.  Default is 0.
    -ignoreNs Ignore 'N's when calculating minCover.
    -minAli=0.N minimum alignment ratio
               default is 0.93
    -nearTop=0.N how much can deviate from top and be taken
               default is 0.01
    -minNearTopSize=N  Minimum size of alignment that is near top
               for alignment to be kept.  Default 30.
    -coverQSizes=file Tab-separate file with effective query sizes.
                     When used with -minCover, this allows polyAs
                     to be excluded from the coverage calculation

================================================================
========   pslSelect   ====================================
================================================================
pslSelect - select records from a PSL file.

usage:
   pslSelect [options] inPsl outPsl

Must specify a selection option

Options:
   -qtPairs=file - file is tab-separated qName and tName pairs to select
   -queries=file - file has qNames to select
   -queryPairs=file - file is tab-separated paris of qNames to select
    with new qName to substitute in output file
   -qtStart=file - file is tab-seperate rows of qName,tName,tStart

================================================================
========   pslSort   ====================================
================================================================
pslSort - merge and sort psCluster .psl output files
usage:
  pslSort dirs[1|2] outFile tempDir inDir(s)
This will sort all of the .psl files in the directories
inDirs in two stages - first into temporary files in tempDir
and second into outFile.  The device on tempDir needs to have
enough space (typically 15-20 gigabytes if processing whole genome)
  pslSort g2g[1|2] outFile tempDir inDir(s)
This will sort a genome to genome alignment, reflecting the
alignments across the diagonal.

Adding 1 or 2 after the dirs or g2g will limit the program to
only the first or second pass repectively of the sort

Options:
   -nohead - do not write psl header:
   -verbose=N Set verbosity level, higher for more output. Default 1

================================================================
========   pslStats   ====================================
================================================================
pslStats - collect statistics from a psl file.

usage:
   pslStats [options] psl statsOut

Options:
  -queryStats - output per-query statistics, the default is per-alignment stats
  -overallStats - output overall statistics.
  -queries=querySizeFile - tab separated file with of expected qNames and sizes.
   If specified, statistic will include queries that didn't align.

================================================================
========   pslSwap   ====================================
================================================================
wrong # args:
pslSwap [options] inPsl outPsl

Swap target and query in psls

Options:
  -noRc - don't reverse complement untranslated alignments to
   keep target positive strand.  This will make the target strand
   explict.

================================================================
========   pslToBed   ====================================
================================================================
pslToBed: tranform a psl format file to a bed format file.
usage:
    pslToBed psl bed
options:
    -cds=cdsFile
cdsFile specifies a input cds tab-separated file which contains
genbank-style CDS records showing cdsStart..cdsEnd
e.g. NM_123456 34..305
These coordinates are assumed to be in the query coordinate system
of the psl, like those that are created from genePredToFakePsl

================================================================
========   pslToChain   ====================================
================================================================
pslToChain - Convert psl records to chain records 
usage:
   pslToChain pslIn chainOut
options:
   -xxx=XXX

================================================================
========   pslToPslx   ====================================
================================================================
pslToPslx - Convert from psl to pslx format, which includes sequences
usage:
   pslToPslx [options] in.psl qSeqSpec tSeqSpec out.pslx

qSeqSpec and tSeqSpec can be nib directory, a 2bit file, or a FASTA file.
FASTA files should end in .fa, .fa.gz, .fa.Z, or .fa.bz2 and are read into
memory.

Options:
  -masked - if specified, repeats are in lower case cases, otherwise entire
            sequence is loader case.

================================================================
========   pslxToFa   ====================================
================================================================
pslxToFa - convert pslx (with sequence) to fasta file
usage:
   pslxToFa in.psl out.fa
options:
   -liftTarget=liftTarget.lft
   -liftQuery=liftQuery.lft

================================================================
========   qaToQac   ====================================
================================================================
qaToQac - convert from uncompressed to compressed
quality score format.
usage:
   qaToQac in.qa out.qac
================================================================
========   qacAgpLift   ====================================
================================================================
qacAgpLift - Use AGP to combine per-scaffold qac into per-chrom qac.
usage:
   qacAgpLift scaffoldToChrom.agp scaffolds.qac chrom.qac
options:
    -mScore=N - score to use for missing data (otherwise fail)
            range: 0-99, recommended values are 98 (low qual) or 99 (high)
================================================================
========   qacToQa   ====================================
================================================================
qacToQa - convert from compressed to uncompressed
quality score format.
usage:
   qacToQa in.qac out.qa
	-name=name  restrict output to just this sequence name

================================================================
========   qacToWig   ====================================
================================================================
qacToWig - convert from compressed quality score format to wiggle format.
usage:
   qacToWig in.qac outFileOrDir
	-name=name    restrict output to just this sequence name
	-fixed        output single file with wig headers and fixed step size
   If neither -name nor -fixed is used, outFileOrDir is a directory which
   will be created if it does not already exist.  If -name and/or -fixed is
   used, outFileOrDir is a file (or "stdout").

================================================================
========   raSqlQuery   ====================================
================================================================
raSqlQuery - Do a SQL-like query on a RA file.
   raSqlQuery raFile(s) query-options
or
   raSqlQuery -db=dbName query-options
Where dbName is a UCSC Genome database like hg18, sacCer1, etc.
One of the following query-options must be specified
   -queryFile=fileName
   "-query=select list,of,fields from file where field='this'"
The queryFile just has a query in it in the same form as the query option.
The syntax of a query statement is very SQL-like. The most common commands are:
    select tag1,tag2,tag3 where tag1 like 'prefix%'
where the % is a SQL wildcard.  Sorry to mix wildcards. Another command query is
    select count(*) from * where tag = 'val
The from list is optional.  If it exists it is a list of raFile names
    select track,type from *Encode* where type like 'bigWig%'
Other command line options:
   -addFile - Add 'file' field to say where record is defined
   -addDb - Add 'db' field to say where record is defined
   -strict - Used only with db option.  Only report tracks that exist in db
   -key=keyField - Use the as the key field for merges and parenting. Default name
   -parent - Merge together inheriting on parentField
   -parentField=field - Use field as the one that tells us who is our parent. Default subTrack
   -overrideNeeded - If set records are only overridden field-by-field by later records
               if 'override' follows the track name. Otherwiser later record replaces
               earlier record completely.  If not set all records overridden field by field
   -noInheritField=field - If field is present don't inherit fields from parent
   -merge - If there are multiple raFiles, records with the same keyField will be
          merged together with fields in later files overriding fields in earlier files
   -restrict=keyListFile - restrict output to only ones with keys in file.
   -db=hg19 - Acts on trackDb files for the given database.  Sets up list of files
              appropriately and sets parent, merge, and override all.
              Use db=all for all databases

================================================================
========   raToLines   ====================================
================================================================
raToLines - Output .ra file stanzas as single lines, with pipe-separated fields.

usage:
   raToLines in.ra out.txt

================================================================
========   raToTab   ====================================
================================================================
raToTab - Convert ra file to table.
usage:
   raToTab in.ra out.tab
options:
   -cols=a,b,c - List columns in order to output in table
                 Only these columns will be output.  If you
                 Don't give this option, all columns are output
                 in alphabetical order
   -head - Put column names in header

================================================================
========   randomLines   ====================================
================================================================
randomLines - Pick out random lines from file
usage:
   randomLines inFile count outFile
options:
   -seed=N - Set seed used for randomizing, useful for debugging.
   -decomment - remove blank lines and those starting with 

================================================================
========   rmFaDups   ====================================
================================================================
rmFaDup - remove duplicate records in FA file
usage
   rmFaDup oldName.fa newName.fa

================================================================
========   rowsToCols   ====================================
================================================================
rowsToCols - Convert rows to columns and vice versa in a text file.
usage:
   rowsToCols in.txt out.txt
By default all columns are space-separated, and all rows must have the
same number of columns.
options:
   -varCol - rows may to have various numbers of columns.
   -tab - fields are separated by tab
   -fs=X - fields are separated by given character
   -fixed - fields are of fixed width with space padding
   -offsets=X,Y,Z - fields are of fixed width at given offsets

================================================================
========   sizeof   ====================================
================================================================
     type   bytes    bits
     char	1	8
unsigned char	1	8
short int	2	16
u short int	2	16
      int	4	32
 unsigned	4	32
     long	8	64
unsigned long	8	64
long long	8	64
u long long	8	64
   size_t	8	64
   void *	8	64
    float	4	32
   double	8	64
long double	16	128
LITTLE ENDIAN machine detected
byte order: normal order: 0x12345678 in memory: 0x78563412
================================================================
========   spacedToTab   ====================================
================================================================
spacedToTab - Convert fixed width space separated fields to tab separated
Note this requires two passes, so it can't be done on a pipe
usage:
   spacedToTab in.txt out.tab
options:
   -sizes=X,Y,Z - Force it to have columns of the given widths.
                 The final char in each column should be space or newline

================================================================
========   splitFile   ====================================
================================================================
splitFile - Split up a file
usage:
   splitFile source linesPerFile outBaseName
options:
   -head=file - put head in front of each output
   -tail=file - put tail at end of each output
================================================================
========   splitFileByColumn   ====================================
================================================================
splitFileByColumn - Split text input into files named by column value
usage:
   splitFileByColumn source outDir
options:
   -col=N      - Use the Nth column value (default: N=1, first column)
   -head=file  - Put head in front of each output
   -tail=file  - Put tail at end of each output
   -chromDirs  - Split into subdirs of outDir that are distilled from chrom
                 names, e.g. chr3_random -> outDir/3/chr3_random.XXX .
   -ending=XXX - Use XXX as the dot-suffix of split files (default: taken
                 from source).
   -tab        - Split by tab characters instead of whitespace.
Split source into multiple files in outDir, with each filename determined
by values from a column of whitespace-separated input in source.
If source begins with a header, you should pipe "tail +N source" to this
program where N is number of header lines plus 1, or use some similar
method to strip the header from the input.

================================================================
========   sqlToXml   ====================================
================================================================
sqlToXml - dump out all or part of a relational database to XML, guided
by a dump specification.  See sqlToXml.doc for additional information.
usage:
   sqlToXml database dumpSpec.od output.xml
options:
   -topTag=name - Give the top level XML tag the given name.  By
               default it will be the same as the database name.
   -query=file.sql - Instead of dumping whole database, just dump those
                  records matching SQL select statement in file.sql.
                  This statement should be of the form:
           select * from table where ...
                   or
           select table.* from table,otherTables where ...
                   Where the table is the same as the table in the first
                   line of dumpSpec.
   -tab=N - number of spaces betweeen tabs in xml.dumpSpec - by default it's 8.
            (It may be best just to avoid tabs in that file though.)
   -maxList=N - This will limit any lists in the output to no more than
                size N.  This is mostly just for testing.

================================================================
========   stringify   ====================================
================================================================
stringify - Convert file to C strings
usage:
   stringify [options] in.txt
A stringified version of in.txt  will be printed to standard output.

Options:
  -var=varname - create a variable with the specified name containing
                 the string.
  -static - create the variable as a string array.


================================================================
========   subChar   ====================================
================================================================
subChar - Substitute one character for another throughout a file.
usage:
   subChar oldChar newChar file(s)
oldChar and newChar can either be single letter literal characters,
or two digit hexadecimal ascii codes
================================================================
========   subColumn   ====================================
================================================================
subColumn - Substitute one column in a tab-separated file.
usage:
   subColumn column in.tab sub.tab out.tab
Where:
    column is the column number (starting with 1)
    in.tab is a tab-separated file
    sub.tab is a where first column is old values, second new
    out.tab is the substituted output
options:
   -list - Column is a comma-separated list.  Substitute all elements in list
   -miss=fileName - Print misses to this file instead of aborting

================================================================
========   tailLines   ====================================
================================================================
tailLines - add tail to each line of file
usage:
   tailLines file tail
This will add tail to each line of file and print to stdout.
================================================================
========   tdbQuery   ====================================
================================================================
tdbQuery - Query the trackDb system using SQL syntax.
Usage:
    tdbQuery sqlStatement
Where the SQL statement is enclosed in quotations to avoid the shell interpreting it.
Only a very restricted subset of a single SQL statement (select) is supported.   Examples:
    tdbQuery "select count(*) from hg18"
counts all of the tracks in hg18 and prints the results to stdout
   tdbQuery "select count(*) from *"
counts all tracks in all databases.
   tdbQuery "select  track,shortLabel from hg18 where type like 'bigWig%'"
prints to stdout a a two field .ra file containing just the track and shortLabels of bigWig 
type tracks in the hg18 version of trackDb.
   tdbQuery "select * from hg18 where track='knownGene' or track='ensGene'"
prints the hg18 knownGene and ensGene track's information to stdout.
   tdbQuery "select *Label from mm9"
prints all fields that end in 'Label' from the mm9 trackDb.
OPTIONS:
   -root=/path/to/trackDb/root/dir
Sets the root directory of the trackDb.ra directory hierarchy to be given path. By default
this is ~/kent/src/hg/makeDb/trackDb.
   -check
Check that trackDb is internally consistent.  Prints diagnostic output to stderr and aborts if 
there's problems.
   -strict
Mimic -strict option on hgTrackDb. Suppresses tracks where corresponding table does not exist.
   -release=alpha|beta|public
Include trackDb entries with this release tag only. Default is alpha.
   -noBlank
Don't print out blank lines separating records
   -oneLine
Print single ('|') pipe-separated line per record
   -noCompSub
Subtracks don't inherit fields from parents
   -shortLabelLength=N
Complain if shortLabels are over N characters
   -longLabelLength=N
Complain if longLabels are over N characters

================================================================
========   textHistogram   ====================================
================================================================
textHistogram - Make a histogram in ascii
usage:
   textHistogram [options] inFile
Where inFile contains one number per line.
  options:
   -binSize=N - Size of bins, default 1
   -maxBinCount=N - Maximum # of bins, default 25
   -minVal=N - Minimum value to put in histogram, default 0
   -log - Do log transformation before plotting
   -noStar - Don't draw asterisks
   -col=N - Which column to use. Default 1
   -aveCol=N - A second column to average over. The averages
             will be output in place of counts of primary column.
   -real - Data input are real values (default is integer)
   -autoScale=N - autoscale to N # of bins
   -probValues - show prob-Values (density and cum.distr.) (sets -noStar too)
   -freq - show frequences instead of counts
   -skip=N - skip N lines before starting, default 0

================================================================
========   tickToDate   ====================================
================================================================
tickToDate - Convert seconds since 1970 to time and date
usage:
   tickToDate ticks
Use 'now' for current ticks and date

================================================================
========   toLower   ====================================
================================================================
toLower - Convert upper case to lower case in file. Leave other chars alone
usage:
   toLower inFile outFile
equivalent to the unix commands: cat inFile | tr '[A-Z]' '[a-z]' > outFile
================================================================
========   toUpper   ====================================
================================================================
toUpper - Convert lower case to upper case in file. Leave other chars alone
usage:
   toUpper inFile outFile
equivalent to the unix commands: cat inFile | tr '[a-z]' '[A-Z]' > outFile
================================================================
========   trfBig   ====================================
================================================================
trfBig - Mask tandem repeats on a big sequence file.
usage:
   trfBig inFile outFile
This will repeatedly run trf to mask tandem repeats in infile
and put masked results in outFile.  inFile and outFile can be .fa
or .nib format. Outfile can be .bed as well. Sequence output is hard
masked, lowercase.

   -bed creates a bed file in current dir
   -bedAt=path.bed - create a bed file at explicit location
   -tempDir=dir Where to put temp files.
   -trf=trfExe explicitly specifies trf executable name
   -maxPeriod=N  Maximum period size of repeat (default 2000)
   -keep  don't delete tmp files

================================================================
========   twoBitDup   ====================================
================================================================
twoBitDup - check to see if a twobit file has any identical sequences in it
usage:
   twoBitDup file.2bit
options:
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

================================================================
========   twoBitInfo   ====================================
================================================================
twoBitInfo - get information about sequences in a .2bit file
usage:
   twoBitInfo input.2bit output.tab
options:
   -nBed   instead of seq sizes, output BED records that define 
           areas with N's in sequence
   -noNs   outputs the length of each sequence, but does not count Ns 
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs
Output file has the columns::
   seqName size

The 2bit file may be specified in the form path:seq or path:seq1,seq2,seqN...
so that information is returned only on the requested sequence(s).
If the form path:seq:start-end is used, start-end is ignored.

================================================================
========   twoBitMask   ====================================
================================================================
twoBitMask - apply masking to a .2bit file, creating a new .2bit file
usage:
   twoBitMask input.2bit maskFile output.2bit
options:
   -add   Don't remove pre-existing masking before applying maskFile.
   -type=.XXX   Type of maskFile is XXX (bed or out).
maskFile can be a RepeatMasker .out file or a .bed file.  It must not
contain rows for sequences which are not in input.2bit.

================================================================
========   twoBitToFa   ====================================
================================================================
twoBitToFa - Convert all or part of .2bit file to fasta
usage:
   twoBitToFa input.2bit output.fa
options:
   -seq=name - restrict this to just one sequence
   -start=X  - start at given position in sequence (zero-based)
   -end=X - end at given position in sequence (non-inclusive)
   -seqList=file - file containing list of the desired sequence names 
                    in the format seqSpec[:start-end], e.g. chr1 or chr1:0-189
                    where coordinates are half-open zero-based, i.e. [start,end)
   -noMask - convert sequence to all upper case
   -bpt=index.bpt - use bpt index instead of built in one
   -bed=input.bed - grab sequences specified by input.bed. Will exclude introns
   -bedPos        - with -bed, to use chrom:start-end as the fasta ID in output.fa
   -udcDir=/dir/to/cache - place to put cache for remote bigBed/bigWigs

Sequence and range may also be specified as part of the input
file name using the syntax:
      /path/input.2bit:name
   or
      /path/input.2bit:name
   or
      /path/input.2bit:name:start-end

================================================================
========   validateFiles   ====================================
================================================================
validateFiles - Validates the format of different genomic files.
                Exits with a zero status for no errors detected and non-zero for errors.
                Uses filename 'stdin' to read from stdin.
                Automatically decompresses Files in .gz, .bz2, .zip, .Z format.
                Accepts multiple input files of the same type.
                Writes Error messages to stderr
usage:
   validateFiles -chromInfo=FILE -options -type=FILE_TYPE file1 [file2 [...]]

   -type=
       fasta        : Fasta files (only one line of sequence, and no quality scores)
       fastq        : Fasta with quality scores (see http://maq.sourceforge.net/fastq.shtml)
       csfasta      : Colorspace fasta (implies -colorSpace)
       csqual       : Colorspace quality (see link below)
                      See http://marketing.appliedbiosystems.com/mk/submit/SOLID_KNOWLEDGE_RD?_JS=T&rd=dm
       bam          : Binary Alignment/Map
                      See http://samtools.sourceforge.net/SAM1.pdf
       bigWig       : Big Wig
                      See http://genome.ucsc.edu/goldenPath/help/bigWig.html
       bedN[+P]     : BED N or BED N+ or BED N+P
                      where N is a number between 3 and 15 of standard BED columns,
                      optional + indicates the presence of additional columns
                      and P is the number of addtional columns
                      Examples: -type=bed6 or -type=bed6+ or -type=bed6+3 
                      See http://genome.ucsc.edu/FAQ/FAQformat.html#format1
       bigBedN[+P]  : bigBED N  or bigBED N+ or bigBED N+P, similar to BED
                      See http://genome.ucsc.edu/goldenPath/help/bigBed.html
       tagAlign     : Alignment files, replaced with BAM
       pairedTagAlign  
       broadPeak    : ENCODE Peak formats
       narrowPeak     These are specialized bedN+P formats.
       gappedPeak     See http://genomewiki.cse.ucsc.edu/EncodeDCC/index.php/File_Formats
       bedGraph    :  BED Graph
       rcc         :  NanoString RCC
       idat        :  Illumina IDAT

   -as=fields.as                If you have extra "bedPlus" fields, it's great to put a definition
                                of each field in a row in AutoSql format here. Applies to bed-related types.
   -tab                         If set, expect fields to be tab separated, normally
                                expects white space separator. Applies to bed-related types.
   -chromDb=db                  Specify DB containing chromInfo table to validate chrom names
                                and sizes
   -chromInfo=file.txt          Specify chromInfo file to validate chrom names and sizes
   -colorSpace                  Sequences include colorspace values [0-3] (can be used 
                                with formats such as tagAlign and pairedTagAlign)
   -isSorted                    Input is sorted by chrom, only affects types tagAlign and pairedTagAlign
   -doReport                    Output report in filename.report
   -version                     Print version

For Alignment validations
   -genome=path/to/hg18.2bit    REQUIRED to validate sequence mappings match the genome specified
                                in the .2bit file. (BAM, tagAlign, pairedTagAlign)
   -nMatch                      N's do not count as a mismatch
   -matchFirst=n                Only check the first N bases of the sequence
   -mismatches=n                Maximum number of mismatches in sequence (or read pair) 
   -mismatchTotalQuality=n      Maximum total quality score at mismatching positions
   -mmPerPair                   Check either pair dont exceed mismatch count if validating
                                  pairedTagAlign files (default is the total for the pair)
   -mmCheckOneInN=n             Check mismatches in only one in 'n' lines (default=1, all)
   -allowOther                  Allow chromosomes that aren't native in BAM's
   -allowBadLength              Allow chromosomes that have the wrong length in BAM
   -complementMinus             Complement the query sequence on the minus strand (for testing BAM)
   -bamPercent=N.N              Percentage of BAM alignments that must be compliant
   -privateData                 Private data so empty sequence is tolerated


================================================================
========   validateManifest   ====================================
================================================================
manifest.txt not found in workingDir .
validateManifest v1.6 - Validates the ENCODE3 manifest.txt file.
                Calls validateFiles on each file in the manifest.
                Exits with a zero status for no errors detected and non-zero for errors.
                Writes Error messages to stderr
usage:
   validateManifest

   -dir=workingDir, defaults to the current directory.
   -encValData=encValDataDir, relative to workingDir, defaults to encValData.

   Input files in the working directory: 
     manifest.txt - current input manifest file
     validated.txt - input from previous run of validateManifest

   Output file in the working directory: 
     validated.txt - results of validated input


================================================================
========   wigCorrelate   ====================================
================================================================
wigCorrelate - Produce a table that correlates all pairs of wigs.
usage:
   wigCorrelate one.wig two.wig ... n.wig
This works on bigWig as well as wig files.
The output is to stdout
options:
   -clampMax=N - values larger than this are clipped to this value

================================================================
========   wigEncode   ====================================
================================================================
wigEncode - convert Wiggle ascii data to binary format

usage:
    wigEncode [options] wigInput wigFile wibFile
	wigInput - wiggle ascii data input file (stdin OK)
	wigFile - .wig output file to be used with hgLoadWiggle
	wibFile - .wib output file to be symlinked into /gbdb/<db>/wib/

This processes the three data input format types described at:
	http://genome.ucsc.edu/encode/submission.html#WIG
	(track and browser lines are tolerated, i.e. ignored)
options:
    -lift=<D> - lift all input coordinates by D amount, default 0
              - can be negative as well as positive
    -allowOverlap - allow overlapping data, default: overlap not allowed
              - only effective for fixedStep and if fixedStep declarations
              - are in order by chromName,chromStart
    -noOverlapSpanData - check for overlapping span data
    -wibSizeLimit=<N> - ignore rest of input when wib size is >= N

Example:
    hgGcPercent -wigOut -doGaps -file=stdout -win=5 xenTro1 \
        /cluster/data/xenTro1 | wigEncode stdin gc5Base.wig gc5Base.wib
load the resulting .wig file with hgLoadWiggle:
    hgLoadWiggle -pathPrefix=/gbdb/xenTro1/wib xenTro1 gc5Base gc5Base.wig
    ln -s `pwd`/gc5Base.wib /gbdb/xenTro1/wib
================================================================
========   wigToBigWig   ====================================
================================================================
wigToBigWig v 4 - Convert ascii format wig file (in fixedStep, variableStep
or bedGraph format) to binary big wig format.
usage:
   wigToBigWig in.wig chrom.sizes out.bw
Where in.wig is in one of the ascii wiggle formats, but not including track lines
and chrom.sizes is two column: <chromosome name> <size in bases>
and out.bw is the output indexed big wig file.
Use the script: fetchChromSizes to obtain the actual chrom.sizes information
from UCSC, please do not make up a chrom sizes from your own information.
options:
   -blockSize=N - Number of items to bundle in r-tree.  Default 256
   -itemsPerSlot=N - Number of data points bundled at lowest level. Default 1024
   -clip - If set just issue warning messages rather than dying if wig
                  file contains items off end of chromosome.
   -unc - If set, do not use compression.
================================================================
========   wordLine   ====================================
================================================================
wordLine - chop up words by white space and output them with one
word to each line.
usage:
    wordLine inFile(s)
Output will go to stdout.Options:
    -csym - Break up words based on C symbol rules rather than white space

================================================================
========   xmlCat   ====================================
================================================================
xmlCat - Concatenate xml files together, stuffing all records inside a single outer tag. 
usage:
   xmlCat XXX
options:
   -xxx=XXX

================================================================
========   xmlToSql   ====================================
================================================================
xmlToSql - Convert XML dump into a fairly normalized relational database
   in the form of a directory full of tab-separated files and table
   creation SQL.  You'll need to run autoDtd on the XML file first to
   get the dtd and stats files.
usage:
   xmlToSql in.xml in.dtd in.stats outDir
options:
   -prefix=name - A name to prefix all tables with
   -textField=name - Name to use for text field (default 'text')
   -maxPromoteSize=N - Maximum size (default 32) for a element that
                       just defines a string to be promoted to a field
                       in parent table

================================================================