maf_tile - synthesize an alignment for a given region
maf_tile [options] -i [SEQ:]BEGIN:END [-s SPECIES[:NAME] ...] maf [index]
maf_tile [options] --bed BED -o BASE [-s SPECIES[:NAME] ...] maf [index]
maf_tile takes a MAF file, with optional index, or directory of indexed MAF files, extracts alignment blocks overlapping the given genomic interval, and constructs a single alignment block covering the entire interval for the specified species. Optionally, any gaps in coverage of the MAF file's reference sequence can be filled in from a FASTA sequence file.
If a single interval is specified, the output will be written to
stdout in FASTA format. If a directory of MAF files is supplied as the
maf parameter, the interval must include the sequence identifier in
sequence:begin:end. If the
--output-base option is
_<begin>:<end>.fa will be appended to the given
--output-base is also required.
Species can be renamed for output by specifying them as SPECIES:NAME; the first component will be used to select the species from the MAF file, and the second will be used in the FASTA description line for output.
The FASTA reference sequence file given, which may be gzipped, will be used to fill in any gaps between alignment blocks.
The given zero-based genomic interval will be used to select
alignment blocks from the MAF file. If the chromosome is not
specified, it will be taken from the first species specified with
The given species will be selected for output. If given as
species:name, it will appear in the FASTA output as name.
Species to select, and optional mapping names, will be read from
the given file, one species per line. If the species name is
followed by whitespace and an additional name, this will be taken
as the output name. Lines beginning with
# will be ignored.
The given BED file will be used to provide a list of intervals to
process. If present,
--interval will be ignored and
--output-base must be given as well.
The given species name will be prepended to the chromosome name
indicated in the BED file, separated by a period. This is necessary
if the BED file simply indicates
chr12, but the sequence
identifiers in the MAF file are e.g.
The alignments specified in the BED file will be individually tiled and concatenated.
The given path will be used as the base name for output files, as described above.
Gaps where no aligning sequence data exists will be filled with the
given character instead of
All sequence data will be folded to upper case.
Run quietly, with warnings suppressed.
Run verbosely, with additional informational messages.
Log debugging information.
Generate an alignment of the
chrY.maf over the interval 14400 to 15000 on the
reference sequence of the MAF file. Fills in gaps from
chrY.refseq.fa.gz. Writes FASTA output to stdout.
$ maf_tile --reference ~/maf/chrY.refseq.fa.gz \ --interval 14400:15000 \ -s hg19:human -s petMar1 -s ornAna1 \ chrY.maf chrY.kct >human GGGTGACGAAAAGAGCCGA-----[...] >petMar1 gagtgccggggagtgccggggagt[...] >ornAna1 AGGGATCTGGGAATTCTGG-----[...]
Write out a FASTA file for each interval in the given BED file,
/tmp/mm8, and without filling in data from a reference
$ maf_tile --bed /tmp/mm8.bed --output-base /tmp/mm8 \ -s mm8:mouse -s rn4:rat -s hg18:human \ mm8_chr7_tiny.maf mm8_chr7_tiny.kct
The output is generated in FASTA format, with one sequence per species.
The maf parameter must specify either a Multiple Alignment Format (MAF) file or a directory of such files, with indexes.
MAF files can optionally be BGZF-compressed, as produced by bgzip(1) from samtools.
The index must be a MAF index built with maf_index(1). This parameter is ignored if the maf parameter is a directory. It can be omitted if a single MAF file is given, but in this case the entire file will be parsed to build a temporary index. For large files which will be reused, this is not advisable.
--bed bed is specified, its argument must be a BED file. Only
the second and third columns will be used, to specify the zero-based
start and end positions of intervals.
maf_tile is a Ruby program and relies on ordinary Ruby environment
maf_tile is copyright (C) 2012 Clayton Wheeler.
maf_index(1), ruby(1), bgzip(1)