Formats|Formats

劳拉的欧派有香气?纵观TPS游戏21年兴衰！

百度佛教史传典籍的编纂具有宗教性目的，就是要建立佛教的历史系谱，并试图利用中国既已成形的经典形式，来为自己的著作背书。

The Multiple Alignment Format, described by UCSC, stores a series of multiple alignments in a single file. Suitable for whole-genome to whole-genome alignments, metadata such as source chromosome, start position, size, and strand can be stored.

Biopython 1.69 includes a MAF reader and writer accessible via Bio.AlignIO, and an indexer accessible via Bio.AlignIO.MafIO.

All examples below make use of the Multiz 30-way alignment to mouse chromosome 10 available from UCSC.

Getting the AlignIO code from GitHub

If you can’t wait for Biopython 1.69 to be released, get the latest Biopython from GitHub:

First, clone the repository with git from the command line, like so:

git?clone git@github.com:biopython/biopython.git

This will give you the default master branch. Then install from source.

Reading in a MAF file

Parsing a MAF file is similar to any other alignment file in AlignIO. Additional data, however, is stored as a dict in the .annotations property of SeqRecords belonging to returned MultipleSeqAlignment objects.

Annotations available in SeqRecords

Key	Type	Value
start	integer	The start position in the source sequence of this alignment
size	integer	The ungapped length of this sequence
strand	enum(“+”, “-“)	The strand this sequence originates from on the source sequence/chromosome
srcSize	integer	The total length of the source sequence/chromosome

Example

from Bio import AlignIO

for multiple_alignment in AlignIO.parse("chr10.maf", "maf"):
    print("printing a new multiple alignment")

    for seqrec in multiple_alignment:
        print(
            "starts at %s on the %s strand of a sequence %s in length, and runs for %s bp"
            % (
                seqrec.annotations["start"],
                seqrec.annotations["strand"],
                seqrec.annotations["srcSize"],
                seqrec.annotations["size"],
            )
        )

MafIndex

Biopython may soon provide an interface for fast access to the multiple alignment of several sequences across an arbitrary interval: for example, chr10:25,079,604-25,243,324 in mm9. As MAF files are available for entire chromosomes, they can be indexed by chromosome position and accessed at random. This functionality would be available in the class Bio.AlignIO.MafIO.MafIndex.

Creating or loading a MAF index

Indexes are created by determining the chromosome start and end position for a specific sequence name (generally a species), which must appear in every alignment block in the file. An index can be generated for only one species at a time. In whole-genome alignments generated by Multiz, the chromosome of one species is generally used as the reference to which other species are aligned. This reference species will appear in every block, and should be used as the target_seqname parameter. For UCSC multiz files, the form of species.chromosome is used.

To index a MAF file, or load an existing index, create a new MafIO.MafIndex object. If the index database file sqlite_file does not exist, it will be created, otherwise it will be loaded.

# index mouse chr10 from UCSC and store it in a file for later use

from Bio.AlignIO import MafIO

# idx = MafIO.MafIndex(sqlite_file, maf_file, target_seqname)
idx = MafIO.MafIndex("chr10.mafindex", "chr10.maf", "mm9.chr10")

Retrieving alignments overlapping a given interval

The MafIO.MafIndex.search() generator function accepts a list of start and end positions, and yields MultipleSeqAlignment objects that overlap the given intervals. This is particularly useful for obtaining alignments over the multiple exons of a single transcript, eliminating the need to retrieve an entire locus.

# count the number of bases in danRer5 (Zebrafish) that align to the
# Pcmt1 locus in mouse

from Bio.AlignIO.MafIO import MafIndex

idx = MafIndex("chr10.mafindex", "chr10.maf", "mm9.chr10")
results = idx.search([7350034], [7383048])

total_bases = 0

for multiple_alignment in results:
    for seqrec in multiple_alignment:
        if seqrec.id.startswith("danRer5"):
            # don't count gaps as bases
            total_bases += len(str(seqrec.seq).replace("-", ""))

print("a total of %s bases align" % total_bases)

Retrieving a pre-spliced alignment over a given set of exons

The MafIO.MafIndex.get_spliced() function accepts a list of start and end positions representing exons, and returns a single MultipleSeqAlignment object of the in silico spliced transcript from the reference and all aligned sequences. If part of the sequence range is not found in a particular species in the alignment, dashes (“-“) are used to fill the gaps, or “N”s if the sequence is not present in the reference (target_seqname) sequence. If strand is opposite that in the reference sequence, all sequences in the returned alignment will be reverse complemented.

# convert the alignment for mouse Foxo3 (NM_019740) from MAF to FASTA

from Bio import AlignIO

idx = AlignIO.MafIO.MafIndex("chr10.mafindex", "chr10.maf", "mm9.chr10")

multiple_alignment = idx.get_spliced(
    [41905591, 41916271, 41994621, 41996331],
    [41906101, 41917707, 41995347, 41996548],
    strand="+",
)

AlignIO.write(multiple_alignment, "mm9_foxo3.fa", "fasta")

# find every gene on chr10 in the current UCSC refGene database,
# retrieve its spliced multiple alignment, and write it to
# a FASTA file in the current directory
#
# depends: MySQLdb

import MySQLdb
from Bio import AlignIO

# connect to UCSC's live MySQL database
mysql_conn = MySQLdb.connect(
    host="genome-mysql.cse.ucsc.edu", user="genome", passwd="", db="mm9"
)

db_conn = mysql_conn.cursor(MySQLdb.cursors.DictCursor)

# load MAF index
idx = AlignIO.MafIO.MafIndex("chr10.mafindex", "chr10.maf", "mm9.chr10")

# fetch all records on chr10
db_conn.execute("SELECT * FROM refGene WHERE chrom = 'chr10'")

for record in db_conn.fetchall():
    multiple_alignment = idx.get_spliced(
        map(int, record["exonStarts"].split(",")[:-1]),
        map(int, record["exonEnds"].split(",")[:-1]),
        strand=record["strand"],
    )

    print("writing %s.fa" % record["name"])

    AlignIO.write(multiple_alignment, "%s.fa" % record["name"], "fasta")

Format

track name=euArc visibility=pack
##maf version=1 scoring=tba.v8
# tba.v8 (((human chimp) baboon) (mouse rat))

a score=23262.0
s hg18.chr7    27578828 38 + 158545518 AAA-GGGAATGTTAACCAAATGA---ATTGTCTCTTACGGTG
s panTro1.chr6 28741140 38 + 161576975 AAA-GGGAATGTTAACCAAATGA---ATTGTCTCTTACGGTG
s baboon         116834 38 +   4622798 AAA-GGGAATGTTAACCAAATGA---GTTGTCTCTTATGGTG
s mm4.chr6     53215344 38 + 151104725 -AATGGGAATGTTAAGCAAACGA---ATTGTCTCTCAGTGTG
s rn3.chr4     81344243 40 + 187371129 -AA-GGGGATGCTAAGCCAATGAGTTGTTGTCTCTCAATGTG

a score=5062.0
s hg18.chr7    27699739 6 + 158545518 TAAAGA
s panTro1.chr6 28862317 6 + 161576975 TAAAGA
s baboon         241163 6 +   4622798 TAAAGA
s mm4.chr6     53303881 6 + 151104725 TAAAGA
s rn3.chr4     81444246 6 + 187371129 taagga

a score=6636.0
s hg18.chr7    27707221 13 + 158545518 gcagctgaaaaca
s panTro1.chr6 28869787 13 + 161576975 gcagctgaaaaca
s baboon         249182 13 +   4622798 gcagctgaaaaca
s mm4.chr6     53310102 13 + 151104725 ACAGCTGAAAATA

ct是什么	榴莲有什么营养价值	雷峰塔为什么叫雷峰塔	edp是什么意思	幽门螺旋杆菌的症状是什么
创伤弧菌用什么抗生素	斑秃去医院挂什么科	事宜是什么意思	一月来两次月经是什么原因	吃什么能减肥最快还能减全身
1988年什么命	四叶草是什么意思	农村适合养殖什么	巴马汤泡脚有什么功效	亚铁是什么
6月份能种什么菜	阳强易举是什么意思	为什么学习	吃洋葱有什么好处和坏处	突然流鼻血是什么征兆

梦见偷别人东西是什么意思xinjiangjialails.com	最贵的烟是什么hcv9jop3ns9r.cn	后背酸疼是什么原因hcv7jop6ns9r.cn	有什么好看的三级片hcv8jop7ns0r.cn	什么叫npchcv9jop3ns3r.cn
黄褐斑是什么样的图片bjcbxg.com	二甲双胍为什么晚上吃hcv8jop9ns9r.cn	三叉神经痛有什么症状hcv9jop1ns1r.cn	沙棘原浆有什么作用hcv7jop7ns2r.cn	黑胡椒和白胡椒有什么区别hcv9jop1ns4r.cn
生殖激素常规检查是查什么的hcv9jop4ns2r.cn	绿茶不能和什么一起吃hcv7jop7ns1r.cn	蜘蛛吃什么zhongyiyatai.com	刁子鱼是什么鱼hcv8jop6ns1r.cn	父母什么血型会溶血zsyouku.com
县纪委副书记什么级别dajiketang.com	左肾肾盂分离什么意思beikeqingting.com	二型血糖高吃什么药好fenrenren.com	茯茶是什么茶hcv9jop7ns3r.cn	三个羊念什么hcv9jop1ns2r.cn