python DNA基序匹配操作 - 查找与TATAA序列重叠的峰

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python DNA基序匹配操作 - 查找与TATAA序列重叠的峰相关的知识,希望对你有一定的参考价值。

import xmlrpclib
import time

url = "http://deepblue.mpi-inf.mpg.de/xmlrpc"
user_key = "anonymous_key"

server = xmlrpclib.Server(url, allow_none=True)

# Find all locations where the motif TATAA appears in the genome
(status, tataa_regions) = server.find_motif("TATAAA", "GRCh38", "chr1", None, None, False, user_key)

# Selecting the data from 2 experiments: BL-2_c01.ERX297416.H3K27ac.bwa.GRCh38.20150527.bed and S008SGH1.ERX406923.H3K27ac.bwa.GRCh38.20150728.bed
# As we already know the experiments names, we keep all others fields empty.
# We are selecting the are in the chromosome 1, position 0 to 50.000.000.
(status, query_id) = server.select_experiments (["BL-2_c01.ERX297416.H3K27ac.bwa.GRCh38.20150527.bed", "S008SGH1.ERX406923.H3K27ac.bwa.GRCh38.20150728.bed"], "chr1", 0, 50000000, user_key )

# Overlap regions with pattern
(status, overlapped) = server.intersection(query_id, tataa_regions, user_key)

# Retrieve the experiments data (The @NAME meta-column is used to include the experiment name and @BIOSOURCE for experiment's biosource
(status, request_id) = server.get_regions(overlapped, "CHROMOSOME,START,END,SIGNAL_VALUE,PEAK,@NAME,@BIOSOURCE,@LENGTH,@SEQUENCE", user_key)

# Wait for the server processing
(status, info) = server.info(request_id, user_key)
request_status = info[0]["state"]
while request_status != "done" and request_status != "failed":
  time.sleep(1)
  (status, info) = server.info(request_id, user_key)
  request_status = info[0]["state"]

(status, regions) = server.get_request_data(request_id, user_key)

print regions

以上是关于python DNA基序匹配操作 - 查找与TATAA序列重叠的峰的主要内容,如果未能解决你的问题,请参考以下文章

优化或提出c++、c#代码使用omp查找所有相似的k个motifs

[POJ2778]DNA Sequence

如何拼凑 DNA 的短片段?匹配序列文件中的碱基对

Regex / Python3 - re.findall() - 查找操作码之间的所有匹配项

HDU - 2457 DNA repair(AC自动机+dp)

IDA*+剪枝DNA sequence