r query_tblastn.R

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了r query_tblastn.R相关的知识,希望对你有一定的参考价值。

library(dplyr)
library(readr)
library(tidyr)
library(stringr)
library(reutils)
library(XML)
library(reutils)


tblastn <- function(fasta, db = "refseq_genomic") 
{
  query_name <- paste0(fasta, ".blast.txt")
  comm <- paste("export BLASTDB=/usr/local/share/blast; tblastn -query ", fasta,
                "-db", db, 
                "-task tblastn",
                "-max_target_seqs 20000",
                "-outfmt '6  sscinames scomnames staxids qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore'",
                "-remote",
                "-entrez_query txid6237[ORGN]",
                ">",
                query_name)
  print(comm)
  system(comm)
  
  r <- read_tsv( query_name, col_names = c("Species",
                                           "Name",
                                           "TaxID",
                                           "QueryID",
                                           "SubjectID",
                                           "Percent_Identity",
                                           "Alignment_Length",
                                           "Mismatches",
                                           "Gap_Openings",
                                           "Q.Start",
                                           "Q.End",
                                           "S.Start",
                                           "S.End",
                                           "E",
                                           "Bits") ) %>%
    separate(SubjectID, into = c("name_drop", "gi", "ref_drop","accession"), sep = "\\|", extra = "drop", convert = T) %>%
    dplyr::select(-name_drop, -ref_drop) %>%
    dplyr::mutate(Name = sapply(unlist(efetch(accession, db="nuccore", "docsum")['//Item[@Name="Title"]/text()']), xmlValue) ) %>%
    dplyr::rename(POS_Start = S.Start, POS_End = S.End) %>%
    dplyr::mutate(CHROM = str_match(Name, "chromosome ([A-Za-z0-9])")[,2]) %>%
    dplyr::select(Species, Name, TaxID, QueryID, CHROM, POS_Start, POS_End, accession,  everything()) 
}

results <- tblastn("~/Desktop/Sap.fasta")

以上是关于r query_tblastn.R的主要内容,如果未能解决你的问题,请参考以下文章

——R的数据组织

+-r, +-s 的所有排列

shinydashboard ui.R 和 server.R 未读取 Global.R

R语言计算回归模型R方(R-Squared)实战

r语言中r-studio怎么调用

R电子书资料《学习R》+《R语言实战第2版》+《R数据科学》学习推荐