Class | Bio::NCBI::REST |
In: |
lib/bio/io/ncbirest.rb
|
Parent: | Object |
NCBI_INTERVAL | = | 1 | Make no more than one request every 1 seconds. (NCBI‘s restriction is "Make no more than 3 requests every 1 second.", but limited to 1/sec partly because of keeping the value in integer.) |
# File lib/bio/io/ncbirest.rb, line 252 252: def self.efetch(*args) 253: self.new.efetch(*args) 254: end
# File lib/bio/io/ncbirest.rb, line 244 244: def self.esearch(*args) 245: self.new.esearch(*args) 246: end
# File lib/bio/io/ncbirest.rb, line 248 248: def self.esearch_count(*args) 249: self.new.esearch_count(*args) 250: end
Retrieve database entries by given IDs and using E-Utils (efetch) service.
For information on the possible arguments, see
ncbi = Bio::NCBI::REST.new ncbi.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"}) ncbi.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb", "retmode"=>"xml"}) ncbi.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"}) Bio::NCBI::REST.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"}) Bio::NCBI::REST.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb"}) Bio::NCBI::REST.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Arguments:
Returns: | String |
# File lib/bio/io/ncbirest.rb, line 212 212: def efetch(ids, hash = {}, step = 100) 213: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi" 214: opts = { 215: "tool" => "bioruby", 216: "retmode" => "text", 217: } 218: opts.update(hash) 219: 220: case ids 221: when Array 222: list = ids 223: else 224: list = ids.to_s.split(/\s*,\s*/) 225: end 226: 227: result = "" 228: 0.step(list.size, step) do |i| 229: opts["id"] = list[i, step].join(',') 230: unless opts["id"].empty? 231: ncbi_access_wait 232: response = Bio::Command.post_form(serv, opts) 233: result += response.body 234: end 235: end 236: return result.strip 237: #return result.strip.split(/\n\n+/) 238: end
List the NCBI database names E-Utils (einfo) service
pubmed protein nucleotide nuccore nucgss nucest structure genome books cancerchromosomes cdd gap domains gene genomeprj gensat geo gds homologene journals mesh ncbisearch nlmcatalog omia omim pmc popset probe proteinclusters pcassay pccompound pcsubstance snp taxonomy toolkit unigene unists
ncbi = Bio::NCBI::REST.new ncbi.einfo Bio::NCBI::REST.einfo
Returns: | array of string (database names) |
# File lib/bio/io/ncbirest.rb, line 68 68: def einfo 69: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi" 70: opts = {} 71: response = Bio::Command.post_form(serv, opts) 72: result = response.body 73: list = result.scan(/<DbName>(.*?)<\/DbName>/m).flatten 74: return list 75: end
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
For information on the possible arguments, see
ncbi = Bio::NCBI::REST.new ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"}) ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"}) ncbi.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5}) Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"}) Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"}) Bio::NCBI::REST.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Arguments:
Returns: | array of entry IDs or a number of results |
# File lib/bio/io/ncbirest.rb, line 135 135: def esearch(str, hash = {}, limit = nil, step = 10000) 136: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" 137: opts = { 138: "tool" => "bioruby", 139: "term" => str, 140: } 141: opts.update(hash) 142: 143: case opts["rettype"] 144: when "count" 145: count = esearch_count(str, opts) 146: return count 147: else 148: retstart = 0 149: retstart = hash["retstart"].to_i if hash["retstart"] 150: 151: limit ||= hash["retmax"].to_i if hash["retmax"] 152: limit ||= 100 # default limit is 100 153: limit = esearch_count(str, opts) if limit == 0 # unlimit 154: 155: list = [] 156: 0.step(limit, step) do |i| 157: retmax = [step, limit - i].min 158: opts.update("retmax" => retmax, "retstart" => i + retstart) 159: ncbi_access_wait 160: response = Bio::Command.post_form(serv, opts) 161: result = response.body 162: list += result.scan(/<Id>(.*?)<\/Id>/m).flatten 163: end 164: return list 165: end 166: end
Arguments: | same as esearch method |
Returns: | array of entry IDs or a number of results |
# File lib/bio/io/ncbirest.rb, line 170 170: def esearch_count(str, hash = {}) 171: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" 172: opts = { 173: "tool" => "bioruby", 174: "term" => str, 175: } 176: opts.update(hash) 177: opts.update("rettype" => "count") 178: #ncbi_access_wait 179: response = Bio::Command.post_form(serv, opts) 180: result = response.body 181: count = result.scan(/<Count>(.*?)<\/Count>/m).flatten.first.to_i 182: return count 183: end