Class | Bio::NCBI::REST |
In: |
lib/bio/io/ncbirest.rb
|
Parent: | Object |
The Bio::NCBI::REST class provides REST client for the NCBI E-Utilities
Entrez utilities index:
NCBI_INTERVAL | = | 1.0 / 3.0 |
Run retrieval scripts on weekends or between 9 pm and 5 am Eastern Time
weekdays for any series of more than 100 requests. -> Not implemented
yet in BioRuby
Wait for 1/3 seconds. NCBI‘s restriction is: "Make no more than 3 requests every 1 second.". |
# File lib/bio/io/ncbirest.rb, line 352 352: def self.efetch(*args) 353: self.new.efetch(*args) 354: end
# File lib/bio/io/ncbirest.rb, line 344 344: def self.esearch(*args) 345: self.new.esearch(*args) 346: end
# File lib/bio/io/ncbirest.rb, line 348 348: def self.esearch_count(*args) 349: self.new.esearch_count(*args) 350: end
Retrieve database entries by given IDs and using E-Utils (efetch) service.
For information on the possible arguments, see
ncbi = Bio::NCBI::REST.new ncbi.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"}) ncbi.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb", "retmode"=>"xml"}) ncbi.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"}) Bio::NCBI::REST.efetch("185041", {"db"=>"nucleotide", "rettype"=>"gb", "retmode" => "xml"}) Bio::NCBI::REST.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb"}) Bio::NCBI::REST.efetch("AAA52805", {"db"=>"protein", "rettype"=>"gb"})
Arguments:
Returns: | String |
# File lib/bio/io/ncbirest.rb, line 316 316: def efetch(ids, hash = {}, step = 100) 317: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi" 318: opts = default_parameters.merge({ "retmode" => "text" }) 319: opts.update(hash) 320: 321: case ids 322: when Array 323: list = ids 324: else 325: list = ids.to_s.split(/\s*,\s*/) 326: end 327: 328: result = "" 329: 0.step(list.size, step) do |i| 330: opts["id"] = list[i, step].join(',') 331: unless opts["id"].empty? 332: response = ncbi_post_form(serv, opts) 333: result += response.body 334: end 335: end 336: return result.strip 337: #return result.strip.split(/\n\n+/) 338: end
List the NCBI database names E-Utils (einfo) service
pubmed protein nucleotide nuccore nucgss nucest structure genome books cancerchromosomes cdd gap domains gene genomeprj gensat geo gds homologene journals mesh ncbisearch nlmcatalog omia omim pmc popset probe proteinclusters pcassay pccompound pcsubstance snp taxonomy toolkit unigene unists
ncbi = Bio::NCBI::REST.new ncbi.einfo Bio::NCBI::REST.einfo
Returns: | array of string (database names) |
# File lib/bio/io/ncbirest.rb, line 180 180: def einfo 181: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi" 182: opts = default_parameters.merge({}) 183: response = ncbi_post_form(serv, opts) 184: result = response.body 185: list = result.scan(/<DbName>(.*?)<\/DbName>/m).flatten 186: return list 187: end
Search the NCBI database by given keywords using E-Utils (esearch) service and returns an array of entry IDs.
For information on the possible arguments, see
ncbi = Bio::NCBI::REST.new ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"}) ncbi.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"}) ncbi.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5}) Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"count"}) Bio::NCBI::REST.esearch("tardigrada", {"db"=>"nucleotide", "rettype"=>"gb"}) Bio::NCBI::REST.esearch("yeast kinase", {"db"=>"nuccore", "rettype"=>"gb", "retmax"=>5})
Arguments:
Returns: | array of entry IDs or a number of results |
# File lib/bio/io/ncbirest.rb, line 247 247: def esearch(str, hash = {}, limit = nil, step = 10000) 248: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" 249: opts = default_parameters.merge({ "term" => str }) 250: opts.update(hash) 251: 252: case opts["rettype"] 253: when "count" 254: count = esearch_count(str, opts) 255: return count 256: else 257: retstart = 0 258: retstart = hash["retstart"].to_i if hash["retstart"] 259: 260: limit ||= hash["retmax"].to_i if hash["retmax"] 261: limit ||= 100 # default limit is 100 262: limit = esearch_count(str, opts) if limit == 0 # unlimit 263: 264: list = [] 265: 0.step(limit, step) do |i| 266: retmax = [step, limit - i].min 267: opts.update("retmax" => retmax, "retstart" => i + retstart) 268: response = ncbi_post_form(serv, opts) 269: result = response.body 270: list += result.scan(/<Id>(.*?)<\/Id>/m).flatten 271: end 272: return list 273: end 274: end
Arguments: | same as esearch method |
Returns: | array of entry IDs or a number of results |
# File lib/bio/io/ncbirest.rb, line 278 278: def esearch_count(str, hash = {}) 279: serv = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" 280: opts = default_parameters.merge({ "term" => str }) 281: opts.update(hash) 282: opts.update("rettype" => "count") 283: response = ncbi_post_form(serv, opts) 284: result = response.body 285: count = result.scan(/<Count>(.*?)<\/Count>/m).flatten.first.to_i 286: return count 287: end