class Bio::Blast::Fastacmd
DESCRIPTION¶ ↑
Retrieves FASTA formatted sequences from a blast database using NCBI
fastacmd command.
This class requires ‘fastacmd’ command and a blast database
(formatted using the ‘-o’ option of ‘formatdb’).
USAGE¶ ↑
require 'bio' fastacmd = Bio::Blast::Fastacmd.new("/db/myblastdb") entry = fastacmd.get_by_id("sp:128U_DROME") fastacmd.fetch("sp:128U_DROME") fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"]) fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"]).each do |fasta| puts fasta end
REFERENCES¶ ↑
Attributes
Database file path.
fastacmd command file path.
Public Class Methods
This method provides a handle to a BLASTable database, which you can then use to retrieve sequences.
Prerequisites:
-
You have created a BLASTable database with the ‘-o T’ option.
-
You have the
NCBI
fastacmd tool installed.
For example, suppose the original input file looks like:
>my_seq_1 ACCGACCTCCGGAACGGATAGCCCGACCTACG >my_seq_2 TCCGACCTTTCCTACCGCACACCTACGCCATCAC ...
and you’ve created a BLASTable database from that with the command
cd /my_dir/ formatdb -i my_input_file -t Test -n Test -o T
then you can get a handle to this database with the command
fastacmd = Bio::Blast::Fastacmd.new("/my_dir/Test")
Arguments:
- database
-
path and name of BLASTable database
# File lib/bio/io/fastacmd.rb 80 def initialize(blast_database_file_path) 81 @database = blast_database_file_path 82 @fastacmd = 'fastacmd' 83 end
Public Instance Methods
Iterates over all sequences in the database.
fastacmd.each_entry do |fasta| p [ fasta.definition[0..30], fasta.seq.size ] end
- Returns
-
a
Bio::FastaFormat
object for each iteration
# File lib/bio/io/fastacmd.rb 129 def each_entry 130 cmd = [ @fastacmd, '-d', @database, '-D', '1' ] 131 Bio::Command.call_command(cmd) do |io| 132 io.close_write 133 Bio::FlatFile.open(Bio::FastaFormat, io) do |f| 134 f.each_entry do |entry| 135 yield entry 136 end 137 end 138 end 139 self 140 end
Get the sequence for a list of IDs in the database.
For example:
p fastacmd.fetch(["sp:1433_SPIOL", "sp:1432_MAIZE"])
This method always returns an array of Bio::FastaFormat
objects, even when the result is a single entry.
Arguments:
-
ids: list of IDs to retrieve from the database
- Returns
-
array of
Bio::FastaFormat
objects
# File lib/bio/io/fastacmd.rb 108 def fetch(list) 109 if list.respond_to?(:join) 110 entry_id = list.join(",") 111 else 112 entry_id = list 113 end 114 115 cmd = [ @fastacmd, '-d', @database, '-s', entry_id ] 116 Bio::Command.call_command(cmd) do |io| 117 io.close_write 118 Bio::FlatFile.new(Bio::FastaFormat, io).to_a 119 end 120 end
Get the sequence of a specific entry in the BLASTable database. For example:
entry = fastacmd.get_by_id("sp:128U_DROME")
Arguments:
-
id: id of an entry in the BLAST database
- Returns
-
a
Bio::FastaFormat
object
# File lib/bio/io/fastacmd.rb 93 def get_by_id(entry_id) 94 fetch(entry_id).shift 95 end