class Bio::NBRF

Sequence data class for NBRF/PIR flatfile format.

Constants

DELIMITER

Delimiter of each entry. Bio::FlatFile uses it.

DELIMITER_OVERRUN

(Integer) excess read size included in DELIMITER.

Attributes

accession[RW]

Returns ID described in the entry.

data[RW]

sequence data of the entry (???)

definition[RW]

Returns the description line of the NBRF/PIR formatted data.

entry_id[RW]

Returns ID described in the entry.

entry_overrun[R]

piece of next entry. Bio::FlatFile uses it.

seq_type[RW]

Returns sequence type described in the entry.

P1 (protein), F1 (protein fragment)
DL (DNA linear), DC (DNA circular)
RL (DNA linear), RC (DNA circular)
N3 (tRNA), N1 (other functional RNA)

Public Class Methods

new(str) click to toggle source

Creates a new NBRF object. It stores the comment and sequence information from one entry of the NBRF/PIR format string. If the argument contains more than one entry, only the first entry is used.

   # File lib/bio/db/nbrf.rb
45 def initialize(str)
46   str = str.sub(/\A[\r\n]+/, '') # remove first void lines
47   line1, line2, rest = str.split(/^/, 3)
48 
49   rest = rest.to_s
50   rest.sub!(/^>.*/m, '') # remove trailing entries for sure
51   @entry_overrun = $&
52   rest.sub!(/\*\s*\z/, '') # remove last '*' and "\n"
53   @data = rest
54 
55   @definition = line2.to_s.chomp
56   if /^>?([A-Za-z0-9]{2})\;(.*)/ =~ line1.to_s then
57     @seq_type = $1
58     @entry_id = $2
59   end
60 end
to_nbrf(hash) click to toggle source

Creates a NBRF/PIR formatted text. Parameters can be omitted.

    # File lib/bio/db/nbrf.rb
167 def self.to_nbrf(hash)
168   seq_type = hash[:seq_type]
169   seq = hash[:seq]
170   unless seq_type
171     if seq.is_a?(Bio::Sequence::AA) then
172       seq_type = 'P1'
173     elsif seq.is_a?(Bio::Sequence::NA) then
174       seq_type = /u/i =~ seq ? 'RL' : 'DL'
175     else
176       seq_type = 'XX'
177     end
178   end
179   width = hash.has_key?(:width) ? hash[:width] : 70
180   if width then
181     seq = seq.to_s + "*"
182     seq.gsub!(Regexp.new(".{1,#{width}}"), "\\0\n")
183   else
184     seq = seq.to_s + "*\n"
185   end
186   ">#{seq_type};#{hash[:entry_id]}\n#{hash[:definition]}\n#{seq}"
187 end

Public Instance Methods

aalen() click to toggle source

Returens the length of protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

    # File lib/bio/db/nbrf.rb
157 def aalen
158   aaseq.length
159 end
aaseq() click to toggle source

Returens the protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

    # File lib/bio/db/nbrf.rb
143 def aaseq
144   if seq.is_a?(Bio::Sequence::NA) then
145     raise 'not nucleic but protein sequence'
146   elsif seq.is_a?(Bio::Sequence::AA) then
147     seq
148   else
149     Bio::Sequence::AA.new(seq)
150   end
151 end
entry() click to toggle source

Returns the stored one entry as a NBRF/PIR format. (same as to_s)

   # File lib/bio/db/nbrf.rb
84 def entry
85   @entry = ">#{@seq_type or 'XX'};#{@entry_id}\n#{definition}\n#{@data}*\n"
86 end
Also aliased as: to_s
length() click to toggle source

Returns sequence length.

    # File lib/bio/db/nbrf.rb
115 def length
116   seq.length
117 end
nalen() click to toggle source

Returens the length of sequence. If you call nalen for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

    # File lib/bio/db/nbrf.rb
135 def nalen
136   naseq.length
137 end
naseq() click to toggle source

Returens the nucleic acid sequence. If you call naseq for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

    # File lib/bio/db/nbrf.rb
122 def naseq
123   if seq.is_a?(Bio::Sequence::AA) then
124     raise 'not nucleic but protein sequence'
125   elsif seq.is_a?(Bio::Sequence::NA) then
126     seq
127   else
128     Bio::Sequence::NA.new(seq)
129   end
130 end
seq() click to toggle source

Returns sequence data. Returns Bio::Sequence::NA, Bio::Sequence::AA or Bio::Sequence, according to the sequence type.

    # File lib/bio/db/nbrf.rb
107 def seq
108   unless defined?(@seq)
109     @seq = seq_class.new(@data.tr(" \t\r\n0-9", '')) # lazy clean up
110   end
111   @seq
112 end
seq_class() click to toggle source

Returns Bio::Sequence::AA, Bio::Sequence::NA, or Bio::Sequence, depending on sequence type.

    # File lib/bio/db/nbrf.rb
 91 def seq_class
 92   case @seq_type
 93   when /[PF]1/
 94     # protein
 95     Sequence::AA
 96   when /[DR][LC]/, /N[13]/
 97     # nucleic
 98     Sequence::NA
 99   else
100     Sequence
101   end
102 end
to_s()
Alias for: entry