module Bio::Alignment::EnumerableExtension

Public Instance Methods

alignment_collect() { |str| ... } click to toggle source

Iterates over each sequence and results running blocks are collected and returns a new alignment as a Bio::Alignment::SequenceArray object.

Note that it would be redefined if you want to change return value's class.

# File lib/bio/alignment.rb, line 445
def alignment_collect
  a = SequenceArray.new
  a.set_all_property(get_all_property)
  each_seq do |str|
    a << yield(str)
  end
  a
end
alignment_concat(align) click to toggle source

Concatenates the given alignment. align must have each_seq or each method.

Returns self.

Note that it is a destructive method.

For Hash, please use it carefully because the order of the sequences is inconstant and key information is completely ignored.

# File lib/bio/alignment.rb, line 849
def alignment_concat(align)
  flag = nil
  a = []
  each_seq { |s| a << s }
  i = 0
  begin
    align.each_seq do |seq|
      flag = true
      a[i].concat(seq) if a[i] and seq
      i += 1
    end
    return self
  rescue NoMethodError, ArgumentError => evar
    raise evar if flag
  end
  align.each do |seq|
    a[i].concat(seq) if a[i] and seq
    i += 1
  end
  self
end
alignment_length() click to toggle source

Returns the alignment length. Returns the longest length of the sequence in the alignment.

# File lib/bio/alignment.rb, line 366
def alignment_length
  maxlen = 0
  each_seq do |s|
    x = s.length
    maxlen = x if x > maxlen
  end
  maxlen
end
Also aliased as: seq_length
alignment_lstrip!() click to toggle source

Removes excess gaps in the head of the sequences. If removes nothing, returns nil. Otherwise, returns self.

Note that it is a destructive method.

# File lib/bio/alignment.rb, line 752
def alignment_lstrip!
  #(String-like)
  pos = 0
  each_site do |a|
    a.remove_gaps!
    if a.empty?
      pos += 1
    else
      break
    end
  end
  return nil if pos <= 0
  each_seq { |s| s[0, pos] = '' }
  self
end
Also aliased as: lstrip!
alignment_normalize!() click to toggle source

Fills gaps to the tail of each sequence if the length of the sequence is shorter than the alignment length.

Note that it is a destructive method.

# File lib/bio/alignment.rb, line 712
def alignment_normalize!
  #(original)
  len = alignment_length
  each_seq do |s|
    s << (gap_char * (len - s.length)) if s.length < len
  end
  self
end
Also aliased as: normalize!
alignment_rstrip!() click to toggle source

Removes excess gaps in the tail of the sequences. If removes nothing, returns nil. Otherwise, returns self.

Note that it is a destructive method.

# File lib/bio/alignment.rb, line 727
def alignment_rstrip!
  #(String-like)
  len = alignment_length
  newlen = len
  each_site_step(len - 1, 0, -1) do |a|
    a.remove_gaps!
    if a.empty? then
      newlen -= 1
    else
      break
    end
  end
  return nil if newlen >= len
  each_seq do |s|
    s[newlen..-1] = '' if s.length > newlen
  end
  self
end
Also aliased as: rstrip!
alignment_site(position) click to toggle source

Gets a site of the position. Returns a Bio::Alignment::Site object.

If the position is out of range, it returns the site of which all are gaps.

# File lib/bio/alignment.rb, line 403
def alignment_site(position)
  site = _alignment_site(position)
  site.set_all_property(get_all_property)
  site
end
alignment_slice(*arg) click to toggle source

Returns the specified range of the alignment. For each sequence, the 'slice' method (it may be String#slice, which is the same as String#[]) is executed, and returns a new alignment as a Bio::Alignment::SequenceArray object.

Unlike #alignment_window method, the result alignment might contain nil.

If you want to change return value's class, you should redefine #alignment_collect method.

# File lib/bio/alignment.rb, line 807
def alignment_slice(*arg)
  #(String-like)
  #(BioPerl) AlignI::slice like method
  alignment_collect do |s|
    s.slice(*arg)
  end
end
Also aliased as: slice
alignment_strip!() click to toggle source

Removes excess gaps in the sequences. If removes nothing, returns nil. Otherwise, returns self.

Note that it is a destructive method.

# File lib/bio/alignment.rb, line 774
def alignment_strip!
  #(String-like)
  r = alignment_rstrip!
  l = alignment_lstrip!
  (r or l)
end
Also aliased as: strip!
alignment_subseq(*arg) click to toggle source

For each sequence, the 'subseq' method (Bio::Seqeunce::Common#subseq is expected) is executed, and returns a new alignment as a Bio::Alignment::SequenceArray object.

All sequences in the alignment are expected to be kind of Bio::Sequence::NA or Bio::Sequence::AA objects.

Unlike #alignment_window method, the result alignment might contain nil.

If you want to change return value's class, you should redefine #alignment_collect method.

# File lib/bio/alignment.rb, line 829
def alignment_subseq(*arg)
  #(original)
  alignment_collect do |s|
    s.subseq(*arg)
  end
end
Also aliased as: subseq
alignment_window(*arg) click to toggle source

Returns specified range of the alignment. For each sequence, the '[]' method (it may be String#[]) is executed, and returns a new alignment as a Bio::Alignment::SequenceArray object.

Unlike #alignment_slice method, the result alignment are guaranteed to contain String object if the range specified is out of range.

If you want to change return value's class, you should redefine #alignment_collect method.

# File lib/bio/alignment.rb, line 466
def alignment_window(*arg)
  alignment_collect do |s|
    s[*arg] or seqclass.new('')
  end
end
Also aliased as: window
collect_each_site() { |site| ... } click to toggle source

Iterates over each site of the alignment and results running the block are collected and returns an array. It yields a Bio::Alignment::Site object.

# File lib/bio/alignment.rb, line 503
def collect_each_site
  ary = []
  each_site do |site|
    ary << yield(site)
  end
  ary
end
consensus_each_site(opt = {}) { |a| ... } click to toggle source

Helper method for calculating consensus sequence. It iterates over each site of the alignment. In each site, gaps will be removed if specified with opt. It yields a Bio::Alignment::Site object. Results running the block (String objects are expected) are joined to a string and it returns the string.

opt[:gap_mode] ==> 0 -- gaps are regarded as normal characters
                   1 -- a site within gaps is regarded as a gap
                  -1 -- gaps are eliminated from consensus calculation
    default: 0
# File lib/bio/alignment.rb, line 523
def consensus_each_site(opt = {})
  mchar = (opt[:missing_char] or self.missing_char)
  gap_mode = opt[:gap_mode]
  case gap_mode
  when 0, nil
    collect_each_site do |a|
      yield(a) or mchar
    end.join('')
  when 1
    collect_each_site do |a|
      a.has_gap? ? gap_char : (yield(a) or mchar)
    end.join('')
  when -1
    collect_each_site do |a|
      a.remove_gaps!
      a.empty? ? gap_char : (yield(a) or mchar)
    end.join('')
  else
    raise ':gap_mode must be 0, 1 or -1'
  end
end
consensus_iupac(opt = {}) click to toggle source

Returns the IUPAC consensus string of the alignment of nucleic-acid sequences.

It resembles the BioPerl's AlignI::consensus_iupac method.

Please refer to the #consensus_each_site method for opt.

# File lib/bio/alignment.rb, line 565
def consensus_iupac(opt = {})
  consensus_each_site(opt) do |a|
    a.consensus_iupac
  end
end
consensus_string(threshold = 1.0, opt = {}) click to toggle source

Returns the consensus string of the alignment. 0.0 <= threshold <= 1.0 is expected.

It resembles the BioPerl's AlignI::consensus_string method.

Please refer to the #consensus_each_site method for opt.

# File lib/bio/alignment.rb, line 552
def consensus_string(threshold = 1.0, opt = {})
  consensus_each_site(opt) do |a|
    a.consensus_string(threshold)
  end
end
convert_match(match_char = '.') click to toggle source

This is the BioPerl's AlignI::match like method.

Changes second to last sequences' sites to match_char(default: '.') when a site is equeal to the first sequence's corresponding site.

Note that it is a destructive method.

For Hash, please use it carefully because the order of the sequences is inconstant.

# File lib/bio/alignment.rb, line 662
def convert_match(match_char = '.')
  #(BioPerl) AlignI::match like method
  len = alignment_length
  firstseq = nil
  each_seq do |s|
    unless firstseq then
      firstseq = s
    else
      (0...len).each do |i|
        if s[i] and firstseq[i] == s[i] and !is_gap?(firstseq[i..i])
          s[i..i] = match_char
        end
      end
    end
  end
  self
end
convert_unmatch(match_char = '.') click to toggle source

This is the BioPerl's AlignI::unmatch like method.

Changes second to last sequences' sites match_char(default: '.') to original sites' characters.

Note that it is a destructive method.

For Hash, please use it carefully because the order of the sequences is inconstant.

# File lib/bio/alignment.rb, line 690
def convert_unmatch(match_char = '.')
  #(BioPerl) AlignI::unmatch like method
  len = alignment_length
  firstseq = nil
  each_seq do |s|
    unless firstseq then
      firstseq = s
    else
      (0...len).each do |i|
        if s[i..i] == match_char then
          s[i..i] = (firstseq[i..i] or match_char)
        end
      end
    end
  end
  self
end
each_seq() { |seq| ... } click to toggle source

Iterates over each sequences. Yields a sequence. It acts the same as Enumerable#each.

You would redefine the method suitable for the class/object.

# File lib/bio/alignment.rb, line 340
def each_seq(&block) #:yields: seq
  each(&block)
end
each_site() { |site| ... } click to toggle source

Iterates over each site of the alignment. It yields a Bio::Alignment::Site object (which inherits Array). It returns self.

# File lib/bio/alignment.rb, line 412
def each_site
  cp = get_all_property
  (0...alignment_length).each do |i|
    site = _alignment_site(i)
    site.set_all_property(cp)
    yield(site)
  end
  self
end
each_site_step(start, stop, step = 1) { |site| ... } click to toggle source

Iterates over each site of the alignment, with specifying start, stop positions and step. It yields Bio::Alignment::Site object (which inherits Array). It returns self. It is same as start.step(stop, step) { |i| yield alignment_site(i) }.

# File lib/bio/alignment.rb, line 428
def each_site_step(start, stop, step = 1)
  cp = get_all_property
  start.step(stop, step) do |i|
    site = _alignment_site(i)
    site.set_all_property(cp)
    yield(site)
  end
  self
end
each_window(window_size, step_size = 1) { |alignment_window(i, window_size)| ... } click to toggle source

Iterates over each sliding window of the alignment. window_size is the size of sliding window. step is the step of each sliding. It yields a Bio::Alignment::SequenceArray object which contains each sliding window. It returns a Bio::Alignment::SequenceArray object which contains remainder alignment at the terminal end. If window_size is smaller than 0, it returns nil.

# File lib/bio/alignment.rb, line 481
def each_window(window_size, step_size = 1)
  return nil if window_size < 0
  if step_size >= 0 then
    last_step = nil
    0.step(alignment_length - window_size, step_size) do |i|
      yield alignment_window(i, window_size)
      last_step = i
    end
    alignment_window((last_step + window_size)..-1)
  else
    i = alignment_length - window_size
    while i >= 0
      yield alignment_window(i, window_size)
      i += step_size
    end
    alignment_window(0...(i-step_size))
  end
end
lstrip!()
Alias for: alignment_lstrip!
match_line(opt = {}) click to toggle source

Returns the match line stirng of the alignment of nucleic- or amino-acid sequences. The type of the sequence is automatically determined or you can specify with opt.

It resembles the BioPerl's AlignI::match_line method.

opt[:type] ==> :na or :aa (or determined by sequence class)
opt[:match_line_char]   ==> 100% equal    default: '*'
opt[:strong_match_char] ==> strong match  default: ':'
opt[:weak_match_char]   ==> weak match    default: '.'
opt[:mismatch_char]     ==> mismatch      default: ' '
  :strong_ and :weak_match_char are used only in amino mode (:aa)

More opt can be accepted. Please refer to the #consensus_each_site method for opt.

# File lib/bio/alignment.rb, line 624
def match_line(opt = {})
  case opt[:type]
  when :aa
    amino = true
  when :na, :dna, :rna
    amino = false
  else
    if seqclass == Bio::Sequence::AA then
      amino = true
    elsif seqclass == Bio::Sequence::NA then
      amino = false
    else
      amino = nil
      self.each_seq do |x|
        if /[EFILPQ]/i =~ x
          amino = true
          break
        end
      end
    end
  end
  if amino then
    match_line_amino(opt)
  else
    match_line_nuc(opt)
  end
end
match_line_amino(opt = {}) click to toggle source

Returns the match line stirng of the alignment of amino-acid sequences.

It resembles the BioPerl's AlignI::match_line method.

opt[:match_line_char]   ==> 100% equal    default: '*'
opt[:strong_match_char] ==> strong match  default: ':'
opt[:weak_match_char]   ==> weak match    default: '.'
opt[:mismatch_char]     ==> mismatch      default: ' '

More opt can be accepted. Please refer to the #consensus_each_site method for opt.

# File lib/bio/alignment.rb, line 584
def match_line_amino(opt = {})
  collect_each_site do |a|
    a.match_line_amino(opt)
  end.join('')
end
match_line_nuc(opt = {}) click to toggle source

Returns the match line stirng of the alignment of nucleic-acid sequences.

It resembles the BioPerl's AlignI::match_line method.

opt[:match_line_char]   ==> 100% equal    default: '*'
opt[:mismatch_char]     ==> mismatch      default: ' '

More opt can be accepted. Please refer to the #consensus_each_site method for opt.

# File lib/bio/alignment.rb, line 601
def match_line_nuc(opt = {})
  collect_each_site do |a|
    a.match_line_nuc(opt)
  end.join('')
end
normalize!()
number_of_sequences() click to toggle source

Returns number of sequences in this alignment.

# File lib/bio/alignment.rb, line 1315
def number_of_sequences
  i = 0
  self.each_seq { |s| i += 1 }
  i
end
remove_all_gaps!() click to toggle source

Completely removes ALL gaps in the sequences. If removes nothing, returns nil. Otherwise, returns self.

Note that it is a destructive method.

# File lib/bio/alignment.rb, line 787
def remove_all_gaps!
  ret = nil
  each_seq do |s|
    x = s.gsub!(gap_regexp, '')
    ret ||= x
  end
  ret ? self : nil
end
rstrip!()
Alias for: alignment_rstrip!
seq_length()
Alias for: alignment_length
seqclass() click to toggle source

Returns class of the sequence. If instance variable @seqclass (which can be set by 'seqclass=' method) is set, simply returns the value. Otherwise, returns the first sequence's class. If no sequences are found, returns nil.

# File lib/bio/alignment.rb, line 349
def seqclass
  if (defined? @seqclass) and @seqclass then
    @seqclass
  else
    klass = nil
    each_seq do |s|
      if s then
        klass = s.class
        break if klass
      end
    end
    (klass or String)
  end
end
sequence_names() click to toggle source

Returns an array of sequence names. The order of the names must be the same as the order of each_seq.

# File lib/bio/alignment.rb, line 1324
def sequence_names
  (0...(self.number_of_sequences)).to_a
end
slice(*arg)
Alias for: alignment_slice
strip!()
Alias for: alignment_strip!
subseq(*arg)
Alias for: alignment_subseq
window(*arg)
Alias for: alignment_window