Class Bio::FlatFileIndex
In: lib/bio/io/flatfile/bdb.rb  (CVS)
lib/bio/io/flatfile/index.rb  (CVS)
lib/bio/io/flatfile/indexer.rb  (CVS)
Parent: Object

Bio::FlatFileIndex is a class for OBDA flatfile index.

Methods

Classes and Modules

Module Bio::FlatFileIndex::BDB_1
Module Bio::FlatFileIndex::BDBdefault
Module Bio::FlatFileIndex::DEBUG
Module Bio::FlatFileIndex::Flat_1
Module Bio::FlatFileIndex::Indexer
Module Bio::FlatFileIndex::Template
Class Bio::FlatFileIndex::BDBwrapper
Class Bio::FlatFileIndex::DataBank
Class Bio::FlatFileIndex::FileID
Class Bio::FlatFileIndex::FileIDs
Class Bio::FlatFileIndex::NameSpaces
Class Bio::FlatFileIndex::Results

Constants

MAGIC_FLAT = 'flat/1'   magic string for flat/1 index
MAGIC_BDB = 'BerkeleyDB/1'   magic string for BerkeleyDB/1 index

Public Class methods

[Source]

# File lib/bio/io/flatfile/indexer.rb, line 716
    def self.formatstring2class(format_string)
      case format_string
      when /genbank/i
        dbclass = Bio::GenBank
      when /genpept/i
        dbclass = Bio::GenPept
      when /embl/i
        dbclass = Bio::EMBL
      when /sptr/i
        dbclass = Bio::SPTR
      when /fasta/i
        dbclass = Bio::FastaFormat
      else
        raise "Unsupported format : #{format}"
      end
    end

[Source]

# File lib/bio/io/flatfile/indexer.rb, line 733
    def self.makeindex(is_bdb, dbname, format, options, *files)
      if format then
        dbclass = formatstring2class(format)
      else
        dbclass = Bio::FlatFile.autodetect_file(files[0])
        raise "Cannot determine format" unless dbclass
        DEBUG.print "file format is #{dbclass}\n"
      end

      options = {} unless options
      pns = options['primary_namespace']
      sns = options['secondary_namespaces']

      parser = Indexer::Parser.new(dbclass, pns, sns)

      #if /(EMBL|SPTR)/ =~ dbclass.to_s then
        #a = [ 'DR' ]
        #parser.add_secondary_namespaces(*a)
      #end
      if sns = options['additional_secondary_namespaces'] then
        parser.add_secondary_namespaces(*sns)
      end

      if is_bdb then
        Indexer::makeindexBDB(dbname, parser, options, *files)
      else
        Indexer::makeindexFlat(dbname, parser, options, *files)
      end
    end

Opens existing databank. Databank is a directory which contains indexed files and configuration files. The type of the databank (flat or BerkeleyDB) are determined automatically.

Unlike +FlatFileIndex.open+, block is not allowed.

[Source]

# File lib/bio/io/flatfile/index.rb, line 113
    def initialize(name)
      @db = DataBank.open(name)
    end

Opens existing databank. Databank is a directory which contains indexed files and configuration files. The type of the databank (flat or BerkeleyDB) are determined automatically.

If block is given, the databank object is passed to the block. The databank will be automatically closed when the block terminates.

[Source]

# File lib/bio/io/flatfile/index.rb, line 88
    def self.open(name)
      if block_given? then
        begin
          i = self.new(name)
          r = yield i
        ensure
          if i then
            begin
              i.close
            rescue IOError
            end
          end
        end
      else
        r = self.new(name)
      end
      r
    end

[Source]

# File lib/bio/io/flatfile/indexer.rb, line 763
    def self.update_index(dbname, format, options, *files)
      if format then
        parser = Indexer::Parser.new(dbclass)
      else
        parser = nil
      end
      Indexer::update_index(dbname, parser, options, *files)
    end

Public Instance methods

If true, consistency checks will be performed every time accessing flatfiles. If nil/false, no checks are performed.

By default, always_check_consistency is true.

[Source]

# File lib/bio/io/flatfile/index.rb, line 297
    def always_check_consistency(bool)
      @db.always_check
    end

If true is given, consistency checks will be performed every time accessing flatfiles. If nil/false, no checks are performed.

By default, always_check_consistency is true.

[Source]

# File lib/bio/io/flatfile/index.rb, line 288
    def always_check_consistency=(bool)
      @db.always_check=(bool)
    end

Check consistency between the databank(index) and original flat files.

If the original flat files are changed after creating the databank, raises RuntimeError.

Note that this check only compares file sizes as described in the OBDA specification.

[Source]

# File lib/bio/io/flatfile/index.rb, line 278
    def check_consistency
      check_closed?
      @db.check_consistency
    end

Closes the databank. Returns nil.

[Source]

# File lib/bio/io/flatfile/index.rb, line 132
    def close
      check_closed?
      @db.close
      @db = nil
    end

Returns true if already closed. Otherwise, returns false.

[Source]

# File lib/bio/io/flatfile/index.rb, line 139
    def closed?
      if @db then
        false
      else
        true
      end
    end

Returns default namespaces. Returns an array of strings or nil. nil means all namespaces.

[Source]

# File lib/bio/io/flatfile/index.rb, line 172
    def default_namespaces
      @names
    end

Set default namespaces. default_namespaces = nil means all namespaces in the databank.

default_namespaces= [ str1, str2, … ] means set default namespeces to str1, str2, …

Default namespaces specified in this method only affect get_by_id, search, and include? methods.

Default of default namespaces is nil (that is, all namespaces are search destinations by default).

[Source]

# File lib/bio/io/flatfile/index.rb, line 160
    def default_namespaces=(names)
      if names then
        @names = []
        names.each { |x| @names.push(x.dup) }
      else
        @names = nil
      end
    end

common interface defined in registry.rb Searching databank and returns entry (or entries) as a string. Multiple entries (contatinated to one string) may be returned. Returns empty string if not found.

[Source]

# File lib/bio/io/flatfile/index.rb, line 122
    def get_by_id(key)
      search(key).to_s
    end

Searching databank. If some entries are found, returns an array of unique IDs (primary identifiers). If not found anything, returns nil.

This method is useful when search result is very large and search method is very slow.

[Source]

# File lib/bio/io/flatfile/index.rb, line 210
    def include?(key)
      check_closed?
      if @names then
        r = @db.search_namespaces_get_unique_id(key, *@names)
      else
        r = @db.search_all_get_unique_id(key)
      end
      if r.empty? then
        nil
      else
        r
      end
    end

Same as include?, but serching only specified namespaces.

[Source]

# File lib/bio/io/flatfile/index.rb, line 226
    def include_in_namespaces?(key, *names)
      check_closed?
      r = @db.search_namespaces_get_unique_id(key, *names)
      if r.empty? then
        nil
      else
        r
      end
    end

Same as include?, but serching only primary namespace.

[Source]

# File lib/bio/io/flatfile/index.rb, line 238
    def include_in_primary?(key)
      check_closed?
      r = @db.search_primary_get_unique_id(key)
      if r.empty? then
        nil
      else
        r
      end
    end

Returns names of namespaces defined in the databank. (example: [ ‘LOCUS’, ‘ACCESSION’, ‘VERSION’ ] )

[Source]

# File lib/bio/io/flatfile/index.rb, line 251
    def namespaces
      check_closed?
      r = secondary_namespaces
      r.unshift primary_namespace
      r
    end

Returns name of primary namespace as a string.

[Source]

# File lib/bio/io/flatfile/index.rb, line 259
    def primary_namespace
      check_closed?
      @db.primary.name
    end

Searching databank and returns a Bio::FlatFileIndex::Results object.

[Source]

# File lib/bio/io/flatfile/index.rb, line 177
    def search(key)
      check_closed?
      if @names then
        @db.search_namespaces(key, *@names)
      else
        @db.search_all(key)
      end
    end

Searching only specified namespeces. Returns a Bio::FlatFileIndex::Results object.

[Source]

# File lib/bio/io/flatfile/index.rb, line 189
    def search_namespaces(key, *names)
      check_closed?
      @db.search_namespaces(key, *names)
    end

Searching only primary namespece. Returns a Bio::FlatFileIndex::Results object.

[Source]

# File lib/bio/io/flatfile/index.rb, line 197
    def search_primary(key)
      check_closed?
      @db.search_primary(key)
    end

Returns names of secondary namespaces as an array of strings.

[Source]

# File lib/bio/io/flatfile/index.rb, line 265
    def secondary_namespaces
      check_closed?
      @db.secondary.names
    end

[Validate]