Class Ferret::Analysis::RegExpAnalyzer
In: ext/r_analysis.c
Parent: Ferret::Analysis::Analyzer

Summary

Using a RegExpAnalyzer is a simple way to create a custom analyzer. If implemented in Ruby it would look like this;

  class RegExpAnalyzer
    def initialize(reg_exp, lower = true)
      @lower = lower
      @reg_exp = reg_exp
    end

    def token_stream(field, str)
      if @lower
        return LowerCaseFilter.new(RegExpTokenizer.new(str, reg_exp))
      else
        return RegExpTokenizer.new(str, reg_exp)
      end
    end
  end

Example

  csv_analyzer = RegExpAnalyzer.new(/[^,]+/, false)

Methods

new   token_stream  

Public Class methods

Create a new RegExpAnalyzer which will create tokenizers based on the regular expression and lowercasing if required.

reg_exp:the token matcher for the tokenizer to use
lower:set to false if you don‘t want to downcase the tokens

Public Instance methods

Create a new TokenStream to tokenize input. The TokenStream created may also depend on the field_name. Although this parameter is typically ignored.

field_name:name of the field to be tokenized
input:data from the field to be tokenized

[Validate]