| Class | Ferret::Analysis::RegExpAnalyzer |
| In: |
ext/r_analysis.c
|
| Parent: | Ferret::Analysis::Analyzer |
Using a RegExpAnalyzer is a simple way to create a custom analyzer. If implemented in Ruby it would look like this;
class RegExpAnalyzer
def initialize(reg_exp, lower = true)
@lower = lower
@reg_exp = reg_exp
end
def token_stream(field, str)
if @lower
return LowerCaseFilter.new(RegExpTokenizer.new(str, reg_exp))
else
return RegExpTokenizer.new(str, reg_exp)
end
end
end
csv_analyzer = RegExpAnalyzer.new(/[^,]+/, false)
Create a new RegExpAnalyzer which will create tokenizers based on the regular expression and lowercasing if required.
| reg_exp: | the token matcher for the tokenizer to use |
| lower: | set to false if you don‘t want to downcase the tokens |
Create a new TokenStream to tokenize input. The TokenStream created may also depend on the field_name. Although this parameter is typically ignored.
| field_name: | name of the field to be tokenized |
| input: | data from the field to be tokenized |