Module Stemmable
In: lib/stemmer/porter.rb

$Id: stemmable.rb,v 1.2 2003/02/01 02:07:30 condit Exp $

See example usage at the end of this file.

Methods

stem   stem_porter  

Constants

STEP_2_LIST = { 'ational'=>'ate', 'tional'=>'tion', 'enci'=>'ence', 'anci'=>'ance', 'izer'=>'ize', 'bli'=>'ble', 'alli'=>'al', 'entli'=>'ent', 'eli'=>'e', 'ousli'=>'ous', 'ization'=>'ize', 'ation'=>'ate', 'ator'=>'ate', 'alism'=>'al', 'iveness'=>'ive', 'fulness'=>'ful', 'ousness'=>'ous', 'aliti'=>'al', 'iviti'=>'ive', 'biliti'=>'ble', 'logi'=>'log'
STEP_3_LIST = { 'icate'=>'ic', 'ative'=>'', 'alize'=>'al', 'iciti'=>'ic', 'ical'=>'ic', 'ful'=>'', 'ness'=>''
SUFFIX_1_REGEXP = /( ational | tional | enci | anci | izer | bli | alli | entli | eli | ousli | ization | ation | ator | alism | iveness | fulness | ousness | aliti | iviti | biliti | logi)$/x
SUFFIX_2_REGEXP = /( al | ance | ence | er | ic | able | ible | ant | ement | ment | ent | ou | ism | ate | iti | ous | ive | ize)$/x
C = "[^aeiou]"
V = "[aeiouy]"
CC = "#{C}(?>[^aeiouy]*)"
VV = "#{V}(?>[aeiou]*)"
MGR0 = /^(#{CC})?#{VV}#{CC}/o
MEQ1 = /^(#{CC})?#{VV}#{CC}(#{VV})?$/o
MGR1 = /^(#{CC})?#{VV}#{CC}#{VV}#{CC}/o
VOWEL_IN_STEM = /^(#{CC})?#{V}/o

Public Instance methods

stem()

Alias for stem_porter

Porter stemmer in Ruby.

This is the Porter stemming algorithm, ported to Ruby from the version coded up in Perl. It‘s easy to follow against the rules in the original paper in:

  Porter, 1980, An algorithm for suffix stripping, Program, Vol. 14,
  no. 3, pp 130-137,

See also www.tartarus.org/~martin/PorterStemmer

Send comments to raypereda@hotmail.com

[Validate]