Module Linguistics::EN
In: lib/linguistics/en/wordnet.rb
lib/linguistics/en/infinitive.rb
lib/linguistics/en/linkparser.rb
lib/linguistics/en.rb

Linguistics::EN

This module contains English-language linguistic functions for the Linguistics module. It can be either loaded directly, or by passing some variant of ‘en’ or ‘eng’ to the Linguistics::use method.

The functions contained by the module provide:

Plural Inflections

Plural forms of all nouns, most verbs, and some adjectives are provided. Where appropriate, "classical" variants (for example: "brother" -> "brethren", "dogma" -> "dogmata", etc.) are also provided.

These can be accessed via the #plural, #plural_noun, #plural_verb, and #plural_adjective methods.

Indefinite Articles

Pronunciation-based "a"/"an" selection is provided for all English words, and most initialisms.

See: #a, #an, and #no.

Numbers to Words

Conversion from Numeric values to words are supported using the American "thousands" system. E.g., 2561 => "two thousand, five hundred and sixty-one".

See the #numwords method.

Ordinals

It is also possible to inflect numerals (1,2,3) and number words ("one", "two", "three") to ordinals (1st, 2nd, 3rd) and ordinates ("first", "second", "third").

Conjunctions

This module also supports the creation of English conjunctions from Arrays of Strings or objects which respond to the #to_s message. Eg.,

  %w{cow pig chicken cow dog cow duck duck moose}.en.conjunction
    ==> "three cows, two ducks, a pig, a chicken, a dog, and a moose"

Infinitives

Returns the infinitive form of English verbs:

 "dodging".en.infinitive
   ==> "dodge"

Authors

  • Michael Granger <ged@FaerieMUD.org>

Acknowledgements

The inflection functions of this module were adapted from Damien Conway‘s Lingua::EN::Inflect Perl module:

  Copyright (c) 1997-2000, Damian Conway. All Rights Reserved.
  This module is free software. It may be used, redistributed
    and/or modified under the same terms as Perl itself.

The conjunctions code was adapted from the Lingua::Conjunction Perl module written by Robert Rothenberg and Damian Conway, which has no copyright statement included.

Copyright (c) 2003-2008, Michael Granger All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  * Redistributions of source code must retain the above copyright notice,
    this list of conditions and the following disclaimer.

  * Redistributions in binary form must reproduce the above copyright notice,
    this list of conditions and the following disclaimer in the documentation
    and/or other materials provided with the distribution.

  * Neither the name of the author/s, nor the names of the project's
    contributors may be used to endorse or promote products derived from this
    software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Methods

Classes and Modules

Class Linguistics::EN::Infinitive

Attributes

lprintf_formatters  [RW] 

Public Class methods

Add the specified method (which can be either a Method object or a Symbol for looking up a method)

Make a function that calls the method meth on the synset of an input word.

Returns true if LinkParser was loaded okay

Returns true if WordNet was loaded okay

The instance of LinkParser used for all Linguistics LinkParser functions.

If #has_link_parser? returns false, this can be called to fetch the exception which was raised when trying to load LinkParser.

Wrap one or more parts in a non-capturing alteration Regexp

If #haveWordnet? returns false, this can be called to fetch the exception which was raised when WordNet was loaded.

The instance of the WordNet::Lexicon used for all Linguistics WordNet functions.

Public Instance methods

Return the given phrase with the appropriate indefinite article ("a" or "an") prepended.

an( phrase, count=nil )

Alias for #a

Turns a camel-case string ("camelCaseToEnglish") to plain English ("camel case to english"). Each word is decapitalized.

Return the specified obj (which must support the #collect method) as a conjunction. Each item is converted to a String if it is not already (using #to_s) unless a block is given, in which case it is called once for each object in the array, and the stringified return value from the block is used instead. Returning nil causes that particular element to be omitted from the resulting conjunction. The following options can be used to control the makeup of the returned conjunction String:

:separator
Specify one or more characters to separate items in the resulting list. Defaults to ’, ‘.
:altsep
An alternate separator to use if any of the resulting conjunction‘s clauses contain the :separator character/s. Defaults to ’; ‘.
:penultimate
Flag that indicates whether or not to join the last clause onto the rest of the conjunction using a penultimate :separator. E.g.,
  %w{duck, cow, dog}.en.conjunction
  # => "a duck, a cow, and a dog"
  %w{duck cow dog}.en.conjunction( :penultimate => false )
  "a duck, a cow and a dog"

Default to true.

:conjunctive
Sets the word used as the conjunctive (separating word) of the resulting string. Default to ‘and‘.
:combine
If set to true (the default), items which are indentical (after surrounding spaces are stripped) will be combined in the resulting conjunction. E.g.,
  %w{goose cow goose dog}.en.conjunction
  # => "two geese, a cow, and a dog"
  %w{goose cow goose dog}.en.conjunction( :combine => false )
  # => "a goose, a cow, a goose, and a dog"
:casefold
If set to true (the default), then items are compared case-insensitively when combining them. This has no effect if :combine is false.
:generalize
If set to true, then quantities of combined items are turned into general descriptions instead of exact amounts.
  ary = %w{goose pig dog horse goose reindeer goose dog horse}
  ary.en.conjunction
  # => "three geese, two dogs, two horses, a pig, and a reindeer"
  ary.en.conjunction( :generalize => true )
  # => "several geese, several dogs, several horses, a pig, and a reindeer"

See the #quantify method for specifics on how quantities are generalized. Generalization defaults to false, and has no effect if :combine is false.

:quantsort
If set to true (the default), items which are combined in the resulting conjunction will be listed in order of amount, with greater quantities sorted first. If :quantsort is false, combined items will appear where the first instance of them occurred in the list. This sort is also the fallback for indentical quantities (ie., items of the same quantity will be listed in the order they appeared in the source list).

Turns an English language string into a CamelCase word.

Returns the given word with a prepended indefinite article, unless count is non-nil and not singular.

Return the infinitive form of the given word

Return the name of the language this module is for.

Format the given fmt string by replacing %-escaped sequences with the result of performing a specified operation on the corresponding argument, ala Kernel.sprintf.

%PL:Plural.
%A, %AN:Prepend indefinite article.
%NO:Zero-quantified phrase.
%NUMWORDS:Convert a number into the corresponding words.
%CONJUNCT:Conjunction.

Translate zero-quantified phrase to "no +phrase.plural+"

Normalize a count to either 1 or 2 (singular or plural)

Return the specified number num as an array of number phrases.

Return the specified number as english words. One or more configuration values may be passed to control the returned String:

:group
Controls how many numbers at a time are grouped together. Valid values are 0 (normal grouping), 1 (single-digit grouping, e.g., "one, two, three, four"), 2 (double-digit grouping, e.g., "twelve, thirty-four", or 3 (triple-digit grouping, e.g., "one twenty-three, four").
:comma
Set the character/s used to separate word groups. Defaults to ", ".
:and
Set the word and/or characters used where ’ and ’ (the default) is normally used. Setting :and to ’ ‘, for example, will cause 2556 to be returned as "two-thousand, five hundred fifty-six" instead of "two-thousand, five hundred and fifty-six".
:zero
Set the word used to represent the numeral 0 in the result. ‘zero‘ is the default.
:decimal
Set the translation of any decimal points in the number; the default is ‘point‘.
:asArray
If set to a true value, the number will be returned as an array of word groups instead of a String.

Transform the given number into an ordinal word. The number object can be either an Integer or a String.

Transform the given number into an ordinate word.

part_pres( word )

Return the plural of the given phrase if count indicates it should be plural.

plural_adj( phrase, count=nil )

Alias for #plural_adjective

Return the plural of the given adjectival phrase if count indicates it should be plural.

Return the plural of the given noun phrase if count indicates it should be plural.

Return the plural of the given verb phrase if count indicates it should be plural.

Do normal/classical switching and match capitalization in inflected by examining the original input.

Returns the proper noun form of a string by capitalizing most of the words.

Examples:

  English.proper_noun("bosnia and herzegovina") ->
    "Bosnia and Herzegovina"
  English.proper_noun("macedonia, the former yugoslav republic of") ->
    "Macedonia, the Former Yugoslav Republic of"
  English.proper_noun("virgin islands, u.s.") ->
    "Virgin Islands, U.S."

Return a phrase describing the specified number of objects in the given phrase in general terms. The following options can be used to control the makeup of the returned quantity String:

:joinword
Sets the word (and any surrounding spaces) used as the word separating the quantity from the noun in the resulting string. Defaults to ’ of ‘.

Return a LinkParser::Sentence for the stringified obj.

Look up the synset associated with the given word or collocation in the WordNet lexicon and return a WordNet::Synset object.

Look up all the synsets associated with the given word or collocation in the WordNet lexicon and return an Array of WordNet::Synset objects. If pos is nil, return synsets for all parts of speech.

Transform the specified number of hundreds-, tens-, and units-place numerals into a word phrase. If the number of thousands (thousands) is greater than 0, it will be used to determine where the decimal point is in relation to the hundreds-place number.

Transform the specified number of tens- and units-place numerals into a word-phrase at the given number of thousands places.

Transform the specified number into one or more words like ‘thousand’, ‘million’, etc. Uses the thousands (American) system.

Transform the specified number of units-place numerals into a word-phrase at the given number of thousands places.

[Validate]