Lemmatizer
Introduction
The Lettria API allows you to perform lemmatization, which is the process of finding the base form of a word, called the lemma.
This is done by removing inflections (such as tense, person, number, gender, etc.) from a word.
For example, the lemma of the word "jumps" is "jump," the lemma of the word "running" is "run," and the lemma of the word "better" is "good." This is useful in natural language processing because it allows words to be compared and analyzed more easily.
Format
Lemmatizer objects can be received as either an Array()
or an Object()
.
Key | Type | Description | Concerned Tags |
---|---|---|---|
conjugate | list of Conjugate Objects | List possible conjugations | V, VP, VINF |
confidence | float | level of confidence in the results (higher is better) | * |
gender | Gender | describes the gender and plurality | VP, JJ, N, D, PD |
lemma | String | lemmatized version of the source | C, CC, CLO, CLS, D, JJ, N, NP, PUNCT, P, PD, PROREL, RB, RB_WH, SYM, UH |
infinit | list of String | list of possible verb infinitives | V, VP, VINF |
transitif | Boolean | whether the verb is transitive or not | V, VP, VINF |
Number | float | value | CD |
mode | String | mode of the verb | D, PD |
possessing | int | see Possessive determiners | D, PD |
pronom | int | see Pronouns | CLS |
designation | list of String | see Categories | CLO |
category | String | see Adverb Categories | RB |
source | String | Source of the lemmatization | RB, P |
sens | list of Preposition sens object | see Preposition sens | P |
Examples
V
{ "infinit": "etre", "gender": { "female": false, "plural": false }, "conjugate": [{ "mode": "indicative", "temps": "past", "pronom": 1, "modality": null }], "transitif": true }
RB
{ "category": ["time and aspect"] }
CLS
{ "pronom": 1 }