Token API for Python#
- class eot.wowool.annotation.Token#
Bases:
eot.wowool.annotation.annotation.Annotation
Token
is a class that contains all the information of a token- __init__(begin_offset: int, end_offset: int, literal: str)#
Initialize a
Token
instance- Parameters
begin_offset (
int
) – Begin offset of the tokenend_offset (
int
) – End offset of the tokenliteral (
str
) – Literal string from the input document
- Returns
An initialized token
- Return type
- literal: str#
- morphology: List[eot.wowool.annotation.token.MorphData]#
- properties: MutableSet[str]#
- rich()#
- Returns
The rich string representation of the token
- Return type
str
- property stem: str#
- Returns
The first stem of the token or an empty string if absent
- Type
str
- property pos: str#
- Returns
The first part-of-speech of the token or an empty string if absent
- Type
str
- has_property(prop: str) bool #
- Parameters
prop (
str
) – Property name. For example"NF"
- Returns
Whether a given property is set on the token
- Return type
bool
- has_pos(pos: str) bool #
- Parameters
pos (
str
) – Part-of-speech. For example"Nn"
- Returns
Whether a given part-of-speech is set on the token
- Return type
bool
- get_morphology(pos: str)#
- Parameters
pos (
str
) – Part-of-speech. For example"Nn"
- Returns
Whether a given part-of-speech is set on the token
- Return type
bool
- static iter(object) Generator[eot.wowool.annotation.token.Token, None, None] #
Iterate over the concepts in a document, an analysis, a sentence or a concept. For example:
document = analyzer("Hello from Antwerp, said John Smith.") for token in Token.iter(document): print(token)
- static next(sentence, object) eot.wowool.annotation.token.Token #
returns the next Token
- static prev(sentence, object: Union[eot.wowool.annotation.token.Concept, eot.wowool.annotation.token.Token]) eot.wowool.annotation.token.Token #
returns the prev Token
- property begin_offset: int#
- Returns
The begin offset of the annotation
- Type
int
- property end_offset: int#
- Returns
The end offset of the annotation
- Type
int
- property index: int#
- class eot.wowool.annotation.token.MorphData#
MorphData
is a class that contains the morphological data. For example:for md in token.morphology: print( md.pos , md.stem )
- Parameters
pos (
str
) – Part-of-speechstem (
str
) – Stem
- rich() str #
- Returns
The rich string representation of the morphological data
- Return type
str
See also the corresponding JSON schema.