EntityMapper API for Python#

class eot.wowool.entity_mapper.EntityMapper

Generates a json objects with the entity relations that are found. For example, suppose we center around the Person entity and want to find all relations with the concepts found in the given sentence. As such, lhs = ‘Person’, and rhs = ‘.*’. Alternatively, if you are interested in Person -> Company and Person -> Position, then lhs = ‘Person’, and rhs = [‘Company’,’Position’]. The results will be returned as a json objects.

Keyword Arguments:
  • lhs (str) – The lhs of your mapping or the concept you are most interested in.

  • rhs (list) – The list of concept [‘Company’,’Position’] to map to.

  • attributes (list) – The list of attributes you want to add to your mappings

  • fields (list) – The list of fields you want to have in you’re results. Note that these can also be formatted at runtime. ex: –fields ‘id,Person,Event’ but you can specify a f-string for every field using the assign operator ‘=’. And while using the f-string you have access to the current ‘row’ dict variable or the Concepts objects. this means you can reformat every field or do crazy stuff with it. ex: –fields ‘id=document.id,Person=Person.literal.upper,Event’

  • scopes (list(int)) – The list of sentence index relative to the lhs concept you want to find the rhs concepts.

  • slots (list) – The list of concept that stay alive during the mapping ex: [‘Date’, ‘UserName’].

from eot.wowool.core.native import Language, Domain
from eot.wowool.entity_mapper import EntityMapper

english = Language("english")
entities = Domain("english-entity")
# Note: the document object can be obtained from the portal client also
mapper = EntityMapper(lhs="Person", rhs=["Company"])

doc = mapper(entities(english("John Doe works for creative EyeOnText.")))
print(doc.entity_mapper)

results:

[{"Person": "Mary Smith", "Company": "EyeOnID"}]
__init__(lhs: str, rhs: list = [], fields: list = [], uri_to_table_name: dict = {}, scopes: list = [], slots: list = [])

TO BE DONE: THE FOLLOWING IS NOT CORRECT, IT ASSUMES THE SDK BUT IS PLACED IN COMMON

Driver (CLI)#

The entity mapper comes with a CLI that can be used to generate a csv file with the given mappings. In this example we map the name of the person with either a Company or City using the english-entity domain.

entity_mapper -p english,entity -i "John Dow works for EyeOnText in Antwerp." \
                --lhs "Person" --rhs "Company,City"  -o test.csv --attributes gender
[
{"id": "stream_id_2199722368313370556", "Person": "John Dow", "Company": "EyeOnText", "gender": "male"},
{"id": "stream_id_2199722368313370556", "Person": "John Dow", "City": "Antwerp", "gender": "male"}
]
id,Person,Company,gender
stream_id_2199722368313370556,John Dow,EyeOnText,male
stream_id_2199722368313370556,John Dow,,male

Reformating the fields#

You can reformat the fields that will appear in the csv using the ‘–fields’ attribute. Every field is comma separates, ex : “id,Person,Company,gender” But every field can be formatted using the f-string format, and you have access to the requested concepts as the complete row (dict). The format is.

field_name={f-string},…

entity_mapper.sdk -p english,entity -i "John Dow works for EyeOnText in Antwerp." \
            --lhs "Person" --rhs "Company,City"  -o test.csv --attributes gender  \
            --fields "id=document.id,Person,gender=Person.gender.upper" \
            -o out.csv
id,Person,gender
STREAM_ID_4768304866666727575,John Dow,MALE
STREAM_ID_4768304866666727575,John Dow,MALE

If the id is a filename you can even use Path to format the id. In this example we only want the filename without extension.

--fields "id=document.id,Person,gender=Person.gender.upper"