Knowledge base

The class KnowledgeBase is the main access point to all resources described in the knowledge base (e.g. HucitAuthor, HucitWork, etc.). Its methods can be divided into the following high-level groups:

class hucitlib.KnowledgeBase(config_file: Optional[str] = None)

KnowledgeBase is a class that allows for accessing a HuCit knowledge base in an object-oriented fashion. The abstraction layer it provides means that you can use, search and modify its content without having to worry about the underlying modelling of data in RDF.

Parameters

config_file (str) – Path to the configuration file containing the parameters to connect to the triple store whose data will be accessible via the KnowledgeBase object.

Returns

Description of returned object.

Return type

None

Note

By default (i.e. when no configuration file is specified) a new KnowledgeBase instance will be created that reads data directly from the triple store hosted at Druid. NB: please note that all methods that modify entries in the KB won’t work as that triple store is read-only.

Example of usage:

>>> from hucit_kb import KnowledgeBase
>>> kb = KnowledgeBase()
>>> homer = kb.get_resource_by_urn('urn:cts:greekLit:tlg0012')
>>> print(homer.rdfs_label.one)
add_textelement_type(label: str, lang: str = 'en') Optional[surf.resource.Resource]

Adds a new TextElementType to the Knowledge base if not yet present.

Parameters
  • label (str) – Description of parameter label.

  • lang (str) – Description of parameter lang.

Returns

Description of returned object.

Return type

Optional[surf.resource.Resource]

# this will work only when connecting to a triples store
# where you have access in writing mode
>>> from hucit_kb import KnowledgeBase
>>> kb = KnowledgeBase()
>>> element_type_obj = kb.add_textelement_type("book")
add_textelement_types(types: List[str]) None

Adds the text element type in case it doesn’t exist.

Parameters

types (List[str]) – a list of strings (e.g. [“book”, “poem”, “line”])

Returns

Description of returned object.

Return type

None

# this will work only when connecting to a triples store
# where you have access in writing mode
>>> from hucit_kb import KnowledgeBase
>>> kb = KnowledgeBase()
>>> kb.add_textelement_types(["book", "line"])
property author_names: Dict[str, str]

Returns a dictionary like this:

{
    "urn:cts:greekLit:tlg0012$$n1" : "Homer"
    , "urn:cts:greekLit:tlg0012$$n2" : "Omero"
    , ...
}
create_cts_urn(resource: surf.resource.Resource, urn_string: str) Optional[surf.resource.Resource]

Creates a CTS URN object and assigns it to a given resource.

Parameters
  • resource (surf.resource.Resource) – KB entry to be identified by the CTS URN.

  • urn_string (str) – CTS URN identifier (e.g. urn:cts:greekLit:tlg0012)

Returns

The newly created object or None if it already existed.

Return type

Optional[surf.resource.Resource]

create_text_element(work: surf.resource.Resource, urn_string: str, element_type: surf.resource.Resource, source_uri: str = None)

Short summary.

Parameters
  • urn (str) – Text element’s URN.

  • element_type (surf.resource.Resource) – Text element type.

Returns

The newly created text element.

Return type

type

>>> iliad = kb.get_resource_by_urn("urn:cts:greekLit:tlg0012.tlg001")
>>> etype_book = kb.get_textelement_type("book")
>>> ts = iliad.structure
>>> ts.create_element(
    "urn:cts:greekLit:tlg0012.tlg001:1",
    element_type=type_book,
    following_urn="urn:cts:greekLit:tlg0012.tlg001:2"
)
get_author_label(urn)

Get the label corresponding to the author identified by the CTS URN.

try to get an lang=en label (if multiple labels in this lang pick the shortest) try to get a lang=la label (if multiple labels in this lang exist pick the shortest) try to get a lang=None label (if multiple labels in this lang exist pick the shortest)

returns None if no name is found

get_authors() List[hucitlib.surfext.HucitAuthor]

Lists all authors contained in the knowledge base.

Returns

A list of authors.

Return type

List[HucitAuthor]

get_opus_maximum_of(author_cts_urn)

Return the author’s opux maximum (None otherwise).

Given the CTS URN of an author, this method returns its opus maximum. If not available returns None.

Parameters

author_cts_urn – the author’s CTS URN.

Returns

an instance of surfext.HucitWork or None

get_resource_by_urn(urn)

Fetch the resource corresponding to the input CTS URN.

Currently supports only HucitAuthor and HucitWork.

Parameters

urn – the CTS URN of the resource to fetch

Returns

either an instance of HucitAuthor or of HucitWork

get_statistics() Dict[str, int]

Gather basic stats about the Knowledge Base and its contents.

Note

This method currently has some performances issues.

Returns

a dictionary

get_textelement_type(label: str) Optional[surf.resource.Resource]

Returns a TextElementType (instance of E55_Type) if present.

Note

label (lowercased) is used to create the URI (http://purl.org/hucit/kb/types/{label}).

Parameters

label (str) – Description of parameter label.

Returns

Description of returned object.

Return type

surf.resource.Resource

get_textelement_types() List[surf.resource.Resource]

Returns all TextElementTypes defined in the knowledge base.

Returns

Description of returned object.

Return type

List[surf.resource.Resource]

get_work_label(urn)

Get the label corresponding to the work identified by the input CTS URN.

try to get an lang=en label try to get a lang=la label try to get a lang=None label

returns None if no title is found

get_works()

Return the author’s works.

Returns

a list of HucitWork instances.

search(search_string: str) List[Tuple[str, surf.resource.Resource]]

Searches for a given string through the resources’ labels.

Parameters

search_string (str) – Description of parameter search_string.

Returns

Description of returned object.

Return type

List[Tuple[str, Resource]]

to_json()

Serialises the content of the KnowledgeBase as JSON.

Returns

TODO