Knowledge base
The class KnowledgeBase is the main access point to all resources
described in the knowledge base (e.g. HucitAuthor,
HucitWork, etc.). Its methods can be divided into the following high-level groups:
methods that concern globally the knowledge base:
methods to access top-level resources:
“factory methods”, i.e. methods that create new objects (i.e. entries):
- class hucitlib.KnowledgeBase(config_file: Optional[str] = None)
KnowledgeBaseis a class that allows for accessing a HuCit knowledge base in an object-oriented fashion. The abstraction layer it provides means that you can use, search and modify its content without having to worry about the underlying modelling of data in RDF.- Parameters
config_file (str) – Path to the configuration file containing the parameters to connect to the triple store whose data will be accessible via the
KnowledgeBaseobject.- Returns
Description of returned object.
- Return type
None
Note
By default (i.e. when no configuration file is specified) a new
KnowledgeBaseinstance will be created that reads data directly from the triple store hosted at Druid. NB: please note that all methods that modify entries in the KB won’t work as that triple store is read-only.Example of usage:
>>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> homer = kb.get_resource_by_urn('urn:cts:greekLit:tlg0012') >>> print(homer.rdfs_label.one)
- add_textelement_type(label: str, lang: str = 'en') Optional[surf.resource.Resource]
Adds a new TextElementType to the Knowledge base if not yet present.
- Parameters
label (str) – Description of parameter label.
lang (str) – Description of parameter lang.
- Returns
Description of returned object.
- Return type
Optional[surf.resource.Resource]
# this will work only when connecting to a triples store # where you have access in writing mode >>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> element_type_obj = kb.add_textelement_type("book")
- add_textelement_types(types: List[str]) None
Adds the text element type in case it doesn’t exist.
- Parameters
types (List[str]) – a list of strings (e.g. [“book”, “poem”, “line”])
- Returns
Description of returned object.
- Return type
None
# this will work only when connecting to a triples store # where you have access in writing mode >>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> kb.add_textelement_types(["book", "line"])
- property author_names: Dict[str, str]
Returns a dictionary like this:
{ "urn:cts:greekLit:tlg0012$$n1" : "Homer" , "urn:cts:greekLit:tlg0012$$n2" : "Omero" , ... }
- create_cts_urn(resource: surf.resource.Resource, urn_string: str) Optional[surf.resource.Resource]
Creates a CTS URN object and assigns it to a given resource.
- Parameters
resource (surf.resource.Resource) – KB entry to be identified by the CTS URN.
urn_string (str) – CTS URN identifier (e.g.
urn:cts:greekLit:tlg0012)
- Returns
The newly created object or
Noneif it already existed.- Return type
Optional[surf.resource.Resource]
- create_text_element(work: surf.resource.Resource, urn_string: str, element_type: surf.resource.Resource, source_uri: str = None)
Short summary.
- Parameters
urn (str) – Text element’s URN.
element_type (surf.resource.Resource) – Text element type.
- Returns
The newly created text element.
- Return type
type
>>> iliad = kb.get_resource_by_urn("urn:cts:greekLit:tlg0012.tlg001") >>> etype_book = kb.get_textelement_type("book") >>> ts = iliad.structure >>> ts.create_element( "urn:cts:greekLit:tlg0012.tlg001:1", element_type=type_book, following_urn="urn:cts:greekLit:tlg0012.tlg001:2" )
- get_author_label(urn)
Get the label corresponding to the author identified by the CTS URN.
try to get an lang=en label (if multiple labels in this lang pick the shortest) try to get a lang=la label (if multiple labels in this lang exist pick the shortest) try to get a lang=None label (if multiple labels in this lang exist pick the shortest)
returns None if no name is found
- get_authors() List[hucitlib.surfext.HucitAuthor]
Lists all authors contained in the knowledge base.
- Returns
A list of authors.
- Return type
List[HucitAuthor]
- get_opus_maximum_of(author_cts_urn)
Return the author’s opux maximum (None otherwise).
Given the CTS URN of an author, this method returns its opus maximum. If not available returns None.
- Parameters
author_cts_urn – the author’s CTS URN.
- Returns
an instance of surfext.HucitWork or None
- get_resource_by_urn(urn)
Fetch the resource corresponding to the input CTS URN.
Currently supports only HucitAuthor and HucitWork.
- Parameters
urn – the CTS URN of the resource to fetch
- Returns
either an instance of HucitAuthor or of HucitWork
- get_statistics() Dict[str, int]
Gather basic stats about the Knowledge Base and its contents.
Note
This method currently has some performances issues.
- Returns
a dictionary
- get_textelement_type(label: str) Optional[surf.resource.Resource]
Returns a TextElementType (instance of E55_Type) if present.
Note
label(lowercased) is used to create the URI (http://purl.org/hucit/kb/types/{label}).- Parameters
label (str) – Description of parameter label.
- Returns
Description of returned object.
- Return type
surf.resource.Resource
- get_textelement_types() List[surf.resource.Resource]
Returns all TextElementTypes defined in the knowledge base.
- Returns
Description of returned object.
- Return type
List[surf.resource.Resource]
- get_work_label(urn)
Get the label corresponding to the work identified by the input CTS URN.
try to get an lang=en label try to get a lang=la label try to get a lang=None label
returns None if no title is found
- get_works()
Return the author’s works.
- Returns
a list of HucitWork instances.
- search(search_string: str) List[Tuple[str, surf.resource.Resource]]
Searches for a given string through the resources’ labels.
- Parameters
search_string (str) – Description of parameter search_string.
- Returns
Description of returned object.
- Return type
List[Tuple[str, Resource]]
- to_json()
Serialises the content of the KnowledgeBase as JSON.
- Returns
TODO