Knowledge base
The class KnowledgeBase
is the main access point to all resources
described in the knowledge base (e.g. HucitAuthor
,
HucitWork
, etc.). Its methods can be divided into the following high-level groups:
methods that concern globally the knowledge base:
methods to access top-level resources:
“factory methods”, i.e. methods that create new objects (i.e. entries):
- class hucitlib.KnowledgeBase(config_file: Optional[str] = None)
KnowledgeBase
is a class that allows for accessing a HuCit knowledge base in an object-oriented fashion. The abstraction layer it provides means that you can use, search and modify its content without having to worry about the underlying modelling of data in RDF.- Parameters
config_file (str) – Path to the configuration file containing the parameters to connect to the triple store whose data will be accessible via the
KnowledgeBase
object.- Returns
Description of returned object.
- Return type
None
Note
By default (i.e. when no configuration file is specified) a new
KnowledgeBase
instance will be created that reads data directly from the triple store hosted at Druid. NB: please note that all methods that modify entries in the KB won’t work as that triple store is read-only.Example of usage:
>>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> homer = kb.get_resource_by_urn('urn:cts:greekLit:tlg0012') >>> print(homer.rdfs_label.one)
- add_textelement_type(label: str, lang: str = 'en') Optional[surf.resource.Resource]
Adds a new TextElementType to the Knowledge base if not yet present.
- Parameters
label (str) – Description of parameter label.
lang (str) – Description of parameter lang.
- Returns
Description of returned object.
- Return type
Optional[surf.resource.Resource]
# this will work only when connecting to a triples store # where you have access in writing mode >>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> element_type_obj = kb.add_textelement_type("book")
- add_textelement_types(types: List[str]) None
Adds the text element type in case it doesn’t exist.
- Parameters
types (List[str]) – a list of strings (e.g. [“book”, “poem”, “line”])
- Returns
Description of returned object.
- Return type
None
# this will work only when connecting to a triples store # where you have access in writing mode >>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> kb.add_textelement_types(["book", "line"])
- property author_names: Dict[str, str]
Returns a dictionary like this:
{ "urn:cts:greekLit:tlg0012$$n1" : "Homer" , "urn:cts:greekLit:tlg0012$$n2" : "Omero" , ... }
- create_cts_urn(resource: surf.resource.Resource, urn_string: str) Optional[surf.resource.Resource]
Creates a CTS URN object and assigns it to a given resource.
- Parameters
resource (surf.resource.Resource) – KB entry to be identified by the CTS URN.
urn_string (str) – CTS URN identifier (e.g.
urn:cts:greekLit:tlg0012
)
- Returns
The newly created object or
None
if it already existed.- Return type
Optional[surf.resource.Resource]
- create_text_element(work: surf.resource.Resource, urn_string: str, element_type: surf.resource.Resource, source_uri: str = None)
Short summary.
- Parameters
urn (str) – Text element’s URN.
element_type (surf.resource.Resource) – Text element type.
- Returns
The newly created text element.
- Return type
type
>>> iliad = kb.get_resource_by_urn("urn:cts:greekLit:tlg0012.tlg001") >>> etype_book = kb.get_textelement_type("book") >>> ts = iliad.structure >>> ts.create_element( "urn:cts:greekLit:tlg0012.tlg001:1", element_type=type_book, following_urn="urn:cts:greekLit:tlg0012.tlg001:2" )
- get_author_label(urn)
Get the label corresponding to the author identified by the CTS URN.
try to get an lang=en label (if multiple labels in this lang pick the shortest) try to get a lang=la label (if multiple labels in this lang exist pick the shortest) try to get a lang=None label (if multiple labels in this lang exist pick the shortest)
returns None if no name is found
- get_authors() List[hucitlib.surfext.HucitAuthor]
Lists all authors contained in the knowledge base.
- Returns
A list of authors.
- Return type
List[HucitAuthor]
- get_opus_maximum_of(author_cts_urn)
Return the author’s opux maximum (None otherwise).
Given the CTS URN of an author, this method returns its opus maximum. If not available returns None.
- Parameters
author_cts_urn – the author’s CTS URN.
- Returns
an instance of surfext.HucitWork or None
- get_resource_by_urn(urn)
Fetch the resource corresponding to the input CTS URN.
Currently supports only HucitAuthor and HucitWork.
- Parameters
urn – the CTS URN of the resource to fetch
- Returns
either an instance of HucitAuthor or of HucitWork
- get_statistics() Dict[str, int]
Gather basic stats about the Knowledge Base and its contents.
Note
This method currently has some performances issues.
- Returns
a dictionary
- get_textelement_type(label: str) Optional[surf.resource.Resource]
Returns a TextElementType (instance of E55_Type) if present.
Note
label
(lowercased) is used to create the URI (http://purl.org/hucit/kb/types/{label}).- Parameters
label (str) – Description of parameter label.
- Returns
Description of returned object.
- Return type
surf.resource.Resource
- get_textelement_types() List[surf.resource.Resource]
Returns all TextElementTypes defined in the knowledge base.
- Returns
Description of returned object.
- Return type
List[surf.resource.Resource]
- get_work_label(urn)
Get the label corresponding to the work identified by the input CTS URN.
try to get an lang=en label try to get a lang=la label try to get a lang=None label
returns None if no title is found
- get_works()
Return the author’s works.
- Returns
a list of HucitWork instances.
- search(search_string: str) List[Tuple[str, surf.resource.Resource]]
Searches for a given string through the resources’ labels.
- Parameters
search_string (str) – Description of parameter search_string.
- Returns
Description of returned object.
- Return type
List[Tuple[str, Resource]]
- to_json()
Serialises the content of the KnowledgeBase as JSON.
- Returns
TODO