Knowledge base¶
The class KnowledgeBase is the main access point to all resources
described in the knowledge base (e.g. HucitAuthor,
HucitWork, etc.). Its methods can be divided into the following high-level groups:
methods that concern globally the knowledge base:
methods to access top-level resources:
“factory methods”, i.e. methods that create new objects (i.e. entries):
-
class
hucitlib.KnowledgeBase(config_file: str = None)¶ KnowledgeBaseis a class that allows for accessing a HuCit knowledge base in an object-oriented fashion. The abstraction layer it provides means that you can use, search and modify its content without having to worry about the underlying modelling of data in RDF.Parameters: config_file (str) – Path to the configuration file containing the parameters to connect to the triple store whose data will be accessible via the KnowledgeBaseobject.Returns: Description of returned object. Return type: None Note
By default (i.e. when no configuration file is specified) a new
KnowledgeBaseinstance will be created that reads data directly from the triple store hosted at Druid. NB: please note that all methods that modify entries in the KB won’t work as that triple store is read-only.Example of usage:
>>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> homer = kb.get_resource_by_urn('urn:cts:greekLit:tlg0012') >>> print(homer.rdfs_label.one)
-
add_textelement_type(label: str, lang: str = 'en') → Optional[surf.resource.Resource]¶ Adds a new TextElementType to the Knowledge base if not yet present.
Parameters: - label (str) – Description of parameter label.
- lang (str) – Description of parameter lang.
Returns: Description of returned object.
Return type: Optional[surf.resource.Resource]
# this will work only when connecting to a triples store # where you have access in writing mode >>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> element_type_obj = kb.add_textelement_type("book")
-
add_textelement_types(types: List[str]) → None¶ Adds the text element type in case it doesn’t exist.
Parameters: types (List[str]) – a list of strings (e.g. [“book”, “poem”, “line”]) Returns: Description of returned object. Return type: None # this will work only when connecting to a triples store # where you have access in writing mode >>> from hucit_kb import KnowledgeBase >>> kb = KnowledgeBase() >>> kb.add_textelement_types(["book", "line"])
Returns a dictionary like this:
{ "urn:cts:greekLit:tlg0012$$n1" : "Homer" , "urn:cts:greekLit:tlg0012$$n2" : "Omero" , ... }
-
create_cts_urn(resource: surf.resource.Resource, urn_string: str) → Optional[surf.resource.Resource]¶ Creates a CTS URN object and assigns it to a given resource.
Parameters: - resource (surf.resource.Resource) – KB entry to be identified by the CTS URN.
- urn_string (str) – CTS URN identifier (e.g.
urn:cts:greekLit:tlg0012)
Returns: The newly created object or
Noneif it already existed.Return type: Optional[surf.resource.Resource]
-
create_text_element(work: surf.resource.Resource, urn_string: str, element_type: surf.resource.Resource, source_uri: str = None)¶ Short summary.
Parameters: - urn (str) – Text element’s URN.
- element_type (surf.resource.Resource) – Text element type.
Returns: The newly created text element.
Return type: type
>>> iliad = kb.get_resource_by_urn("urn:cts:greekLit:tlg0012.tlg001") >>> etype_book = kb.get_textelement_type("book") >>> ts = iliad.structure >>> ts.create_element( "urn:cts:greekLit:tlg0012.tlg001:1", element_type=type_book, following_urn="urn:cts:greekLit:tlg0012.tlg001:2" )
Get the label corresponding to the author identified by the CTS URN.
try to get an lang=en label (if multiple labels in this lang pick the shortest) try to get a lang=la label (if multiple labels in this lang exist pick the shortest) try to get a lang=None label (if multiple labels in this lang exist pick the shortest)
returns None if no name is found
Lists all authors contained in the knowledge base.
Returns: A list of authors. Return type: List[HucitAuthor]
-
get_opus_maximum_of(author_cts_urn)¶ Return the author’s opux maximum (None otherwise).
Given the CTS URN of an author, this method returns its opus maximum. If not available returns None.
Parameters: author_cts_urn – the author’s CTS URN. Returns: an instance of surfext.HucitWork or None
-
get_resource_by_urn(urn)¶ Fetch the resource corresponding to the input CTS URN.
Currently supports only HucitAuthor and HucitWork.
Parameters: urn – the CTS URN of the resource to fetch Returns: either an instance of HucitAuthor or of HucitWork
-
get_statistics() → Dict[str, int]¶ Gather basic stats about the Knowledge Base and its contents.
Note
This method currently has some performances issues.
Returns: a dictionary
-
get_textelement_type(label: str) → Optional[surf.resource.Resource]¶ Returns a TextElementType (instance of E55_Type) if present.
Note
label(lowercased) is used to create the URI (http://purl.org/hucit/kb/types/{label}).Parameters: label (str) – Description of parameter label. Returns: Description of returned object. Return type: surf.resource.Resource
-
get_textelement_types() → List[surf.resource.Resource]¶ Returns all TextElementTypes defined in the knowledge base.
Returns: Description of returned object. Return type: List[surf.resource.Resource]
-
get_work_label(urn)¶ Get the label corresponding to the work identified by the input CTS URN.
try to get an lang=en label try to get a lang=la label try to get a lang=None label
returns None if no title is found
-
get_works()¶ Return the author’s works.
Returns: a list of HucitWork instances.
-
search(search_string: str) → List[Tuple[str, surf.resource.Resource]]¶ Searches for a given string through the resources’ labels.
Parameters: search_string (str) – Description of parameter search_string. Returns: Description of returned object. Return type: List[Tuple[str, Resource]]
-
to_json()¶ Serialises the content of the KnowledgeBase as JSON.
Returns: TODO
-