Using ReN with ANNIS
Please note that this is only a short introduction. The german version is more comprehensive.
User interface
The user interface of ANNIS is separated into three parts:
- Input field for the queries
- Corpus list / search options
- Content area
The corpus names are situated in the corpus list. In case of ReN the name “ReN” is followed by an underscore and the date of publication of the relevant corpus version (e.g. ReN_2016-08-23 for the first version that was released). Please note that the list contains further corpora of the HZSK as well. To ReN only those corpora belong, that begin with the abbreviation “ReN”. In the corpus list there are two options to choose from:
- Corpus information: shows information about the relevant corpus; the metadata about the particular texts can be found here.
- Text list: displays a list of the texts included in the corpus; here users can call up the full text view.
In order to search something in a certain corpus, this very corpus has to be selected from this list.
Queries
Simple queries
Among others it is possible to search for a certain part of speech, i.e. for a specific PoS-tag. If, for example, all the proper names of the corpus shall be searched (PoS-tag: NE), the query has to be written as follows:
pos="NE"
In order to limit the query to a single text of the corpus, e.g. the text “Griseldis”, the metadata is added, here: the short title of the relevant text (doc):
pos="NE" & meta::doc="Griseldis"
Combination of multiple criteria
For a more detailed search it is helpful to choose specific annotation levels or to combine information of different annotation levels. For example: If a user wants to search for the preposition “in” and simultaneously exclude examples for e.g. the homonymous adverb “in”, they may use the lemma with word-sense disambiguation (lemma_wsd), that contains a differentiation of homonyms by means of numerations:
lemma_wsd="in²"
But it is also possible to connect the lemma in a simplified form (“in”) with the PoS-tag (APPR):
lemma="in" & pos="APPR" & #1 _=_ #2
The phrase #1 _=_ #2
stands for the relation between the two combined terms: the =
-sign (equality sign) means, that the searching criteria in the first term of the phrase (#1
, corresponds to lemma="in"
) and in the second term of the phrase (#2
, corresponds to pos="APPR"
) refer to the same text unit.
Lemmas can not only be connected with PoS-tags, but also with morphologic information. If a user wants to find evidence for e.g. the preposition “in” with the accusative case, it is possible to use the following query:
lemma_wsd="in²" & morph="Akk" & #1 _=_ #2
Further information
- ANNIS User Guide: http://corpus-tools.org/annis/resources/ANNIS_User_Guide_3.4.3.pdf