By Charly Moerth
Last update: 2024-06-24
This dictionary project is part of the isiZulu course taught at the Institute of African Studies of the University of Vienna. It is highly experimental in nature. It has also been used as a test-bed for the development of the lexicographic editor, the ˂TEI˃Enricher, and experiments in semi-automatic data acquisition/creation for digital dictionaries.
The focus in compiling the dictionary has been on contemporary urban isiZulu. The data collection is accomplished along two lines. Firstly, we have tried to provide everything needed for teaching and learning isiZulu including high-frequency lexical items used in textbooks and typically needed in teaching the language. In addition, we have also tried to colletc data not available in other dictionaries. Particular attention has been paid to neologisms and loan words. For the time being, we do not attempt to achieve anything like a complete coverage. We do not include data from other dictionaries whose existence could not be verified in our corpora.
In compiling the dictionary, a corpus of digital isiZulu texts has been used which was put together from various Internet resources, in particular online newspapers. A very few contemporary literary works have also been used. The TEI encoded data has been made searchable via NoSke. In October 2021 the corpus contained 12,727,434 tokens.
As of 3.6.2024, the published dictionary contained roughly 15.417 lemmata (single and multi-word units) and 13.280 sample sentences.
IsiZulu is a Southern Niger-Congo language and belongs to the Nguni branch of languages which are spoken in southern Africa. The amaZulu people (about 12 million L1 speakers) primarily inhabit the province of KwaZulu-Natal of the South African Republic. With approximately a quarter of the population of South Africa, IsiZulu is the most widely used home language of the country. It is one of South Africa's 11 official languages.
Through the query interface, you can search for words or groups of words in the dictionary. By simply entering a word and pressing the ENTER button on your keyboard you will trigger the query. Results matching your query will be displayed below the input field.
Mind that all queries are case sensitive.
When you start typing letters in the input field, the preview option will show you a list of tokens that start with the characters you entered so far.
It is possible to search in particular fields of the dictionary. Wildcards are applied on the token level.
Query String | Explanation | |
hamba | Find the string hamba | Try it! |
The interface also supports a simple query language. The names for the fields can be found in the field selector under the input control
lem=uhambo | Find the lemma with the string uhambo. | Try it! |
pos=conjunction | Find all conjunctions. | Try it! |
dom=kinship term | Find all entries with a domain label kinship term. | Try it! |
dom=botany | Find all entries with a domain label botany. | Try it! |
It is possible to use wildcars in the queries.
ham.* | All entries with a string ham. | Try it! |
infl=hambo.* | Find all inflected forms (e.g. plurals) containing the string hambo. | Try it! |
^ham.* | The circumflex (^) is supposed to anchor the term at the beginning of a token. The query should then yield results conaining `housekeeping´ or `household´ .. | Try it! |
.*yu$ | With the dollar sign you can anchor the string at the end of the token. | Try it! |
You can also combine queries.
dom=zoology+lem=^ub.* | All animals starting with the string ub. | Try it! |
[pos=adjective]+[lem=an.*] | All adjectives containing the string an. | Try it! |
Abbreviations used in the dictionary:
imp. | imperative |
inf. | infinitive |
loc. | locative |
NP | Nominal phrase |
pl. | plural |
sg. | singular |
VP | Verbal phrase is used for any phrases containing verbs. |