Technologies and Supported Languages |
Top Previous Next |
In total, 61 existing core technologies are available through web services. See the relevant topics below for more information on these technologies.
• Language identifier (1)
• OCR engines (10)
• Tokenisers (10)
• Sentence boundary detectors (10)
• Part-of-speech (POS) taggers (10)
• Named-entity recognisers (10)
• Phrase chunkers (10)
The core technologies fully support all official South African languages, with the exception of English. We do not include English Part of Speech tagging, Named Entity Recognition, or Phrase Chunking, as these tools are readily available elsewhere. Undefined is only applicable to language identification when LID cannot determine the language of a given document or line (see Language ID).
|