English and isiZulu parallelĀ corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from the SA government domain.
Productdetails
Aantal woorden | Text: 35.490 sentences (tokens) |
Annotaties | UTF8, Aligned, Sentence segmented |
Dataformaat | txt |
Documentatie | Readme available with download |
Eigenaar | North-West University, Centre for Text Technology (CTexT) |
Financier | Department of Arts and Culture |
Licentiesoort | Creative Commons Attribution-NonCommercial-ShareAlike 2.5 South Africa |
Opdrachtgever | Department of Arts and Culture |
Talen | English, isiZulu |
Versie | 1.0 |
Downloaddetails
Bestand | |
---|---|
20150804_Autshumato_English-isuZulu_Parallel_Corpora_1.0.zip |
- Aantal bestanden 1
- Aantal downloads 1
- Bestandsgrootte 1.89 MB
- Datum plaatsing 02/09/2020
- Laatst bijgewerkt 22/07/2021
- Versie