English and Sepedi parallel corpora aligned on sentence level through a combination of automatic and manual alignment techniques. The parallel corpora were obtained from the SA government domain.
Productdetails
| Aantal woorden | Text: 44.981 sentences (tokens) |
| Annotaties | UTF8, Aligned, Sentence segmented |
| Dataformaat | txt |
| Documentatie | Readme available with download |
| Eigenaar | North-West University, Centre for Text Technology (CTexT) |
| Financier | Department of Arts and Culture |
| Licentiesoort | Creative Commons Attribution-NonCommercial-ShareAlike 2.5 South Africa |
| Opdrachtgever | Department of Arts and Culture |
| Talen | English, Sesotho sa Leboa (Sepedi) |
| Versie | 1.0 |
Downloaddetails
| Bestand | |
|---|---|
| 20150804_Autshumato_English-Sesotho_sa_Leboa_Parallel_Corpora_1.0.zip |
- Aantal bestanden 1
- Aantal downloads 1
- Bestandsgrootte 2.37 MB
- Datum plaatsing 02/09/2020
- Laatst bijgewerkt 22/07/2021
- Versie