BasiLex-corpus - INT Taalmaterialen

Het BasiLex-corpus is een geannoteerde verzameling van teksten geschreven voor kinderen in de basisschoolleeftijd. Het corpus bevat 13,5 miljoen tokens, waarvan 11,5 miljoen woorden. De tokens komen voor ongeveer 40% uit educatieve materialen, 40% uit kinderliteratuur en 20% uit media.

Voor commercieel gebruik zie de commerciële productpagina.

The BasiLex corpus is an annotated collection of texts written for primary school-aged children. The corpus contains 13.5 million tokens, of which 11.5 million are words. About 40% of the tokens come from educational materials, 40% from children's literature and 20% from media.

For commercial use, see the commercial product page.

Dit product is gratis, maar het tekenen van een licentie is vereist. De download bevat de licentie en verdere instructies voor het plaatsen van een bestelling.

This product is free, but signing a license agreement is required. The download contains the license and further instructions for placing an order.

Productdetails

Besturingssysteem	Linux, Windows
Dataformaat	xml (FoLiA)
Doelpubliek	Voornamelijk voor leerkrachten, makers van lesmaterialen en toetsen, schrijvers van kinderliteratuur, uitgevers en onderzoekers.
Eigenaar	Radboud Universiteit
Financier	NWO
Jaar	2015
Originele publicaties	Tellings, A., Hulsbosch, M., Vermeer, A. & van den Bosch, A. (2015). BasiLex: an 11.5-million words corpus of Dutch texts written for children. Computational Linguistics in the Netherlands Journal 4, 191-208
Project	WIC-CorD: a Dutch Written Input for Children Corpus, POS-tagged and lemmatized, with a derived lexicon tagged for frequency and linguistic characteristics
Refereren	Tellings, A. E. J. M. (2015), BasiLex-corpus (Version 1.0) [Data set]. Available at the Dutch Language Institute: http://hdl.handle.net/10032/tm-a2-n4
Talen	Nederlands
Versie	1.0

Downloaddetails

Bestand
BP_BasiLex-corpus_NC.zip

Aantal bestanden 1
Aantal downloads 146
Bestandsgrootte 52.32 KB
Datum plaatsing 17/07/2020
Laatst bijgewerkt 15/12/2025
Versie 1.0