DReaM

Standard reference

Shafqat Virk, Harald Hammarström, Markus Forsberg, Søren Wichmann (2020): The DReaM Corpus: A Multilingual Annotated Corpus of Grammars for the World’s Languages, in Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, 11–16 May 2020 / Editors : Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis

Data citation

Språkbanken (2020). DReaM (updated: 2020-11-09). [Data set]. Enriched and distributed by Språkbanken. https://doi.org/10.23695/jeyf-1m55

Additional ways to cite the dataset.

A multilingual corpus of linguistic descriptions of the world's natural languages.

This resource contains an openly available multilingual digitized version of thousands of documents describing natural languages of the world. The corpus is annotated with various meta, word, and text level attributes. More details about the data and annotations can be found in the reference given below.

There is also a password protected part of the corpus which can be found here.

Accessible through

Access	Platform	Licence
https://spraakbanken.gu.se/korp/?mode=dream		CC-BY-4.0

Download

File	Size	Modified	Licence
dream.zip.bz2 corpus (XML)	188.83 MB	2020-11-11	CC-BY-4.0

Datasets in this collection

Number of hits: 13

Resource	Type	Language	Access
DReaM-de-open German open part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	German	Word statistics: stats_DREAM-DE-OPEN.txt.zip 2025-04-22 – 14.28 MB – CC-BY-4.0 Explore in:
DReaM-de-restricted German restricted part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	German	Word statistics: stats_DREAM-DE-RESTRICTED.txt.zip 2025-04-22 – 24.3 MB – CC-BY-4.0 Explore in:
DReaM-en-open English open part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	English	Word statistics: stats_DREAM-EN-OPEN.txt.zip 2025-04-22 – 14.64 MB – CC-BY-4.0 Explore in:
DReaM-en-restricted English restricted part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	English	Word statistics: stats_DREAM-EN-RESTRICTED.txt.zip 2025-04-22 – 182.85 MB – CC-BY-4.0 Explore in:
DReaM-es-open Spanish open part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	Spanish	Word statistics: stats_DREAM-ES-OPEN.txt.zip 2025-04-22 – 9.23 MB – CC-BY-4.0 Explore in:
DReaM-es-restricted Spanish restricted part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	Spanish	Word statistics: stats_DREAM-ES-RESTRICTED.txt.zip 2025-04-22 – 20.32 MB – CC-BY-4.0 Explore in:
DReaM-fr-open French open part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	French	Word statistics: stats_DREAM-FR-OPEN.txt.zip 2025-04-22 – 8.98 MB – CC-BY-4.0 Explore in:
DReaM-fr-restricted French restricted part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	French	Word statistics: stats_DREAM-FR-RESTRICTED.txt.zip 2025-04-22 – 46.58 MB – CC-BY-4.0 Explore in:
DReaM-it-open Italian open part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	Italian	Word statistics: stats_DREAM-IT-OPEN.txt.zip 2025-04-22 – 2.06 MB – CC-BY-4.0 Explore in:
DReaM-it-restricted Italian restricted part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	Italian	Word statistics: stats_DREAM-IT-RESTRICTED.txt.zip 2025-04-22 – 4.53 MB – CC-BY-4.0 Explore in:
DReaM-nl-open Dutch open part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	Dutch	Word statistics: stats_DREAM-NL-OPEN.txt.zip 2025-04-22 – 2.03 MB – CC-BY-4.0 Explore in:
DReaM-nl-restricted Dutch restricted part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	Dutch	Word statistics: stats_DREAM-NL-RESTRICTED.txt.zip 2025-04-22 – 3.39 MB – CC-BY-4.0 Explore in:
DReaM-ru-open Russian open part of the dataset from the project DReaM: The Dictionary/Grammar Reading Machine.	Corpus	Russian	Word statistics: stats_DREAM-RU-OPEN.txt.zip 2025-04-22 – 1.31 MB – CC-BY-4.0 Explore in:

Standard reference

Data citation

Accessible through

Download

Datasets in this collection

Type

Language

Size

Updated

Contact

DOI