Skip to main content
Språkbanken Text is a department within Språkbanken.

Word Embeddings trained on English Wikipedia

Citation Information

Språkbanken Text (2024). Word Embeddings trained on English Wikipedia (updated: 2024-01-25). [Data set]. Språkbanken Text. https://doi.org/10.23695/z9cm-xc45
BibTeX Additional ways to cite the dataset.
Word Embeddings trained on English Wikipedia

See See https://zenodo.org/record/6542975

Caveats

Machine learning models trained on uncurated data inevitably learn hidden or obvious biases and as a result, the models shared with here might contain characteristics including sexism, racism, antisemitism, homophobia, and other such types of unacceptable biases. I encourage whoever is using these models to make sure such biases are actually removed before using them in production settings (see eg https://aclanthology.org/N19-1061/)

References

  • Hengchen, Simon. (2022). Word2vec models trained on English Wikipedia [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6542975

File Size Modified Licence
112.01 MB 2024-01-25 CC BY 4.0
attribution
3.75 GB 2024-01-25 CC BY 4.0
attribution
3.75 GB 2024-01-25 CC BY 4.0
attribution
28.04 MB 2024-01-25 CC BY 4.0
attribution
949.26 MB 2024-01-25 CC BY 4.0
attribution
949.26 MB 2024-01-25 CC BY 4.0
attribution

Type

  • Model

Language

English

Size

Updated

2024-01-25

Contact

Språkbanken
sb-info@svenska.gu.se