Förtränade inbäddningar (word embeddings) för engelska wikipedia
Datacitering
Språkbanken Text (2024). Engelska inbäddningar (word embeddings) (uppdaterad: 2024-01-25). [Data set]. Språkbanken Text. https://doi.org/10.23695/z9cm-xc45
Ytterligare sätt att citera datamängden.
Förbehåll
Machine learning models trained on uncurated data inevitably learn hidden or obvious biases and as a result, the models shared with here might contain characteristics including sexism, racism, antisemitism, homophobia, and other such types of unacceptable biases. I encourage whoever is using these models to make sure such biases are actually removed before using them in production settings (see eg https://aclanthology.org/N19-1061/)
Referenser
Hengchen, Simon. (2022). Word2vec models trained on English Wikipedia [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6542975
Ladda ned
| Fil | Storlek | Modifierad | Licens |
|---|---|---|---|
| 112.01 MB | 2024-01-25 | CC-BY-4.0 | |
| 3.75 GB | 2024-01-25 | CC-BY-4.0 | |
| 3.75 GB | 2024-01-25 | CC-BY-4.0 | |
| 28.04 MB | 2024-01-25 | CC-BY-4.0 | |
| 949.26 MB | 2024-01-25 | CC-BY-4.0 | |
| 949.26 MB | 2024-01-25 | CC-BY-4.0 |