دورية أكاديمية

W2C – Web to Corpus – Corpora

التفاصيل البيبلوغرافية
العنوان: W2C – Web to Corpus – Corpora
المؤلفون: Majliš, Martin
بيانات النشر: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
سنة النشر: 2013
المجموعة: OLAC: Open Language Archives Community
مصطلحات موضوعية: multilingual corpora
الوصف: A set of corpora for 120 languages automatically collected from wikipedia and the web. Collected using the W2C toolset: http://hdl.handle.net/11858/00-097C-0000-0022-60D6-1Test
نوع الوثيقة: text
اللغة: Afrikaans
unknown
Albanian
Amharic
Arabic
Aragonese
Egyptian (Ancient)
Asturian; Bable; Leonese; Asturleonese
Azerbaijani
Belarusian
Bengali
Bosnian
Breton
Buginese
Bulgarian
Catalan; Valencian
Cebuano
Czech
Chuvash
Corsican
Welsh
Danish
German
Greek, Modern (1453-)
English
Esperanto
Estonian
Basque
Faroese
Persian
Finnish
French
Western Frisian
Chinese
Gaelic; Scottish Gaelic
Irish
Galician
Gujarati
Haitian; Haitian Creole
Hebrew
Hindi
Croatian
Sorbian languages
Hungarian
Armenian
Interlingua (International Auxiliary Language Association)
Indonesian
Icelandic
Italian
Javanese
Japanese
Kannada
Georgian
Korean
Kurdish
Latin
Latvian
Lithuanian
Malayalam
Marathi
Macedonian
Malagasy
Mongolian
Maori
Malay
Burmese
Low German; Low Saxon; German, Low; Saxon, Low
Nepali
Nepal Bhasa; Newari
Dutch; Flemish
Norwegian Nynorsk; Nynorsk, Norwegian
Norwegian
Occitan (post 1500)
Polish
Portuguese
Quechua
Romanian; Moldavian; Moldovan
Russian
Yakut
Sicilian
Scots
Slovak
Slovenian
Spanish; Castilian
Serbian
Swahili
Swedish
Tamil
Tatar
Telugu
Tajik
Tagalog
Thai
Turkish
Ukrainian
Urdu
Uzbek
Vietnamese
Waray
Yiddish
Yoruba
العلاقة: http://hdl.handle.net/11858/00-097C-0000-0022-6133-9Test
الإتاحة: http://hdl.handle.net/11858/00-097C-0000-0022-6133-9Test
حقوق: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) ; http://creativecommons.org/licenses/by-sa/3.0Test/
رقم الانضمام: edsbas.DEAB723C
قاعدة البيانات: BASE