The expansion of the Gigafida corpus: internet content

Vesna MikoliÄ

The expansion of the Gigafida corpus: internet content

2017

book part

Abstract

The paper discusses the expansion of the Gigafida corpus, a Slovenian reference corpus, to include Internet content, i.e. web pages and user-generated content (tweets, blogs, forums and comments on news portals). The resources and tools available which are best suited to achieve this objective are discussed, and the web crawling methodology used for this purpose is also presented.

DOI

10.4312/9789612379131

Archivio

http://hdl.handle.net/11368/3007023

https://e-knjige.ff.uni-lj.si/znanstvena-zalozba/catalog/book/2

Diritti

metadata only access

Soggetti

reference corpus, Slo...

google-scholar

Opzioni

The expansion of the Gigafida corpus: internet content