SinMin news - Text classification


Sinmin contains texts of different genres and styles of the modern and old Sinhala language. The main sources of electronic copies of texts for the corpus are online Sinhala newspapers, online Sinhala news sites, Sinhala school textbooks available in online, online Sinhala magazines, Sinhala Wikipedia, Sinhala fictions available in online, Mahawansa, Sinhala Blogs, Sinhala subtitles and Sri lankan gazette.

Language - Sinhala

Authors - D. Upeksha, C. Wijayarathna, M. Siriwardena, L. Lasandun, C. Wimalasuriya,

N. H. N. D. De Silva, and G. Dias

Reference - https://nisansads.staff.uom.lk/#DataSets

Citation - D. Upeksha, C. Wijayarathna, M. Siriwardena, L. Lasandun, C. Wimalasuriya, N. H. N. D. De Silva,

and G. Dias, "Implementing a Corpus for Sinhala Language," in Symposium on Language Technology

for South Asia 2015, 2015, [pdf] [bib]