This paper presents the co-occurrence dictionary based on Thai phenomena. The theoretical background, the data structure, the dictionary development and word collocation information are described in details. At present, 75,000 word collocations have been added in the co-occurrence dictionary with the help of linguists who made much effort in encoding the linguistic information. Hopefully, the word collocation information presented in this paper will be the useful resources for the natural language processing studies, and second language acquisition.
This paper describes semi-automatic construction of Thai WordNet and the applied method for Asian wordNet. Based on the Princeton WordNet®, we develop a method in generating a WordNet by using an existing bi-lingual dictionary. We align the PWN Synset to a bilingual dictionary through the English equivalent and its part-of-speech (POS), automatically. Manual translation is also employed after the alignment. We also develop a web-based collaborative workbench, called KUI (Knowledge Unifying Initiator), for revising the result of synset assignment and provide a framework to create Asian WordNet via linkage through PWN synset.