Dictionaries

Co-occurrence dictionary: technical report (1995) (academic paper)

This paper presents the co-occurrence dictionary based on Thai phenomena. The theoretical background, the data structure, the dictionary development and word collocation information are described in details. At present, 75,000 word collocations have been added in the co-occurrence dictionary with the help of linguists who made much effort in encoding the linguistic information. Hopefully, the word collocation information presented in this paper will be the useful resources for the natural language processing studies, and second language acquisition.

Princeton University WordNet

WordNet®is a large lexical database of English language maintained at the Cognitive Science Laboratory of Princeton University. The database groups nouns, verbs, adjectives and adverbs into sets of synonyms (synsets) each of which express a distinct concept.

Thai WordNet Construction (academic paper)

Abstract

This paper describes semi-automatic construction of Thai WordNet and the applied method for Asian wordNet. Based on the Princeton WordNet®, we develop a method in generating a WordNet by using an existing bi-lingual dictionary. We align the PWN Synset to a bilingual dictionary through the English equivalent and its part-of-speech (POS), automatically. Manual translation is also employed after the alignment. We also develop a web-based collaborative workbench, called KUI (Knowledge Unifying Initiator), for revising the result of synset assignment and provide a framework to create Asian WordNet via linkage through PWN synset.

Download Thai <-> English dictionary data files in XML format

From this page you can download Thai to English and English to Thai dictionary data in XML format. The original data comes from NECTEC LEXiTRON 2.0 data files and was converted to XML (Extensible Markup Language) by Bangkok Library. XML is an open standard for creating custom markup languages and sharing data. The converted data files are valid and well-formed XML documents and can be imported into Microsoft Access 2003 where you can build queries and forms to create your own Thai <-> English dictionary program.

LEXiTRON English-Thai dictionary data imported into Microsoft Access database

Download LEXiTRON v.2.0 Thai <-> English dictionary data files

LEXiTRON is a free Thai <-> English dictionary produced by NECTEC. Its dictionary data files have been used and adapted by several websites and other software programs. Version 2.0 includes two plain text files files (telex and etlex) that contain Thai to English and English to Thai translations respectively.

Screenshot of Thai to English dictionary data file opened in Microsoft Visual Studio 2003

Download LEXiTRON v.2.1 beta for Windows

LEXiTRON v.2.1 beta is a very early version of the free Thai <-> English dictionary produced by the National Computer and Electronics Center, Thailand (NECTEC). The software includes a license agreement that is dated 2003.

warning.png

WARNING: THIS IS OUTDATED BETA SOFTWARE. BETA SOFTWARE CAN BE UNSTABLE AND CAUSE CRASHES OR DATA LOSS. YOU ARE ADVISED NOT TO INSTALL THIS PROGRAM. YOU CAN DOWNLOAD A MORE RECENT VERSION OF LEXITRON HERE.

NECTEC LEXiTRON Dictionary 3.0 available for download soon

March 12, 2009: The NECTEC LEXiTRON website continues to have user registration and login problems. We phoned NECTEC today to report these problems and find out more about LEXiTRON version 3.0 beta. Apparently, LEXiTRON version 3.0 refers to the new look website and not a new version of the windows program. A new version of the Windows program may be released in about 2 - 4 months. Until then, we suggest you continue using the most current version, LEXiTRON 2.6.

Syndicate content