Saturday, November 11, 2006

datrie for a large dictionary

datrie is a trie implementation in C using double array by thep. Its purpose is mainly for assisting word segmentation in libthai, which 16 bit array's indexes are enough. However, I want to use datrie in my experiment where the dictionary is quite large. Therefore I try to expand the array indexes to be 32 bits. To make a patch is pretty convenient since thep has already had plan to extend it. The patch is here. Currently, I think one must be able to choose 16 or 32 bits as a parameter of configure (autoconf). Anyways, in order to do easy 16/32 bits switching, I still have some issues waited for discussion here (in Thai). By the way, datrie32 (without 16/32 switching) must be already sufficient for using in my experiment.

No comments: