Table Of Content

  1. Introduction
  2. Architecture
  3. Conversion Functions

Introduction

We identify 6 elements in Chinese language, these elements are key parts in applications. These elements are (as shown in Figure 1)


Figure 1. Key Elements of Chinese Language
ZuYinBoPoMoFo symbols
Yina meaningful pronunciation
Zhia Chinese character
Tsia Chinese word
TsiYincompound of Yin for a Tsi
Chua sentence

Table 1 is a list of potential softwares that may require each of the elements:

ZuYincomputer aided education, input method engine
Yininput method engine, speaking, voice recognition
Zhiinput method engine, word processing
Tsiinput method engine, word processing, checker
TsiYinspeaking, voice recognition
Chuword processor

Table 1. Potential softwares for each element

With libtabe, an application enables the knowledge of processing each element, and conversion between them. For example, an application can query a single character or a word frequency by calling a function provided by libtabe, query a single character or a word's pronunciation and reverse lookup. The complexity of processing each element increase from left to right, from top to bottom in Figure 1.

Architecture


Figure 2. Basic Architecture of libtabe

Figure 2 depicts the basic architecture of libtabe. The solid line with one-way arrow means there's a strong relationship and can be converted. All the conversions are supported by libtabe. Detailed explanations on how the conversion are done can be found in Conversion Functions section.


Figure 2. Extended Architecture of libtabe

Despite of those elements and conversion functions, there're 3 tables and 1 database have been implemented in libtabe. Two out of the three tables are used by conversion functions, the other one is used to provide character frequency. The Tsi database provides more than 140,000 word, including the pronunciation, frequency. We planned to add some more tables and databases if they are appropriate and useful. They are shown in Figure 3.

Conversion Functions

There're eleven conversion functions in the basic architecture. Each of them is responsible for converting from one element to another. Applications frequently do conversion between elements to make use of the underlying semantics of language.

NumberFromToDescription
1ZuYinYinencoding
2YinZuYindecoding
3YinTsiYinconcatenation
4TsiYinYindecoupling
5YinZhitable lookup (1-to-many)
6ZhiYintable lookup (1-to-many)
7TsiYinTsidatabase query (1-to-many)
8TsiTsiYindatabase query (1-to-many)
9ZhiTsiconcatenation
10TsiZhidecoupling
11ChuTsiword segmentation

Table 2. Description of Conversion Functions

Functions 3, 4, 9, 10 are trivial ones, because of the underlying elements are all fixed length. "BoPoMoFo system and encoding in libtabe" provides useful information about conversion functions 1 and 2.


$Id: arch.shtml,v 1.1.1.1 2000/02/23 05:51:20 shawn Exp $
Copyright 2000, libtabe Project. All rights reserved.