HKIUG CJK/Unicode Resources

A. About the HKIUG Unicode Task Force

The HKIUG Unicode Task Force was officially established by the HKIUG Standing Committee in February 2005 to maintain the CJK/Unicode resources produced by the HKIUG Unicode Project in 2003. It is also responsible for developing new resources such as the TSVCC tables; facilitating the searching, display and retrieval of CJK records in library catalogs; and assisting member libraries in migrating from EACC-based character encoding to Unicode.

Members of the Task Force include:

Past Task Force members:

B. HKIUG Unicode Project and the Code Table

In 2003, HKIUG member libraries were in the process of rolling out Innovative's Unicode-based Millennium modules as well as testing the CJK and Unicode support in the web-based Online Public Access Catalog. It was found that collaborative effort among INNOPAC/Millennium users was needed in order to fix the retrieval and storage problems caused by the incorrect mappings among Unicode (UTF-8), EACC and BIG5 encodings.

In July 2003, a working group of catalogers and systems librarians from HKIUG member libraries was established to study the issues and develop a joint proposal for the vendor. After two months of effort, the group completed its study, produced a EACC/Unicode Mapping Table and submitted the proposal to the vendor for implementation.

The EACC/Unicode Mapping Table is an useful CJK/Unicode resource. It supplements Library of Congress's East Asian Code Tables in two main aspects:

You can download the HKIUG EACC/Unicode Mapping Table in HTML and XML formats from the following links:

C. TSVCC Tables

Attempts to create TSVCC (Traditional, Simplified, Variant Chinese Characters) links for Chinese characters began in 2004. TSVCC linking allows retrieval systems to implement search logic so that searching one form of a character will also retrieve all the other forms. This linking information is particularly essential for native Unicode database system. Unlike EACC-based system that can make use of EACC's internal structure for linking, Unicode-based system has to rely on external resource in order to implement such linking logic.

There are two versions of the HKIUG TSVCC Table, one for EACC-based systems and the other for Unicode-based systems. You can download them from the following links:

D. Romanization and Radical/Stroke Table of Unihan Characters

This table is useful for people who are interested in knowing the Pinyin and Wade-Giles romanziation as well as the radicals and stroke counts of Chinese characters as found in Unicode's Unihan database.

E. Presentations

Please send comments and enquires to the Chair of the HKIUG Unicode Task Force Last revised on 14 November 2013