Armenian and Thai Scripts in Catalog Records Initiated


By Jessalyn Zoom
This spring, the Acquisitions and Bibliographic Directorate (ABA) expanded the use of Armenian and Thai scripts in Voyager bibliographic records. This enhancement allows researchers and public users to access Library [of Congress] collections in Armenian and Thai languages not only through romanized text but also in the original scripts, greatly improving the discoverability of the Library collections.
“The Library’s online catalog has just added two more original scripts,” ABA’s director, Beacher Wiggins, said. “This major milestone is an important step forward for cataloging foreign-language materials.”
In 1983, the Library introduced records cataloged using original script in Chinese, Japanese and Korean. In 1991, it added Arabic, Persian, Hebrew and Yiddish, enabling users to enter, search, display and retrieve records cataloged in those scripts. In 2007, Cyrillic joined non-Latin scripts in Voyager.
Multiple divisions and offices collaborated over a year and a half to complete the Armenian and Thai languages project. From January to December 2022, staff members from the Asian and Middle Eastern Division (ASME), the Cairo and Jakarta overseas offices and the Asian Division conducted tests while colleagues in the Collections Discovery and Metadata Service (CDMS) and ABA assisted or observed.
Testers were given authorization to use the Voyager practice database so as not to interrupt production cataloging. CDMS metadata specialists and a former ASME chief, now a volunteer, completed crucial preparation before the test started.
Using a task list CDMS developed with input from ABA, participating staff members tested all functionality to the extent possible, including original cataloging input, record update, external record import, printing, searching and display of characters in the Library’s online public catalog and linked data service.
CDMS metadata specialists worked with the testers to analyze test records. In addition, CDMS evaluated records in a production environment to ensure no surprises for external customers who receive and import the Library’s catalog records through the Cataloging Distribution Service.
Testers also resolved issues related to using Armenian and Thai punctuation and numbers to ensure correct transliteration and ease of comprehension.
Upon finding the test results favorable, ABA and CDMS moved the Thai and Armenian test into production. On May 16, the Office of the Chief Information Officer deployed workstation applications, and bibliographic records in Armenian and Thai began to contain original script.
“Having Thai script in the catalog is essential for people who read Thai,” said Ryan Wolfson-Ford, a Southeast Asian reference librarian in the Asian Division.
“I’m excited to provide greater access to Library materials for Armenian speakers,” ASME’s Armenian and Georgian librarian, Brigita Sebald, said.
The project has opened doors to fuller Unicode functionality in Voyager and implementation of Unicode in the Library Collection Access Platform, now under development. Unicode is a standard of character encoding and representation that accommodates most of the world’s writing systems.
The Library holds over 175 million items from around the world. Many of its books and serial collections are in languages written in non-Latin alphabet. The Library’s non-Latin script collections reside in custodial divisions, including the Asian, African and Middle Eastern and Latin American, Caribbean and European divisions, and special collections divisions.
Note: This article was first published in the Library of Congress Gazette.