Developer Steve Wolter has released Version 1.0 of Libhyphenate, his C++ hyphenation library, and has also written a sample application.
Libhyphenate implements the hyphenation algorithm also used by the Tex layout system, and is described in a thesis titled "Word Hy-phen-a-tion by Com-put-er" by Frank Liang. The library currently supports hyphenation libraries for English, German and French. More languages can be generated from the corresponding Tex files according to Wolter. The current release fixes issues with UTF 8 encoding in German texts that affected the previous version.
Wolter has also coded a small application based on his C++ library: XHTML Hyphenate, Version 1.0 of which was also released recently, supports hyphenation for XHTML documents. To do so, it adds the UTF 8 U+00AD separating character to text content (apart from titles) at hyphenation borders; this character is interpreted by many browser, but ignored by Firefox. The program parses the correct hyphenation locale from the "xml:lang" attribute.
XHTML Hyphenate is released under the GPL, and the Libhyphenate library under the LGPL. Both are available from the author’s homepage as source code archives.
Comments