FOSSPicks
Text processor
pandoc 3.1
While we're in the realm of text processing, there's one tool above all others that writers turn to when they have to convert documentation in one format to documentation in another. That's pandoc. Pandoc has been around since 2006 and has just celebrated the major milestone release of 3.0, formalizing more than five years of development work. Pandoc is the text equivalent to FFmpeg, capable not just of converting a huge number of input formats to a huge number of output formats, but also transforming the text on its way through its various processors. It can convert between Markdown formats, HTML, LaTeX and DOCX, and output to PDF while transforming tables, definition lists, footnotes, and even mathematical and R notation. We used it extensively when converting magazine articles in XML to ePub.
What makes pandoc isn't so much the input and output formats it supports, but the way you can finely control how the conversion operates, often iterating over finer points in substantial sets of documentation that might otherwise take considerable effort to convert any other way. This power is thanks to what pandoc calls filters, a program module that can modify the text as it passes through. Many are included, and you can easily write your own using JSON or Lua. But the big change for this release is how pandoc runs, because you now have a choice in how you access its functionality, either using a traditional command-line tool, as a server, or via a library. This makes it easier to incorporate into whatever system you use to translate your docs. It can even be used as an interpreter, like Python, so you can quickly check command-processing effects without editing and running a script. This is a great way to get started if you've ever been put off by pandoc's complexity and definitely worth the time investment.
Project Website
TV recorder
Buy this article as PDF
(incl. VAT)