Format Writer Documents with Any Markup

Dmitri Popov

Productivity Sauce

Sep 25, 2009 GMT
Dmitri Popov

Although I use OpenOffice.org Writer for all of my writings, most publishers I write for require plain text files with a dash of markup formatting. This means that pretty much none of my articles are delivered as .odt files. So before I can send the article, I have to format it using one of many markup dialects. For example, all my blog posts must be formatted using markup supported by eZ Publish software, while articles for Linux Pro Magazine must be formatted using a special in-house markup.

Doing formatting manually is both tedious and time-consuming, so I wrote an OpenOffice.org macro that does the donkey job for me. The way the macro works is pretty simple: it searches for formatted text fragments in the current Writer document and wraps all occurrences into the appropriate tags. For example, when the macro finds a text fragment in bold, it adds the <B> and </B> tags to the fragment, so this is bold becomes <B>this is bold</B>. Besides text formatting, the macro also supports paragraph styles, so it adds the appropriate header tags to paragraphs formatted as Heading 1, Heading 2, Heading 3, and so on. In addition to that, the macro can also handle hyperlinks.

Sub HTMLMarkup

MarkupHeadingsFunc("Heading 1", "<H1>", "</H1>")
MarkupHeadingsFunc("Heading 2", "<H2>", "</H2>")
MarkupHeadingsFunc("Heading 3", "<H3>", "</H3>")

MarkupTextFunc("CharWeight", com.sun.star.awt.FontWeight.BOLD, "<B>&</B>")
MarkupTextFunc("CharPosture", com.sun.star.awt.FontSlant.ITALIC, "<I>&</I>")

MarkupURLFunc

End Sub

Function MarkupHeadingsFunc (StyleName, StartTag, EndTag)
ThisDoc=ThisComponent
ThisText=ThisDoc.Text
ParaEnum=ThisText.createEnumeration
While ParaEnum.hasmoreElements
 Para=ParaEnum.nextElement
  PortionEnum=Para.createEnumeration
   While PortionEnum.hasMoreElements
    Portion=PortionEnum.nextElement
     If Portion.paraStyleName = StyleName then
       Portion.String = StartTag + Portion.String + EndTag
      End if
   Wend
Wend
End Function

Function MarkupTextFunc(SearchAttrName, SearchAttrValue, ReplaceStr)
Dim SearchAttributes(0) As New com.sun.star.beans.PropertyValue
ThisDoc=ThisComponent
SearchAttributes(0).Name=SearchAttrName
SearchAttributes(0).Value=SearchAttrValue
ReplaceObj=ThisDoc.createReplaceDescriptor
ReplaceObj.SearchRegularExpression=true
ReplaceObj.searchStyles=false
ReplaceObj.searchAll=true
ReplaceObj.SetSearchAttributes(SearchAttributes)
ReplaceObj.SearchString=".*"
ReplaceObj.ReplaceString=ReplaceStr
ThisDoc.replaceAll(ReplaceObj)
End Function

Sub MarkupURLFunc
ThisDoc=ThisComponent
ThisText=ThisDoc.Text
ParaEnum=ThisText.createEnumeration
While ParaEnum.hasmoreElements
 Para=ParaEnum.nextElement
  PortionEnum=Para.createEnumeration
   While PortionEnum.hasMoreElements
    Portion=PortionEnum.nextElement
     If Portion.HyperlinkURL <> "" then
       Portion.String = "<A HREF=""" + Portion.HyperlinkURL +""">" +Portion.String + "</A>"
      End if
   Wend
Wend
End Sub

The macro formats the active Writer document using the HTML markup, but you can easily adapt it for other markups. For example, if you want the macro to format the text using the DokuWiki markup, replace parameters in the MarkupHeadingsFunc and MarkupTextFunc functions with DokuWiki tags:

MarkupHeadingsFunc("Heading 1", "====== ", " ======")
MarkupHeadingsFunc("Heading 2", "===== ", " =====")
MarkupHeadingsFunc("Heading 3", "==== ", " ====")

MarkupTextFunc("CharWeight", com.sun.star.awt.FontWeight.BOLD, "**&**")
MarkupTextFunc("CharPosture", com.sun.star.awt.FontSlant.ITALIC, "//&//")

Also, you can easily extend the macro to add support for other text and paragraph styles. The following two statements format the underlined and strikeout text fragments using the appropriate DokuWiki tags:

MarkupTextFunc("CharUnderline", com.sun.star.awt.FontUnderline.SINGLE, "__&__")
MarkupTextFunc("CharStrikeout", com.sun.star.awt.FontStrikeout.SINGLE, "<del>&</del>")

The described macro saves me a lot of time and work, and if you find yourself in a similar situation where you need to apply specific formatting to your Writer documents, this simple tool can help you, too.

comments powered by Disqus