Jsoup 1.2.3 processes HTML 5

Aug 04, 2010

Jsoup, a free Java library for processing HTML, is available in version 1.2.3 with enhanced HTML 5 support.

Jsoup, a free Java library for processing HTML, is available in version 1.2.3 with enhanced HTML 5 support.

As the parser has always implicitly supported HTML 5 tags, it now knows element definitions of the new standards. The tool can also generate an HTML-5-standards compliant page parse tree for further processing.

The second important innovation in Jsoup automatically detects the character set of a scanned document and decodes the input before parsing. There are also new selectors as well as small fixes and improvements.

Jsoup runs on Java version 1.5 and is under MIT / X license. On the Jsoup homepage there are Jar files for download and instructions in the Cookbook-style and the API reference.

Related content

  • Books

     

  • Cover Story: HTML5

    Back in 1999 when the HTML 4.01 standard first appeared, virtually nobody envisioned video blogs, social networking sites, or Internet office tools. The upcoming HTML 5 standard will remake the web for the new generation of technologies and services.

  • Firefox 3.1 Beta 2 Provides New Features

    The second beta of Firefox 3.1 has arrived with a new JavaScript engine, data security mode and web worker threads.

  • BPEL

    BPEL helps you build tools for managing workflows.

  • GCC 4.5: Features and Fixes

    The GNU Compiler Collection (GCC) version 4.5 picks up numerous new enhancements, beginning with building the GCC from its sources.

comments powered by Disqus

Issue 170/2015

Buy this issue as a PDF

Digital Issue: Price $9.99
(incl. VAT)

News