Haru PDF Library

John Tytgat joty at netsurf-browser.org
Tue May 27 04:04:59 BST 2008

In message <f0f413440805260819n5b40e128m7ed150df9ddc8842 at mail.gmail.com>
          "Adam Blokus" <adamblokus at gmail.com> wrote:

> The provided functionality is enough to cover all the plotting functions of
> the plotters_table interface. There may be some problems:
> - with font handling (didn't work with some fonts for me)

Ah, you have more details ? Which ones didn't work ?

> - with font encoding - (as I understand it: there is some unicode support with
> chinese and japanese characters, but multibyte encoding seems not to be
> supported in general; this can be done by converting from utf8 to someting
> else first? )

I only looked very briefly at Haku API and a possible approach could be
that you switch the encoding when needed.  And this means that the font
used for that encoding is going to be switched as well as I understand
from the Haru API that you can only select a certain set of fonts for
each category of encodings.

And that also means that you break the link between the fonts used in
the PDF output (plotter API) and the ones used in the render/layout code
in NS.  I don't think this is a dealbreaker but I understand this means
that the width of text chunks used during layouting should be made aware
of the fonts used in the current selected plotter.

I think this could be a required change as certainly on RISC OS you can't
garantee that the fonts used for render/layout will be the same ones used
for embeddeding unless we would teach Haru somehow to embed RISC OS fonts
(which is going to be outside the scope of this project).

Perhaps an alternative approach : I'm wondering if we can't teach Haru about
the UCS2 CMap encoding and if the rendering makes use of TT fonts (having
Unicode cmap table inside) Haru should be able to use those TT fonts.
When the rendering does not make use of TT fonts, perhaps we could make use
of the so called Unicode TT fonts (i.e. TrueType fonts which do have a large
character set covering a lot of well known encodings).  But here as well,
the width of text chunks in those fonts will be different that the ones used
for layouting.

> - with images - there is support for png,raw and jpeg, but,as rjek pointed
> out-not for gif (this one can be added basing on what we already have in
> NetSurf)
> There are still some developers working on this library, so these issues (esp.
> the first one, the second and third can be worked around in some ways) should
> be fixed someday. By now there are not critical obstacles.

I don't think those issues should rule out Haru and believe that those
are solvable.

> Other possible choices, that I see, are:
> - using cairo (has pdf surfaces, but with more limited functionality; rjek
> complained about its dependiences)

I agree.

> - making a .ps and convert it with GS

The font/encoding aspect in PS writing is a bit more difficult than in PDF...

> - hacking some sources of a program with a suitable licence to extract the
> needed functions
> - finding another library ( although I made some research when writing my
> proposal and during the last few days and I didn't find anything better )
> So I am proposing to use Haru - what do you think about it?

Yes, we should go ahead with Haru.

John Tytgat
joty at netsurf-browser.org

More information about the netsurf-dev mailing list