Source scrubbing
by Franz Korntner
Hi,
I first contacted Rob Kendrick who suggested I join this list. Attached
is the mail I originally sent him and the response to his mail he
forwarded to this mailing list.
In the meantime I noticed that my contribution was incomplete as I
missed out the riscos subdirectory. I work under linux and have no
access to riscos running platform. The full patch netsurf-1.1a4.patch.gz
is some odd 10000 lines and I have made a selection of interesting
changes in netsurf-1.1a4.fixes.gz which might possibly be considered as
bugfixes. [the a4 is my internal version number].
The intention was to scrub the sources as to analyze it for one of our
porting projects. The result should be able to compile with extra
switches such as -ansi -Wall -Wextra -Werror. You should also be able to
pull it through a C++ compiler with -x c++. I require the latter to
allow for extra checking (to relax porting requirements) and to use
function and operator overloading to analyze dataflow and -scoping
though the application.
Finally I constructed autoconf/automake templates as I require certain
functionality not delivered by the supplied makefile. I suggest you
include these files so that package/distribution builders can choose
which system to use. For the same reasons I suggest you include
precompiled versions of lemon/re2c generated files.
At the moment I'm analyzing the rendering coordinate system, the numeric
operations on it, the mixed use of int's and floats, the knowledge that
a FPU is absent, and the propagation of rounding errors. It gives
interesting reading and I see possibilities to extremely minimize FP
operations in the rendering engine in exchange for maximized accuracy.
Franz.
---------------------------------
Hi,
At this moment I am busy porting netsurf (specifically the rendering
engine) to a different platform and environment. A part of that process
is to ensure that the source code is of a certain level of maturity so
that our methods and tools can get their teeth properly into it. As a
result the netsurf sources have been scrubbed down. The process mainly
involves checking type hinting, prototyping, scoping, const usage,
signedness, use of floatingpoint and the c89 standard.
I would hereby like to contribute the applied changes. I have also
included automake/autoconf templates. However as I am developing under
linux and have no access to riscos, I am not sure about the
desirability. To enable autoconf/automake, just issue the commands
'automake -c -a --foreign; autoconf'. Otherwise just drop configure.ac,
aclocal.m4 and all the Makefile.am.
The file netsurf-1.1a.patch.gz contains the complete modifications.
The patch is quite large. It contains mostly type hinting and code
reconstruction and should by themselves not effect program
functionality. The modifications that do are considered 'real' patches
and I have isolated them in the file netsurf-1.1a.patchonly.gz for
visual reference.
The patch also contains many casts to int. This is because the source
are floats and the operation causes loss of precision. I sometimes get
the impression that the effect might cascade (especially when scaling)
and I feel that it might be wiser to use floats throughout the whole
renderer and convert to ints in the backend. This not only maximizes
precision, but will most likely run faster.
I also very strongly suggest that you include copies of the generated
files css/parser.c css/parser.h css/scanner.c in the distribution. It
relaxes the need to install lemon and re2c. The templates rarely change
and most people just compile and run without wanting a full development
environment.
I needed to rename css/css_enums css_enum.enum for makefile dependency
reasons. Hope you don't mind.
Finally... I might just be tempted to get a javascript engine running.
But i'm not sure whether to use spidermonkey or ecmascript. Both have
pro's and con's and besides, I still need a better 'feeling' on how
changes to css/xml propagate through the system. I'm easier tempted to
pass certain validations such as acid2 and the complexspiral (perhaps
even more) as rendering of certain sites has things left desired.
Cheers,
Franz.
--------------------------
Hi Rob,
Thanks for your surprisingly very fast response. I haven't connected to
the forum yet and at this point feel it nicer to communicate in a more
personal manner. I would like to give a quick response so you might get
a better impression about what I'm standing for.
To begin with, it is not my intention to educate or reform anything or
anybody which is netsurf related. I selected netsurf as the candidate
for one of our projects. As a result it will get tweaked. It would be a
pity if the tweaks stay internal as they improve the quality level of
sources. That is, our quality level. It up to you and the development
team to decide which issues are part of your QC. You can import the ones
that share the same base. Browse through the patchfile and select the
ones to your liking. All patch snippets have a very local scope, except
for css/css.h where the scoping of the enums and structs requires a more
rigorous rewrite.
> NetSurf is written in C99, no C89.
>
>
Hmm. Please don't get me wrong but this is turning the world around.
It's ok to say that 'I *need* certain functionality that is only
available in C99'. What I am experiencing is 'We set the standard to C99
and we are free to use the extra available functionality'.
Browsing the code I have found nothing that requires C99. I have found
only some programming shortcuts that are handy if you have C99, but a
simple rewrite can avoid this requirement.
I have chosen netsurf because it's a great candidate for small-footprint
multi-platform/multi-language environment. For that to be easy you need
sources that rely on a small base of functionally which is simple to
convert to the target systems. By implementing the 'freedom' of a more
advanced standard not only makes such a conversion more complex, it
reduces the 'unique selling point' of netsurf. And arguing such a
reduction in favor of programming shortcuts -- which can be measly
avoided -- just isn't right. [Note: the patches supplied makes netsurf
full C89 conformant!]
> I notice your patch includes casting void *s to other pointer types -
> this isn't necessary, and just complicates the code.
>
Yes and no. What happens is that the source argument is of type void*
and you are assigning it to a typed pointer. The compiler has to guess
which type that might be. Some languages allow it, some don't. Some
standards allow it, some don't. It does not complicate the source, it
illustrates. Why make the code less portable because of a programmers
shortcut to drop a typecast at places where the compiler (and also the
reader) has to guess. Besides, such casts are very usefull as they guard
against certain changes of prototypes and definitions. If not present,
the compiler will swallow nearly anything, even undesired changes.
I haven't mentioned it before, but I have made the sourcecode C++
compliant. Why? the gnu C++ compiler has tons of extra checks not
available under C. And, C++ will choke if you drop explicit casts when
assigning from void*.
So, you should be able to compile with make CC='gcc -ansi -Wall -Werror'
You should also be able to compile with make CC='gcc -x c++ -Wall -Werror'
Being such makes the code a very good candidate to go porting in many
directions!!!
You might think I'm pedantic or even worse. I have over 30 years in the
core development of embedded industrial design, programming languages
and their environments. I have ported loads of stuff from one language
and platform to another. My knowledge and experience has taught me what
might be harmfull and what assistive.
> (The issue is almost all the chars etc on RISC OS are of the
> same signedness, but under UNIX they tend not to be, so you get a
> great deal more warnings.)
>
>
Aha, so thats why for loops cater for unsigned ints. I guess you must
have loads of problems when porting unix packages to riscos!!
No problem with me, as I'm signedness aware. It's good to know, so i'll
be prepared.
> Autoconf/automake/libtool will never make it into the official version.
> Almost all of the developers are extremely against it. Also, NetSurf
> builds on most UNIXes out of the box assuming the dependencies are
> installed. If it doesn't, there's a bug that needs fixing, not autoconf
> adding.
>
>
H'mm. I used to think like that, but changed my opinion due to experience.
Have you compared the Makefile.am with your makefile and the
functionality they deliver?
I work on a non-standard gnu/linux environment. I do not use standard
distributions but maintain proprietary distributions for our headless
systems. autoconf/automake is a lifesaver. Your makefile will and does
certainly not work on our development environment. And it's not easy
fixing.
If I may suggest: include the autoconf/automake templates in some
directory. For that I can construct 3 files, configure.ac Makefile.in
and aclocal.m4. They are by themselves harmless. This way the developer
has the *choice* to use the makefile that best fit their need. Please
don't make that choice for them.
> I'm not entirely sure why we mix things here. We use floats where we
> actually want them, but due to NetSurf's native platform almost always
> lacking an FPU, we've historically avoided using them everywhere.
>
>
To start with, I'm familiar with FPU issues and needed that to port
linux and gnu to StrongArm based embedded systems. That processor family
lacks FPU support. As you say, you use floats where you need them. You
pull value out of ints, perform float operations and store them into
ints again with loss of precision. Some point later you pull the value
out again perform other operations and store with even more lack of
precision. 32 bit floats aren't that accurate by themselves and using
this model makes things even worse.
At the moment I'm examining the possibility of an alternative model
where intermediate results do not result in loss of precision. But don't
worry, i'm highly experienced in this field so I confident enough to
state that I know what i'm doing.
> We've considered this. It feels unclean. Certainly, lemon and re2c are
> packaged in most distributions.
>
In the past I also shared this view. But experience has changed my opinion
> I don't see any advantage of including them, given the list of build
> dependencies are clearly stated, and you're going to have to fetch
> some things anyway as they're not all that widely-used.
>
>
How can you secure that future versions of lemon/re2c will generate
compatible code.
How can you secure then that older versions of lemon/re2c stay available?
I sincerely hope you never get into the situation where you find that
you cannot get an (older) package working because the requirements
cannot be met. And you cannot upgrade to a newer version. And that you
are in a business critical environment and *need* to get it working, but
can't.And that your boss will not take no for an answer. And he wants to
see an answer NOW!
I have been freelancing in the past to assist in such situations. People
running around screaming and seeing them pulling their hair out is
enough to make even the boldest shiver.
>> I needed to rename css/css_enums css_enum.enum for makefile
>> dependency reasons. Hope you don't mind.
>>
>
> Can you clarify?
>
>
It's got to do with make pattern rules. Like building parser.c from
parser.y. The basename stays the same, only the extension changes. You
build css_enum.c and css_enum.h from css_enums. This highly complicates
pattern matching. By renaming css_enums to css_enum.<something> you
match the model and have much more freedom with implicit building of
intermediate files simplyfing your makefiles.
> The current layout engine cannot cope with the document changing.
> There's also no DOM support to speak of. John Mark Bell has recently
> been writing a new HTML parser and a DOM library that should hopefully
> be complete with a few months. That would make JavaScript somewhat
> easier.
>
>
Good to know.
I have a good feeling that DOM functionality can easily be injected in
the current HTML parser. I am looking for a small footprint package and
get unnervy with the foresight of the introduction of a complete and/or
standalone DOM component. This might make things easier.
> We've looked at several JavaScript libraries in the past. The choices
> are narrow when you have the requirements that it be linkable to C,
> compilable on RISC OS, and have a GPL2-compatible licence. Performance
> is also an issue: Many users of NetSurf use it on machines with CPUs
> less than 500MHz, and with no FPU.
>
>
Ok. i'll also take that into account. I'm leaning more towards seamonkey
as it seems that it can be made lightweight.
> I've looked into using Lua, which is a very very fast interpreter and
> language which is also lightweight. Many of JavaScript's features can
> be easily mapped onto Lua, but I've not examined the possibility in any
> depth.
>
>
I haven't examined this one. I will certainly put it on the candidate
list and give it a internal review.
> Thanks again for your patch. Alas, it's not my decision if any of it
> should go in, but I have forwarded it for the perusal of the other
> developers. You can expect a reply from them.
>
>
Cheers!
> You might want to join the developer's mailing list and continue the
> discussion there - there's a link from our website
> (http://www.netsurf-browser.org/)
>
>
Yes, certainly, but not at this time. I'm too distracted as it is and
you really don't want to know my working hours.
Franz.
14 years, 8 months
Re: netsurf
by Rob Kendrick
(I've read through the mail, and given the patch a brief gander.
Here is the reply I sent to Franz. Please excuse any errors in
clarifying our policy on some matters - I've never been clear on
them myself.)
On Sun, 2007-10-14 at 14:16 +0200, Franz Korntner wrote:
> Hi,
Hi there. I've forwarded your email onto the developer's mailing list,
but I thought I should just acknowledge the receipt of it, and point a
few things out.
> The process mainly
> involves checking type hinting, prototyping, scoping, const usage,
> signedness, use of floatingpoint and the c89 standard.
NetSurf is written in C99, no C89. I notice your patch includes casting
void *s to other pointer types - this isn't necessary, and just
comlicates the code. The casts of ints etc is something I've been
planning on doing for some time - good work there. (The issue is almost
all the chars etc on RISC OS are of the same signedness, but under UNIX
they tend not to be, so you get a great deal more warnings.)
> I would hereby like to contribute the applied changes. I have also
> included automake/autoconf templates. However as I am developing under
> linux and have no access to riscos, I am not sure about the
> desirability. To enable autoconf/automake, just issue the commands
> 'automake -c -a --foreign; autoconf'. Otherwise just drop configure.ac,
> aclocal.m4 and all the Makefile.am.
Autoconf/automake/libtool will never make it into the official version.
Almost all of the developers are extremely against it. Also, NetSurf
builds on most UNIXes out of the box assuming the dependencies are
installed. If it doesn't, there's a bug that needs fixing, not autoconf
adding.
> The patch also contains many casts to int. This is because the source
> are floats and the operation causes loss of precision. I sometimes get
> the impression that the effect might cascade (especially when scaling)
> and I feel that it might be wiser to use floats throughout the whole
> renderer and convert to ints in the backend. This not only maximizes
> precision, but will most likely run faster.
I'm not entirely sure why we mix things here. We use floats where we
actually want them, but due to NetSurf's native platform almost always
lacking an FPU, we've historically avoided using them everywhere.
> I also very strongly suggest that you include copies of the generated
> files css/parser.c css/parser.h css/scanner.c in the distribution. It
> relaxes the need to install lemon and re2c. The templates rarely change
> and most people just compile and run without wanting a full development
> environment.
We've considered this. It feels unclean. Certainly, lemon and re2c are
packaged in most distributions. The obvious counter-example to this is
Red Hat, which doesn't ship Lemon, either separately or with SQLite. A
friend in Red Hat is sorting this out as we speak. I don't see any
advantage of including them, given the list of build dependencies are
clearly stated, and you're going to have to fetch some things anyway as
they're not all that widely-used.
> I needed to rename css/css_enums css_enum.enum for makefile dependency
> reasons. Hope you don't mind.
Can you clarify?
> Finally... I might just be tempted to get a javascript engine running.
> But i'm not sure whether to use spidermonkey or ecmascript. Both have
> pro's and con's and besides, I still need a better 'feeling' on how
> changes to css/xml propagate through the system. I'm easier tempted to
> pass certain validations such as acid2 and the complexspiral (perhaps
> even more) as rendering of certain sites has things left desired.
The current layout engine cannot cope with the document changing.
There's also no DOM support to speak of. John Mark Bell has recently
been writing a new HTML parser and a DOM library that should hopefully
be complete with a few months. That would make JavaScript somewhat
easier.
We've looked at several JavaScript libraries in the past. The choices
are narrow when you have the requirements that it be linkable to C,
compilable on RISC OS, and have a GPL2-compatible licence. Performance
is also an issue: Many users of NetSurf use it on machines with CPUs
less than 500MHz, and with no FPU.
I've looked into using Lua, which is a very very fast interpreter and
language which is also lightweight. Many of JavaScript's features can
be easily mapped onto Lua, but I've not examined the possibility in any
depth.
Thanks again for your patch. Alas, it's not my decision if any of it
should go in, but I have forwarded it for the perusal of the other
developers. You can expect a reply from them.
You might want to join the developer's mailing list and continue the
discussion there - there's a link from our website
(http://www.netsurf-browser.org/)
B.
14 years, 8 months
[Fwd: netsurf]
by Rob Kendrick
I was sent the attached directly. I have not examined the patch at all,
and only glanced at the mail itself.
B.
14 years, 8 months