Hi,
Le 10 mai 2011 à 14:32, Vincent Sanders a écrit :
I have explained *at length* why I disagree with your approach. I
shall briefly outline those arguemnts once more. I have already stated
I do not wish to discuss this futher as you simply will not accept my
arguments, this review/summary is provided as a pure courtesy.
I'll keep the branch around as reference and rewrite it on the -v2 one, but here is
what I have to say.
I also have tried to explain the reasons why I went this route, I'll summarize them
again below... Besides, I started it as a joke to see if it would work, so I originally
just wanted something that work before getting things cleaned up.
I actually started with the fetcher way inside the curl fetcher, and it almost worked.
However the data chunks sent from the server did not always end on a line boundary, so the
simple parser missed entire items. Refusing part of the data by indicating less bytes
consumed than presented also resulted in the fetch being aborted (would actually be handy
if it would just keep the unconsumed data for the next call). Fixing it would have meant
adding caching in the fetcher itself. I looked at how llcache could be used for this, but
I couldn't find any way to use it without presenting data that would end up in a
content object.
So I tried the content handler route, and was directed at the old plaintext renderer which
used the same trick as the current code. Being able to get all the data at once simplifies
the parser as well, though of course it is not as fast probably.
While it might not follow the current design goal, it does the job, which was the idea for
a first version.
Also, concerning the fetcher vs handler way, I think putting things in the fetcher is as
bad, if not incorrect semantically. Putting something that massages *content* in a
*fetcher* seems wrong. While I agree it makes some sense for file:, gopher data has a
standardized format which makes up a content that needs to be *handled*.
Having to dig the HTML handler function and masquerade oneself to it is just a result of
not being able to create content objects ex nihilo from arbitrary data instead of llcache
object tied to a url.
Besides, "fetching" html content from gopher directories has side effects like,
one won't be able to view the source data for example. Not very important but
still...
We have made extensive efforts to streamline the data flow through
the
browser in the recent past, starting with jmb implementing the low
level cache, us making the fetcher interface less troublesome and
culminating in the ongoing content handler refactor. I say this as you
seem to think this ongoing work is directed at you personally soemhow.
No I just happened to have pop up at the wrong time and got a bit irritated.
I understand the effort and will try to follow the same route for the v2 branch, but I
still think it's not totally logic either to put gopher htmlization in the
"fetcher".
We have come to understand that the curl library simply should not
be
relied upon beyond its http/https implementations. We have implemented
fetchers directly for things like file: explicitly.
At least for gopher I didn't find any bug in curl.
The fact that the data comes in in chunks is likely caused by the server doing so.
The only thing I thought was a bug in curl was actually a bug in the server at
floodgap.com that has been fixed since then.
While curl could have tried to map gopher errors to HTTP codes, they don't do it
probably because it's just an heuristic of checking if the 1st char is '3'
(error item type), which is not really telling much.
Is there anything I could use to handle the caching of partial item lines or will I have
to write it myself btw ?
Because the curl libraries in actual common use have numerous bugs
and
unwanted behaviours (an example would be curl applies proxy settings
to file: urls) and the huge memory footprint of curl *in our use case*
it looks increasingly like a poor fit.
Don't have much experience with cURL, so I can't tell. I actually didn't check
how proxies should be used for gopher (not gopher-to-http proxies), at least some browser
provide a setting for it, but gopher doesn't have any provision for vhosts...
Because of this we have gone to great efforts to *remove* fetcher
support from the curl library. As discussed at the develoepr meetings
previously Daniel Silverstone is working on creating an http library
to eventually completely remove our dependancy on curl.
http://source.netsurf-browser.org/branches/dsilvers/netsurf-remove-curl/ ?
I should probably have a look then...
If possible it should have a lower level API for doing socket I/O to reuse in non-http
things.
Meanwhile I'll try to isolate the curl dependency in my code.
Content
-------
As mentioned to you, The way we have chosen for things that generate
output (primarily directory listings but the about: handler also) that
the browser should display as HTML is that the *fetcher* layer returns
the data as generated HTML. We have created helper functions for this
generation for directory listings.
Again this goes beyond the job of a *fetcher*, but anyway.
If the formatted output was not to your liking these content
generation routines should have been extended and the formatitng
applied using the internal.css.
I already added things to internal.css for gopher.
The method of inventing a content type and content handler which
then
rummages around inside a different content handler is full of
problems. Indeed we have worked to completely remove this method of
operation and built a data model where it is not supported.
cf. above.
> + * Copyright 2006 James Bursa
<bursa(a)users.sourceforge.net>
> + * Copyright 2006 Adrian Lees <adrianl(a)users.sourceforge.net>
> + * Copyright 2011 François Revol <mmu_man(a)users.sourceforge.net>
odd you credit james and adrian in a new file?
I originally forked textplain.c as a template, though indeed I didn't keep much of it,
and also reused stuff from dirlist.c.
> +/* HACK HACK */
> +extern const content_handler html_content_handler;
right there is why you have got no other reviews by the way, not just
because it *is* a hack but because you knew it and did nothing about
it, or at least provided a rational explanation.
That is because I was hoping some cleaner way would be introduced, like a content delegate
object of some sort. As I said it's the result of not being able to create a content
without an llcache backing object, which would have been cleaner.
> +static const char gopher_faked_type[] =
"text/x-gopher-directory";
> +static lwc_string *gopher_faked_mime_type;
as a comment we are trying to move away from forward declaration,
secondly every function should be doccommented somewhere. if its
static, in its scope here. If its external a comment saying in which
header it is docuemnted.
I originally forked textplain.c ...
> + { 'p', "image/png"}, /* at least on
gopher://namcub.accelera-labs.com/1/pics */
> + { 0, NULL }
> +};
arghhhh magic letters! no! an enum if you please
Ok... But they are not magic though, the protocol explicitely uses letters.
Will use an enum next time.
> +
> +/**
> + * Create a CONTENT_GOPHER.
there are six parameters here...
Right, pure laziness, I admit.
> +static char *html_escape_string(char *str)
> +{
> +}
BLETCH
is there not already a utility function to do this, and if not create
one rather than spooging this in and really this ought to be less
iffy.
Didn't find one at first sight, I just stripped things from gen_nice_title to get it
done.
> +
> +
> +static char *gen_nice_title(const char *path)
> +{
didnt you just have this code above?
No, this one comes from dirlist.c, and gave birth to the one above. I agree it should
probably be put to utils.c
> + *
> + * gopher_generate_top()
> + * gopher_generate_title()
> + * gopher_generate_row() -- call 'n' times for 'n'
rows
> + * gopher_generate_bottom()
docuemnt this in the caller, not in every function
I originally forked textplain.c ...
but yeah.
> + /*"<!-- base href=\"%s\" -->\n"*//* XXX: needs the
content url */
> + /* Don't do that:
> + * seems to trigger a reparsing of the gopher data itself as html...
> + * "<meta http-equiv=\"Content-Type\" content=\"text/html;
charset=UTF-8\" />\n"
> + */
yuk
What ?
Failed attempts.
> + /* TODO: move this to clean CSS in internal.css */
well why have you not?
Actually I did, forgot to remove the TODO.
> +static bool gopher_generate_row_internal(char type, char
*fields[5],
whats this magic number? not an enum so fileds[] has meaning?
Hmm ok...
> + /* XXX: outputting \n generates better looking html code,
> + * but currently screws up indentation due to a bug.
what bug? why isnt it fixed?
Don't think so.
It's visible on:
http://revolf.free.fr/beos/shots/shot_netsurf_gopher_vs_fx2_001.png
lines like "(plus using..." and "Get your own..." are shifted on the
right by a parasite whitespace.
I don't have much knowledge of the html renderer to fix that.
> + */
> +#define HTML_LF
> +/*#define HTML_LF "\n"*/
> +
yuk
until someone fixes it :p
> + case '0': /* text/plain link */
> + error = snprintf(buffer, buffer_length,
> + "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
> + "<span
class=\"text\">%s</span></a>"HTML_LF
> + "<br/>"HTML_LF,
> + fields[2],
> + alt_port ? ":" : "",
> + alt_port ? fields[3] : "",
> + type, fields[1], nice_text);
way too much magic in one printf.
I can do much worse :P
> + error = snprintf(buffer, buffer_length,
> + "<form method=\"get\"
action=\"gopher://%s%s%s/%c%s\">"HTML_LF
> + "<span class=\"query\">"
> + "<label>%s "
> + "<input name=\"\" type=\"text\"
align=\"right\" />"
> + "</label>"
> + "</span></form>"HTML_LF
> + "<br/>"HTML_LF,
> + fields[2],
> + alt_port ? ":" : "",
> + alt_port ? fields[3] : "",
> + type, fields[1], nice_text);
seriously? what? someome else might thave to read this stuff, please, less and more
readable
When I said I could do much worse :D
> + case 'g':
> + case 'I':
> + case 'p':
> + /* quite dangerous, cf.
gopher://namcub.accela-labs.com/1/pics */
dangerous? explain!
Some research queries return gopher directories with a lot of images, which tend to stale
the GUI while fetching them if we enable inlining them (with <img>).
Some clients like the OverbiteFF Firefox plugin inlines them inside an iframe that is only
shown (and fetched) when clicked, but we can't do this yet.
An alternative would be to only inline the first N images...
> + if (error < 0 || error >= buffer_length)
> + /* Error or buffer too small */
> + return false;
> + else
> + /* OK */
> + return true;
> +}
overall this function was too long, too complex, too many "magic" value
It kinda grew organically...
> + if (type == '.' && field == 0 && p ==
s) {
> + ;/* XXX: signal end of page? For now we just ignore it. */
> + }
is that supposed to be a todo?
No it's an undecided part, it seems we don't really need it but part of the review
was to answer this.
> +#ifndef _NETSURF_RENDER_GOPHER_H_
> +#define _NETSURF_RENDER_GOPHER_H_
it has recently been pointe dout to us that macros beginning _ belong
to the compiler so we ought not to use them plain
Rather, to the OS vender IIRC, but yeah.
NETSURF_RENDER_GOPHER_H would be ok
netsurf/render/textplain.h should be fixed then...
> Index: render/html.c
> ===================================================================
> --- render/html.c (revision 12321)
> +++ render/html.c (working copy)
> @@ -101,7 +101,7 @@
> unsigned int depth);
> #endif
>
> -static const content_handler html_content_handler = {
> +const content_handler html_content_handler = {
very rude word
"... not being able to create a content object without llcache object ..."
is this useful in the generic url handling?
You tell me, that's why the review is for.
I don't think it has any use outside gopher, but it was more related to url handling,
so...
> { "http_proxy_auth_pass", OPTION_STRING,
&option_http_proxy_auth_pass },
> + { "gopher_inline_images", OPTION_BOOL, &option_gopher_inline_images
},
> { "font_size", OPTION_INTEGER, &option_font_size },
> { "font_min_size", OPTION_INTEGER, &option_font_min_size },
> { "font_sans", OPTION_STRING, &option_font_sans },
would this not have been better as an option to extend the generic
directory listing to show images inline?
Maybe, but as I said, doing this for gopher is dangerous at least in the current state.
various eww in teh curl handler...it really is supposed to be only
for http(s)
Then why is it named *curl.c* ? ;-)
>
> +void * fetch_curl_setup_gopher(struct fetch *parent_fetch, const char *url,
> + bool only_2xx, const char *post_urlenc,
> + const struct fetch_multipart_data *post_multipart,
> + const char **headers)
> +{
> + struct curl_fetch_info *f;
> + const char *mime;
> + char type;
> + f = fetch_curl_setup(parent_fetch, url, only_2xx, post_urlenc,
> + post_multipart, headers);
> + if (url_gopher_type(url, &type) == URL_FUNC_OK && type) {
> + f->gopher_type = type;
> + } else {
> + f->http_code = 404;
> + fetch_set_http_code(f->fetch_handle, f->http_code);
> + }
> +
> + mime = gopher_type_to_mime(type);
> + /* TODO: add a fetch_mimetype_by_ext() or fetch_mimetype_sniff_data() */
err we already have a way of doing this for file: perhaps you can use that
Right, by ext, though I'm not sure every platform code actually likes being called
with non-existing files, it isn't obvious from the documentation.
For example, the BeOS code tries to open the file to read its MIME type attribute.
I added some hardcoded types for known extensions, before noticing the
BMimeType::GuessMimeType() methods that take a filename and data buffer.
Maybe we should have one for real files and one that only checks extensions ?
OTH we could just assume the file might not exist and fallback to extension matching for
those.
> @@ -771,6 +820,8 @@
> LOG(("done %s", f->url));
>
> if (abort_fetch == false && result == CURLE_OK) {
> + //if (f->gopher_type)
> + //fetch_curl_gopher_data(NULL, 0, 0, f);
> /* fetch completed normally */
> if (f->stopped ||
> (!f->had_headers &&
this is C not C++, apropriate comment syntax please and if its not
useful *remove* it
Dead code leftover.
> + /* TODO: try to guess better from the string ?
> + * like "3 '/bcd' doesn't exist!"
> + * TODO: what about other file types ?
> + */
> + f->http_code = 404;
hmm do we not have an enum for there? bad project, no biccuit
I like numbers :p
François.