I shall do a brief content review because you did take the time to
make a patch.
I have explained *at length* why I disagree with your approach. I
shall briefly outline those arguemnts once more. I have already stated
I do not wish to discuss this futher as you simply will not accept my
arguments, this review/summary is provided as a pure courtesy.
We have made extensive efforts to streamline the data flow through the
browser in the recent past, starting with jmb implementing the low
level cache, us making the fetcher interface less troublesome and
culminating in the ongoing content handler refactor. I say this as you
seem to think this ongoing work is directed at you personally soemhow.
Curl
----
We have come to understand that the curl library simply should not be
relied upon beyond its http/https implementations. We have implemented
fetchers directly for things like file: explicitly.
Because the curl libraries in actual common use have numerous bugs and
unwanted behaviours (an example would be curl applies proxy settings
to file: urls) and the huge memory footprint of curl *in our use case*
it looks increasingly like a poor fit.
Because of this we have gone to great efforts to *remove* fetcher
support from the curl library. As discussed at the develoepr meetings
previously Daniel Silverstone is working on creating an http library
to eventually completely remove our dependancy on curl.
Content
-------
As mentioned to you, The way we have chosen for things that generate
output (primarily directory listings but the about: handler also) that
the browser should display as HTML is that the *fetcher* layer returns
the data as generated HTML. We have created helper functions for this
generation for directory listings.
If the formatted output was not to your liking these content
generation routines should have been extended and the formatitng
applied using the internal.css.
The method of inventing a content type and content handler which then
rummages around inside a different content handler is full of
problems. Indeed we have worked to completely remove this method of
operation and built a data model where it is not supported.
Review
------
Will be inline with code thats ok elided, as usual.
Added files
Index: render/gopher.c
===================================================================
--- /dev/null 2011-05-08 22:46:04.000000000 +0200
+++ render/gopher.c 2011-05-08 22:41:11.000000000 +0200
@@ -0,0 +1,753 @@
+/*
+ * Copyright 2006 James Bursa <bursa(a)users.sourceforge.net>
+ * Copyright 2006 Adrian Lees <adrianl(a)users.sourceforge.net>
+ * Copyright 2011 François Revol <mmu_man(a)users.sourceforge.net>
odd you credit james and adrian in a new file?
+
+/* HACK HACK */
+extern const content_handler html_content_handler;
right there is why you have got no other reviews by the way, not just
because it *is* a hack but because you knew it and did nothing about
it, or at least provided a rational explanation.
+
+nserror gopher_create(const content_handler *handler,
+ lwc_string *imime_type, const http_parameter *params,
+ llcache_handle *llcache, const char *fallback_charset,
+ bool quirks, struct content **c);
+bool gopher_convert(struct content *c);
+content_type gopher_content_type(lwc_string *mime_type);
+
+static char *gen_nice_title(const char *path);
+static bool gopher_generate_top(char *buffer, int buffer_length);
+static bool gopher_generate_title(const char *title, char *buffer, int buffer_length);
+static bool gopher_generate_row(const char **data, size_t *size,
+ char *buffer, int buffer_length);
+static bool gopher_generate_bottom(char *buffer, int buffer_length);
+
+static const char gopher_faked_type[] = "text/x-gopher-directory";
+static lwc_string *gopher_faked_mime_type;
as a comment we are trying to move away from forward declaration,
secondly every function should be doccommented somewhere. if its
static, in its scope here. If its external a comment saying in which
header it is docuemnted.
+
+static struct {
+ char type;
+ const char *mime;
+} gopher_type_map[] = {
+ /* these come from
http://tools.ietf.org/html/rfc1436 */
+ { '0', "text/plain" },
+ { '1', "text/x-gopher-directory;charset=UTF-8" }, /* gopher directory
*/
+ /* 2 CSO search */
+ /* 3 error message */
+ /* 4 binhex encoded text */
+ /* 5 binary archive file */
+ /* 6 uuencoded text */
+ { '7', "text/x-gopher-directory;charset=UTF-8" }, /* search query */
+ /* 8 telnet: */
+ /* 9 binary */
+ { 'g', "image/gif" },
+ { 'h', "text/html" },
+ /* i information text */
+ /* I image (depends, usually jpeg) */
+ /* s audio (wav?) */
+ /* T tn3270 session */
+
+ /* those are not standardized */
+ { 'd', "application/pdf" }, /* display?? seems to be only for PDF
files so far */
+ { 'p', "image/png"}, /* at least on
gopher://namcub.accelera-labs.com/1/pics */
+ { 0, NULL }
+};
+
arghhhh magic letters! no! an enum if you please
+static const content_handler gopher_content_handler = {
+ gopher_create,
+ NULL,
+ gopher_convert,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ gopher_content_type,
+ true
+};
+
yes i know the changes were only applied recently, but named
initialisors pls
+
+/**
+ * Create a CONTENT_GOPHER.
there are six parameters here...
+ */
+
+nserror gopher_create(const content_handler *handler,
+ lwc_string *imime_type, const http_parameter *params,
+ llcache_handle *llcache, const char *fallback_charset,
+ bool quirks, struct content **c)
+
+
+static char *html_escape_string(char *str)
+{
+ char *nice_str, *cnv, *tmp;
+
+ if (str == NULL) {
+ return NULL;
+ }
+
+ /* Convert str for display */
+ nice_str = malloc(strlen(str) * SLEN("&") + 1);
+ if (nice_str == NULL) {
+ return NULL;
+ }
+
+ /* Escape special HTML characters */
+ for (cnv = nice_str, tmp = str; *tmp != '\0'; tmp++) {
+ if (*tmp == '<') {
+ *cnv++ = '&';
+ *cnv++ = 'l';
+ *cnv++ = 't';
+ *cnv++ = ';';
+ } else if (*tmp == '>') {
+ *cnv++ = '&';
+ *cnv++ = 'g';
+ *cnv++ = 't';
+ *cnv++ = ';';
+ } else if (*tmp == '&') {
+ *cnv++ = '&';
+ *cnv++ = 'a';
+ *cnv++ = 'm';
+ *cnv++ = 'p';
+ *cnv++ = ';';
+ } else {
+ *cnv++ = *tmp;
+ }
+ }
+ *cnv = '\0';
+
+ return nice_str;
+}
BLETCH
is there not already a utility function to do this, and if not create
one rather than spooging this in and really this ought to be less
iffy.
+
+
+static char *gen_nice_title(const char *path)
+{
+ const char *tmp;
+ char *nice_path, *cnv;
+ char *title;
+ int title_length;
+
+ /* Convert path for display */
+ nice_path = malloc(strlen(path) * SLEN("&") + 1);
+ if (nice_path == NULL) {
+ return NULL;
+ }
+
+ /* Escape special HTML characters */
+ for (cnv = nice_path, tmp = path; *tmp != '\0'; tmp++) {
+ if (*tmp == '<') {
+ *cnv++ = '&';
+ *cnv++ = 'l';
+ *cnv++ = 't';
+ *cnv++ = ';';
+ } else if (*tmp == '>') {
+ *cnv++ = '&';
+ *cnv++ = 'g';
+ *cnv++ = 't';
+ *cnv++ = ';';
+ } else if (*tmp == '&') {
+ *cnv++ = '&';
+ *cnv++ = 'a';
+ *cnv++ = 'm';
+ *cnv++ = 'p';
+ *cnv++ = ';';
+ } else {
+ *cnv++ = *tmp;
+ }
+ }
+ *cnv = '\0';
didnt you just have this code above?
+
+bool gopher_need_generate(char type)
+{
+ switch (type) {
+ case '1':
+ case '7':
+ return true;
+ default:
+ return false;
+ }
+}
+
magic letters again
+
+/**
+ * Generates the top part of an HTML directory listing page
+ *
+ * \return true iff buffer filled without error
+ *
+ * This is part of a series of functions. To generate a complete page,
+ * call the following functions in order:
+ *
+ * gopher_generate_top()
+ * gopher_generate_title()
+ * gopher_generate_row() -- call 'n' times for 'n' rows
+ * gopher_generate_bottom()
docuemnt this in the caller, not in every function
+ */
+
+static bool gopher_generate_top(char *buffer, int buffer_length)
+{
+ int error = snprintf(buffer, buffer_length,
+ "<html>\n"
+ "<head>\n"
+ /*"<!-- base href=\"%s\" -->\n"*//* XXX: needs the content
url */
+ /* Don't do that:
+ * seems to trigger a reparsing of the gopher data itself as html...
+ * "<meta http-equiv=\"Content-Type\" content=\"text/html;
charset=UTF-8\" />\n"
+ */
yuk
+ /* TODO: move this to clean CSS in internal.css */
well why have you not?
+ "<link rel=\"stylesheet\"
title=\"Standard\" "
+ "type=\"text/css\"
href=\"resource:internal.css\">\n");
+
+ if (error < 0 || error >= buffer_length)
+ /* Error or buffer too small */
+ return false;
+ else
+ /* OK */
+ return true;
+}
+
+
+
+/**
+ * Internal worker called by gopher_generate_row().
+ */
+
+static bool gopher_generate_row_internal(char type, char *fields[5],
whats this magic number? not an enum so fileds[] has meaning?
+ char *buffer, int buffer_length)
+{
+ char *nice_text;
+ char *redirect_url = NULL;
+ int error;
+ bool alt_port = false;
+ char *username = NULL;
+
+ if (fields[3] && strcmp(fields[3], "70"))
+ alt_port = true;
+
+ /* escape html special characters */
+ nice_text = html_escape_string(fields[0]);
+
+ /* XXX: outputting \n generates better looking html code,
+ * but currently screws up indentation due to a bug.
what bug? why isnt it fixed?
+ */
+#define HTML_LF
+/*#define HTML_LF "\n"*/
+
yuk
+ switch (type) {
+ case '.':
+ /* end of the page */
+ *buffer = '\0';
+ break;
+ case '0': /* text/plain link */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"text\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
way too much magic in one printf.
+ break;
+ case '9': /* binary */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"binary\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
ditto
+ case '1':
+ /*
+ * directory link
+ */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
and again
+ case '3':
+ /* Error
+ */
+ error = snprintf(buffer, buffer_length,
+ "<span
class=\"error\">%s</span><br/>"HTML_LF,
+ nice_text);
+ break;
+ case '7':
+ /* TODO: handle search better.
+ * For now we use an unnamed input field and accept sending ?=foo
+ * as it seems at least Veronica-2 ignores the = but it's unclean.
+ */
handle it or a better comment
+ error = snprintf(buffer, buffer_length,
+ "<form method=\"get\"
action=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"query\">"
+ "<label>%s "
+ "<input name=\"\" type=\"text\"
align=\"right\" />"
+ "</label>"
+ "</span></form>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
seriously? what? someome else might thave to read this stuff, please, less and more
readable
+ break;
+ case '8':
+ /* telnet: links
+ * cf. gopher://78.80.30.202/1/ps3
+ * -> gopher://78.80.30.202:23/8/ps3/new -> new(a)78.80.30.202
+ */
+ alt_port = false;
+ if (fields[3] && strcmp(fields[3], "23"))
+ alt_port = true;
+ username = strrchr(fields[1], '/');
+ if (username)
+ username++;
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"telnet://%s%s%s%s%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ username ? username : "",
+ username ? "@" : "",
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ nice_text);
+ break;
ditto
+ case 'g':
+ case 'I':
+ case 'p':
+ /* quite dangerous, cf.
gopher://namcub.accela-labs.com/1/pics */
dangerous? explain!
+ if (option_gopher_inline_images) {
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"img\">%s "HTML_LF /*
</span><br/> */
+ //"<span class=\"img\" >"HTML_LF
+ "<img src=\"gopher://%s%s%s/%c%s\"
alt=\"%s\"/>"HTML_LF
+ "</span>"
+ "</a>"
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1],
+ nice_text,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1],
+ nice_text);
+ break;
+ }
complex
+ /* fallback to default, link them */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
+ case 'h':
+ if (fields[1] && strncmp(fields[1], "URL:", 4) == 0)
SLEN()
+ redirect_url = fields[1] + 4;
+ /* cf. gopher://pineapple.vg/1 */
+ if (fields[1] && strncmp(fields[1], "/URL:", 5) == 0)
+ redirect_url = fields[1] + 5;
+ if (redirect_url) {
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"%s\">"HTML_LF
+ "<span class=\"link\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ redirect_url,
+ nice_text);
+ } else {
+ /* cf.
gopher://sdf.org/1/sdf/classes/ */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ }
+ break;
+ case 'i':
+ error = snprintf(buffer, buffer_length,
+ "<span class=\"info\">%s</span><br/>"HTML_LF,
+ nice_text);
+ break;
+ default:
+ LOG(("warning: unknown gopher item type 0x%02x '%c'\n", type,
type));
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
+ }
+
+ free(nice_text);
+
+ if (error < 0 || error >= buffer_length)
+ /* Error or buffer too small */
+ return false;
+ else
+ /* OK */
+ return true;
+}
overall this function was too long, too complex, too many "magic" value
+
+static bool gopher_generate_row(const char **data, size_t *size,
+ char *buffer, int buffer_length)
+{
+ bool ok = false;
+ char type = 0;
+ int field = 0;
+ /* name, selector, host, port, gopher+ flag */
+ char *fields[5] = { NULL, NULL, NULL, NULL, NULL };
magic value
+ const char *s = *data;
+ const char *p = *data;
+ int i;
+
+ for (; *size && *p; p++, (*size)--) {
+ if (!type) {
+ type = *p;
+ if (!type || type == '\n' || type == '\r') {
if type is not
what? our idom is if (type == NULL)
+ LOG(("warning: invalid gopher item type 0x%02x\n",
type));
+ }
+ s++;
+ continue;
+ }
+ switch (*p) {
+ case '\n':
+ if (field > 0) {
+ LOG(("warning: unterminated gopher item '%c'\n", type));
+ }
+ //FALLTHROUGH
no c++ comemnts please
+ case '\r':
+ if (*size < 1 || p[1] != '\n') {
+ LOG(("warning: CR without LF in gopher item '%c'\n", type));
+ }
+ if (field < 3 && type != '.') {
+ LOG(("warning: unterminated gopher item '%c'\n", type));
+ }
+ fields[field] = malloc(p - s + 1);
+ memcpy(fields[field], s, p - s);
+ fields[field][p - s] = '\0';
+ if (type == '.' && field == 0 && p == s) {
+ ;/* XXX: signal end of page? For now we just ignore it. */
+ }
is that supposed to be a todo?
+ ok = gopher_generate_row_internal(type, fields, buffer,
buffer_length);
+ for (i = 0; i < 5; i++) {
magic numbers and please can you try and make variable names
meaningful, it is my personal pet peeve that we are writing C not
fortran but good variable names are important.
+ free(fields[i]);
+ fields[i] = NULL;
+ }
+ (*size)--;
+ p++;
+ if (*size && *p == '\n') {
+ p++;
+ (*size)--;
+ }
+ *data = p;
+ field = 0;
+ return ok;
+ case '\x09':
+ if (field >= 4) {
+ LOG(("warning: extra tab in gopher item '%c'\n", type));
+ break;
+ }
+ fields[field] = malloc(p - s + 1);
+ memcpy(fields[field], s, p - s);
+ fields[field][p - s] = '\0';
not strndup?
Index: render/gopher.h
===================================================================
--- /dev/null 2011-05-08 22:46:04.000000000 +0200
+++ render/gopher.h 2011-05-08 22:41:11.000000000 +0200
+
+#ifndef _NETSURF_RENDER_GOPHER_H_
+#define _NETSURF_RENDER_GOPHER_H_
it has recently been pointe dout to us that macros beginning _ belong
to the compiler so we ought not to use them plain
NETSURF_RENDER_GOPHER_H would be ok
+
+#include <stddef.h>
+
+struct content;
+struct http_parameter;
+
+nserror gopher_init(void);
+void gopher_fini(void);
+
+const char *gopher_type_to_mime(char type);
+bool gopher_need_generate(char type);
docuemntation?
Index: render/html.c
===================================================================
--- render/html.c (revision 12321)
+++ render/html.c (working copy)
@@ -101,7 +101,7 @@
unsigned int depth);
#endif
-static const content_handler html_content_handler = {
+const content_handler html_content_handler = {
very rude word
html_create,
html_process_data,
html_convert,
Index: utils/url.c
===================================================================
--- utils/url.c (revision 12321)
+++ utils/url.c (working copy)
@@ -856,6 +856,38 @@
}
/**
+ * Extract the gopher document type from an URL
+ *
+ * \param url an absolute URL
+ * \param result pointer to buffer to hold result
+ * \return URL_FUNC_OK on success
+ */
+
+url_func_result url_gopher_type(const char *url, char *result)
+{
+ url_func_result status;
+ struct url_components components;
+
+ assert(url);
+
+ status = url_get_components(url, &components);
+ if (status == URL_FUNC_OK) {
+ if (!components.path) {
+ status = URL_FUNC_FAILED;
+ } else {
+ if (strlen(components.path) < 2)
+ *result = '1';
+ else if (components.path[0] == '/')
+ *result = components.path[1];
+ else
+ status = URL_FUNC_FAILED;
+ }
+ }
+ url_destroy_components(&components);
+ return status;
+}
is this useful in the generic url handling?
Index: desktop/options.c
===================================================================
--- desktop/options.c (revision 12321)
+++ desktop/options.c (working copy)
@@ -69,6 +69,8 @@
char *option_http_proxy_auth_user = 0;
/** Proxy authentication password */
char *option_http_proxy_auth_pass = 0;
+/** Inline image items in Gopher pages. Dangerous. */
+bool option_gopher_inline_images = false;
/** Default font size / 0.1pt. */
int option_font_size = 128;
/** Minimum font size. */
@@ -248,6 +250,7 @@
{ "http_proxy_auth", OPTION_INTEGER, &option_http_proxy_auth },
{ "http_proxy_auth_user", OPTION_STRING, &option_http_proxy_auth_user },
{ "http_proxy_auth_pass", OPTION_STRING, &option_http_proxy_auth_pass },
+ { "gopher_inline_images", OPTION_BOOL, &option_gopher_inline_images },
{ "font_size", OPTION_INTEGER, &option_font_size },
{ "font_min_size", OPTION_INTEGER, &option_font_min_size },
{ "font_sans", OPTION_STRING, &option_font_sans },
would this not have been better as an option to extend the generic
directory listing to show images inline?
Index: content/fetchers/curl.c
===================================================================
--- content/fetchers/curl.c (revision 12321)
+++ content/fetchers/curl.c (working copy)
various eww in teh curl handler...it really is supposed to be only for http(s)
+void * fetch_curl_setup_gopher(struct fetch *parent_fetch, const char *url,
+ bool only_2xx, const char *post_urlenc,
+ const struct fetch_multipart_data *post_multipart,
+ const char **headers)
+{
+ struct curl_fetch_info *f;
+ const char *mime;
+ char type;
+ f = fetch_curl_setup(parent_fetch, url, only_2xx, post_urlenc,
+ post_multipart, headers);
+ if (url_gopher_type(url, &type) == URL_FUNC_OK && type) {
+ f->gopher_type = type;
+ } else {
+ f->http_code = 404;
+ fetch_set_http_code(f->fetch_handle, f->http_code);
+ }
+
+ mime = gopher_type_to_mime(type);
+ /* TODO: add a fetch_mimetype_by_ext() or fetch_mimetype_sniff_data() */
err we already have a way of doing this for file: perhaps you can use that
+ /*
+ if (mime == NULL)
+ mime = "application/octet-stream";
+ */
+
ummm?
@@ -771,6 +820,8 @@
LOG(("done %s", f->url));
if (abort_fetch == false && result == CURLE_OK) {
+ //if (f->gopher_type)
+ //fetch_curl_gopher_data(NULL, 0, 0, f);
/* fetch completed normally */
if (f->stopped ||
(!f->had_headers &&
this is C not C++, apropriate comment syntax please and if its not
useful *remove* it
@@ -978,6 +1029,23 @@
struct curl_fetch_info *f = _f;
CURLcode code;
+ /* gopher data receives special treatment */
+ if (f->gopher_type && gopher_need_generate(f->gopher_type)) {
+ /* type 3 items report an error */
+ if (!f->http_code) {
+ if (data[0] == '3') {
magic *again* blergh
+ /* TODO: try to guess better from the string ?
+ * like "3 '/bcd' doesn't exist!"
+ * TODO: what about other file types ?
+ */
+ f->http_code = 404;
hmm do we not have an enum for there? bad project, no biccuit
+ } else {
+ f->http_code = 200;
+ }
+ fetch_set_http_code(f->fetch_handle, f->http_code);
+ }
+ }
+
/* ensure we only have to get this information once */
if (!f->http_code)
{
Even leaving aside the structural issues, this needs a *lot* of work before its merged.
--
Regards Vincent