Better API for fetch_filetype/fetch_mimetype
by François Revol
We've been having some discussion on IRC over the current mimetype detection...
There are several issues with the current call:
- it assumes a local file, so implementations try to read xattrs,
- it returns "text/html" on failure instead of NULL, which have some side effects.
In several places I would need some similar calls but for urls or data buffers but I can't use the existing call correctly.
For example, with gopher some item types don't have a specific mime type so it must be probed either by extension or by sniffing the incoming data. But using fetch_filetype() which returns text/html by default makes it try to display binary files...
BeOS had a mime sniffer for 15 years, and other "sane" OS have one too.
XDG seems to have something specified as well:
http://www.freedesktop.org/wiki/Specifications/shared-mime-info-spec
There seems to be an attempt at normalizing some mime sniffing as part of HTML5 also:
http://code.google.com/p/mimesniff/
http://tools.ietf.org/html/draft-abarth-mime-sniff-03
What I'd propose is to have something like:
[const?] char *fetch_mime_localfile(const char *path);
[const?] char *fetch_mime_by_ext(const char *filename);
[const?] char *fetch_mime_by_data(const void *data, size_t size);
The last 2, the new ones, are easily implemented in BeOS for ex. with the BMimeType:GuessMimeType() variants:
http://cvincent.pagesperso-orange.fr/bebook/Release Notes/StorageKit.html#GuessMimeType()
François.
12 years, 3 months
OSX URI declaration in the bundle
by François Revol
I'm wondering why we use org.netsurf-browser.NetSurf.URI as CFBundleURLName...
Shouldn't it be a human name ?
It seems both Safari just uses "Web site URL".
Firefox seems to separate http and https though:
<dict>
<key>CFBundleURLIconFile</key>
<string>document.icns</string>
<key>CFBundleURLName</key>
<string>http URL</string>
<key>CFBundleURLSchemes</key>
<array>
<string>http</string>
</array>
</dict>
<dict>
<key>CFBundleURLIconFile</key>
<string>document.icns</string>
<key>CFBundleURLName</key>
<string>https URL</string>
<key>CFBundleURLSchemes</key>
<array>
<string>https</string>
</array>
</dict>
François.
12 years, 4 months
More static analysis
by Vincent Sanders
jmb did soem corrections to yestardays run, so I repeated it this morning
http://www.kyllikki.org/scan-build-2011-05-17-1.tar.gz
Log was:
scan-build: 'clang' executable not found in '/home/vince/clang/llvm/tools/clang/tools/scan-build/bin'.
scan-build: Using 'clang' from path: /home/vince/clang/build/Debug+Asserts/bin//clang
M.CONFIG: JPEG (libjpeg) enabled (NETSURF_USE_JPEG := YES)
M.CONFIG: JNG/MNG/PNG (libmng) disabled (NETSURF_USE_MNG := NO)
M.CONFIG: PDF export (haru) disabled (NETSURF_USE_HARU_PDF := NO)
M.CONFIG: glibc internal iconv enabled (NETSURF_USE_LIBICONV_PLUG := YES)
M.CONFIG: SVG (librsvg-2.0) auto-enabled (NETSURF_USE_RSVG := AUTO)
M.CONFIG: SVG (libsvgtiny) disabled (NETSURF_USE_NSSVG := NO)
M.CONFIG: Sprite (librosprite) auto-disabled (NETSURF_USE_ROSPRITE := AUTO)
M.CONFIG: BMP (libnsbmp) enabled (NETSURF_USE_BMP := YES)
M.CONFIG: GIF (libnsgif) enabled (NETSURF_USE_GIF := YES)
M.CONFIG: PNG (libpng) enabled (NETSURF_USE_PNG := YES)
M.CONFIG: WebP (libwebp) disabled (NETSURF_USE_WEBP := NO)
MKDIR: build-Linux-gtk
MKDIR: build-Linux-gtk/deps
COMPILE: utils/utils.c
COMPILE: utils/utf8.c
COMPILE: utils/useragent.c
COMPILE: utils/url.c
COMPILE: utils/talloc.c
COMPILE: utils/messages.c
COMPILE: utils/log.c
COMPILE: utils/locale.c
COMPILE: utils/http.c
COMPILE: utils/hashtable.c
COMPILE: utils/filepath.c
COMPILE: utils/filename.c
COMPILE: utils/container.c
COMPILE: utils/base64.c
COMPILE: render/textplain.c
COMPILE: render/table.c
render/table.c:419:4: warning: Value stored to 'a_src' is never read
a_src = b_src;
^ ~~~~~
render/table.c:365:4: warning: Value stored to 'a_src' is never read
a_src = b_src;
^ ~~~~~
render/table.c:602:4: warning: Value stored to 'a_src' is never read
a_src = b_src;
^ ~~~~~
render/table.c:646:23: warning: Access to field 'parent' results in a dereference of a null pointer (loaded from variable 'row')
struct box *group = row->parent;
^~~
render/table.c:687:4: warning: Value stored to 'a_src' is never read
a_src = b_src;
^ ~~~~~
5 warnings generated.
COMPILE: render/list.c
COMPILE: render/layout.c
render/layout.c:660:9: warning: Access to field 'next' results in a dereference of a null pointer (loaded from variable 'box')
box = box->next;
^~~
render/layout.c:3950:41: warning: Division by zero
extra = 1 + (cell->min_width - min) /
^
2 warnings generated.
COMPILE: render/imagemap.c
COMPILE: render/hubbub_binding.c
render/hubbub_binding.c:471:15: warning: Assigned value is always the same as the existing value
n->_private = (void *) (++count);
~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
render/hubbub_binding.c:476:15: warning: Assigned value is always the same as the existing value
n->_private = (void *) (++count);
~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
render/hubbub_binding.c:492:15: warning: Assigned value is always the same as the existing value
n->_private = (void *) (--count);
~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
render/hubbub_binding.c:499:15: warning: Assigned value is always the same as the existing value
n->_private = (void *) (--count);
~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
4 warnings generated.
COMPILE: render/html_redraw.c
COMPILE: render/html_interaction.c
render/html_interaction.c:248:4: warning: Value stored to 'overflow' is never read
overflow = css_computed_overflow(box->style);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
COMPILE: render/html.c
COMPILE: render/form.c
render/form.c:913:3: warning: Value stored to 'scroll' is never read
scroll = 0;
^ ~
render/form.c:928:6: warning: Value stored to 'scroll' is never read
scroll = (i + 1) *
^ ~~~~~~~~~
2 warnings generated.
COMPILE: render/font.c
COMPILE: render/box_normalise.c
COMPILE: render/box_construct.c
COMPILE: render/box.c
render/box.c:128:41: warning: The left operand to '|' is always 0
box->flags = style_owned ? (box->flags | STYLE_OWNED) : box->flags;
~~~~~~~~~~ ^
1 warning generated.
COMPILE: image/webp.c
COMPILE: image/svg.c
COMPILE: image/rsvg.c
COMPILE: image/png.c
COMPILE: image/nssprite.c
COMPILE: image/mng.c
COMPILE: image/jpeg.c
COMPILE: image/image.c
COMPILE: image/ico.c
COMPILE: image/gif.c
COMPILE: image/bmp.c
COMPILE: gtk/window.c
COMPILE: gtk/treeview.c
COMPILE: gtk/toolbar.c
gtk/toolbar.c:1058:2: warning: Value stored to 'ii' is never read
ii = BACK_BUTTON;
^ ~~~~~~~~~~~
gtk/toolbar.c:1057:2: warning: Value stored to 'i' is never read
i = BACK_BUTTON;
^ ~~~~~~~~~~~
2 warnings generated.
COMPILE: gtk/thumbnail.c
COMPILE: gtk/throbber.c
COMPILE: gtk/theme.c
COMPILE: gtk/tabs.c
COMPILE: gtk/system_colour.c
COMPILE: gtk/sexy_icon_entry.c
gtk/sexy_icon_entry.c:105:2: warning: Value stored to 'entry_class' is never read
entry_class = GTK_ENTRY_CLASS(klass);
^ ~~~~~~~~~~~~~~~~~~~~~~
gtk/sexy_icon_entry.c:409:2: warning: Value stored to 'gtkentry' is never read
gtkentry = GTK_ENTRY(widget);
^ ~~~~~~~~~~~~~~~~~
gtk/sexy_icon_entry.c:471:10: warning: Access to field 'allocation' results in a dereference of a null pointer (loaded from variable 'widget')
widget->allocation = *allocation;
~~~~~~ ^
gtk/sexy_icon_entry.c:643:24: warning: Access to field 'style' results in a dereference of a null pointer (loaded from variable 'widget')
gtk_paint_flat_box(widget->style, icon_info->window,
^~~~~~
4 warnings generated.
COMPILE: gtk/selection.c
COMPILE: gtk/search.c
COMPILE: gtk/schedule.c
COMPILE: gtk/scaffolding.c
COMPILE: gtk/save.c
COMPILE: gtk/print.c
COMPILE: gtk/plotters.c
COMPILE: gtk/menu.c
COMPILE: gtk/login.c
COMPILE: gtk/hotlist.c
COMPILE: gtk/history.c
COMPILE: gtk/gui.c
gtk/gui.c:438:2: warning: Value stored to 'bw' is never read
bw = browser_window_create(addr, 0, 0, true, false);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
COMPILE: gtk/font_pango.c
gtk/font_pango.c:271:2: warning: Value stored to 'context' is never read
context = nsfont_pango_context;
^ ~~~~~~~~~~~~~~~~~~~~
1 warning generated.
COMPILE: gtk/filetype.c
COMPILE: gtk/download.c
COMPILE: gtk/dialogs/source.c
COMPILE: gtk/dialogs/options.c
COMPILE: gtk/dialogs/about.c
COMPILE: gtk/cookies.c
COMPILE: gtk/completion.c
COMPILE: gtk/compat.c
COMPILE: gtk/bitmap.c
COMPILE: desktop/version.c
COMPILE: desktop/tree_url_node.c
COMPILE: desktop/tree.c
desktop/tree.c:469:2: warning: Value stored to 'width' is never read
width = node->box.width;
^ ~~~~~~~~~~~~~~~
desktop/tree.c:1064:7: warning: Access to field 'flags' results in a dereference of a null pointer (loaded from variable 'tree')
if ((tree->flags & TREE_DELETE_EMPTY_DIRS) && parent != NULL &&
^~~~
2 warnings generated.
COMPILE: desktop/thumbnail.c
COMPILE: desktop/textinput.c
desktop/textinput.c:856:6: warning: The left operand to '-=' is always 0
dx -= text_box->x;
~~ ^
desktop/textinput.c:849:6: warning: Assigned value is always the same as the existing value
dx = text_box->x;
~~ ^ ~~~~~~~~~~~
2 warnings generated.
COMPILE: desktop/textarea.c
COMPILE: desktop/sslcert.c
COMPILE: desktop/selection.c
COMPILE: desktop/searchweb.c
COMPILE: desktop/search.c
desktop/search.c:477:7: warning: Assigned value is garbage or undefined
ss = context[top].ss;
^ ~~~~~~~~~~~~~~~
1 warning generated.
COMPILE: desktop/scrollbar.c
desktop/scrollbar.c:220:3: warning: Value stored to 'well_length' is never read
well_length *= scale;
^ ~~~~~
1 warning generated.
COMPILE: desktop/save_text.c
COMPILE: desktop/save_pdf/pdf_plotters.c
COMPILE: desktop/save_pdf/font_haru.c
COMPILE: desktop/save_complete.c
COMPILE: desktop/print.c
COMPILE: desktop/plot_style.c
COMPILE: desktop/options.c
COMPILE: desktop/netsurf.c
COMPILE: desktop/mouse.c
COMPILE: desktop/knockout.c
COMPILE: desktop/hotlist.c
COMPILE: desktop/history_global_core.c
desktop/history_global_core.c:158:2: warning: Value stored to 'node' is never read
node = tree_create_URL_node_shared(global_history_tree,
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
COMPILE: desktop/history_core.c
COMPILE: desktop/frames.c
COMPILE: desktop/download.c
COMPILE: desktop/cookies.c
COMPILE: desktop/browser.c
COMPILE: css/utils.c
COMPILE: css/select.c
COMPILE: css/internal.c
COMPILE: css/dump.c
css/dump.c:598:2: warning: Value stored to 'val' is never read
val = css_computed_counter_reset(style, &counter);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
css/dump.c:578:2: warning: Value stored to 'val' is never read
val = css_computed_counter_increment(style, &counter);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
COMPILE: css/css.c
COMPILE: content/urldb.c
COMPILE: content/llcache.c
content/llcache.c:1241:15: warning: Access to field 'next' results in a dereference of a null pointer (loaded from field 'prev')
user->prev->next = user->next;
~~~~ ^
1 warning generated.
COMPILE: content/hlcache.c
COMPILE: content/fetchers/resource.c
COMPILE: content/fetchers/file.c
COMPILE: content/fetchers/data.c
COMPILE: content/fetchers/curl.c
TESTMENT: utils/testament.h
COMPILE: content/fetchers/about.c
COMPILE: content/fetch.c
COMPILE: content/dirlist.c
COMPILE: content/content_factory.c
COMPILE: content/content.c
M.CONFIG: JPEG (libjpeg) enabled (NETSURF_USE_JPEG := YES)
M.CONFIG: JNG/MNG/PNG (libmng) disabled (NETSURF_USE_MNG := NO)
M.CONFIG: PDF export (haru) disabled (NETSURF_USE_HARU_PDF := NO)
M.CONFIG: glibc internal iconv enabled (NETSURF_USE_LIBICONV_PLUG := YES)
M.CONFIG: SVG (librsvg-2.0) auto-enabled (NETSURF_USE_RSVG := AUTO)
M.CONFIG: SVG (libsvgtiny) disabled (NETSURF_USE_NSSVG := NO)
M.CONFIG: Sprite (librosprite) auto-disabled (NETSURF_USE_ROSPRITE := AUTO)
M.CONFIG: BMP (libnsbmp) enabled (NETSURF_USE_BMP := YES)
M.CONFIG: GIF (libnsgif) enabled (NETSURF_USE_GIF := YES)
M.CONFIG: PNG (libpng) enabled (NETSURF_USE_PNG := YES)
M.CONFIG: WebP (libwebp) disabled (NETSURF_USE_WEBP := NO)
TESTMENT: unchanged
LINK: nsgtk
scan-build: 33 bugs found.
scan-build: Run 'scan-view /tmp/scan-build-2011-05-17-1' to examine bug reports.
12 years, 4 months
Building LibCSS on RISC OS
by Steve Fryatt
I had an email from Chris Martin last month, suggesting that it was no
longer possible to build LibCSS on RISC OS due to the changes made for code
autogeneration. He wrote:
"So I had a go at updating my sources and rebuilding on my Iyonix.
Straight-forward except for one thing: the autogeneration of property
parsing code in libcss. The makefile relies on command substitution, so I
wrote a little RiscLua 4 script to do the equivalent job on RISC OS. I've
attached the script and my modified makefile so you can see exactly what I
mean.
"But now I'm not sure what to do with this modification. It lets
NetSurf be built on RISC OS as before (with the additional RiscLua 4
requirement). But is it a happy accident that NetSurf can still be
built on RISC OS at all? I'd guess that all other developers are now
using Linux exclusively and so don't have the problem of a shell
lacking command substitution capabilities."
This (LibCSS and the build system) is outside the areas that I'm familiar
with, and I haven't hit the problem as I don't build natively on RISC OS
anyway, so I don't know enough to answer Chris's second paragraph. I'm
happy to check the patch below in for him if those familiar with the
affected code are OK with it; failing that, if there's another way to fix
the issue then I'm sure Chris would be happy to go with that.
Comments from those who know LibCSS better than me are welcome. :-)
Index: src/parse/properties/Makefile
===================================================================
--- src/parse/properties/Makefile (revision 12411)
+++ src/parse/properties/Makefile (working copy)
@@ -35,8 +35,13 @@
define gen_prop_parser
$(DIR)autogenerated_$1.c: $(DIR)properties.gen $(BUILDDIR)/gen_parser
- $$(VQ)$$(ECHO) $$(ECHOFLAGS) "GENERATE: $$@"
+ $$(VQ)$$(ECHO) $$(ECHOFLAGS) "GENERATE: $$@ $1"
+ifneq ($(HOST),riscos)
$$(Q)$$(BUILDDIR)/gen_parser -o $$@ '$(shell $(GREP) "^$1:" $(DIR)properties.gen)'
+else
+# I have written a RiscLua 4 script to do this job -- Christopher Martin
+ $$(Q)lua -- src.parse.properties.generate $$@ $1
+endif
AUTOGEN_SOURCES := $$(AUTOGEN_SOURCES) autogenerated_$1.c
Index: src/parse/properties/generate
===================================================================
--- src/parse/properties/generate (revision 0)
+++ src/parse/properties/generate (revision 0)
@@ -0,0 +1,40 @@
+#!risclua4
+--[[
+This RiscLua 4 script performs the equivalent job of the following command
+run from the Makefile in this same folder:
+
+ $$(BUILDDIR)/gen_parser -o $$@ '$(shell $(GREP) "^$1:" $(DIR)properties.gen)'
+
+Invoke it through the Makefile as follows:
+
+ lua -- src.parse.properties.generate $$@ $1
+
+Note that RiscLua doesn't understand Unix paths; we have to use RISC OS paths
+instead. This is a clunky solution but it runs well enough and emulates the
+original shell command. Remember that the Makefile is run when the root
+<libcss> folder is the CSD.
+--]]
+do
+ local !,r in swi
+ local xtndcl
+ do
+ local tgt = ("^%s:"):format(arg[2])
+ io.input("src.parse.properties.properties/gen")
+ for line in io.lines() do
+ if line:match(tgt) then
+ xtndcl = line
+ break
+ end
+ end
+ io.input():close()
+ end
+ if xtndcl then
+ xtndcl = ("-o %s '%s'"):format(arg[1], xtndcl)
+ r[0] = 1 + #xtndcl
+ !"DDEUtils_SetCLSize"
+ r[0] = xtndcl
+ !"DDEUtils_SetCL"
+ r[0] = "run build-riscos-riscos-release-lib-static.gen_parser"
+ !"OS_CLI"
+ end
+end
--
Steve Fryatt - Leeds, England
http://www.stevefryatt.org.uk/
12 years, 4 months
[PATCH][RFC] Add support for doi: URI scheme
by François Revol
I was bored the other day and added support for the doi: URI scheme...
Quite simple as it only needs to redirect to an http website.
cf. http://tools.ietf.org/html/draft-paskin-doi-uri
Oddly they don't seem to push much for this scheme though...
comments ?
François.
Index: netsurf/Makefile.sources
===================================================================
--- netsurf/Makefile.sources (revision 12411)
+++ netsurf/Makefile.sources (working copy)
@@ -7,7 +7,7 @@
S_CONTENT := content.c content_factory.c dirlist.c fetch.c hlcache.c \
llcache.c urldb.c
-S_FETCHERS := curl.c data.c file.c about.c resource.c
+S_FETCHERS := curl.c data.c doi.c file.c about.c resource.c
S_CSS := css.c dump.c internal.c select.c utils.c
Index: netsurf/content/fetchers/doi.h
===================================================================
--- netsurf/content/fetchers/doi.h (revision 0)
+++ netsurf/content/fetchers/doi.h (revision 0)
@@ -0,0 +1,38 @@
+/*
+ * Copyright 2011 François Revol <mmu_man(a)users.sourceforge.net>
+ *
+ * This file is part of NetSurf.
+ *
+ * NetSurf is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * NetSurf is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+/** \file
+ * doi: URL method handler.
+ *
+ * The doi fetcher is intended to provide a redirection of doi URLs
+ * to the canonical doi website accessible via HTTP.
+ * cf. http://tools.ietf.org/html/draft-paskin-doi-uri
+ *
+ */
+
+#ifndef NETSURF_CONTENT_FETCHERS_FETCH_DOI_H
+#define NETSURF_CONTENT_FETCHERS_FETCH_DOI_H
+
+/**
+ * Register the resource scheme.
+ *
+ * should only be called from the fetch initialise
+ */
+void fetch_doi_register(void);
+
+#endif
Index: netsurf/content/fetchers/doi.c
===================================================================
--- netsurf/content/fetchers/doi.c (revision 0)
+++ netsurf/content/fetchers/doi.c (revision 0)
@@ -0,0 +1,207 @@
+/*
+ * Copyright 2011 François Revol <mmu_man(a)users.sourceforge.net>
+ *
+ * This file is part of NetSurf.
+ *
+ * NetSurf is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * NetSurf is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* doi: URL handling. Based on the resource fetcher by Vincent Sanders */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <assert.h>
+#include <errno.h>
+#include <stdbool.h>
+#include <inttypes.h>
+#include <string.h>
+#include <strings.h>
+#include <time.h>
+#include <stdio.h>
+#include <dirent.h>
+#include <limits.h>
+#include <stdarg.h>
+
+#include "utils/config.h"
+#include "content/dirlist.h"
+#include "content/fetch.h"
+#include "content/fetchers/doi.h"
+#include "content/urldb.h"
+#include "desktop/netsurf.h"
+#include "desktop/options.h"
+#include "utils/log.h"
+#include "utils/messages.h"
+#include "utils/url.h"
+#include "utils/utils.h"
+#include "utils/ring.h"
+
+struct fetch_doi_context;
+
+/** Context for an resource fetch */
+struct fetch_doi_context {
+ struct fetch_doi_context *r_next, *r_prev;
+
+ struct fetch *fetchh; /**< Handle for this fetch */
+
+ bool aborted; /**< Flag indicating fetch has been aborted */
+ bool locked; /**< Flag indicating entry is already entered */
+
+ char redirect_url[1]; /**< The url the fetch redirects to */
+};
+
+static struct fetch_doi_context *ring = NULL;
+
+static const char *fetch_doi_redirect_base = "http://dx.doi.org/";
+
+/** issue fetch callbacks with locking */
+static inline bool fetch_doi_send_callback(fetch_msg msg,
+ struct fetch_doi_context *ctx, const void *data,
+ unsigned long size, fetch_error_code errorcode)
+{
+ ctx->locked = true;
+ fetch_send_callback(msg, ctx->fetchh, data, size, errorcode);
+ ctx->locked = false;
+
+ return ctx->aborted;
+}
+
+static bool fetch_doi_redirect_handler(struct fetch_doi_context *ctx)
+{
+ /* content is going to return redirect */
+ fetch_set_http_code(ctx->fetchh, 302);
+
+ fetch_doi_send_callback(FETCH_REDIRECT, ctx, ctx->redirect_url, 0,
+ FETCH_ERROR_NO_ERROR);
+
+ return true;
+}
+
+
+/** callback to initialise the resource fetcher. */
+static bool fetch_doi_initialise(const char *scheme)
+{
+ return true;
+}
+
+/** callback to initialise the resource fetcher. */
+static void fetch_doi_finalise(const char *scheme)
+{
+}
+
+/** callback to set up a resource fetch context. */
+static void *
+fetch_doi_setup(struct fetch *fetchh,
+ const char *url,
+ bool only_2xx,
+ const char *post_urlenc,
+ const struct fetch_multipart_data *post_multipart,
+ const char **headers)
+{
+ struct fetch_doi_context *ctx;
+
+ ctx = calloc(1, sizeof(*ctx) + strlen(fetch_doi_redirect_base) +
+ strlen(url) + 1);
+ if (ctx == NULL)
+ return NULL;
+
+ sprintf(ctx->redirect_url, "%s%s", fetch_doi_redirect_base,
+ url + SLEN("doi:"));
+
+ ctx->fetchh = fetchh;
+
+ RING_INSERT(ring, ctx);
+
+ return ctx;
+}
+
+/** callback to free a resource fetch */
+static void fetch_doi_free(void *ctx)
+{
+ struct fetch_doi_context *c = ctx;
+ RING_REMOVE(ring, c);
+ free(ctx);
+}
+
+/** callback to start a resource fetch */
+static bool fetch_doi_start(void *ctx)
+{
+ return true;
+}
+
+/** callback to abort a resource fetch */
+static void fetch_doi_abort(void *ctx)
+{
+ struct fetch_doi_context *c = ctx;
+
+ /* To avoid the poll loop having to deal with the fetch context
+ * disappearing from under it, we simply flag the abort here.
+ * The poll loop itself will perform the appropriate cleanup.
+ */
+ c->aborted = true;
+}
+
+
+/** callback to poll for additional resource fetch contents */
+static void fetch_doi_poll(const char *scheme)
+{
+ struct fetch_doi_context *c, *next;
+
+ if (ring == NULL) return;
+
+ /* Iterate over ring, processing each pending fetch */
+ c = ring;
+ do {
+ /* Ignore fetches that have been flagged as locked.
+ * This allows safe re-entrant calls to this function.
+ * Re-entrancy can occur if, as a result of a callback,
+ * the interested party causes fetch_poll() to be called
+ * again.
+ */
+ if (c->locked == true) {
+ next = c->r_next;
+ continue;
+ }
+
+ /* Only process non-aborted fetches */
+ if (c->aborted == false) {
+ /* resource fetches can be processed in one go */
+ fetch_doi_redirect_handler(c);
+ }
+
+ /* Compute next fetch item at the last possible moment
+ * as processing this item may have added to the ring
+ */
+ next = c->r_next;
+
+ fetch_remove_from_queues(c->fetchh);
+ fetch_free(c->fetchh);
+
+ /* Advance to next ring entry, exiting if we've reached
+ * the start of the ring or the ring has become empty
+ */
+ } while ( (c = next) != ring && ring != NULL);
+}
+
+void fetch_doi_register(void)
+{
+ fetch_add_fetcher("doi",
+ fetch_doi_initialise,
+ fetch_doi_setup,
+ fetch_doi_start,
+ fetch_doi_abort,
+ fetch_doi_free,
+ fetch_doi_poll,
+ fetch_doi_finalise);
+}
Index: netsurf/content/fetch.c
===================================================================
--- netsurf/content/fetch.c (revision 12411)
+++ netsurf/content/fetch.c (working copy)
@@ -42,6 +42,7 @@
#include "content/fetchers/about.h"
#include "content/fetchers/curl.h"
#include "content/fetchers/data.h"
+#include "content/fetchers/doi.h"
#include "content/fetchers/file.h"
#include "content/urldb.h"
#include "desktop/netsurf.h"
@@ -112,6 +113,7 @@
{
fetch_curl_register();
fetch_data_register();
+ fetch_doi_register();
fetch_file_register();
fetch_resource_register();
fetch_about_register();
12 years, 4 months
Review: Merge branch mmu_man/netsurf-gopher-support-v2
by François Revol
Precis:
Gopher browsing is now mostly functional.
Due to having to wait for the full content before converting to html, large pages like search results might take some seconds to load, but otherwise it works not too bad.
Supported item-types:
- 0 (plain text)
- 1 (gopher dir) converted to html for display
- 3 (error, when first item should behave as an HTTP 404, not much tested)
- 7 (search) converted to html for display
- 8 generates a link to a telnet: URI
- g (GIF) adds a link (or optionally inlines the image as <img>)
- h (html) selectors begining with URL: are automatically converted to direct links to the specified URL, which then is not limited to http: either.
- i displays a text line
- I (image) links to the item but tries to display as text.
- 9 (binary) links to the item but tries to display as text.
- d (PDF, unofficial)
- p (seems to be PNG, unofficial)
- all other unknown types are just linked as is, and will probably work when the mime type is correctly sniffed.
Unsupported item-types:
- 2 (CSO search) I failed to find a sample link, it requires a complex and separate protocol.
- T (TN3270) Failed to find a sample link, it's an antique competitor to telnet.
Missing stuff, which will be considered in a second iteration:
- document the feature.
- better error handling on non-existing files (will require an heuristic since gopher doesn't have out-of-band signaling).
- make the parser more robust.
- fix url escaping in generated html code, gopher selectors can include any character except tab, though no tested servers ever tried using reserved html chars yet.
- add a search icons.
- links to ambiguous items (like type 'I' which can be JPEG or PNG or another image type, and binary types like '9') will attempt to load the target as a text file. Handling them correctly probably requires implementing LLCACHE_RETRIEVE_SNIFF_TYPE. Archive types could probably get a fake Content-disposition header forcing a download though.
- make gopher: URI scheme handling explicit to OS front-ends (done for BeOS, OSX, and AmigaOS (untested)).
- fix search field sending an extra = in the url (add a "GOPHER" form method maybe ?), though all tested search engines just skip it.
- maybe Gopher+ support but it seems most clients just ignore it.
Added files
Index: render/gopher.c
===================================================================
--- /dev/null 2011-05-08 22:46:04.000000000 +0200
+++ render/gopher.c 2011-05-08 22:41:11.000000000 +0200
@@ -0,0 +1,753 @@
+/*
+ * Copyright 2006 James Bursa <bursa(a)users.sourceforge.net>
+ * Copyright 2006 Adrian Lees <adrianl(a)users.sourceforge.net>
+ * Copyright 2011 François Revol <mmu_man(a)users.sourceforge.net>
+ *
+ * This file is part of NetSurf, http://www.netsurf-browser.org/
+ *
+ * NetSurf is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * NetSurf is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+/** \file
+ * Content for text/x-gopher-directory (implementation).
+ */
+
+#include <errno.h>
+#include <stddef.h>
+#include <string.h>
+#include <strings.h>
+#include <math.h>
+
+#include "content/content_protected.h"
+#include "desktop/gui.h"
+#include "desktop/options.h"
+#include "render/gopher.h"
+#include "utils/http.h"
+#include "utils/log.h"
+#include "utils/messages.h"
+#include "utils/utils.h"
+
+/* HACK HACK */
+extern const content_handler html_content_handler;
+
+nserror gopher_create(const content_handler *handler,
+ lwc_string *imime_type, const http_parameter *params,
+ llcache_handle *llcache, const char *fallback_charset,
+ bool quirks, struct content **c);
+bool gopher_convert(struct content *c);
+content_type gopher_content_type(lwc_string *mime_type);
+
+static char *gen_nice_title(const char *path);
+static bool gopher_generate_top(char *buffer, int buffer_length);
+static bool gopher_generate_title(const char *title, char *buffer, int buffer_length);
+static bool gopher_generate_row(const char **data, size_t *size,
+ char *buffer, int buffer_length);
+static bool gopher_generate_bottom(char *buffer, int buffer_length);
+
+static const char gopher_faked_type[] = "text/x-gopher-directory";
+static lwc_string *gopher_faked_mime_type;
+
+static struct {
+ char type;
+ const char *mime;
+} gopher_type_map[] = {
+ /* these come from http://tools.ietf.org/html/rfc1436 */
+ { '0', "text/plain" },
+ { '1', "text/x-gopher-directory;charset=UTF-8" }, /* gopher directory */
+ /* 2 CSO search */
+ /* 3 error message */
+ /* 4 binhex encoded text */
+ /* 5 binary archive file */
+ /* 6 uuencoded text */
+ { '7', "text/x-gopher-directory;charset=UTF-8" }, /* search query */
+ /* 8 telnet: */
+ /* 9 binary */
+ { 'g', "image/gif" },
+ { 'h', "text/html" },
+ /* i information text */
+ /* I image (depends, usually jpeg) */
+ /* s audio (wav?) */
+ /* T tn3270 session */
+
+ /* those are not standardized */
+ { 'd', "application/pdf" }, /* display?? seems to be only for PDF files so far */
+ { 'p', "image/png"}, /* at least on gopher://namcub.accelera-labs.com/1/pics */
+ { 0, NULL }
+};
+
+static const content_handler gopher_content_handler = {
+ gopher_create,
+ NULL,
+ gopher_convert,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ NULL,
+ gopher_content_type,
+ true
+};
+
+
+nserror gopher_init(void)
+{
+ lwc_error lerror;
+ nserror error;
+
+ lerror = lwc_intern_string(gopher_faked_type,
+ strlen(gopher_faked_type),
+ &gopher_faked_mime_type);
+ if (lerror != lwc_error_ok) {
+ error = NSERROR_NOMEM;
+ goto error;
+ }
+
+ error = content_factory_register_handler(gopher_faked_mime_type,
+ &gopher_content_handler);
+ if (error != NSERROR_OK)
+ goto error;
+
+ return NSERROR_OK;
+
+error:
+ gopher_fini();
+
+ return error;
+}
+
+
+void gopher_fini(void)
+{
+ if (gopher_faked_mime_type)
+ lwc_string_unref(gopher_faked_mime_type);
+}
+
+
+/**
+ * Create a CONTENT_GOPHER.
+ */
+
+nserror gopher_create(const content_handler *handler,
+ lwc_string *imime_type, const http_parameter *params,
+ llcache_handle *llcache, const char *fallback_charset,
+ bool quirks, struct content **c)
+{
+ nserror error;
+
+ error = html_content_handler.create(handler, imime_type, params,
+ llcache, fallback_charset, quirks, c);
+ return error;
+}
+
+
+/**
+ * Convert a CONTENT_GOPHER for display.
+ */
+
+bool gopher_convert(struct content *c)
+{
+ char *title;
+ char buffer[1024];
+ const char *data;
+ unsigned long size;
+ const char *p;
+ unsigned long left;
+ bool ok;
+
+ data = content__get_source_data(c, &size);
+
+ p = data;
+ left = size;
+ if (data == NULL || size == 0)
+ return false;
+
+ if (gopher_generate_bottom(buffer, sizeof(buffer))) {
+ ok = html_content_handler.process_data(c, buffer, strlen(buffer));
+ if (!ok)
+ return false;
+ }
+ if (gopher_generate_top(buffer, sizeof(buffer))) {
+ ok = html_content_handler.process_data(c, buffer, strlen(buffer));
+ if (!ok)
+ return false;
+ }
+ title = gen_nice_title(content__get_url(c));
+ if (gopher_generate_title(title, buffer, sizeof(buffer))) {
+ ok = html_content_handler.process_data(c, buffer, strlen(buffer));
+ if (!ok)
+ return false;
+ }
+ free(title);
+
+ while (gopher_generate_row(&p, &left, buffer, sizeof(buffer))) {
+ ok = html_content_handler.process_data(c, buffer, strlen(buffer));
+ if (!ok)
+ return false;
+ gui_multitask();
+ }
+
+ /* finally make it HTML so we don't have to bother for other calls */
+ c->handler = &html_content_handler;
+
+ /* forward to the HTML handler */
+ return html_content_handler.convert(c);
+}
+
+/**
+ * Compute the type of a content
+ *
+ * \param c Content to consider
+ * \return CONTENT_HTML
+ */
+content_type gopher_content_type(lwc_string *mime_type)
+{
+ /* we will end up with HTML content anyway */
+ return CONTENT_HTML;
+}
+
+
+static char *html_escape_string(char *str)
+{
+ char *nice_str, *cnv, *tmp;
+
+ if (str == NULL) {
+ return NULL;
+ }
+
+ /* Convert str for display */
+ nice_str = malloc(strlen(str) * SLEN("&") + 1);
+ if (nice_str == NULL) {
+ return NULL;
+ }
+
+ /* Escape special HTML characters */
+ for (cnv = nice_str, tmp = str; *tmp != '\0'; tmp++) {
+ if (*tmp == '<') {
+ *cnv++ = '&';
+ *cnv++ = 'l';
+ *cnv++ = 't';
+ *cnv++ = ';';
+ } else if (*tmp == '>') {
+ *cnv++ = '&';
+ *cnv++ = 'g';
+ *cnv++ = 't';
+ *cnv++ = ';';
+ } else if (*tmp == '&') {
+ *cnv++ = '&';
+ *cnv++ = 'a';
+ *cnv++ = 'm';
+ *cnv++ = 'p';
+ *cnv++ = ';';
+ } else {
+ *cnv++ = *tmp;
+ }
+ }
+ *cnv = '\0';
+
+ return nice_str;
+}
+
+
+static char *gen_nice_title(const char *path)
+{
+ const char *tmp;
+ char *nice_path, *cnv;
+ char *title;
+ int title_length;
+
+ /* Convert path for display */
+ nice_path = malloc(strlen(path) * SLEN("&") + 1);
+ if (nice_path == NULL) {
+ return NULL;
+ }
+
+ /* Escape special HTML characters */
+ for (cnv = nice_path, tmp = path; *tmp != '\0'; tmp++) {
+ if (*tmp == '<') {
+ *cnv++ = '&';
+ *cnv++ = 'l';
+ *cnv++ = 't';
+ *cnv++ = ';';
+ } else if (*tmp == '>') {
+ *cnv++ = '&';
+ *cnv++ = 'g';
+ *cnv++ = 't';
+ *cnv++ = ';';
+ } else if (*tmp == '&') {
+ *cnv++ = '&';
+ *cnv++ = 'a';
+ *cnv++ = 'm';
+ *cnv++ = 'p';
+ *cnv++ = ';';
+ } else {
+ *cnv++ = *tmp;
+ }
+ }
+ *cnv = '\0';
+
+ /* Construct a localised title string */
+ title_length = (cnv - nice_path) + strlen(messages_get("FileIndex"));
+ title = malloc(title_length + 1);
+
+ if (title == NULL) {
+ free(nice_path);
+ return NULL;
+ }
+
+ /* Set title to localised "Index of <nice_path>" */
+ snprintf(title, title_length, messages_get("FileIndex"), nice_path);
+
+ free(nice_path);
+
+ return title;
+}
+
+
+/**
+ * Convert the gopher item type to mime type
+ *
+ * \return MIME type string
+ *
+ */
+
+const char *gopher_type_to_mime(char type)
+{
+ int i;
+
+ for (i = 0; gopher_type_map[i].type; i++)
+ if (gopher_type_map[i].type == type)
+ return gopher_type_map[i].mime;
+ return NULL;
+}
+
+
+/**
+ * Tells if the gopher item type needs to be converted to html
+ *
+ * \return true iff the item must be converted
+ *
+ */
+
+bool gopher_need_generate(char type)
+{
+ switch (type) {
+ case '1':
+ case '7':
+ return true;
+ default:
+ return false;
+ }
+}
+
+
+/**
+ * Generates the top part of an HTML directory listing page
+ *
+ * \return true iff buffer filled without error
+ *
+ * This is part of a series of functions. To generate a complete page,
+ * call the following functions in order:
+ *
+ * gopher_generate_top()
+ * gopher_generate_title()
+ * gopher_generate_row() -- call 'n' times for 'n' rows
+ * gopher_generate_bottom()
+ */
+
+static bool gopher_generate_top(char *buffer, int buffer_length)
+{
+ int error = snprintf(buffer, buffer_length,
+ "<html>\n"
+ "<head>\n"
+ /*"<!-- base href=\"%s\" -->\n"*//* XXX: needs the content url */
+ /* Don't do that:
+ * seems to trigger a reparsing of the gopher data itself as html...
+ * "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />\n"
+ */
+ /* TODO: move this to clean CSS in internal.css */
+ "<link rel=\"stylesheet\" title=\"Standard\" "
+ "type=\"text/css\" href=\"resource:internal.css\">\n");
+
+ if (error < 0 || error >= buffer_length)
+ /* Error or buffer too small */
+ return false;
+ else
+ /* OK */
+ return true;
+}
+
+
+/**
+ * Generates the part of an HTML directory listing page that contains the title
+ *
+ * \param title title to use
+ * \param buffer buffer to fill with generated HTML
+ * \param buffer_length maximum size of buffer
+ * \return true iff buffer filled without error
+ *
+ * This is part of a series of functions. To generate a complete page,
+ * call the following functions in order:
+ *
+ * gopher_generate_top()
+ * gopher_generate_title()
+ * gopher_generate_row() -- call 'n' times for 'n' rows
+ * gopher_generate_bottom()
+ */
+
+static bool gopher_generate_title(const char *title, char *buffer, int buffer_length)
+{
+ int error;
+
+ if (title == NULL)
+ title = "";
+
+ error = snprintf(buffer, buffer_length,
+ "<title>%s</title>\n"
+ "</head>\n"
+ "<body id=\"gopher\">\n"
+ "<h1>%s</h1>\n",
+ title, title);
+ if (error < 0 || error >= buffer_length)
+ /* Error or buffer too small */
+ return false;
+ else
+ /* OK */
+ return true;
+}
+
+/**
+ * Internal worker called by gopher_generate_row().
+ */
+
+static bool gopher_generate_row_internal(char type, char *fields[5],
+ char *buffer, int buffer_length)
+{
+ char *nice_text;
+ char *redirect_url = NULL;
+ int error;
+ bool alt_port = false;
+ char *username = NULL;
+
+ if (fields[3] && strcmp(fields[3], "70"))
+ alt_port = true;
+
+ /* escape html special characters */
+ nice_text = html_escape_string(fields[0]);
+
+ /* XXX: outputting \n generates better looking html code,
+ * but currently screws up indentation due to a bug.
+ */
+#define HTML_LF
+/*#define HTML_LF "\n"*/
+
+ switch (type) {
+ case '.':
+ /* end of the page */
+ *buffer = '\0';
+ break;
+ case '0': /* text/plain link */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"text\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
+ case '9': /* binary */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"binary\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
+ case '1':
+ /*
+ * directory link
+ */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
+ case '3':
+ /* Error
+ */
+ error = snprintf(buffer, buffer_length,
+ "<span class=\"error\">%s</span><br/>"HTML_LF,
+ nice_text);
+ break;
+ case '7':
+ /* TODO: handle search better.
+ * For now we use an unnamed input field and accept sending ?=foo
+ * as it seems at least Veronica-2 ignores the = but it's unclean.
+ */
+ error = snprintf(buffer, buffer_length,
+ "<form method=\"get\" action=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"query\">"
+ "<label>%s "
+ "<input name=\"\" type=\"text\" align=\"right\" />"
+ "</label>"
+ "</span></form>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
+ case '8':
+ /* telnet: links
+ * cf. gopher://78.80.30.202/1/ps3
+ * -> gopher://78.80.30.202:23/8/ps3/new -> new(a)78.80.30.202
+ */
+ alt_port = false;
+ if (fields[3] && strcmp(fields[3], "23"))
+ alt_port = true;
+ username = strrchr(fields[1], '/');
+ if (username)
+ username++;
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"telnet://%s%s%s%s%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ username ? username : "",
+ username ? "@" : "",
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ nice_text);
+ break;
+ case 'g':
+ case 'I':
+ case 'p':
+ /* quite dangerous, cf. gopher://namcub.accela-labs.com/1/pics */
+ if (option_gopher_inline_images) {
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"img\">%s "HTML_LF /* </span><br/> */
+ //"<span class=\"img\" >"HTML_LF
+ "<img src=\"gopher://%s%s%s/%c%s\" alt=\"%s\"/>"HTML_LF
+ "</span>"
+ "</a>"
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1],
+ nice_text,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1],
+ nice_text);
+ break;
+ }
+ /* fallback to default, link them */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
+ case 'h':
+ if (fields[1] && strncmp(fields[1], "URL:", 4) == 0)
+ redirect_url = fields[1] + 4;
+ /* cf. gopher://pineapple.vg/1 */
+ if (fields[1] && strncmp(fields[1], "/URL:", 5) == 0)
+ redirect_url = fields[1] + 5;
+ if (redirect_url) {
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"%s\">"HTML_LF
+ "<span class=\"link\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ redirect_url,
+ nice_text);
+ } else {
+ /* cf. gopher://sdf.org/1/sdf/classes/ */
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ }
+ break;
+ case 'i':
+ error = snprintf(buffer, buffer_length,
+ "<span class=\"info\">%s</span><br/>"HTML_LF,
+ nice_text);
+ break;
+ default:
+ LOG(("warning: unknown gopher item type 0x%02x '%c'\n", type, type));
+ error = snprintf(buffer, buffer_length,
+ "<a href=\"gopher://%s%s%s/%c%s\">"HTML_LF
+ "<span class=\"dir\">%s</span></a>"HTML_LF
+ "<br/>"HTML_LF,
+ fields[2],
+ alt_port ? ":" : "",
+ alt_port ? fields[3] : "",
+ type, fields[1], nice_text);
+ break;
+ }
+
+ free(nice_text);
+
+ if (error < 0 || error >= buffer_length)
+ /* Error or buffer too small */
+ return false;
+ else
+ /* OK */
+ return true;
+}
+
+
+/**
+ * Generates the part of an HTML directory listing page that displays a row
+ * of the gopher data
+ *
+ * \param size pointer to the data buffer pointer
+ * \param size pointer to the remaining data size
+ * \param buffer buffer to fill with generated HTML
+ * \param buffer_length maximum size of buffer
+ * \return true iff buffer filled without error
+ *
+ * This is part of a series of functions. To generate a complete page,
+ * call the following functions in order:
+ *
+ * gopher_generate_top()
+ * gopher_generate_title()
+ * gopher_generate_row() -- call 'n' times for 'n' rows
+ * gopher_generate_bottom()
+ */
+
+static bool gopher_generate_row(const char **data, size_t *size,
+ char *buffer, int buffer_length)
+{
+ bool ok = false;
+ char type = 0;
+ int field = 0;
+ /* name, selector, host, port, gopher+ flag */
+ char *fields[5] = { NULL, NULL, NULL, NULL, NULL };
+ const char *s = *data;
+ const char *p = *data;
+ int i;
+
+ for (; *size && *p; p++, (*size)--) {
+ if (!type) {
+ type = *p;
+ if (!type || type == '\n' || type == '\r') {
+ LOG(("warning: invalid gopher item type 0x%02x\n", type));
+ }
+ s++;
+ continue;
+ }
+ switch (*p) {
+ case '\n':
+ if (field > 0) {
+ LOG(("warning: unterminated gopher item '%c'\n", type));
+ }
+ //FALLTHROUGH
+ case '\r':
+ if (*size < 1 || p[1] != '\n') {
+ LOG(("warning: CR without LF in gopher item '%c'\n", type));
+ }
+ if (field < 3 && type != '.') {
+ LOG(("warning: unterminated gopher item '%c'\n", type));
+ }
+ fields[field] = malloc(p - s + 1);
+ memcpy(fields[field], s, p - s);
+ fields[field][p - s] = '\0';
+ if (type == '.' && field == 0 && p == s) {
+ ;/* XXX: signal end of page? For now we just ignore it. */
+ }
+ ok = gopher_generate_row_internal(type, fields, buffer, buffer_length);
+ for (i = 0; i < 5; i++) {
+ free(fields[i]);
+ fields[i] = NULL;
+ }
+ (*size)--;
+ p++;
+ if (*size && *p == '\n') {
+ p++;
+ (*size)--;
+ }
+ *data = p;
+ field = 0;
+ return ok;
+ case '\x09':
+ if (field >= 4) {
+ LOG(("warning: extra tab in gopher item '%c'\n", type));
+ break;
+ }
+ fields[field] = malloc(p - s + 1);
+ memcpy(fields[field], s, p - s);
+ fields[field][p - s] = '\0';
+ field++;
+ s = p + 1;
+ break;
+ default:
+ break;
+ }
+ }
+
+ return false;
+}
+
+
+/**
+ * Generates the bottom part of an HTML directory listing page
+ *
+ * \return Bottom of directory listing HTML
+ *
+ * This is part of a series of functions. To generate a complete page,
+ * call the following functions in order:
+ *
+ * gopher_generate_top()
+ * gopher_generate_title()
+ * gopher_generate_row() -- call 'n' times for 'n' rows
+ * gopher_generate_bottom()
+ */
+
+static bool gopher_generate_bottom(char *buffer, int buffer_length)
+{
+ int error = snprintf(buffer, buffer_length,
+ "</div>\n"
+ "</body>\n"
+ "</html>\n");
+ if (error < 0 || error >= buffer_length)
+ /* Error or buffer too small */
+ return false;
+ else
+ /* OK */
+ return true;
+}
+
+
Index: render/gopher.h
===================================================================
--- /dev/null 2011-05-08 22:46:04.000000000 +0200
+++ render/gopher.h 2011-05-08 22:41:11.000000000 +0200
@@ -0,0 +1,39 @@
+/*
+ * Copyright 2006 James Bursa <bursa(a)users.sourceforge.net>
+ * Copyright 2006 Adrian Lees <adrianl(a)users.sourceforge.net>
+ * Copyright 2011 François Revol <mmu_man(a)users.sourceforge.net>
+ *
+ * This file is part of NetSurf, http://www.netsurf-browser.org/
+ *
+ * NetSurf is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * NetSurf is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+/** \file
+ * Content for text/x-gopher-directory (interface).
+ */
+
+#ifndef _NETSURF_RENDER_GOPHER_H_
+#define _NETSURF_RENDER_GOPHER_H_
+
+#include <stddef.h>
+
+struct content;
+struct http_parameter;
+
+nserror gopher_init(void);
+void gopher_fini(void);
+
+const char *gopher_type_to_mime(char type);
+bool gopher_need_generate(char type);
+
+#endif
Changed files
!NetSurf/Resources/internal.css,f79 | 30 ++++++++++++++
Docs/BUILDING-Cocoa | 10 ++++
Makefile.sources | 3 -
amiga/dist/Install | 1
beos/beos_res.rdef | 2
cocoa/res/NetSurf-Info.plist | 1
content/fetchers/curl.c | 74 ++++++++++++++++++++++++++++++++++--
desktop/netsurf.c | 6 ++
desktop/options.c | 3 +
desktop/options.h | 1
render/html.c | 2
utils/url.c | 32 +++++++++++++++
utils/url.h | 1
13 files changed, 161 insertions(+), 5 deletions(-)
Index: render/html.c
===================================================================
--- render/html.c (revision 12321)
+++ render/html.c (working copy)
@@ -101,7 +101,7 @@
unsigned int depth);
#endif
-static const content_handler html_content_handler = {
+const content_handler html_content_handler = {
html_create,
html_process_data,
html_convert,
Index: cocoa/res/NetSurf-Info.plist
===================================================================
--- cocoa/res/NetSurf-Info.plist (revision 12321)
+++ cocoa/res/NetSurf-Info.plist (working copy)
@@ -35,6 +35,7 @@
<string>org.netsurf-browser.NetSurf.URI</string>
<key>CFBundleURLSchemes</key>
<array>
+ <string>gopher</string>
<string>http</string>
<string>https</string>
</array>
Index: !NetSurf/Resources/internal.css,f79
===================================================================
--- !NetSurf/Resources/internal.css,f79 (revision 12321)
+++ !NetSurf/Resources/internal.css,f79 (working copy)
@@ -174,3 +174,33 @@
body#configlist .null-content {
font-style: italic; }
+
+/*
+ * gopher listing style
+ */
+
+body#gopher {
+ /* margin: 10px;*/
+ font-size: 100%;
+ padding-bottom: 2em; }
+
+body#gopher h1 {
+ padding: 5mm;
+ margin: 0;
+ border-bottom: 2px solid #777; }
+
+/* XXX: white-space: pre-wrap would be better but is currently buggy */
+body#gopher span {
+ margin-left: 1em;
+ padding-left: 2em;
+ font-family: Courier, monospace;
+ white-space: pre; }
+
+body#gopher span.dir {
+ background-image: url('resource:Icons/directory.png');
+ background-repeat: no-repeat;
+ background-position: bottom left; }
+
+body#gopher span.text {
+ background-image: url('resource:Icons/content.png');
+ background-repeat: no-repeat; background-position: bottom left; }
Index: Docs/BUILDING-Cocoa
===================================================================
--- Docs/BUILDING-Cocoa (revision 12321)
+++ Docs/BUILDING-Cocoa (working copy)
@@ -20,6 +20,16 @@
In both cases the actual build process is controlled by the Makefile.
+ Gopher support requires a recent libcurl from MacPorts since the one Apple
+ ships lacks the gopher handler. Building with MacPorts breaks the self-
+ contained nature of the .app bundle though. Install libcurl with MacPorts:
+
+ $ sudo port install curl
+
+ Then build with:
+
+ $ make TARGET=cocoa WITH_MACPORTS=1
+
Obtaining NetSurf's build dependencies
========================================
Index: beos/beos_res.rdef
===================================================================
--- beos/beos_res.rdef (revision 12321)
+++ beos/beos_res.rdef (working copy)
@@ -98,9 +98,11 @@
"types" = "image/jpeg",
"types" = "application/x-vnd.Be-bookmark",
"types" = "text",
+ "types" = "text/x-gopher-directory",
"types" = "application/x-vnd.Be-doc_bookmark",
"types" = "application/x-vnd.Be.URL.file",
"types" = "application/x-vnd.Be.URL.ftp",
+ "types" = "application/x-vnd.Be.URL.gopher",
"types" = "application/x-vnd.Be.URL.http",
"types" = "application/x-vnd.Be.URL.https"
};
Index: Makefile.sources
===================================================================
--- Makefile.sources (revision 12321)
+++ Makefile.sources (working copy)
@@ -13,7 +13,8 @@
S_RENDER := box.c box_construct.c box_normalise.c \
font.c form.c html.c html_interaction.c html_redraw.c \
- hubbub_binding.c imagemap.c layout.c list.c table.c textplain.c
+ hubbub_binding.c imagemap.c layout.c list.c table.c textplain.c \
+ gopher.c
S_UTILS := base64.c filename.c hashtable.c http.c locale.c messages.c \
talloc.c url.c utf8.c utils.c useragent.c filepath.c log.c
Index: utils/url.c
===================================================================
--- utils/url.c (revision 12321)
+++ utils/url.c (working copy)
@@ -856,6 +856,38 @@
}
/**
+ * Extract the gopher document type from an URL
+ *
+ * \param url an absolute URL
+ * \param result pointer to buffer to hold result
+ * \return URL_FUNC_OK on success
+ */
+
+url_func_result url_gopher_type(const char *url, char *result)
+{
+ url_func_result status;
+ struct url_components components;
+
+ assert(url);
+
+ status = url_get_components(url, &components);
+ if (status == URL_FUNC_OK) {
+ if (!components.path) {
+ status = URL_FUNC_FAILED;
+ } else {
+ if (strlen(components.path) < 2)
+ *result = '1';
+ else if (components.path[0] == '/')
+ *result = components.path[1];
+ else
+ status = URL_FUNC_FAILED;
+ }
+ }
+ url_destroy_components(&components);
+ return status;
+}
+
+/**
* Attempt to find a nice filename for a URL.
*
* \param url an absolute URL
Index: utils/url.h
===================================================================
--- utils/url.h (revision 12321)
+++ utils/url.h (working copy)
@@ -60,6 +60,7 @@
url_func_result url_path(const char *url, char **result);
url_func_result url_leafname(const char *url, char **result);
url_func_result url_fragment(const char *url, char **result);
+url_func_result url_gopher_type(const char *url, char *result);
url_func_result url_compare(const char *url1, const char *url2,
bool nofrag, bool *result);
Index: desktop/options.h
===================================================================
--- desktop/options.h (revision 12321)
+++ desktop/options.h (working copy)
@@ -49,6 +49,7 @@
extern int option_http_proxy_auth;
extern char *option_http_proxy_auth_user;
extern char *option_http_proxy_auth_pass;
+extern bool option_gopher_inline_images;
extern int option_font_size;
extern int option_font_min_size;
extern char *option_accept_language;
Index: desktop/netsurf.c
===================================================================
--- desktop/netsurf.c (revision 12321)
+++ desktop/netsurf.c (working copy)
@@ -43,6 +43,7 @@
#include "desktop/gui.h"
#include "desktop/options.h"
#include "desktop/searchweb.h"
+#include "render/gopher.h"
#include "render/html.h"
#include "render/textplain.h"
#include "utils/log.h"
@@ -142,6 +143,10 @@
if (error != NSERROR_OK)
return error;
+ error = gopher_init();
+ if (error != NSERROR_OK)
+ return error;
+
error = html_init();
if (error != NSERROR_OK)
return error;
@@ -212,6 +217,7 @@
textplain_fini();
image_fini();
html_fini();
+ gopher_fini();
css_fini();
LOG(("Closing utf8"));
Index: desktop/options.c
===================================================================
--- desktop/options.c (revision 12321)
+++ desktop/options.c (working copy)
@@ -69,6 +69,8 @@
char *option_http_proxy_auth_user = 0;
/** Proxy authentication password */
char *option_http_proxy_auth_pass = 0;
+/** Inline image items in Gopher pages. Dangerous. */
+bool option_gopher_inline_images = false;
/** Default font size / 0.1pt. */
int option_font_size = 128;
/** Minimum font size. */
@@ -248,6 +250,7 @@
{ "http_proxy_auth", OPTION_INTEGER, &option_http_proxy_auth },
{ "http_proxy_auth_user", OPTION_STRING, &option_http_proxy_auth_user },
{ "http_proxy_auth_pass", OPTION_STRING, &option_http_proxy_auth_pass },
+ { "gopher_inline_images", OPTION_BOOL, &option_gopher_inline_images },
{ "font_size", OPTION_INTEGER, &option_font_size },
{ "font_min_size", OPTION_INTEGER, &option_font_min_size },
{ "font_sans", OPTION_STRING, &option_font_sans },
Index: content/fetchers/curl.c
===================================================================
--- content/fetchers/curl.c (revision 12321)
+++ content/fetchers/curl.c (working copy)
@@ -44,6 +44,7 @@
#include "content/urldb.h"
#include "desktop/netsurf.h"
#include "desktop/options.h"
+#include "render/gopher.h"
#include "utils/log.h"
#include "utils/messages.h"
#include "utils/schedule.h"
@@ -72,6 +73,7 @@
bool abort; /**< Abort requested. */
bool stopped; /**< Download stopped on purpose. */
bool only_2xx; /**< Only HTTP 2xx responses acceptable. */
+ char gopher_type; /**< Indicates the type of document for gopher: url */
char *url; /**< URL of this fetch. */
char *host; /**< The hostname of this fetch. */
struct curl_slist *headers; /**< List of request headers. */
@@ -111,6 +113,10 @@
bool only_2xx, const char *post_urlenc,
const struct fetch_multipart_data *post_multipart,
const char **headers);
+static void * fetch_curl_setup_gopher(struct fetch *parent_fetch, const char *url,
+ bool only_2xx, const char *post_urlenc,
+ const struct fetch_multipart_data *post_multipart,
+ const char **headers);
static bool fetch_curl_start(void *vfetch);
static bool fetch_curl_initiate_fetch(struct curl_fetch_info *fetch,
CURL *handle);
@@ -220,14 +226,19 @@
data = curl_version_info(CURLVERSION_NOW);
for (i = 0; data->protocols[i]; i++) {
+ fetcher_setup_fetch setup_hook;
/* Ignore non-http(s) protocols */
- if (strcmp(data->protocols[i], "http") != 0 &&
- strcmp(data->protocols[i], "https") != 0)
+ if (strcmp(data->protocols[i], "http") == 0 ||
+ strcmp(data->protocols[i], "https") == 0)
+ setup_hook = fetch_curl_setup;
+ else if (strcmp(data->protocols[i], "gopher") == 0)
+ setup_hook = fetch_curl_setup_gopher;
+ else
continue;
if (!fetch_add_fetcher(data->protocols[i],
fetch_curl_initialise,
- fetch_curl_setup,
+ setup_hook,
fetch_curl_start,
fetch_curl_abort,
fetch_curl_free,
@@ -339,6 +350,7 @@
fetch->abort = false;
fetch->stopped = false;
fetch->only_2xx = only_2xx;
+ fetch->gopher_type = 0;
fetch->url = strdup(url);
fetch->headers = 0;
fetch->host = host;
@@ -410,6 +422,43 @@
}
+void * fetch_curl_setup_gopher(struct fetch *parent_fetch, const char *url,
+ bool only_2xx, const char *post_urlenc,
+ const struct fetch_multipart_data *post_multipart,
+ const char **headers)
+{
+ struct curl_fetch_info *f;
+ const char *mime;
+ char type;
+ f = fetch_curl_setup(parent_fetch, url, only_2xx, post_urlenc,
+ post_multipart, headers);
+ if (url_gopher_type(url, &type) == URL_FUNC_OK && type) {
+ f->gopher_type = type;
+ } else {
+ f->http_code = 404;
+ fetch_set_http_code(f->fetch_handle, f->http_code);
+ }
+
+ mime = gopher_type_to_mime(type);
+ /* TODO: add a fetch_mimetype_by_ext() or fetch_mimetype_sniff_data() */
+ /*
+ if (mime == NULL)
+ mime = "application/octet-stream";
+ */
+
+ if (mime) {
+ char s[80];
+ /* fprintf(stderr, "gopher mime is '%s'\n", mime); */
+ snprintf(s, sizeof s, "Content-type: %s\r\n", mime);
+ s[sizeof s - 1] = 0;
+ fetch_send_callback(FETCH_HEADER, f->fetch_handle, s, strlen(s),
+ FETCH_ERROR_NO_ERROR);
+ }
+
+ return f;
+}
+
+
/**
* Dispatch a single job
*/
@@ -771,6 +820,8 @@
LOG(("done %s", f->url));
if (abort_fetch == false && result == CURLE_OK) {
+ //if (f->gopher_type)
+ //fetch_curl_gopher_data(NULL, 0, 0, f);
/* fetch completed normally */
if (f->stopped ||
(!f->had_headers &&
@@ -978,6 +1029,23 @@
struct curl_fetch_info *f = _f;
CURLcode code;
+ /* gopher data receives special treatment */
+ if (f->gopher_type && gopher_need_generate(f->gopher_type)) {
+ /* type 3 items report an error */
+ if (!f->http_code) {
+ if (data[0] == '3') {
+ /* TODO: try to guess better from the string ?
+ * like "3 '/bcd' doesn't exist!"
+ * TODO: what about other file types ?
+ */
+ f->http_code = 404;
+ } else {
+ f->http_code = 200;
+ }
+ fetch_set_http_code(f->fetch_handle, f->http_code);
+ }
+ }
+
/* ensure we only have to get this information once */
if (!f->http_code)
{
Index: amiga/dist/Install
===================================================================
--- amiga/dist/Install (revision 12321)
+++ amiga/dist/Install (working copy)
@@ -559,6 +559,7 @@
; necessary to ask whether the user wants to add NetSurf to launch-handler.
(working "Adding NetSurf to launch-handler config")
(p_fitr "ENVARC:launch-handler/URL/FILE.LH" "ClientName=\"NETSURF\" ClientPath=\"APPDIR:NETSURF\" CMDFORMAT=\"URL=*\"file:///%s*\"\"")
+ (p_fitr "ENVARC:launch-handler/URL/GOPHER.LH" "ClientName=\"NETSURF\" ClientPath=\"APPDIR:NETSURF\" CMDFORMAT=\"URL=*\"gopher://%s*\"\"")
(p_fitr "ENVARC:launch-handler/URL/HTTP.LH" "ClientName=\"NETSURF\" ClientPath=\"APPDIR:NETSURF\" CMDFORMAT=\"URL=*\"http://%s*\"\"")
(p_fitr "ENVARC:launch-handler/URL/HTTPS.LH" "ClientName=\"NETSURF\" ClientPath=\"APPDIR:NETSURF\" CMDFORMAT=\"URL=*\"https://%s*\"\"")
(p_fitr "ENVARC:launch-handler/URL/WWW.LH" "ClientName=\"NETSURF\" ClientPath=\"APPDIR:NETSURF\" CMDFORMAT=\"URL=*\"http://www.%s*\"\"")
Conflicted files
Removed files
12 years, 4 months
Current state of things
by Michael Drake
I've just done some profiling:
http://www.netsurf-browser.org/temp/profiles/r12380-bbc.png
http://www.netsurf-browser.org/temp/profiles/r12380-bbc-news.png
http://www.netsurf-browser.org/temp/profiles/r12380-gamespot.png
All three tests include browser startup and quit, which are too quick to
really show up. I didn't do anything once the pages loaded other than
quit, so it's really just showing where time is spent between requesting a
URL and it finishing showing on screen.
As seen in the profiles, most of the time is spent in css_select_style.
On BBC News and Gamespot front pages, css_select_style takes 55% of the
time, and on BBC homepage, it takes 42% of the time.
On the BBC homepage JPEGs cause jpeg_convert to take 13% of the time.
On the other two pages jpeg_convert took 4-5% of the time.
Other image formats like PNG and GIF don't really show up.
Returning to css_select style, and taking BBC News as an example:
Function Inclusive Self
css_select_style 55.6% 7.8%
match_details 40.9% 11.9%
I'm not sure how much those functions can be optimised themselves, so
their Self values may be difficult to reduce, but the stuff below
match_details should be improved massively when we are able to switch from
libxml to libdom. (FWIW, that's 40.9 - 11.9 = 29% ish.)
Also, I've fixed some issues with the scrollbar widget, which was a
frames-handled-in-core prerequisite.
--
Michael Drake (tlsa) http://www.netsurf-browser.org/
12 years, 4 months