Cramp: web to gemini proxy

November 19th, 2021 on ols.wtf

Like many people interested in the “small web”, I have been looking at the Gemini project for some time now. My Gemini capsule (note the scheme¹) has been up and running for over a year, and has slowly gathered more content. Up until this point, I hadn’t made use of any of the limited (albeit intentionally) interactive elements of the protocol.

I decided that I would write a Gemini “server” whose sole job is to fetch a given web URL, convert it to Gemtext, and return the page to the user. This would make use of the INPUT directive in the Gemini spec to get the requested URL from the user.

What happens under the hood can be boiled down to the following:

The URL is parsed for readability, a similar process to how the “Reader Mode” on most modern web browsers works to strip out unnecessary content, styles, and scripts
The resulting HTML is converted to Gemtext
A success header, followed by the generated Gemtext, is returned

One consideration that had to be made was surrounding hyperlinks. In HTML, links can be scattered around within a paragraph of text. In Gemini a link to another document must be on a line of its own (similar to Gopher). The approach taken here to display links of HTML pages is to number them similar to how you would a footnote (i.e. [1]) and then aggregate the links at the end of a block of text, that is before the next header.

The most challenging aspect of this application was handling images. Because the MIME type is passed as part of the first line of the response in Gemini (e.g. 20 image/png), the process for viewing an image is to read the image into memory, understand its MIME type, and then use that to form the response before printing out the raw data of the image. This was challenging primarily because I was using Go and didn’t realise that an io.Reader can only be used once, so after checking the MIME type, the actual contents of the HTTP response body were blank.

You can see a demo by pointing your Gemini client to ols.wtf:5555 which will prompt you to enter a URL. You could even view this page through the proxy directly by visiting this link.

The source code is available on sr.ht as usual.

You will need a client capable of opening gemini:// links, there are a number recommended on this page, or else you can view my capsule via an HTTP proxy at gemini.ols.wtf. ↩

Do you have a comment to make on this content? Start a discussion in my public inbox by emailing ~ols/public-inbox@lists.sr.ht. You can see the inbox here.