Cramp: web to gemini proxy
November 19th, 2021 on ols.wtf
Like many people interested in the “small web”, I have been looking at the Gemini project for some time now. My Gemini capsule (note the scheme1) has been up and running for over a year, and has slowly gathered more content. Up until this point, I hadn’t made use of any of the limited (albeit intentionally) interactive elements of the protocol.
I decided that I would write a Gemini “server” whose sole job is to fetch a given web URL, convert it to Gemtext, and return the page to the user. This would make use of the
INPUT directive in the Gemini spec to get the requested URL from the user.
What happens under the hood can be boiled down to the following:
- The URL is parsed for readability, a similar process to how the “Reader Mode” on most modern web browsers works to strip out unnecessary content, styles, and scripts
- The resulting HTML is converted to Gemtext
- A success header, followed by the generated Gemtext, is returned
One consideration that had to be made was surrounding hyperlinks. In HTML, links can be scattered around within a paragraph of text. In Gemini a link to another document must be on a line of its own (similar to Gopher). The approach taken here to display links of HTML pages is to number them similar to how you would a footnote (i.e.
) and then aggregate the links at the end of a block of text, that is before the next header.
The most challenging aspect of this application was handling images. Because the MIME type is passed as part of the first line of the response in Gemini (e.g.
20 image/png), the process for viewing an image is to read the image into memory, understand its MIME type, and then use that to form the response before printing out the raw data of the image. This was challenging primarily because I was using Go and didn’t realise that an
io.Reader can only be used once, so after checking the MIME type, the actual contents of the HTTP response body were blank.
The source code is available on sr.ht as usual.