Sitemap Generator

I’ve made a tool that will generate a sitemap, available at https://silverbullet.domain/_/sitemap.xml

To use it paste the following code into any page. I’ve put mine under Library/Me/Script/sitemap. You can change the defaults if you want, but it’s not required.

Question for anyone viewing this, is there any better way to access JS-land from Space Lua? I’m doing an export default foobar encoded into a data URI, and then using js.import but that feels very hacky.

---
tags: meta
---

```space-lua
do
  local defaults = {
    domain = "o.stag.lol"; -- base absolute domain (fallback if not in http header)
    cf = "weekly"; -- default for <changefreq>
    cfname = "changefreq"; -- frontmatter attribute to be used
  }

  -- js: export default encodeURIComponent
  local encodeURI = js.import("data:text/javascript,export%20default%20encodeURIComponent")

  local function gensitemap(host)
    local sitemap = ""
  
    for page in each(space.listPages()) do
      if not (
        page.name:startsWith("Library/") or
        page.name == page.name:upper()
      ) then
        sitemap = sitemap + spacelua.interpolate([[
          <url>
            <loc>https://${host}/${encodeURI(page.name)}</loc>
            <lastmod>${page.lastModified}</lastmod>
            <changefreq>${page[defaults.cfname] or defaults.cf}</changefreq>
          </url>
        ]], {host=host;encodeURI=encodeURI;page=page;defaults=defaults})
      end
    end
  
    return spacelua.interpolate([[
      <?xml version="1.0" encoding="UTF-8"?>
      <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        ${sitemap}
      </urlset>
    ]], {sitemap=sitemap})
  end
  
  event.listen {
    name = "http:request:/sitemap.xml";
    run = function(event)
      local stat, res = pcall(function()
        return gensitemap(event.data.headers.host or defaults.domain)
      end)
      
      return {
        status = 200;
        headers = {
          ["Content-Type"] = stat and "application/xml" or "text/plain";
        };
        body = stat and res or "Error: " .. res;
      }
    end;
  }
 end
```
3 Likes

This really cool, thanks! When I tried it with Firefox, it returned this error:

XML Parsing Error: XML or text declaration not at start of entity
Location: https://silverbullet.example.com/_/sitemap.xml
Line Number 2, Column 7:

Looks like a very picky parser, but this can be improved by removing the whitespace preceding the XML declaration like this:

    return spacelua.interpolate([[<?xml version="1.0" encoding="UTF-8"?>
      <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      -- (...)

No wonder it feels hacky, I’m impressed you made it work! If I understand correctly, a big reason of introducing Lua was to prevent access to arbitrary JavaScript from Space Scripts, and instead only allow Plugs to do it. Since working with URLs seems a reasonably popular thing to do in a web app, maybe we’ll get the function as built-in one day?

I raise you an even hackier (or not?) solution: the algorithms for UTF-8 or URI encoding aren’t terribly complicated and they have good documentation…

encodeURIComponent reimplemented in Space Lua
function encodeURIComponent(uriComponent)
  -- this works better on v2
  local text, count = string.gsub(uriComponent, "([^A-Za-z0-9\\\\\\-_.!~*'()])", function(character)
    local c = string.byte(character)
    local bytes = {}
    if c <= 127 then -- c <= 0x7F, regular ASCII
      bytes[1] = c
    elseif c <= 2047 then -- c <= 0x7FF
      bytes[2] = 128 + (c % 64) -- 0x80 + c & 0x3F
      c = c // 64 -- c = c >> 6
      bytes[1] = 192 + c -- 0xC0 + c
    elseif c <= 65535 then -- c <= 0xFFFF
      bytes[3] = 128 + (c % 64)
      c = c // 64
      bytes[2] = 128 + (c % 64)
      c = c // 64
      bytes[1] = 224 + c -- 0xE0 + c
    elseif c <= 1114111 then -- c <= 0x10FFFF
      bytes[4] = 128 + (c % 64)
      c = c // 64
      bytes[3] = 128 + (c % 64)
      c = c // 64
      bytes[2] = 128 + (c % 64)
      c = c // 64
      bytes[1] = 240 + c -- 0xF0 + c
    end
    local out = ""
    for _, b in ipairs(bytes) do
      out = out .. string.format("%%%02X", b)
    end
    return out
  end)
  return text
end
1 Like

This sounds awesome! Such a clever way of making this work, thanks for sharing!

When I try to use it I get an error:

No custom endpoint handler is handling this path

I’ve done System: Reload, reindexed, and restarted Silverbullet a couple times, but no luck. Not sure what I’ve managed to break :sweat_smile:

That’s weird. Try double checking these lines are correct:

 event.listen {
    name = "http:request:/sitemap.xml";

and that you’re accessing it at silverbullet/_/sitemap.xml.

Maybe the event listener isn’t being registered for some reason, check that it’s in a space-lua block. Also, logs from DevTools (ctrl+shift+i) will be helpful.