My reliability story with silverbullet has been less than stellar. Although I haven’t experienced significant data loss (…partly because I auto-backup everything with a git cron job), I’m noticing a lot of “I want to open silverbullet and right now it kinda just doesn’t wanna”.
I’m talking:
Surprise logouts while editing
Never being quite sure that your edit has been sent to the server
Never being quite sure if your edits from another device have been received on this one
More recent phenomenon: “Client needs reloading to update the cache, required syscalls are not available in this version. This message may appear a few times. Reloading now.” - this feels like an infinite boot loop (…it does seem to be finite so far though)
Surprise re-indexes that take very long
Some space lua not working unless I reload or hard-reload, in ways that are sometimes super awkward e.g. on Android (and might trigger a long re-index) - this is a big problem as my main page is only space lua
Silverbullet works just fine for “it’s open and I’m working within it for a duration”, but the aforementioned is problematic for things like:
I need my child’s medical ID real quick
I want to make a quick note of something
I showed up at a meeting with my laptop and I want to start typing notes
These are specific issues, that I’m sure in isolation have bothered people, but have me actively glancing to other PKM systems, which are nowhere as nice feature-wise, but have a better reliability story. And it seems… silly. Specifically the “Client needs reloading”, which is only a couple of weeks old.
Is this resonating with other folks? Is this just my fault from running from main?
I had similar issues with logouts and uncertainty about whether my edits were saving, so I stopped using SB for a while.
But I couldn’t find anything I liked more, or worked as well for me, so I set it up again and this time bypassed Authelia in favour of just protecting it behind Tailscale.
So if you’re using SB behind an auth system I’d suggest looking at that - things might’ve changed in that respect too (in terms of changes to code/documentation) - or think about accessing it some other way.
Since I did this I’ve not had any issues. The cache reloads I believe happened recently due to an indexing change but I expect (and don’t mind this) if it means I can live on the bleeding edge like you are too.
Yeah I also tried that, but at least the auth stuff I can figure out (I even contributed some fixes)… and the thing prompting me to write this was seeing “Client needs reloading to update the cache, required syscalls are not available in this version. This message may appear a few times. Reloading now” on silverbullet.md - that is, the official site, with no auth required.
YES, this has been one of my biggest gripes with Silverbullet and frankly, with the current architecture, there isn’t a lot we can do. Let my explain, what I think are the underlying issues here:
Silverbullet v0 was something entirely different from Silverbullet v2 The code base was stirred a few times by now and has a lot of patchwork code. I guess this is something that’s kind of normal and to be expected, but really doesn’t help make the project manageable and thus reliable.
Silverbullet has a very, very large API surface, which makes it immovable “Normal” applications only interface to the user by defined UI, while the data is stored in formats hidden from the user. This means the applications controls how the data is laid out and how the user interacts with, think Gmail, apple notes, whatever. Silverbullet has the “problem” that it gives the user direct access to the data and the data format is very complex (Counting stuff like SpaceLua scripts, which are part of the data). This means that once something should be changed, every user would have to possibly change their data, reformat it, etc. This gets especially bad when bugs start to become features, which I’ve seen happen way too many times in Silverbullet. Apps like Gmail can just reformat the data using some database script to fix a bad data layout without the user noticing, Silverbullet can’t. This makes it very hard to work on anything in Silverbullet, Zef for examples has lately been working on indexing, trying to improve speed and reliability (I think) and has been facing problems, because dropping anything (e.g. dropping paragraph indexing) from the API could mean breaking the space of some users. There are way more subtle examples, which in sum create a very big problem.
The client centered model The first problem is synchronization: I’ve not seen an application achieve synchronization between clients reliable, not - a - single - one. I think this is an unsolved problem in computer science. Lots of state means lots of stuff that can go wrong. That doesn’t mean Silverbullet is doing the best job it could right now, but it’s very very very hard to improve. The second problem with the client approach is its heaviness, browsers just aren’t built to potentially keep and access megabytes of data using some lisp derived mutated language. (Chromebook people would probably disagree …)
I think two and three are the core issues, obviously you could put more care into the development of Silverbullet, but at the end I think these issues persist and won’t ever lead to Silverbullet being very reliable.
The only approach to solve this in the long run is a full rewrite in my opinion. There is a certain attitude that every piece of software overtime develops, which influences future code that is written for said piece of software and I don’t think it’s a good one for Silverbullet right now.
This all sounds very pessimistic, but I think it’s realistic. Silverbullet currently is at 75% polishedness and I think it has hit a dead end in the labyrinth of software development.
Thank you! I agree with a lot of your assessment. I think 2 is a result of 1; v2 is basically “the lua rewrite”, but the API still feels unstable. I’m also convinced that a rewrite won’t necessarily solve this (…I mean, there kinda were 2 rewrites already) if it doesn’t have reliability as a priority - and hopefully this thread helps.
I disagree about the client-centered model being unworkable. I’ve had good experience with Google Docs (which, admittedly, has huge resources behind it), and I think Yjs does kinda the same thing? Zef made a conscious choice to both add and remove Y.js (see discussion here) - I don’t know for certain how good Y.js’s reliability was, but the reason for removal was complexity.
I also personally think that too much of the processing currently happens on the client. In v1 (and early v2?) the server was nodejs, which led to poor separation between client and server. The server is now in Go (which I think is an improvement), but now purposely does very little. I personally would love to see the server handle the indexing so that client startup could be faster (just download a pre-built index)… but I realize that would make offline mode tricky. I don’t have a great solution for this.
I use SB stable version only. Self hosted with VPN tunnel. I use advanced git Lua plug to backup my space in more less real time. my clients are desktop browsers only
I never had client disconnection.
Multi clients synchronization was horrible but V2 improved and made more stable this point.
For me, the big deal is new release regression and migration on new version. Many time, it’s necessary to wipe all my clients (twice) to reach stability. I pray to not have to rewrite my custom scripts.
But really since 2.0 release,zef made many efforts.
For me ,SB on mobile is useless. I have not found the way to use it with current Ergonomy and erratic synchronization.
With 2.0, to my mind, the future roadmap could be:
stability: bug fixes
regression test and code coverage
have you try to change hosting architecture to limit network layers?
silvernote.dev looks like it’s hosting the exact same frontend code, with the exact same sync mechanism. I can’t see anything about it that would mitigate any of the issues I’m describing.
First of all for this feedback, it’s always helpful. Since I don’t see crash reports, collect metrics or anything, I have a very contrained view of what issues you all hit unless you tell me, and don’t quietly move on out of frustration So thanks for that.
I think there’s a bunch of different issues being reported here, some more related to my development and release “strategy”, some more inherent to the project and technical/philosophical decisions, some probably more related to technical debt.
I don’t have time for a detailed response and possible solutions now, but will get back to this early next week
I kind of went on a pessimistic ramble I guess. I do that some times, but to list some of the day to day issues I face:
Styles don’t load/Don’t load quickly, and the default styling is sometimes not the best.
Silverbullet sometimes hits some kind of dead state, i.e. something doesn’t work, e.g. widgets aren’t rendered, I can’t navigate, page is empty and doesn’t load, etc. Some of these are fixed after a quick reload, some need a full client, which is now possible and already a big improvement. I think this is the most annoying issue and the one I face the most. It even happens on silverbullet.md.
Constant reindexing, indexing getting stuck. This kind of adds to the previous point, but the amount of times I’ve seen the indexing just stop at the top is weird, idk. I’ve never really investigated it, as it’s fixed by a reload most of the time, but it’s still weird.
Mobile is a mess for me, I just don’t use it anymore. I haven’t really put too much time into configuring it properly, so that may be the culprit, but e.g. the two finger click and three finger click are highly unreliable. The action bar is better, but still not a great way to do stuff.
The selection based widget rendering can feel very weird/bad sometimes with the document jumping around, etc.
I guess I can’t really complain if I don’t open issues or PRs, but these things alone are often very minor or I encounter them at unpleasant times, so I am rarely able to open an issue. I guess that’s the main problem with reliability problems after all…
Before I will go point by point later, a few shower thoughts on the topic of sync not being reliable and (some believe) fundamentally being impossible to make truly reliable.
This raises the (valid) question: why have sync at all? Why not just rip it out, or make it optional?
I never summarized the User Survey results (although I probably should), but one of my take aways was that for you, offline support is not that important. For me personally, it’s critical, but clearly this is not the case for many of you. I’ve also read this here and there anecdotally, because they only use SB in a networked setting anyway. And if some of you say you don’t use SB on mobile at all (separate topic to address), that kind of fits in there.
The only reason for sync to be there is offline support, so what if you can simply disable sync? What would you lose and what would you gain?
The way this can be implemented (since I decoupled the sync engine into the service worker a few releases back) is that effectively I could disable 99% of the service worker logic (or disable the service worker altogether) in this mode.
What you would lose:
Offline support. SilverBullet would only operate when Internet connected and error out in various ways if not.
Some performance: page loads and saves (and possibly initial indexing) would be a bit slower.
What would you gain:
No sync conflicts, because no sync is happening.
No uncertainty about whether your edits made it to the server (you’d see an error instantly in case of errors)
Probably all authentication proxy issues would go away — I hope I solved most of these over time, but any of them that keep appearing are usually rooted in the service worker setup that SB has. If the service worker is off, this problem would go away.
Less data stored locally (the index would still live in the browser, but there would no longer be a full copy of your space)
The way I could introduce this (fairly easily) is to introduce e.g. a SB_DISABLE_SYNC environment variable (historical reference) on the server end, and that would be that. Sync off everywhere. Hypothetically I could make this configurable at a per-client basis, but that is asking for a bunch of edge case scenarios especially when you start to toggle modes that I prefer not to live through again (had this issue in v1 already).
What this would not affect is how (and where) indexing happens. Every client would still have to build up and maintain its own local index in the browser, and on first boot (so in principle once, except for reindex triggers) effectively download all files (once) for indexing purposes. It would not keep a copy locally though (of the files), just the index. Changing this is a whole different (level) project.
Would that seem like something to introduce and help?
What rubs me the wrong way about this approach is that it feels a bit like “giving up on sync”. Personally I do believe sync can be “perfected” (not fully, but at least more than today) and offering this option feels a bit like giving up and a get out of jail free card: “if you have sync issues, well you can just disable it. See yas!” But ok.
By the way, on the topic of “sync is fundamentally an unsolvable problem” — I think this can be addressed by switching to something like CRDTs, which also make multi-player support (concurrent editing, Google Docs style) very feasible.
However, this is a project that is sufficient in scope that I’d only consider tackling it if somebody knocks on my door with a sack of money to hire a team to build and maintain this. This is outside single-person side-project territory (for the foreseeable future) for a project already on the verge of becoming too complex.
Personally, I believe offline mode is valuable enough, and disabling it would go against what I’m describing - now I’d be reliant on the network reliability too. Especially for the “I just have a quick note” situation, having things work offline is super helpful. (In fact, I currently use HTTP Shortcuts to add quick notes directly over the API… and I’ve run into the limitation that it doesn’t work offline, and am working on building a Google Notes Sync based workaround for that).
As for CRDTs - yep, I already mentioned our discussion about Y.js.
I would argue that the project simply already is too complex. Features are added, ensuring a smooth upgrade path is hard (even moreso with a tiny team), edge cases multiply, reliability goes down. This means that, to get (back) to a high level of reliability, active effort needs to be made - at the expense of effort on new features. That’s not an easy trade to make, and the reason I started the thread is to try and assess if I’m just an outlier and it “just works well enough” for others.
Every one of the specific points I brought up can probably be tackled directly and fixed (especially visual sync indicators). The question is more about preventing backslides. Silverbullet seems to have plenty of unit tests… but nothing (I think) which would look at “hey we’ve changed the index format, how many times is the browser going to reload?”. I’m pretty much the opposite of a frontend specialist, but… [sheepishly] playwright tests?
Yes, this is a topic I’m very sensitive to. This feels a bit like a “stop the world” type of situation: stop feature development and focus on stability and reliability for a while, reduce complexity where possible before adding more stuff. Intuitively this feels like the right thing to do right now. I think SB is not missing major big features, so allocating my time to making (and keeping) this thing rock sold makes sense. As I’ve gone along over the last weeks, I have added more tests to areas where they were completely missing before. However still not everything is covered. And your specific issues (syscall-based reloading) are really browser cache related, so that would be quite tricky to detect in a test anyway.
I don’t know if it still causes “deadlocks” sometimes, but If something gets stuck and I see it in the top bar it’s indexing (although I thought there was a 5s interval to reset it, but it doesn’t seem to go away right now. Not even reloading fixes it now weird.) So I think sync is not even that big of a problem at the moment, but I’m not sure.
Maybe something that would help is putting a “Are you sure you want to close this page?” message (Unsure what the browser API is called) during stages were closing Silverbullet would be catastrophic, e.g. not saved, etc. Unsure if that’s too annoying.
I’m a user who uses the traditional built in auth, sync, and mobile. I’m typically connected to the Internet but I don’t like waiting for syncs to happen before I start writing, especially if a lot changed.
That said, currently SB has met essentially all my needs. Unless I want to work on the same doc I was just working on on desktop, I don’t need to wait for the sync. In fact, I typically don’t bother trying to remember if I edited a page recently on another platform, I just start writing. Sync is fast enough that I haven’t had a conflicted md file since, like, v2.0.
The mobile experience has been totally fine for me. I used to miss the bottom bar, because I had that in Logseq, but really I only ever used the buttons to indent left and right, and I got a library from this community that allowed me to swipe left and right on mobile to do the same thing, so I haven’t wished for the bottom bar in a long time now.
That was all to say, I actually find SB to be quite reliable as-is! Part of me would be interested in looking at CRDTs, but I’ve been playing with prosemirror and automerge on my own project and agree it’s not just a trivial feature to add.
I didn’t want to weigh in here previously because I really don’t have any preference on what gets worked on next, as I’m quite content with the current state of SB, but reading all the work that would be involved in allowing users to disable sync and then having to maintain a synced and not synced flow going forward sounds like a lot and I just wanted to weigh in that some of us are using the app, both on desktop and mobile, without any sync issues currently!
What is your setup? Do you use SilverBullet’s native authentication, or additional authentication layers on top?
The thing with the current architecture is that it’s offline first, all data can be served and persisted locally and SilverBullet will only notice you’re logged out when it attempts any server operation (which could be a file save), and will then prompt you to login again. This should never lead to data loss, though, has that happened? I get it’s annoying, and is happened to me too that I’m typing and all of a sudden a dialog is blocking the UI. But I’m not sure what to do differently here.
I hear you. I’ve never really experience my edits not making it to the server unless I saw an explicit error, at least not since I moved the sync engine to the service worker (2.1 or 2.2?) where sync can even continue for a bit if I lock my phone (which would be the main case where I’d be afraid my changes are not synced). Has this happened to you recently?
As to: not knowing whether the edits have made it to your device: yeah, for your currently open page this syncing happens quite aggressively (every 5s or so), but for other files it can take longer (20s or a bit more). Wondering if there’s anything we can do in the UI to make this more clear.
Yeah I’m not proud of this “solution”. What I’m doing right now is move some logic around. Logic that was previously built into plugs, I’m now moving to SB core (for reasons). The result is that things that used to be plug-provided syscalls, now become SB core syscalls. This should be fine if I could guarantee that browsers wouldn’t have different caching behaviors regarding each of those. The situation I’ve run into is that the user gets the new plug code, but has not loaded the new SB client bundle yet, resulting in a ton of syscall errors. My “fix” was to check on the plug side if a syscall is available and if not, force another reload hoping the browser cache would refresh. This seems to mostly work and should be a one-time thing (per client). Once I’ve moved over all this type of logic, this should not happen again.
Generally, if you are annoyed by these things I recommend people stick to the “stable” releases (:latest) rather than the edge builds. The edge builds will be more noisy in this sense.
There could be two sources of reindexes:
I explicitly bumped the required index number, forcing a full reindex on all clients. I try to do this only when I really need to (e.g. when the index logic has changed dramatically which could result in breaking inconsistencies), but I could still be more conservative in requiring this.
Some sort of bug, which would cause this.
Such surprise reindexes, do they only happen after you upgrade SilverBullet, or even without an upgrade?
If you can somehow pinpoint when this happens, this would be helpful. Space script is usually temporarily disabled when space reindexes need to happen (and ideally auto enable after a full reindex has completed, but this may break if you start to manually reload). Other than that space lua should work consistently, and if not it would be great to see JavaScript logs to see if we can pinpoint the issue. On Android this is hard, though.
Anyway, any hint of in what scenarios this seems to happen would be great. Upgrades, reindexes, clients not being openened for a long time. Only on particular devices, browsers etc.
Some general comments on my release/development approach right now that are probably helpful to understand and I’m happy to take feedback on:
SilverBullet is primarily a single-developer project, this means that some good practices I would enforce (e.g. in my “regular” job) are not in place: specifically, developing features on branches, issuing pull requests, getting code reviews. Instead I develop something, test it locally and push it to main directly. A fancy way to describe this would be “trunk based development”. Whereas in the past I sometimes worked on a bigger change on a branch and then merged it in one go. Lately I try to do this less, because other people also issue PRs and then I keep getting changes and merging changes for code that I’m also touching on my branch. Instead, what I try to do now is do smaller commits, straight to main. Since I update my main, own instance of SilverBullet with :v2 (edge), I also hope to notice if I break something myself in a real use setting.
From time to time I break things, from time to time I introduce a change that I later change again or revert altogether. This can produce some noise. It can happen your SilverBullet instance will break in some way. Again, I’m on this build myself so I will (hopefully) experience the same and fix it quickly, but it’s a risk. If you don’t like this noise or risk, I recommend staying on the “stable” channel, that is :latest and release builds.
On the topic of auth proxies: this is a super annoying and persistent problem. As of about a month ago I’m behind an auth proxy myself too (Pangolin, highly recommended) and as part of that have I think added better checks and behaviors to deal with auth proxies having auth kick in, or doing other weird things. I also found some cases where errors would occur, but they would not propagate to the UI, so you wouldn’t see them. I hope I fixed those too.
I hope that as of 2.4 this problem should now have been solved, but please let me know if you still run into issues. The challenge here is that there are a bunch of authentication proxies (Authentik, Authelia, CF Zero Trust, Pangolin) mixed with various reverse proxies that each have their own quirks and I do not have the capacity to deploy and test them all…
Yes, this is a concern. In the early days I had an agreement with myself to once every month or so stop everything, and go through essentially every line of code and clean things up based on the current state of things. I have stopped doing that, because the code base has gotten too big.
During the v2 rapture I deleted a lot, but never managed to really go everything properly and see what still made sense and what didn’t. I sometimes encounter stuff now and use it as an opportunity to redo bits and pieces, but there’s still quite a bit of technical debt left for sure. I have to find a way to do this in a more structured manner.
Yes, I think here you’re hitting on a fundamental choice/problem. I took the “power to the people” route at some point, meaning I offer sharp knives (as they say in the Ruby community). You can do a lot, but you can also break a lot. This does not make my life easier either, because certain things are hard to fix/change because people have come to reliable on how things work. I still have to find better ways of communicating such (proposed) changes and implement them in less disruptive ways.
I commented on the “can sync be done well” thing before. I think it probably can be.
I think the fundamental sync algorithm SB uses right now is “provably correct”, but it does not deal with conflicts at all, beyond making conflicting copies. So if you want to avoid that, another approach needs to be taken. Then there’s a number of mechanisms built on top of this algorithm that add complexity and edge cases, I think there are still issues to be uncovered here. I also still hit cases where the sync engine does things that I cannot explain. It hasn’t lead really lead to data loss, though.
As I mentioned, I think a more fundamental rebuild could be done using CRDTs. However, this is too big of a product for me to conduct right now.
The last option is to not do sync at all, which I may offer as option (and you seem to effectively already emulate this behavior today).
I think browsers have improved a lot on this topic over the years. There are few web apps that rely on this tech as heavily as SilverBullet does, and as a result we hit more browser bugs and weird behaviors that’s true. Were browsers originally built to do this? No, have they evolved enough so they can handle it? I think probably yes.
And assuming you want to build an offline capable app, with all platforms out there today, web apps are probably the most viable “operating system” to build this type of application, without going native on all platforms or building yet another Electron app.
A lot of these problems would go away if SilverBullet would pivot (back) to a more traditional server oriented traditional model. That would be a big project, and we’d lose offline capability.
When you say the API feels unstable, do you mean that APIs don’t respond reliably, or they change all the time?
Happy to hear that synchronization seems to have improved with v2 releases.
As for release regression: yes I hear you on this and I’m not happy about it.
Here’s the issue: over the last three years a lot of things have changed in SilverBullet. I ran into issues or wanted to do things, users had ideas. I implemented some of them, but then later concluded they were the wrong approach. Then, what should I do, keep the old implementation and offer a “fixed” solution in addition, or rip out the old one and force users to switch to the new one?
While I hope that in time, this issue will occur less and less, it is a fundamental issue. There’s a bunch of experimental features in SilverBullet and from time to time I have a “aha!” moment of how to approach it differently. However, by that stage people may already have started to rely on it, but I have no way of knowing. I cannot do a remote code scan on your spaces to see what APIs you use and how.
What I’ll try to do:
Keep old APIs working as before as much as possible, but add deprecation warnings (in documentation but also console logs maybe)
Mark experimental ideas as such in the docs, I already have a #maturity/experimental and #maturity/beta tag on silverbullet.md I’ll try to leverage that more. You can use these features, but beware they may break on the next release if I had a new insight on how to do things better/differently. Many projects have clear rules about APIs exposed and whether you can rely on them long term. SilverBullet hasn’t had that, but maybe it’s time.
One hand I like the elegance of the server being pretty “dumb” right now. Making it easy to replace with something else, or as some have done, reimplement it in some way. On the other hand, it does mean clients have to do all the heavy lifting, over and over again. I’ve thought about finding more of a middle ground, e.g. by indeed storing the index on the server, but indeed maintaining offline support there would be hard. And replicating/moving indexing logic to the server would require significant architectural changes (again). We’ve been here before with the 0.x series and it was too complex.