The Search function does not support Chinese characters

I self hosted the silverbullet on my centos, and using deployed syncthing to sync all files to my synology NAS (everything works well), so I want to replace my trilium note, but I suddenly found the silverbullet doesn’t support Chinese Characters searching[Ctrl-shift-f].

I think the reason might be that the search is only matching full words (between spaces), which I assume doesn’t work the way you need for Chinese language. When I search a full phrase it does work for me (same with Polish diacritics).

I suppose the Search Plug could be changed to optionally do “dumb” search, just looking for a substring of characters without any of this tokenization logic.

1 Like

Thanks for your response. You’re right, it seems only works with whole sentence or phrase separated with space, comma,stop, et al.
As you mentioned the phrase"我不会说中文“, we need to search any one or two or more characters,such as “我”,“不会”,“我不会”,“会说中”, et al.
I have tried a few more note taking apps, such as dokuwiki, mediawiki, wiki.js, boostack, trilium, logseq, obsidian, siyuan, roamedit, remote, OneNote, Evernote, simplenote, synology note, et al. I prefer self hosted, webpage based, then I found silverbullet, the only thing bothered me is the search function not supported well in Chinese. I hope this issue could be solved.

1 Like

I have the same need, it would be great if I could configure the options myself, thank you very much :grinning:

I am not familiar with any codes, I paste the code of search function into Google Gemini, which also could not give any useful suggestion, but only tell me to transfer Chinese phrase to PinYin(similar to alphabet letters) for searching, which will be inaccurate and painful.

Hi there,

I was wondering if there’s any chance that Chinese search support could be implemented in the future?

I’ve been using Silverbullet for a few months now and absolutely love it, but the lack of Chinese search functionality has been a bit of a challenge for me.

I even tried asking ChatGPT how to create such a feature myself. I put in a lot of effort, but since I’m not a programmer, I eventually had to give up.

Thank you!

I did a partial solution, where every character is treated as a separate word. I don’t speak Chinese, so can’t judge if this makes sense for your usage.

I will explain what I mean by using capital letters instead of actual characters, since that’s the keyboard I have. When you search for “ABC” (if they are ideograms) you would get in the results:

  • “ABC”
  • “CBA”
  • “AB”
  • “xxABxxxxxCx”
  • “xxCxxxBxxx”

This should keep the good performance of word-based search through the entire space at the cost of giving you many false positives. Probably could work if you search for rare keywords? Again, I don’t know the language to judge myself

1 Like

Thank you very much for your help!
I think a simple substring match will suffice for most of our needs.
For example, when I search for “ABC,” I would like to match all documents containing this substring, such as:

“ABC”
“ABCD”
“XXABCXX”

rather than matching each character as a separate word, such as:
“A B C”
“C B A”

Thanks again for your help!