Grep Plug for alternate search implementation

Hi everyone, inspired by a few posts on the forum, I just implemented an alternative search command via Plug. Features:

  • search substring of a file (instead of word by word)
  • search using a regular expression
  • search inside folder of currently opened page
  • configurable case sensitivity

Result presentation:

  • group by file, sorted by match count
  • link to the matched location (once you get links to line #988)
  • quote the full line that matched the search

Drawbacks:

  • noticeably slower than built-in search, since it doesnā€™t use index but actually reads files instead
  • requires git to be installed on the server, but thatā€™s already a part of the Docker image

The repository which includes installation instructions is here:

The Plug system was a pleasure to work with, and using a well-tested external software possible due to one of SilverBulletā€™s superpowers: ā€œitā€™s just files on disk broā€. Once again, congratulations to on the overall design of the app!

11 Likes

It would be fairly trivial to just add ripgrep to the distributed docker image. We canā€™t keep adding random cli tools to it, but if there is demand we can a few useful ones. Thatā€™s why git is there. By the way, have you considered using git grep?

I was planning on using git diff similarly, but didnā€™t even know git grep exists, thanks!

The only downside would be losing the links to specific location, since we canā€™t link to a line number, and I donā€™t see the byte offset option in git grep

ā€¦Unless I PR in links like GitHub in the form of [[Page#L123]], that would go to the line if you donā€™t have # L123 header on the page. Could be useful for other command line tools.

Git grep is pretty nice, also in that it only considers files in git (so no generated stuff) and itā€™s pretty damn fast.

Yes, this is worth adding I think. Itā€™s potentially ambiguous with headers indeed, but that may be worth it.

How about [[Page@L123]]? Thatā€™s consistent with whatā€™s already in SilverBullet, avoids collision with headers, and wonā€™t change behaviour for existing notes. Letā€™s .gitignore GitHub in this case.

And optionally [[Page@L123C12]]to place the cursor in a given column, start of the line if omitted.

Yes, later I was also thinking about overloading @ makes sense. You may have already found this, but main place to add this is silverbullet/plug-api/lib/page_ref.ts at main Ā· silverbulletmd/silverbullet Ā· GitHub

Not sure how this should be implemented but in Logseq, any bullet you link to gets an id:: uuid added to it and then you can refer to that bullet using ((uuid)). They have a nice UI to hide the uuid and replace it with the bullet text. Sub-bullets show up when you hover. This is non-standard markdown of course.

Not sure if line number is perfect cause changing the note can change the location.

+1 for adding [[Page@L123C12]] syntax. I think line numbers make more sense than byte positions a lot of the time, and look better imo for search results.

@shashlick In logseq, do those id:: attributes get added to the actual markdown file? Iā€™m not sure Iā€™d like them being added automatically, but maybe Anchors can be used for something similar? Or something like [[Page?id:blah]] where id could be any attribute.

Yes they get added to the markdown but not visible to users in the app.

File1

* Test bullet
  id:: 61f01da2-6a48-47ae-9c45-7c7a5fce7cc9
  * Sub-bullet content

File1 shows up as:

* Test bullet
  * Sub-bullet content

File2

* Another test
  * ((61f01da2-6a48-47ae-9c45-7c7a5fce7cc9)) - reference

File2 shows up as:

* Another test
  * Test bullet - reference

ā€œTest bulletā€ above is underlined and can be clicked to take you there. Hover shows you the Sub-bullet content in a popup.

There are very attractive reasons to start to put unique IDs in documents. Even to give unique IDs to pages, or just name them by ID. It would simplify a lot of things. It would also clutter up the markdown files and affect portability. To some degree youā€™re already locking yourself in to SilverBullet by using any non-standard Markdown features (especially all the templates and queries), but starting to litter document with IDs would make that even worse.

LogSeq decided to do this anyway (and I think Obsidian also has some sort of block reference feature that uses IDs), which is their choice, but I prefer to avoid it in SilverBullet. You can indeed use Markdown/Anchors, but those are more opt in.

2 Likes

Hey, just released v2.0.0 which depends only on git, and uses brand new links to line and column, I updated the original post and screenshot.

1 Like

Just installed this and itā€™s nice! I may just rebind my full-text search keyboard shortcut to this one :slight_smile:

What Iā€™m not 100% sure about is the GREP RESULT page. This works, but itā€™s a different pattern than used in full text search (creating a page on the fly). On purpose?

I did start with this solution because it was simpler to set up, but afterwards I noticed I like it this way:

The regex pattern can get quite elaborate, with it being essentially a program on its own, that I debug a bit before it does what I want. Also, this isnā€™t as instantaneous as your index-based search. For both of these reasons I found it nice to be able to see the page again after I browse one of the results, like a single-entry command history. And thanks to Meta Pages I donā€™t feel that Iā€™m cluttering the space this way.

2 Likes

You might want to recommend adding it to .gitignore in the README. I just realized Iā€™ve been committing multiple version of the GREP RESULTS page to git.

Maybe SB can have a standard _tmp directory which is added to .gitignore by default, similar to _plug.

How about a solution like this, since youā€™re already the second person asking about the feature:

Roadmap:

  • Setting to make the results page a temporary file (like the included Search plug)

Not sure what exactly you mean by temporary file in this context. Does SB store it separately or in-mem?

He means like the current search pages, which are virtual, created in the fly and not kept on disk, for instance: šŸ” note

3 Likes

I really like this plugin, and it also solves the issue of Chinese search. I have a little suggestion regarding the layout of the search results:

Original Layout

Suggested Layout

We only need to keep the name of the source page for each search result. We can give the page a |šŸ”— alias, which will maintain the functionality of jumping to the corresponding position upon clicking.

There should be a dividing line between different sources to make it easier to read.

Just letting you know that I released version 2.3.0, the screenshot below shows new features.

  • Multiple results in one line are now counted properly and emphasized with configurable markers
  • Shows the result in a virtual page by default as requested by @zef and @meain
  • Uses Space Config now
  • More readable results presentation, thanks to @daydayaā€™s suggestions :slight_smile:

14 Likes
grep:
  # by default shows results in a virtual page, like the built-in search
  saveResults: true

  # visually distinguish >>>matched part<<< in shown context
  surround:
    left: "=="
    right: "=="

Very comfortable with this having found that changing >>> and <<< for surround to == allows me to highlight the sought word in results:
image

Still not quite sure what the saveResults setting does though. Whatā€™s a ā€œvirtual pageā€?

2 Likes