Who is using `paragraph` for queries?

I’m back with another one of those: is anybody using this, or can I remove it? — type questions.

A while ago somebody contributed indexing paragraphs (so you can write queries like query[[from index.tag "paragraph" where ...]]). I never saw a use case of this myself, but accepted it for consistency (it makes sense to index all the things). The thing with paragraph indexing is that it significantly increases the index size, especially if you write a lot of prose (= paragraphs). As the name implies, a fully copy of the paragraph text is stored in the index (and there’s already a copy in the synced files database) and it makes additional copies for every tag you put in it.

The other slight annoyance this this causes is that it’s a bit counter intuitive which tags apply to paragraphs and which to pages. The rule is that if a paragraph contains ONLY tags, those tags will apply to the page, if there’s also text, they apply to that paragraph only. This can be explained, but is not great and not how many other note taking apps interpret tags inside paragraphs.

So, my question is: who is using paragraph-based queries, and if you do, what for, and what do those queries look like? Perhaps we can find alternative or more optimal solutions.

ctrl + s finds that virtual-tag page is using “paragraph” as itags.

:man_raising_hand:never used it. Didn’t even knew it existed :sweat_smile:

(I use it for some listings, although not critical.)

I understand that indexing full paragraphs leads to heavy-weight index but I wouldn’t drop it. Instead what if we made certain indices optional? I think that would be the best solution. Or we can optimize it (index only page+from/to and tags for this area). I could then at least use the page+from/to to extract tagged paragraphs manually (which is better than nothing). We can provide tooling for this too.

It may be also interesting to make a survey here and ask people to tell what indices they use or can imagine to use for something in general. Then what majority of users use, this will become the set of indices that will be turned on by default. And the rest will be opt-in.

How difficult would it be to make some indices optional? @zef

Not at all, it’s already there actually, there’s a index.paragraph.all config you can set to false right now to stop indexing all paragraphs. We can default that to false. I just thought we can simplify more by removing it altogether (also to simplify the “what does this hash tag refer to?” question).

I use it to compile information that usually ends up scattered around my space.

For instance, I keep pages for meetings notes split by people (so, I have a bunch of People/<name> pages) and, sometimes, during these meetings I get feedback about work I delivered recently. So, I add to that paragraph #feedback.

When it’s performance review season at my job, I use queries like this to refer back to feedbacks I got and their relevant context:

query[[
  from index.tag "paragraph"
  where table.includes(_.tags, "feedback") and 
              table.includes(_.itags, "project_name")
  order by _.ref
]]

That being said, if you feel indexing paragraphs is not worth the maintenance burden, it’s ok to remove it.

I think it’s a pretty cool feature, but it’s not a make or break thing for me.

It’s not a maintenance burden, it just adds a lot size to the database. From your example @laurybueno I see that you’re effectively only searching for tagged paragraphs, I think that could be a compromise: just index paragraphs that contain a tag and not the rest by default, and offer a config to index all.

2 Likes

That sounds great to me!

@zef I am not convinced about this at all!

How would we construct query showing all paragraphs containing, e.g., some particular term or phrase? I commonly manipulate (e.g., “smart” truncating) text in Lua for special review queries/pages. Also, the global search like the SilverSearch (which is otherwise great tool) is simply not the same thing and it is insufficient for my particular needs and aesthetics.

You simply cannot write every unique word in each paragraph as tag. So I strongly opose the variant to index only paragraphs that are tagged, for me it’s real “NO-GO”. If opted out from indexing all paragraphs (the new default, as You also proposed) the variant with indexing is natural “GO” as the new default. It makes sense to me: lightweight default, heavy opt-in full paragraph indexing. That’s perfect compromise, IMHO. :slightly_smiling_face:

Ok I’m a bit confused about that last part: you’re saying you’re ok with no paragraph indexing by default, and full indexing as opt in, regardless of tags?

I do not use the feature.

@zef

  • default: index tagged paragraphs only
  • opt-in: index all paragraphs

(And if the option changes then full reindexing would be the best solution to avoid unneccessary code complexity… ?)

1 Like

This might be a bit personal (sorry in advanced), I use it throughout my journal and is one of the main reasons I picked SB over Obsidian (yes Dataview/Datacore exist but they are painfully slow). My primary use is simple, I tag good moments and have a simple query on a monthly summary page to display those moments.

In the past I have used it to help me track various health related items in a similar fashion, tag a comment, write a query in a note related to that “thing” (usually contains a lot of other information), export and hand over to my doctors.

I run two spaces and I only use this feature in my personal space.

2 Likes

Thank you, i’ve just done this and my indexing got much faster.

1 Like

I agree with others that have said that they’re okay making it optional and setting the default to false. I like having the option, but I agree that for most it’s a limited use case.

Thank you Zef!

1 Like

Agreed, this is doing absolute wonders on my mobile device.