-
-
Notifications
You must be signed in to change notification settings - Fork 711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mechanism for ranking results from SQLite full-text search #268
Comments
I did a bunch of research relevant to this a while ago: https://simonwillison.net/2019/Jan/7/exploring-search-relevance-algorithms-sqlite/ |
I want this on the table page - but that means that the table page will need to run a slightly more complex query since it needs access to a BUT... that join needs to be constructed in a way that keeps existing filters, with original as (
select
rowid,
*
from
items
)
select
original.*,
items_fts.rank as items_fts_rank
from
original
join items_fts on original.rowid = items_fts.rowid
where
items_fts match escape_fts(:search)
order by
items_fts_rank desc
limit
10 |
Worth noting that joining to get the rank works for FTS5 but not for FTS4 - see comment here: simonw/sqlite-utils#192 (comment) Easiest solution would be to only support sort-by-rank for FTS5 tables. Alternative would be to depend on https://github.com/simonw/sqlite-fts4 |
I should depend on |
Part of the challenge here is that this is the first time the datasette/datasette/views/table.py Lines 628 to 636 in 5eb8e9b
|
In FTS5, I think doing an FTS search is actually much easier than doing a join against the main table like datasette does now. In fact, FTS5 external content tables provide a transparent interface back to the original table or view. Here's what I'm currently doing:
Unfortunately, datasette doesn't currently seem happy being coerced into doing a real query on an fts5 table. This works: But this doesn't work in the datasette SQL query interface: For what datasette is doing right now, I think you could just use contentless fts5 tables ( I guess if you want to follow this suggestion, you'd need a somewhat different code path for fts5. |
It's kind of an ugly hack, but you can try out what using the fts5 table as an actual datasette-accessible table looks like without changing any datasette code by creating yet another view on top of the fts5 table:
That's now visible from datasette, just like any other view, but you can use This is only good as a proof of concept because you're inefficiently going from view -> fts5 external content table -> view -> data table. However, it does show it works. |
That's a deliberate feature (albeit controversial, see #759) - part of the main problem here is that it's easy to construct a SQLite full-text search string which results in a database error. This is a bad user-experience! You can opt-in to raw SQL queries by appending But maybe there should be an option for turning that on by default without needing the query string? |
I do like the idea of there being a option for turning that on by default so that you could use those terms in the default "Search" bar presented when you browse to a table where FTS has been enabled. Maybe even a small inline pop up with a short bit explaining the FTS feature and the keywords (e.g. case matters). What are the side-effects of turning that on in the query string, or even by default as you suggested? I see that you stated in the docs... "to ensure they do not cause any confusion for users who are not aware of them", but I'm not sure what those could be. Isn't it the case that those keywords are only picked up by sqlite in where you're using the MATCH clause? Seems like a really powerful feature (even though there are a lot of hurdles around setting it up in the sqlite db ... sqlite-utils makes that so simple by the way!) |
Mainly that it's possible to generate SQL queries that crash with an error. This was the example that convinced me to default to escaping:
|
... though interestingly I can't replicate that error on That's running https://latest.datasette.io/-/versions SQLite 3.35.4 whereas https://www.niche-museums.com/-/versions is running 3.27.2 (the most recent version available with Vercel) - but there's nothing in the SQLite changelog between those two versions that suggests changes to how the FTS5 parser works. https://www.sqlite.org/changes.html |
This isn't particularly straight-forward - all the more reason for Datasette to implement it for you. This article is helpful: http://charlesleifer.com/blog/using-sqlite-full-text-search-with-python/
The text was updated successfully, but these errors were encountered: