Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TopK parameter not working for values larger than the default (ElasticSearch), only working has a client-side limit #1938

Open
jcgouveia opened this issue Dec 15, 2024 · 1 comment · May be fixed by #2025

Comments

@jcgouveia
Copy link

When creating a request for a query using Elasticsearch store

		SearchRequest request = SearchRequest.query(query);
		List<Document> docs = store.similaritySearch(request);

The default query size from ES is 10 results
SearchRequest.query(query) => 10 results

If I set a smaller value, eg 5
SearchRequest.query(query)withTopK(5); => 5 results (OK)

If I set a larger value, eg 100
SearchRequest.query(query)withTopK(100); => 10 results (Not OK) I was expecting 100

This happens because the size parameter of the ES SerachRequest (co.elastic.clients.elasticsearch.core.SearchRequest) is never set, the topK is applied only after the query results, which I think is wrong.
Doing the query directly on the _search ES rest api, I get the intended response -100 results (although not being a similarity query), eg

{
  "query": {
    "match_all": {}
  },
  "size": 100
}

Changing in debug the value of SearchRequest.size to 100, I can get also the intended result.

@karol-antczak
Copy link

I have the same problem. Ended up making custom VectorStore implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants