Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChromaVectorStore write() succeeds in Spring Boot 3.3.4, fails in Spring Boot 3.4.1 #2019

Open
cpage-pivotal opened this issue Dec 29, 2024 · 6 comments
Assignees
Labels
invalid This doesn't seem right

Comments

@cpage-pivotal
Copy link

Bug description
Running code that creates embeddings and writes them to a Chroma vector store. Code works, but if I upgrade the app from Spring Boot 3.3.4 to 3.4.1, the write fails with:

2024-12-29T11:37:29.960-06:00 ERROR 72456 --- [legweb] [         task-1] .a.i.SimpleAsyncUncaughtExceptionHandler : Unexpected exception occurred invoking async method: public void org.knowyourgov.legweb.bill.BillIndexService.embed(java.lang.String)

org.springframework.web.client.HttpClientErrorException$UnprocessableEntity: 422 Unprocessable Entity: "{"detail":[{"type":"missing","loc":["body"],"msg":"Field required","input":null}]}"
	at org.springframework.web.client.HttpClientErrorException.create(HttpClientErrorException.java:133) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.StatusHandler.lambda$defaultHandler$3(StatusHandler.java:86) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.StatusHandler.handle(StatusHandler.java:146) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.applyStatusHandlers(DefaultRestClient.java:826) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.lambda$toBodilessEntity$3(DefaultRestClient.java:789) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:574) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchange(DefaultRestClient.java:535) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.RestClient$RequestHeadersSpec.exchange(RestClient.java:677) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.executeAndExtract(DefaultRestClient.java:809) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.toBodilessEntity(DefaultRestClient.java:787) ~[spring-web-6.2.1.jar:6.2.1]
	at org.springframework.ai.chroma.vectorstore.ChromaApi.upsertEmbeddings(ChromaApi.java:182) ~[spring-ai-chroma-store-1.0.0-M5.jar:1.0.0-M5]
	at org.springframework.ai.chroma.vectorstore.ChromaVectorStore.doAdd(ChromaVectorStore.java:182) ~[spring-ai-chroma-store-1.0.0-M5.jar:1.0.0-M5]
	at org.springframework.ai.vectorstore.observation.AbstractObservationVectorStore.lambda$add$1(AbstractObservationVectorStore.java:91) ~[spring-ai-core-1.0.0-M5.jar:1.0.0-M5]
	at io.micrometer.observation.Observation.observe(Observation.java:498) ~[micrometer-observation-1.14.2.jar:1.14.2]
	at org.springframework.ai.vectorstore.observation.AbstractObservationVectorStore.add(AbstractObservationVectorStore.java:91) ~[spring-ai-core-1.0.0-M5.jar:1.0.0-M5]
	at org.springframework.ai.vectorstore.VectorStore.accept(VectorStore.java:53) ~[spring-ai-core-1.0.0-M5.jar:1.0.0-M5]
	at org.springframework.ai.vectorstore.VectorStore.accept(VectorStore.java:38) ~[spring-ai-core-1.0.0-M5.jar:1.0.0-M5]
	at org.springframework.ai.document.DocumentWriter.write(DocumentWriter.java:30) ~[spring-ai-core-1.0.0-M5.jar:1.0.0-M5]

Environment
Spring AI 1.0.0-M5, ChromaDB 0.5.20

Expected behavior
App should continue to work in Spring Boot 3.4.1

Minimal Complete Reproducible example
Code snippet that triggers issue:

        Resource resource = new ByteArrayResource(billText.getFullText().getBytes());
        TextReader reader = new TextReader(resource);
        List<Document> split = tokenSplitter.split(reader.read());
        for (Document document : split) {
            document.getMetadata().put(BILL_METADATA, billText.getBillId());
        }
        vectorStore.write(split);
@Bofutw
Copy link

Bofutw commented Jan 1, 2025

I’m experiencing the same issue. Could it possibly be related to RestClient?

@ilayaperumalg ilayaperumalg self-assigned this Jan 2, 2025
ilayaperumalg added a commit to ilayaperumalg/spring-ai-tests that referenced this issue Jan 2, 2025
ilayaperumalg added a commit to ilayaperumalg/spring-ai-tests that referenced this issue Jan 2, 2025
@ilayaperumalg
Copy link
Member

Hi @cpage-pivotal @Bofutw , It would be helpful to have a GitHub example to investigate this issue. I just tried to replicate the issue via this example and it works fine with Spring Boot 3.4.1. Let me know what's missing in my test to reproduce the issue. Thanks

@cpage-pivotal
Copy link
Author

This repo replicates the problem. Hit the /bill/text endpoint. Upgrade from Spring Boot 3.3.4 to Spring Boot 3.4.1 to trigger the bug.

https://github.com/cepage/spring-ai-chroma-bug

@markpollack
Copy link
Member

just a note - If you drop back to using Chroma 0.5.15 you are good with Spring Boot 3.4.1 and Spring AI 1.0.0-M5.

@ilayaperumalg ilayaperumalg added the invalid This doesn't seem right label Jan 9, 2025
@ilayaperumalg
Copy link
Member

Closing this as invalid as I learnt the issue from the underlying code passed empty document list in some cases.

@cpage-pivotal
Copy link
Author

Just to clarify, the code only passed an empty document list in the case where you weren't able to replicate the bug. Specifically, the case where you only passed a String of 5 or fewer characters, like "hello".

The bug is reliably replicated in any real-world use case, where we are tokenizing a string that is longer than 5 characters in length.

@ilayaperumalg ilayaperumalg reopened this Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

4 participants