-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High concurrency can cause the result generation to fail, and some fields in the result file may be empty. #51
Comments
If you set |
This system is built by the company using a third-party Azure service, and I’ve noticed a curious phenomenon:
I suspect that the actual concurrency level of the requests (which might depend on the number of Is there a way to limit the actual concurrency? I also noticed in the log output that some data progress bars show as completed, but they still print
|
I see, then I am not sure how it is handled. I do agree its probably dependent on concurrency limit of the azure service. If its not possible for you to switch to the default openai service which we use to reproduce the results. My suggestion is maybe you can just do it instance by instance with some timeout in the middle or maybe implement some caching (if even one instance is too many tokens for the interface to handle)
Yes the number of embedding tokens is only accurate when the number of threads is set to 1 (this is specified in the help documentation). The reason is again due to some concurrency where the embedding token counting will be overwritten once we use multiple threads. To ensure accurate counting, you can use |
I tried setting Official: Official: Official: |
Its definitely possible for some slight differences across runs due to non-deterministic embeddings. Also It doesn't seem like there is any difference in the last example you posted. |
The
--num_threads
parameter has puzzled me for a long time. When set to 1, the results are vastly different from when set to 10, with the file size alone differing by several times. This issue occurs both with the embedding and LLM interfaces.In the code, requests that fail due to excessive concurrency are simply written as empty, without any error being raised. Instead, these empty responses are combined with the partially successful ones, causing the output to almost completely mismatch the provided result files.
Especially when calling the embedding interface, even though I set
--num_threads=1
, I passed the maximum possible values formax_retries
andtimeout
as follows:embed_model = OpenAIEmbedding(model_name="text-embedding-3-small", max_retries=120, timeout=300)
https://github.com/OpenAutoCoder/Agentless/blob/5ce5888b9f149beaace393957a55ea8ee46c9f71/agentless/fl/Index.py#L260
This is caused by the embedding interface response failure, which returns
None
. When encountering the above error, I can only keep trying with largermax_retries
andtimeout
values. Can you suggest a more elegant fix for this issue?The text was updated successfully, but these errors were encountered: