Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance of Ansible operator after upgrade to v1.31.0 #17

Open
jtruskow opened this issue Sep 18, 2023 · 4 comments
Open

Poor performance of Ansible operator after upgrade to v1.31.0 #17

jtruskow opened this issue Sep 18, 2023 · 4 comments

Comments

@jtruskow
Copy link

Type of question

General operator-related help

Question

Poor performance of Ansible operator v1.31.0

What did you do?

Upon updating the operator to the latest version (v1.31.0) I'm seeing a serious performance degradation. The reconcile loop takes ~5x longer to complete compared to v1.30.0

Everything remains the same, except for the operator SDK version (and changing to Python3.9). I followed the upgrade guide here: https://sdk.operatorframework.io/docs/upgrading-sdk-version/v1.31.0/

I don't think it's a bug in the operator-sdk because I installed the memcached operator using both v1.30.0 and v1.31.0 on my cluster and they perform similarly. I'm hoping to get some advice on how to debug this further.

I've attached a file showing a diff of runtime for each task. They are pretty consistently slower (there is not one or a small number of tasks responsible for the slowdown)

sdkv1.31.0_speech_timediffs.txt

What did you expect to see?

Similar runtimes across tasks

What did you see instead? Under which circumstances?

Significant slowdown on latest version (v1.31.0)

Environment

Operator type:

/language ansible

Kubernetes cluster type:

Openshift 4.10 and Openshift 4.12

$ operator-sdk version

operator-sdk version: "v1.31.0", commit: "e67da35ef4fff3e471a208904b2a142b27ae32b1", kubernetes version: "1.26.0", go version: "go1.19.11", GOOS: "darwin", GOARCH: "arm64"

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:36:36Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.10+8c21020", GitCommit:"379f6fe03321f9149edea7f20e11ce88f8d99c25", GitTreeState:"clean", BuildDate:"2023-06-12T16:07:59Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}

Additional context

@everettraven
Copy link
Collaborator

everettraven commented Sep 22, 2023

@jtruskow Thanks for raising this issue! v1.31.0 of the ansible-operator base image uses Ansible 2.15.0 instead of 2.9.z. Because I'm not familiar with what performance impacts there may be with this, I reached out to some folks that I know that are involved with the Ansible project and they mentioned that ansible/ansible#81643 may be a culprit here and that this fix is expected to release as part of Ansible 2.15.5 on October 9th.

@jtruskow
Copy link
Author

@everettraven Awesome! That seems like a likely culprit. I assume once that is fixed upstream, we'll need to wait for another operator-sdk release to pick it up.

I wasn't able to find a roadmap, is there any plan in the works for 1.31.1 or 1.32.0?

@everettraven
Copy link
Collaborator

There is a v1.32.0 release in the works, but I don't have an ETA as to when that release is coming or when it might include this fixed ansible version

@openshift-ci
Copy link

openshift-ci bot commented Oct 5, 2023

@jtruskow: The label(s) language/ansible cannot be applied, because the repository doesn't have them.

In response to this:

Type of question

General operator-related help

Question

Poor performance of Ansible operator v1.31.0

What did you do?

Upon updating the operator to the latest version (v1.31.0) I'm seeing a serious performance degradation. The reconcile loop takes ~5x longer to complete compared to v1.30.0

Everything remains the same, except for the operator SDK version (and changing to Python3.9). I followed the upgrade guide here: https://sdk.operatorframework.io/docs/upgrading-sdk-version/v1.31.0/

I don't think it's a bug in the operator-sdk because I installed the memcached operator using both v1.30.0 and v1.31.0 on my cluster and they perform similarly. I'm hoping to get some advice on how to debug this further.

I've attached a file showing a diff of runtime for each task. They are pretty consistently slower (there is not one or a small number of tasks responsible for the slowdown)

sdkv1.31.0_speech_timediffs.txt

What did you expect to see?

Similar runtimes across tasks

What did you see instead? Under which circumstances?

Significant slowdown on latest version (v1.31.0)

Environment

Operator type:

/language ansible

Kubernetes cluster type:

Openshift 4.10 and Openshift 4.12

$ operator-sdk version

operator-sdk version: "v1.31.0", commit: "e67da35ef4fff3e471a208904b2a142b27ae32b1", kubernetes version: "1.26.0", go version: "go1.19.11", GOOS: "darwin", GOARCH: "arm64"

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:36:36Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.10+8c21020", GitCommit:"379f6fe03321f9149edea7f20e11ce88f8d99c25", GitTreeState:"clean", BuildDate:"2023-06-12T16:07:59Z", GoVersion:"go1.19.9", Compiler:"gc", Platform:"linux/amd64"}

Additional context

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@everettraven everettraven transferred this issue from operator-framework/operator-sdk Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants