Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize the Cloud Controller Manager Build/Release Process #36

Open
andrewsykim opened this issue Jun 18, 2019 · 18 comments
Open

Standardize the Cloud Controller Manager Build/Release Process #36

andrewsykim opened this issue Jun 18, 2019 · 18 comments
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. P1 Priority 1
Milestone

Comments

@andrewsykim
Copy link
Member

andrewsykim commented Jun 18, 2019

Right now each provider is building/releasing the external cloud controller manager in their own way. It might be beneficial to standardize this going forward or at least set some guidelines on what is expected from a cloud controller manager build/release.

Some questions to consider:

  • What should a CCM release include? Docker image? Binaries? Source Code?
  • What base images are acceptable for a CCM build? Does it even matter?

We've had this discussion multiple times at KubeCONs and SIG calls, would be great to get some of those ideas vocalized here and formalize this in a doc going forward.

cc @cheftako @jagosan @hogepodge @frapposelli @yastij @dims @justaugustus

@NeilW
Copy link

NeilW commented Jun 20, 2019

First thing to sort out is how to update the modules, so that Go module updates work correctly.

The standard main.go has dependencies on 'k8s.io/kubernetes' and 'k8s.io/component-base'

Component base isn't semantically versioned properly, and fetching the main kubernetes module causes a load of version failures as the staging redirect 'replace' entries in the go module file don't apply in an external structure.

@NeilW
Copy link

NeilW commented Jun 20, 2019

$ go get k8s.io/[email protected]
go: finding k8s.io/apiextensions-apiserver v0.0.0
go: finding k8s.io/apiserver v0.0.0
go: finding k8s.io/kube-proxy v0.0.0
go: finding k8s.io/cloud-provider v0.0.0
go: finding k8s.io/kube-scheduler v0.0.0
go: finding k8s.io/cluster-bootstrap v0.0.0
go: finding k8s.io/csi-translation-lib v0.0.0
go: finding k8s.io/client-go v0.0.0
go: finding k8s.io/kubelet v0.0.0
go: finding k8s.io/sample-apiserver v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: finding k8s.io/apimachinery v0.0.0
go: finding k8s.io/kube-controller-manager v0.0.0
go: finding k8s.io/kube-aggregator v0.0.0
go: finding k8s.io/metrics v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: finding k8s.io/code-generator v0.0.0
go: finding k8s.io/cri-api v0.0.0
go: finding k8s.io/legacy-cloud-providers v0.0.0
go: finding k8s.io/component-base v0.0.0
go: finding k8s.io/cli-runtime v0.0.0
go: finding k8s.io/api v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: k8s.io/[email protected]: unknown revision v0.0.0
go: error loading module requirements

@andrewsykim
Copy link
Member Author

Thanks @NeilW! I agree that removing imports to k8s.io/kubernetes will help the case here. There were some discussions in the past to move k8s.io/kubernetes/cmd/cloud-controller-manager to either k8s.io/cloud-provider/cmd/cloud-controller-manager or k8s.io/cloud-controller-manager. The tricky thing with that is now all cloud-specific controllers also need to move to an external repo now since you can't import k8s.io/kubernetes from a staging repo. Would love your thoughts here on what would be ideal for your provider. cc @timoreimann for feedback from digitalocean

re: k8s.io/component-base not being semantically versioned, can you open an issue in kubernetes/kubernetes for that?

@NeilW
Copy link

NeilW commented Jun 21, 2019

I've spent a day struggling with 1.15 and I've still not managed to get the dependencies sorted out for the cloud-provider. It looks like I'll have to manually code 'replace' entries for all the repos in the 'staging' area of the kubernetes repo. So we definitely have a problem.

However that does open up a possibility for making cloud-providers more standard. If you built a dummy provider that responded to end to end tests, and published in a standard way, but didn't actually do anything, then you could 'replace' that provider's interface repo path with a path to a provider's repo that implements the same interface.

That allows you to simply replicate the standard repo as say 'brightbox-cloud-provider' and just change the 'replace' entry in the 'go.mod' to point to say 'brightbox/brightbox-cloud-provider-interface'. Then you can follow the same automated integration testing and deployment/publishing process as the standard dummy provider.

And on the interface repo that people like me maintain, we can run unit tests and set up the dependencies with our own 'go.mod' completely decoupled from the cloud-provider 'shell' the interface will be compiled into.

@NeilW
Copy link

NeilW commented Jun 21, 2019

In terms of a publishing process, the one I use with Hashicorp to publish our terraform provider is a good one. I go on a slack channel and ask them to roll a new release, and after a few manual checks the maintainer of the central repo holding the providers hits the go button on the automated release system.

Now Hashicorp have staff managing that central provider repo(https://github.com/terraform-providers), and that may not work with k8s given the nature of the project. But it's something to consider.

@timoreimann
Copy link
Contributor

I haven't upgraded DigitalOcean's CCM to 1.15 yet, but I do remember that moving to the 1.14 deps was quite a hassle. For instance, it required adding a replace directive for apimachinery which wasn't obvious for me to spot.

I noticed that the latest client-go v1.12 (corresponding to Kubernetes 1.15 as it seems) encodes these replace directives in its go.mod file now. My guess is that, if cloud-provider followed the same pattern of accurately pinning down dependencies per each release through Go modules, consumption of cloud-provider should become easier.

@NeilW's idea of providing a dummy provider is interesting, though I'm not sure I fully grasped yet how that'd be consumed. In general, I'd definitely appreciate a sample provider that described the canonical way of setting up a custom cloud provider; last time I went over some of the available implementations from the different clouds, they all had slight variations, which could easily have been the case because their development cycles can't possibly be synchronized perfectly; or maybe there are legitimate reasons to have divergent setups?

It'd be great to have a "source of truth" that outlines one or more recommended setups (similar to client-go's sample directory).

@timoreimann
Copy link
Contributor

@andrewsykim

There were some discussions in the past to move k8s.io/kubernetes/cmd/cloud-controller-manager to either k8s.io/cloud-provider/cmd/cloud-controller-manager or k8s.io/cloud-controller-manager.

I'm all in favor of removing any dependencies on k8s.io/kubernetes that are currently still in cloud provider since those tend to pull in a fair number of transitive packages (which are presumably not all required?).

What's the benefit of moving the cloud provider command part into a new, separate repository? My gut feeling is that it would be easier to reuse the existing k8s.io/cloud-provider repository we have today.
Is there any prior discussion available to possibly gain more context around the various pro's and con's?

@NeilW
Copy link

NeilW commented Jun 21, 2019

@NeilW's idea of providing a dummy provider is interesting, though I'm not sure I fully grasped yet how that'd be consumed.

Less that we would consume cloud-provider and more that it would consume us.

  1. Copy cloud-provider to a new repo digitalocean-cloud-provider within a k8s organisation that holds and publishes the cloud-providers.
  2. Alter the go.mod and add a replace that says k8s.io/cloud-provider-interface => github.com/digitalocean/digitalocean-cloud-provider-interface vX.Y.Z
  3. Run the release process on that repo, which compiles, builds and tests the cloud-provider then publishes the container somewhere central.

We then just build our provider interface libraries to the published Go Interface.

@NeilW
Copy link

NeilW commented Jun 21, 2019

In terms of updating to 1.15

  • Strip your go.mod down to just the requires for your provider
  • Auto generate the k8s.io require and replace entries using something like this

Hope that saves somebody a lot of time.

@andrewsykim
Copy link
Member Author

/assign @yastij

@andrewsykim andrewsykim added this to the v1.16 milestone Jul 10, 2019
@andrewsykim andrewsykim added the P1 Priority 1 label Jul 10, 2019
@andrewsykim
Copy link
Member Author

For v1.16: consensus on what the build/release process for CCM should look like.

@yastij
Copy link
Member

yastij commented Jul 12, 2019

A couple of things:

  • should we rely on prow + the fact that the release machinery should be oss for the ccms hosted under k/k ? I would say yes. This would let users know how the artefacts they're using are built.

  • I think that the process + outputs should be transparent as much as possible

  • As for base image, I think we should follow what we're doing upstream (i.e. images should be based on distroless)

also I think we should start publishing binaries stripped from in-tree cloud-providers, this would help to drive adoption. cc @kubernetes/release-engineering

@andrewsykim andrewsykim modified the milestones: v1.16, v1.17 Oct 2, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 31, 2019
@cheftako
Copy link
Member

cheftako commented Jan 2, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2020
@cheftako
Copy link
Member

cheftako commented Jan 2, 2020

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jan 2, 2020
@andrewsykim andrewsykim modified the milestones: v1.17, Next Feb 24, 2020
@andrewsykim andrewsykim modified the milestones: Next, v1.19 Apr 15, 2020
@andrewsykim
Copy link
Member Author

/help

@k8s-ci-robot
Copy link
Contributor

@andrewsykim:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Apr 15, 2020
@andrewsykim
Copy link
Member Author

@cheftako to put together short proposal for v1.19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. P1 Priority 1
Projects
None yet
Development

No branches or pull requests

8 participants