Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux CLI tool doesn't respect context namespace #3453

Open
1 task done
sharkymcdongles opened this issue Dec 30, 2022 · 13 comments
Open
1 task done

Flux CLI tool doesn't respect context namespace #3453

sharkymcdongles opened this issue Dec 30, 2022 · 13 comments

Comments

@sharkymcdongles
Copy link

Describe the bug

Say a Kubeconfig has the following block:

contexts:
- context:
    cluster: cluster
    namespace: services
    user: user
  name: cluster
current-context: cluster

When you run flux commands it will not target the namespace your context is currently using. It always targets flux-system unless the namespace flag is specified. For continuity sake between all kubernetes cli tools e.g. kubectl and helm the cli should use namespace context for CLI calls.

Steps to reproduce

  1. Set your kubecontext to a namespace containing a helmrelease e.g.
namespace: services
helmrelease: service
  1. Run
    flux reconcile hr service

  2. See this error
    ✗ helmreleases.helm.toolkit.fluxcd.io "service" not found

  3. Run
    flux reconcile hr service --namespace service

It works

Expected behavior

  1. Set your kubecontext to a namespace containing a helmrelease e.g.
namespace: services
helmrelease: service
  1. Run
    flux reconcile hr service
    it works

Screenshots and recordings

No response

OS / Distro

Fedora

Flux version

v0.38.2

Flux check

N/A

Git provider

No response

Container Registry provider

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@kingdonb
Copy link
Member

kingdonb commented Jan 1, 2023

I agree with this report in principle, however I'd like to note as a point of information that this is not a regression. This is how flux CLI has always worked. You can use the FLUX_SYSTEM_NAMESPACE environment variable to set flux CLI to work in a different namespace, but, generally, it is anticipated that on most clusters, most or all Flux resources will remain in the flux-system namespace, or at least that Flux resources will likely be centralized within a single namespace (eg. dev-team) for any given tenant.

You can use targetNamespace to target resources for apply into a different namespace. Otherwise, the flux cli only switches namespace when you pass it an -n – this diverges from the kubectl behavior in that we do not honor current-context in any way in the flux CLI. Maybe it should not diverge this way?

I think it would be very inconvenient for most users if the default flux-system namespace pin were removed, given understanding this likely constraint on a single-tenant system. It's hard to say whether most Flux installations will be single or multi-tenant installations at large.

@sharkymcdongles
Copy link
Author

@kingdonb I can better understand the way most people use flux from your perspective, and what you said makes perfect sense.

Perhaps there could be an environment variable for following the current context or not as a compromise? We do keep most resources within flux-system e.g. helmcharts, helmrepos, kustomizations, gitrepos etc., but having helmreleases in the intended namespace makes the most sense for how we interact with flux because if a helmrelease is having issues it's nice to be able to see the helm installation via helm commands to fix it. This is why having context follow would be nice since most of our flux interactions involve working with helmreleases.

@kingdonb
Copy link
Member

Thanks for that added context,

an environment variable for following the current context or not as a compromise

That sounds like a good compromise! Only, flux already follows the context, it's just namespace that isn't followed.

If FLUX_FOLLOW_CURRENT_NAMESPACE is true, then flux get ... will always use the namespace from kubectl's context – does that sound like it would help resolve the issue? (Does FOLLOW sound like the right verb here, any other feedback about this as a variable name?)

@sharkymcdongles
Copy link
Author

@kingdonb I think FOLLOW or INHERIT make sense. Either that or maybe FLUX_INHERIT_CONTEXT_NAMESPACE. Maybe say CONTEXT instead of CURRENT because the namespace comes from current context.

@kingdonb
Copy link
Member

kingdonb commented Jan 11, 2023

The rubyist in me wants every name to be as close to a complete and fully expressive sentence as possible, now it's clear the concept is a bit obscure; I don't think inherit quite captures what we're saying, (and I'm afraid of using big/overloaded words like inherit that somebody may incorrectly infer that this has something to do with object oriented inheritance.)

FLUX_NAMESPACE_FOLLOWS_KUBE_CONTEXT=true
or how about FLUX_NS_FOLLOW_KUBECONTEXT maybe?

Not to put the cart in front of the horse either, let's figure out who can contribute the PR for this and whether it could merge, or if there's an even better option I haven't considered, before I spend too much time on what color to paint the bikeshed 🙃

@sharkymcdongles
Copy link
Author

@kingdonb honestly I don't really mind what it is called so long as it is documented and I can use it.

I was just giving some ideas. I think either of your suggestions works for me. <3

@makkes
Copy link
Member

makkes commented Aug 8, 2023

I took a stab at implementing this today as this feature might be especially helpful in multi-tenant scenarios where each tenant has their Flux resources in a dedicated Namespace, different from the default of "flux-system".

What makes this incredibly hard to do is that the upstream library we use has no (easy) way to determine if the user explicitly requested a certain Namespace or not where in the latter case we would default to "flux-system". There is the Namespace() method on ClientConfig but that always returns "default" in case no Namespace is explicitly requested so we can't know if we should override that with "flux-system" or not. 🤔 At the same time the library doesn't provide enough access to its struct fields in which case we could re-implement parts of the Namespace() method ourselves.

Desired order of precedence:

flux-system (default setting) < kubeconfig context < FLUX_SYSTEM_NAMESPACE environment variable < --namespace flag

@makkes
Copy link
Member

makkes commented Aug 8, 2023

Ok, I found a way but it's much more involved than what I have time for right now. If anyone wants to pick my current progress up, here's the commit.

@makkes
Copy link
Member

makkes commented Aug 8, 2023

P.S.: Global variables are a PITA!

@kingdonb
Copy link
Member

kingdonb commented Nov 28, 2024

I kind of feel badly, like I'm here defending chesterton's fence, I would be OK with just fixing the bug if we consider it a bug.

There is an argument that it's a breaking change, for people who have built automation on the flux CLI, which is part of our GA (v1) semver contract; I do feel that we should not make a breaking change without proper notification. But it seems like the risk is minimal as long as we do some form of proper notification in the changelogs.

I would feel a bit badly if we made this change and people who wanted to set a fixed namespace (retain the old behavior) had to set FLUX_SYSTEM_NAMESPACE from that point forward, if only because then it becomes a poorly labeled thing. It makes sense right now to call it FLUX_SYSTEM_NAMESPACE as we don't follow namespace from the kubeconfig. If we do follow in a future version by default, then it may be better to call it FLUX_OVERRIDE_KUBECONFIG_NAMESPACE after this change (but then again that would be another breaking change.)

If we add another knob (eg. FLUX_NAMESPACE_FOLLOWS_KUBE_CONTEXT=true) then there is the possibility that a user has both variables set, which would be in conflict, and that's undesirable as well. Maybe the CLI should balk and quit if someone has set both variables in conflict. If we make users set to false in order to retain the original behavior down the line, I think that's actually better, it doesn't put anyone out; most new users won't be surprised if we follow the kubeconfig context namespace. App admins will expect follow behavior even if single-tenant cluster admins do not. It may be a question of which audience is bigger, or more likely to read the docs.

If users must set FLUX_NS_FOLLOWS_KUBE_CONTEXT to true to opt into the new behavior, then that's not a breaking change. We could change this to true by default whenever Flux v3 comes and then there's no conflict with SemVer. But it will make the "correct behavior" less discoverable for potentially a long time, if we consider this a bug. We could also introduce this variable in Flux 2.5, keep the current behavior, but announce that we will "fix the bug" in 2.6.

We're talking at the dev meeting right now about adding a temporary warning in the CLI for users who don't have either variable set (a warning emitted by the CLI in a minor release, for users who don't have the variable set, that behavior will change in the following minor release), so that way users get notified if they are reading the CLI output whether or not they are reading the changelogs.

I think that's overkill, technically BreakVer (not SemVer) but this all seems like a lot more ceremony than such a small change should require, even though it does accomplish the goal of getting the word out as long as people don't skip a minor release.

Then again maybe I am underestimating the number of users who will read neither the CLI output nor the changelogs.

@darkowlzz
Copy link
Contributor

darkowlzz commented Dec 13, 2024

This was discussed again in the dev meeting and this time everyone agreed that this as a bug. If kubeconfig context namespace is set, it should be respected.
Also, whenever this is fixed, there must be a note in the changelog highlighting this change in behavior.

@makkes
Copy link
Member

makkes commented Dec 13, 2024

This was discussed again in the dev meeting and this time everyone agreed that this as a bug.

I'm not sure if this was ever seriously disputed by anyone. Anyhow, here is my initial attempt of a fix.

@darkowlzz
Copy link
Contributor

@makkes in the previous meetings, concerns were expressed about whether to consider it a bug or an intentional design of the CLI which we shouldn't change. Some of it is in #5028 (comment) . But yesterday, everyone agreed to consider it as a bug and fix it by changing the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants