Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation: remove CEL for PolicyTargetRef to allow vendor extensions #3414

Merged
merged 4 commits into from
Jan 24, 2025

Conversation

ilrudie
Copy link
Contributor

@ilrudie ilrudie commented Jan 16, 2025

This PR removes CEL validation from PolicyTargetRef for two primary reasons:

  1. Allow vendor extensions by allowing Istio policy to target non-Istio references
  2. Allow Istio policy to target a GatewayClass

@ilrudie ilrudie requested a review from a team as a code owner January 16, 2025 19:05
@istio-testing istio-testing added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jan 16, 2025
@ilrudie ilrudie changed the base branch from target-ref-gwclass to master January 16, 2025 19:09
@istio-testing istio-testing added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 16, 2025
//
// When binding to a GatewayClass resource using PolicyTargetReference, your policy must be in the root namespace.
//
// Supported kinds are core/Service, networking.istio.io/ServiceEntry, gateway.networking.k8s.io/Gateway, and gateway.networking.k8s.io/GatewayClass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is documented on the individual usages. Like

networking/v1alpha3/envoy_filter.proto
871:  repeated istio.type.v1beta1.PolicyTargetReference targetRefs = 6;

We should update those. We might want to only include it there, in case there are per-policy supported types? If we know all will be the same then here is probably good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good shout. Actually, looking at this more closely it's probably most clear now with no CEL to simply not mention supported group/kind of the PolicyTargetRef message at all. Support likely varies by where it is used and should be mentioned where this message is used going forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to allow binding resources which are not Authorization Policy to a GatewayClass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved supported group/kind from being centralized on the message to being in our higher lever resources where the PolicyTargetRef is used. TBH, after the move I'm feeling less certain about doing it this way but it is an improvement over having inconsistent group/kind information in different contexts.

Copy link
Member

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

- https://github.com/istio/istio/issues/54696
releaseNotes:
- |
**Removed** CEL validation of group/kind for PolicyTargetReference to enable vendor extensions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're removing the CEL validation, should we add this to the OSS ValidatingWebhook?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe status, "you tried to bind to a beep.boop.bop.io/beepboop and Istiod has not honored this"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would work; just some indication since CEL isn't going to be enforcing it now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got this in the VWH in istio/istio#54780

@ilrudie ilrudie requested a review from keithmattix January 23, 2025 14:59
@LiorLieberman
Copy link

@ilrudie I followed up internally with a couple more Gateway folks and I actually share some of the concerns. The main one being - that AuthorizationPolicy lives in a namespace and it will have a global effect if you target a gatewayclass - which sounds very odd. Ideally, we want to encourage cluster-scope resources triggering global effect.

I know we have prior art for it in istio with the root namespace, but was hoping to avoid it with gateway apis if we can.

@howardjohn
Copy link
Member

@LiorLieberman the only way to achieve that is to duplicate every single CRD we have to add a ClusterXXX which is pretty untenable.

@LiorLieberman
Copy link

The conversation during last week's meeting branched into talking about how to achieve default deny and the current implementation (what Louis brought up). If we come back to the specific usecase that led to this discussion, I think if we were to implement what Louis said (default deny is presented when introducing waypoint) - it should be enough to satisfy that usecase.

@howardjohn
Copy link
Member

These are pretty different goals and requirements. There is a general need for ANY Istio policy to be able to be applied cluster-wide

@LiorLieberman
Copy link

I agree, but I think it deserves more thought. And perhaps to be solved in gateway. If there was a concrete need to implement it right away, and we (istio) did not want or did not have the time to wait for gateway to catch up - this would have make sense. But - I think the correct way to solve the specific problem that brought this up - is not by doing this. (Plus, as said - this will have wider implication which we may be able to avoid - or design differently)

@howardjohn
Copy link
Member

Are there other thoughts on how to achieve this? We have been thinking about this for like 3 years, I haven't actually heard any proposal except this.

@keithmattix
Copy link
Contributor

Are there other thoughts on how to achieve this? We have been thinking about this for like 3 years, I haven't actually heard any proposal except this.

+1 - regardless of what Gateway decides specifically, the only way to get around duplicating all of your CRDs is to limit GatewayClass binding to a privileged namespace. That's what we're doing here.

@LiorLieberman
Copy link

LiorLieberman commented Jan 23, 2025

I am not trying to present a strong opposition to this idea. Instead, I think this is not the right solution for the problem.

As a user, when I want to have default deny - I want to have one policy - I dont want to worry about creating new policies when presenting waypoint in the middle. So ideally I will have one authorization policy that denies everything (maintaining the status-quo today - in the istio-system namespace, not trying to change that). This is much a simpler UX, and a step forward to resolve current confusions around authzpolicies today in ambient.

I definitely dont think thats the only way, for example, (and beyond the proposed paramsRef approach) putting it in meshConfig is another option - I dont advocate for it, and it has probably been suggested in the past - this is just the first alternative came to mind.

@LiorLieberman
Copy link

@howardjohn The two objections that were presented in the meeting for ParametersRef are:

  1. The GatewayClass is something that the implementation is managing and the user should not touch it.

I think thats also somehow valid for the inverse (whats being proposed in this PR). I agree it is slightly better that the user should not edit it, but only reference it for this concern. But in anycase the user has to be fully aware of this.

  1. The need to bind multiple things in ParametersRef. If thats the concern, we can argue to make this a list upstream. I dont see any problem with this.

I think the paramsref approach maintains cluster-scope object triggers cluster-wide effects, which is more straightforward and has better UX IMO.

@howardjohn
Copy link
Member

As a user, when I want to have default deny - I want to have one policy - I dont want to worry about creating new policies when presenting waypoint in the middle. So ideally I will have one authorization policy that denies everything (maintaining the status-quo today - in the istio-system namespace, not trying to change that). This is much a simpler UX, and a step forward to resolve current confusions around authzpolicies today in ambient.

Ah I see and agree. This is not the purpose of this proposal, we are also planning to have a different change where waypoint applies L4 policies like ztunnel and then L7 policies from targetRef. More details coming here very soon

I think thats also somehow valid for the inverse (whats being proposed in this PR).

If so, users shouldn't reference a GatewayClass in Gateway, but we force them to.

@LiorLieberman
Copy link

LiorLieberman commented Jan 23, 2025

If so, users shouldn't reference a GatewayClass in Gateway, but we force them to.

Right, not saying its perfect, I am saying that this happens today, and it is not great. The fact that it exists does not mean its a great to build on it. I was convinced that ParametersRef are better approach here. And more importantly, since the user already is aware of the GatewayClass (cause they need to reference it from Gateways), they can as well edit the paramsRef, I dont see a huge regression here

Can you relate to why you against it? I referred to two of the concerns you brought up in the meeting. Anything else I did not consider?

@LiorLieberman
Copy link

Ah I see and agree. This is not the purpose of this proposal, we are also planning to have a different change where waypoint applies L4 policies like ztunnel and then L7 policies from targetRef. More details coming here very soon

Thats great, and again, I think this change is the right way to solve the usecase @ilrudie brought up.

@istio-testing istio-testing merged commit 03360c1 into istio:master Jan 24, 2025
5 checks passed
@ilrudie ilrudie deleted the target-ref-gwclass branch January 25, 2025 00:08
@ilrudie
Copy link
Contributor Author

ilrudie commented Jan 27, 2025

/cherry-pick release-1.25

@istio-testing
Copy link
Collaborator

@ilrudie: new pull request created: #3425

In response to this:

/cherry-pick release-1.25

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ilrudie ilrudie added the cherrypick/release-1.25 Set this label on a PR to auto-merge it to the release-1.25 branch label Jan 27, 2025
@istio-testing
Copy link
Collaborator

In response to a cherrypick label: new pull request could not be created: failed to create pull request against istio/api#release-1.25 from head istio-testing:cherry-pick-3414-to-release-1.25: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","code":"custom","message":"A pull request already exists for istio-testing:cherry-pick-3414-to-release-1.25."}],"documentation_url":"https://docs.github.com/rest/pulls/pulls#create-a-pull-request","status":"422"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherrypick/release-1.25 Set this label on a PR to auto-merge it to the release-1.25 branch size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants