Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEP-1762: In Cluster Gateway Deployments #1757

Merged
merged 6 commits into from
Sep 27, 2023
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
294 changes: 294 additions & 0 deletions geps/gep-1762.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,294 @@
# GEP-1762: In Cluster Gateway Deployments

* Status: Provisional

## Overview

Gateway API provides a common abstraction over different implementations, whether they are implemented by cloud load balancers, in-cluster deployments, or other mechanisms. However, many in-cluster implementations have solved some of the same problems in different ways.

Related discussions:
* [Support cluster-local Gateways](https://github.com/kubernetes-sigs/gateway-api/discussions/1247)
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
* [Scaling Gateway Resources](https://github.com/kubernetes-sigs/gateway-api/discussions/1355)
* [Manual deployments](https://github.com/kubernetes-sigs/gateway-api/issues/1687)
* [Merging Gateways](https://github.com/kubernetes-sigs/gateway-api/pull/1863/)
* [Per-Gateway Infrastructure](https://github.com/kubernetes-sigs/gateway-api/pull/1757)

## Goals

* Provide prescriptive guidance for how in-cluster implementations should behave.
* Provide requirements (tested by conformance) for how in-cluster implementations should behave.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like most if not all of this GEP would be pretty difficult to write conformance tests for, do you have any ideas for how this would work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point. Not really... any ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unfortunately going to be a feature that doesn't/can't have conformance tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Provide requirements (tested by conformance) for how in-cluster implementations should behave.
* Provide requirements for how in-cluster implementations should behave.


Note that some changes will be suggestions, while others will be requirements.

## Non-Goals

* Provide guidance to how out-of-cluster implementations should behave. Rather, this document aims to bring consistency between these types.

## Terminology

This document uses a few terms throughout. To ensure consistency, they are defined below:

* In-cluster deployment: refers to an implementation that actuates a `Gateway` by running a data plane in the cluster.
This is *often*, but not necessarily, by deploying a `Deployment`/`DaemonSet` and `Service`.
* Automated deployment: refers to an implementation that automatically deploys the data plane based on a `Gateway`.
That is, the user simply creates a `Gateway` resource and the rest is handled behind the scenes by the implementation.
* Manual deployment: refers to an implementation that does NOT automatically deploy the data plane based on a `Gateway`.
This may require, for example, a user to manually create a `Deployment` and `Service` for the data plane, then link this to a `Gateway`.
robscott marked this conversation as resolved.
Show resolved Hide resolved

## Design

This GEP does not introduce new API fields, but rather standardizes how implementations should behave when implementing the existing API.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved

### Automated Deployments

A simple `Gateway`, as is configured below is assumed to be an automated deployment:

```yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: my-gateway
spec:
gatewayClassName: example
listeners:
- name: default
port: 80
protocol: HTTP
```

With this configuration, an implementation:
* MUST mark the Gateway as `Programmed` and provide an address in `Status.Addresses` where the Gateway can be reached on each configured port.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
* MUST label all generated resources (Service, Deployment, etc) with `gateway.networking.k8s.io/metadata.name: my-gateway` (where `my-gateway` is the name of the Gateway resource).
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
* MUST provision generated resources in the same namespace as the Gateway if they are namespace scoped resources.
* Cluster scoped resources are not recommended.
* MUST name all generated resources `my-gateway-example` (`<NAME>-<GATEWAY CLASS>`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should this handle multiple resources of the same kind without colliding? Was this intended to be something like "MUST use as a name prefix for generated resources"?

We, for example, automatically create multiple Deployments (the proxy instances and a configuration manager) and and multiple Services (admin traffic, proxy traffic, and some utility/status APIs), some associated with one Deployment and some with the others.

We currently do something like the prefix approach. IIRC it's caused some issues because of the name character limit, and I don't think we've found a better way around that than telling users to choose shorter names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it does not. Interesting case. On benefit of deterministic names is a lot of other policies attach via name. For example HPA. So having a common name is pretty handy. That is tricky if we have multiple though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dilemma something we can capture in a TODO section so that we can merge, but don't have to hold up this specific PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remain in favor of creating a TODO item for this in the GEP, with the intention that we don't move to implementable until we resolve this, but to allow us to move forward with this first large iteration and resolve this separately as its own focus.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's going to be very difficult to require this behavior without being disruptive to all in-cluster implementations. I think suggesting a pattern, and stating that it may just serve as a prefix (ie generateName instead of name) may be sufficient here.

This is not simply `NAME` to reduce the chance of conflicts with existing resources.

### Manual Deployments

Managing "Manual Deployments" is currently out of scope for this GEP, but may be added in the future.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved

### Customizations
robscott marked this conversation as resolved.
Show resolved Hide resolved

With any in-cluster deployment, customization requirements will arise.

Some common requirements would be:
* `Service.spec.type`, to control whether a service is a `ClusterIP` or `LoadBalancer`.
* IP in the Service to assign to it.
* Arbitrary labels and annotations on generated resources.
* Any other arbitrary fields; the list is unbounded. Some examples would be:
* CPU and memory requests
* Service `externalTrafficPolicy`
* Affinity rules.

#### Gateway Type

This is handled by [GEP-1651](https://github.com/kubernetes-sigs/gateway-api/pull/1653), so won't be described here.

#### Gateway IP

For setting a specific IP address in a generated `Service`, the `.Spec.Addresses` field can be used.

For example:

```yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: my-gateway
spec:
addresses:
- type: IPAddress
value: 1.1.1.1
gatewayClassName: example
listeners:
- name: default
port: 80
protocol: HTTP
```

This would generate a `Service` with `clusterIP` or `loadBalancerIP`, depending on the Service type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really tell if this is trying to tell implementations what they should do, or just provide an example of how some implementations handle this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent here got muddled through the revisions. Its meant to be "If they provide an IP, you better use it" and whether you use it as the clusterIP vs the loadBalancerIP is based on the type of service, which was determined by routability previously

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is already covered reasonably well in the spec:

// Addresses requested for this Gateway. This is optional and behavior can
// depend on the implementation. If a value is set in the spec and the
// requested address is invalid or unavailable, the implementation MUST
// indicate this in the associated entry in GatewayStatus.Addresses.
//
// The Addresses field represents a request for the address(es) on the
// "outside of the Gateway", that traffic bound for this Gateway will use.
// This could be the IP address or hostname of an external load balancer or
// other networking infrastructure, or some other address that traffic will
// be sent to.
//
// The .listener.hostname field is used to route traffic that has already
// arrived at the Gateway to the correct in-cluster destination.
//
// If no Addresses are specified, the implementation MAY schedule the
// Gateway in an implementation-specific manner, assigning an appropriate
// set of Addresses.
//
// The implementation MUST bind all Listeners to every GatewayAddress that
// it assigns to the Gateway and add a corresponding entry in
// GatewayStatus.Addresses.
//
// Support: Extended
//
// +optional
// <gateway:validateIPAddress>
// +kubebuilder:validation:MaxItems=16
// +kubebuilder:validation:XValidation:message="IPAddress values must be unique",rule="self.all(a1, a1.type == 'IPAddress' ? self.exists_one(a2, a2.type == a1.type && a2.value == a1.value) : true )"
// +kubebuilder:validation:XValidation:message="Hostname values must be unique",rule="self.all(a1, a1.type == 'Hostname' ? self.exists_one(a2, a2.type == a1.type && a2.value == a1.value) : true )"
Addresses []GatewayAddress `json:"addresses,omitempty"`

It may be worth describing how in-cluster implementations can accomplish this in this GEP, but I don't think we need to add any new requirements here. Just trying to avoid introducing too many unique sources of requirements in case they need to change in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it more clear, I think. LMK what you think


This follows the same behavior as out-of-cluster Gateway implementations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure I see the parallel here? They may be providing similar functionality, but I don't think the way it's accomplished is very similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Note: this differs from the selection of "manual deployment", which doesn't use `IPAddress` type.
Instead, these Gateways attach to some existing infrastructure (such as Service) which provides an address.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved

#### Labels and Annotations

Labels and annotations for generated resources are specified in `infrastructure`:

```yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: my-gateway
spec:
infrastructure:
labels:
foo: bar
annotations:
name: my-annotation
```

These are both `map[string]string` types, just like in `ObjectMeta`.

Any labels or annotations here are added to all generated resources.
Note this may mean an annotation intended for a `Service` may end up on a `Deployment` (for example).
This is typically not a concern; however, if an implementation is aware of specific meanings of certain labels or annotations, they MAY
exclude these from irrelevant resources.

This is intended to support integration with the kitchen-sync of Kubernetes extensions which rely on labels and annotations,
such as as prometheus scraping, object grouping/organization, etc.

howardjohn marked this conversation as resolved.
Show resolved Hide resolved
#### Arbitrary Customization

GEP-1867 introduces a new `infrastructure` field, which allows customization of some common configurations (version, size, etc)
and allows a per-Gateway generic `parametersRef`.
This can be utilized for the remainder of customizations.

### Resource Attachment

Resources Generated in response to the `Gateway` are guaranteed to have two attributes:
* A `gateway.networking.k8s.io/metadata.name: <NAME>` label.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
* A name `<NAME>-<GATEWAY CLASS>`.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved

The generated resources MUST be in the same namespaces as the `Gateway`.

Implementations MAY set `ownerReferences` to the `Gateway` in most cases, as well, but this is not required
as some implementations may have different cleanup mechanisms.


These can be relied on to attach resources to.
While "Policy attachment" in the gateway-api would use attachment to the actual `Gateway` resource itself,
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
many existing things attach only to resources like `Deployment` or `Service`.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved

An example using these:
```yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
name: gateway
spec:
gatewayClassName: example
listeners:
- name: default
hostname: "example.com"
port: 80
protocol: HTTP
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: gateway
spec:
# Match the generated Deployment by reference
# Note: Do not use `kind: Gateway`.
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: gateway-example
minReplicas: 2
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: gateway
spec:
minAvailable: 1
selector:
# Match the generated Deployment by label
matchLabels:
gateway.networking.k8s.io/metadata.name: gateway
```

Note: there is [discussion](https://github.com/kubernetes-sigs/gateway-api/discussions/1355) around a way to attach a HPA to a Gateway directly.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved

## API

This GEP extends the `infrastructure` API introduced in [GEP-1867](https://gateway-api.sigs.k8s.io/geps/gep-1867).

```go
type GatewayInfrastructure struct {
// Labels that should be applied to any resources created in response to this Gateway.
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
//
// For implementations creating other Kubernetes objects, this should be the `metadata.labels` field on resources.
// For other implementations, this refers to any relevant (implementation specific) "labels" concepts.
//
// If Labels is set on the GatewayClass as well, the labels are merged with the Gateway's labels taking precedence.
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just start with this on the Gateway and consider support on GatewayClass as future work? I think we could make a good argument for precedence to go in either direction here and/or to recreate the overrides and defaults concept from inherited policy attachment in the GatewayClass.spec.infrastructure field to enable both. I think that's a sufficiently complex conversation that it can be saved for a follow up and just removed from the initial scope of this GEP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it like that at first and @youngnick suggesting it would be a good requirement. I don't have a preference either way so will do whatever

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youngnick's traveling for a bit so may take some time to respond, but I did get some time to talk with him about this yesterday and he was on board with only targeting Gateway to start.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is on me, I think that having this work for Gateway first is better. Let's get something first, and add the GatewayClass later if needed.

// Support: Implementation-specific
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
Labels map[string]string `json:"labels,omitempty"`
// Annotations that should be applied to any resources created in response to this Gateway.
//
// For implementations creating other Kubernetes objects, this should be the `metadata.annotations` field on resources.
// For other implementations, this refers to any relevant (implementation specific) "annotations" concepts.
//
// If Annotations is set on the GatewayClass as well, the labels are merged with the Gateway's labels taking precedence.
//
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
// Support: Implementation-specific
Annotations map[string]string `json:"annotations,omitempty"`
// ParametersRef is a reference to a resource that contains the configuration
// parameters corresponding to the Gateway. This is optional if the
// controller does not require any additional configuration.
//
// ParametersRef can reference a standard Kubernetes resource, i.e. ConfigMap,
// or an implementation-specific custom resource. The resource must be namespace-scoped
// and live in the same namespace.
//
// If ParametersRef is set on both a Gateway and the corresponding GatewayClass, the merging
// behavior between the two are implementation-specific.
//
// Support: Implementation-specific
//
// +optional
ParametersRef *LocalParametersReference `json:"parametersRef,omitempty"`
...
}

type GatewayClassInfrastructure struct {
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
// Labels that should be applied to any resources created in response to this GatewayClass.
//
// For implementations creating other Kubernetes objects, this should be the `metadata.labels` field on resources.
// For other implementations, this refers to any relevant (implementation specific) "labels" concepts.
//
// If Labels is set on the Gateway as well, the labels are merged with the Gateway's labels taking precedence.
//
// Support: Implementation-specific
Labels map[string]string `json:"labels,omitempty"`
howardjohn marked this conversation as resolved.
Show resolved Hide resolved
// Annotations that should be applied to any resources created in response to this GatewayClass.
//
// For implementations creating other Kubernetes objects, this should be the `metadata.annotations` field on resources.
// For other implementations, this refers to any relevant (implementation specific) "annotations" concepts.
//
// If Annotations is set on the Gateway as well, the labels are merged with the Gateway's annotations taking precedence.
//
// Support: Implementation-specific
Annotations map[string]string `json:"annotations,omitempty"`
...
}

// LocalParametersReference identifies an API object containing controller-specific
// configuration resource within the cluster.
type LocalParametersReference struct {
// Group is the group of the referent.
Group Group `json:"group"`

// Kind is kind of the referent.
Kind Kind `json:"kind"`

// Name is the name of the referent.
//
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=253
Name string `json:"name"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its in GatewayInfrastructure, so probably pending decision on #1757 (comment)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd personally rather keep this GEP as focused as possible since we're really only a couple of weeks away from trying to be feature complete for v1.0. I don't really have anything against ParamsRef here, but similar to the comment above, it introduces some form of merging logic between GWC and GW, and I want to make sure we take the time to get it right so would rather leave it out of scope for now.

```