Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental config with merged density and load #1008

Merged
merged 1 commit into from
Feb 7, 2020

Conversation

mm4tt
Copy link
Contributor

@mm4tt mm4tt commented Feb 3, 2020

Ref. #1007

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 3, 2020
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2020
@mm4tt mm4tt force-pushed the get_rid_of_density branch 2 times, most recently from 122cffd to a17ac5e Compare February 4, 2020 15:53
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 4, 2020
@mm4tt mm4tt force-pushed the get_rid_of_density branch from a17ac5e to 0025dfb Compare February 5, 2020 11:45
@mm4tt mm4tt changed the title <WIP> Get rid of density test Merge density test into load test Feb 5, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 5, 2020
@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 5, 2020

I ran some manual 100 node tests and things look good.

/hold
I'll run it at 5k node scale today before merging

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 5, 2020
@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 5, 2020

/assign @wojtek-t

# BEGIN scheduler-throughput section
# Min number of pods per deployment to be used for measuring scheduler throughput
# to get enough samples and accurate measurements in small clusters.
{{$MIN_PODS_PER_DEPLOYMENT_TO_MEASURE_SCHEDULER_THROUGHPUT := 250}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is 250 coming from? Only the fact that it's 500 divisor? :D

What I'm mostly interested in is ensuring that we test large deployments (which is what we are doing in existing density test) - there are deployments of size 3000 there.
However, I think that constraining it to "cluster size" is actually reasonable - we will test 100 pod deployments in 100-node cluster and 5000-node deployments in 5k-node clusters.

So I personally think that instead of 250 here, I would simply use ".Nodes", just I would create many more namespaces there, to ensure that at least N(=500?) pods will be created.
[+ ensure that there are at least 2]

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
# BEGIN pod-startup-latency section
# Min number of pods to be used for measuring pod startup latency to get enough
# samples and accurate measurements in small clusters.
{{$MIN_PODS_TO_MEASURE_STARTUP_LATENCY := 500}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a more generic constant?
And use it both to determine min number of latency pods as well as min number of pods needed for scheduler throughput?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
Params:
action: start
labelSelector: group = scheduler-throughput
threshold: 60s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 60s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 100 node tests the 99th %-ile was around 10s, I extrapolated to 1min for 5k node test :)
But changing to 1h to match what we currently have for pods in the load test and added a TODO for that - basically to see whether we can get rid of this artificial pod-startup-latency setups and measure it (and assert on it) across the whole test.

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
- namespaceRange:
min: 1
max: {{$namespaces}}
replicasPerNamespace: {{$latencyReplicas}} # TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch :) Removed.

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
@mm4tt mm4tt force-pushed the get_rid_of_density branch from 93aac3e to d43144e Compare February 5, 2020 15:08
Copy link
Contributor Author

@mm4tt mm4tt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
Params:
action: start
labelSelector: group = scheduler-throughput
threshold: 60s
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 100 node tests the 99th %-ile was around 10s, I extrapolated to 1min for 5k node test :)
But changing to 1h to match what we currently have for pods in the load test and added a TODO for that - basically to see whether we can get rid of this artificial pod-startup-latency setups and measure it (and assert on it) across the whole test.

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
- namespaceRange:
min: 1
max: {{$namespaces}}
replicasPerNamespace: {{$latencyReplicas}} # TODO
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch :) Removed.

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
# BEGIN pod-startup-latency section
# Min number of pods to be used for measuring pod startup latency to get enough
# samples and accurate measurements in small clusters.
{{$MIN_PODS_TO_MEASURE_STARTUP_LATENCY := 500}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mm4tt mm4tt force-pushed the get_rid_of_density branch from d43144e to 192a1dd Compare February 5, 2020 15:25
@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 6, 2020

I run the tests at scale, the baseline run was the gce-performance-scale run from 02-04 - https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/1224739883763896323

Below comparing the relevant results from both tests

Pod startup latency

Baseline

{
  "version": "1.0",
  "dataItems": [
    {
      "data": {
        "Perc50": 0,
        "Perc90": 0,
        "Perc99": 1000
      },
      "unit": "ms",
      "labels": {
        "Metric": "create_to_schedule"
      }
    },
    {
      "data": {
        "Perc50": 1000,
        "Perc90": 1000,
        "Perc99": 2000
      },
      "unit": "ms",
      "labels": {
        "Metric": "schedule_to_run"
      }
    },
    {
      "data": {
        "Perc50": 1138.736383,
        "Perc90": 1752.331418,
        "Perc99": 2060.449444
      },
      "unit": "ms",
      "labels": {
        "Metric": "run_to_watch"
      }
    },
    {
      "data": {
        "Perc50": 2119.650554,
        "Perc90": 2752.734037,
        "Perc99": 3131.267012
      },
      "unit": "ms",
      "labels": {
        "Metric": "schedule_to_watch"
      }
    },
    {
      "data": {
        "Perc50": 2134.250446,
        "Perc90": 2773.996255,
        "Perc99": 3228.573599
      },
      "unit": "ms",
      "labels": {
        "Metric": "pod_startup"
      }
    }
  ]
}

This PR

{
  "version": "1.0",
  "dataItems": [
    {
      "data": {
        "Perc50": 2435.624141,
        "Perc90": 3059.753256,
        "Perc99": 3718.617312
      },
      "unit": "ms",
      "labels": {
        "Metric": "schedule_to_watch"
      }
    },
    {
      "data": {
        "Perc50": 2457.899606,
        "Perc90": 3130.831216,
        "Perc99": 3983.652041
      },
      "unit": "ms",
      "labels": {
        "Metric": "pod_startup"
      }
    },
    {
      "data": {
        "Perc50": 0,
        "Perc90": 0,
        "Perc99": 1000
      },
      "unit": "ms",
      "labels": {
        "Metric": "create_to_schedule"
      }
    },
    {
      "data": {
        "Perc50": 1000,
        "Perc90": 2000,
        "Perc99": 2000
      },
      "unit": "ms",
      "labels": {
        "Metric": "schedule_to_run"
      }
    },
    {
      "data": {
        "Perc50": 1157.777974,
        "Perc90": 1742.486806,
        "Perc99": 2192.646648
      },
      "unit": "ms",
      "labels": {
        "Metric": "run_to_watch"
      }
    }
  ]
}

Scheduler Throughput

Baseline

{
  "average": 99.00990099009915,
  "perc50": 100.6,
  "perc90": 102,
  "perc99": 104.8
}

This PR

{
  "average": 95.23809523809524,
  "perc50": 99.2,
  "perc90": 101.2,
  "perc99": 115
}

API-Call-Latency

Baseline

W0205 04:51:05.880] I0205 04:51:05.879556   14942 prometheus.go:108] Executing "histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{resource!=\"events\", verb!~\"WATCH|WATCHLIST|PROXY|proxy|CONNECT\"}[579m])) by (resource,  subresource, verb, scope, le))" at
 2020-02-05T04:50:52Z
W0205 04:51:13.491] I0205 04:51:13.490485   14942 prometheus.go:108] Executing "sum(increase(apiserver_request_duration_seconds_count{resource!=\"events\", verb!~\"WATCH|WATCHLIST|PROXY|proxy|CONNECT\"}[579m])) by (resource, subresource, scope, verb)" at 2020-02-05T04:50:52Z
W0205 04:51:13.710] I0205 04:51:13.710240   14942 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:pods Subresource: Verb:LIST Scope:namespace Latency:perc50: 33.872901ms, perc90: 92.702588ms, perc99: 3.094022988s Count:49727}; 
threshold: 5s
W0205 04:51:13.711] I0205 04:51:13.711072   14942 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:deployments Subresource: Verb:LIST Scope:cluster Latency:perc50: 50ms, perc90: 1.45s, perc99: 1.495s Count:4}; threshold: 30s
W0205 04:51:13.711] I0205 04:51:13.711503   14942 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:services Subresource: Verb:LIST Scope:cluster Latency:perc50: 135.646387ms, perc90: 194.890109ms, perc99: 1.339166666s Count:393}
; threshold: 30s
W0205 04:51:13.712] I0205 04:51:13.711817   14942 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:nodes Subresource: Verb:LIST Scope:cluster Latency:perc50: 281.161616ms, perc90: 339.464751ms, perc99: 1.151083333s Count:3087}; 
threshold: 30s
W0205 04:51:13.712] I0205 04:51:13.712131   14942 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:persistentvolumes Subresource: Verb:LIST Scope:cluster Latency:perc50: 28.921078ms, perc90: 65.846153ms, perc99: 405.249999ms Cou
nt:1158}; threshold: 30s

baseline

This PR

W0206 03:24:30.481] I0206 03:24:30.481494   13663 prometheus.go:108] Executing "sum(increase(apiserver_request_duration_seconds_count{resource!=\"events\", verb!~\"WATCH|WATCHLIST|PROXY|proxy|CONNECT\"}[612m])) by (resource, subresource, scope, verb)" at 2020-02-06T03:24:12Z
W0206 03:24:30.658] I0206 03:24:30.658187   13663 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:deployments Subresource: Verb:LIST Scope:cluster Latency:perc50: 1.59057971s, perc90: 1.969354838s, perc99: 3.09s Count:246}; thr
eshold: 30s
W0206 03:24:30.659] I0206 03:24:30.658242   13663 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:pods Subresource: Verb:LIST Scope:namespace Latency:perc50: 36.181327ms, perc90: 93.696929ms, perc99: 3.027289416s Count:59781}; 
threshold: 5s
W0206 03:24:30.659] I0206 03:24:30.658254   13663 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:nodes Subresource: Verb:LIST Scope:cluster Latency:perc50: 284.575371ms, perc90: 346.103999ms, perc99: 1.167596153s Count:3257}; 
threshold: 30s
W0206 03:24:30.659] I0206 03:24:30.658262   13663 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:statefulsets Subresource: Verb:DELETE Scope:namespace Latency:perc50: 26.612903ms, perc90: 47.903225ms, perc99: 800.999999ms Coun
t:99}; threshold: 1s
W0206 03:24:30.659] I0206 03:24:30.658270   13663 api_responsiveness_prometheus.go:282] APIResponsivenessPrometheusSimple: Top latency metric: {Resource:services Subresource: Verb:LIST Scope:cluster Latency:perc50: 138.931297ms, perc90: 193.414634ms, perc99: 672ms Count:412}; thres
hold: 30s

Gvfc5CJK1Eg

Summary

I think the results look reasonable enough to merge this PR. We see some regression in pod-startup-latency (from 3.2s to 3.9s) but this is still well within 5s SLO. Moreover, we'd like to get rid of this artificial way of measuring it (see #1024) and this small regression shouldn't block us from doing it, quite the opposite it should encourage us to debug and improve it. On the other hand the we see improvements in api-call-latency and the whole test (density + load) now takes 2h less.

# failure won't fail the test. See https://github.com/kubernetes/kubernetes/issues/73461#issuecomment-467338711
{{$saturationDeploymentHardTimeout := MaxInt $saturationDeploymentTimeout 1200}}

# TODO(https://github.com/kubernetes/perf-tests/issues/1007): Get rid of this file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also high-density test, which relies on it.
Medium-term, we should modify the new load to support it (e.g. instead of creating 2 deployments, create N), but short-term, maybe let's leave this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sent you PRs to address it

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
{{$schedulerThroughputThreshold := DefaultParam .CL2_SCHEDULER_THROUGHPUT_THRESHOLD 0}}
# END scheduler-throughput section

# TODO(https://github.com/kubernetes/perf-tests/issues/1024): Ideally, we wouldn't need this section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
s/Ideally .../Investigate and get rid of this section./

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

min: {{AddInt $namespaces 1}}
max: {{AddInt $namespaces $schedulerThroughputNamespaces}}
replicasPerNamespace: 1
tuningSet: PodThroughputParallel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PodThroughputParallel name is misleading, as in fact we create deployments with that throughput. Given those are fairly big ones, pod throughput in fact may potentially be much higher.

I don't have a good name though...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, I wanted to write SchedulerThroughputParallel, meaning that this is a dedicated tuningSet for SchedulerThroughput that results in fully parallel creations of deployments.
Changed and documented better.

clusterloader2/testing/load/config.yaml Outdated Show resolved Hide resolved
mm4tt added a commit to mm4tt/perf-tests that referenced this pull request Feb 6, 2020
This is to allow deprecating density tests in all other places except
high-density. See kubernetes#1008 (comment)
mm4tt added a commit to mm4tt/perf-tests that referenced this pull request Feb 6, 2020
This is to allow deprecating density tests in all other places except
high-density. See kubernetes#1008 (comment)
@mm4tt mm4tt force-pushed the get_rid_of_density branch from 192a1dd to 6c5052d Compare February 6, 2020 10:59
@wojtek-t
Copy link
Member

wojtek-t commented Feb 6, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2020
@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 6, 2020

/hold
Will merge once #1026 gets merged

@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 6, 2020

@oxddr (current scalability oncall) FYI

I'm merging this today so we can check it tomorrow and revert before the weekend if needed.

@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 6, 2020

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 6, 2020
@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 6, 2020

/test pull-perf-tests-clusterloader2

@mm4tt mm4tt force-pushed the get_rid_of_density branch from 6c5052d to 0d48e27 Compare February 6, 2020 15:23
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2020
@wojtek-t
Copy link
Member

wojtek-t commented Feb 6, 2020

@mm4tt - what has changed?

@mm4tt mm4tt force-pushed the get_rid_of_density branch from 0d48e27 to e1bb781 Compare February 7, 2020 11:38
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 7, 2020
@mm4tt mm4tt changed the title Merge density test into load test Experimental config with merged density and load Feb 7, 2020
@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 7, 2020

Given some flakes I noticed in the presubmits I changed the code to make it no-op for existing tests.
I'm forking the load test into a separate config where I'll make my changes and enable it as an experimental job. Once we get enough results and confidence that the new test works as expected I'll start enabling it in CI/CD jobs.

@mm4tt mm4tt force-pushed the get_rid_of_density branch from e1bb781 to 1f5b392 Compare February 7, 2020 11:41
@mm4tt
Copy link
Contributor Author

mm4tt commented Feb 7, 2020

@wojtek-t, PTAL

This should be no-op now

@wojtek-t
Copy link
Member

wojtek-t commented Feb 7, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 7, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mm4tt, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants