Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: rebase merge branch migrate-metrics-dev into master #1758

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
0b4f0bb
feat(new_metrics): add replica-level metric entity (#1345)
empiredan Feb 13, 2023
eb93e04
feat(new_metrics): migrate replica-level metrics for write service (#…
empiredan Feb 23, 2023
1d59384
feat(new_metrics): migrate replica-level metrics for pegasus_server_i…
empiredan Mar 7, 2023
ae21e4a
feat(new_metrics): migrate replica-level metrics for pegasus_server_i…
empiredan Mar 10, 2023
982bb24
feat(new_metrics): migrate replica-level metrics for capacity_unit_ca…
empiredan Mar 14, 2023
28c8a93
feat(new_metrics): migrate replica-level metrics for replica class (#…
empiredan Mar 22, 2023
eeb8b51
feat(new_metrics): migrate replica-level metrics for pegasus_event_li…
empiredan Mar 23, 2023
1108d62
feat(new_metrics): migrate replica-level metrics for pegasus_mutation…
empiredan Mar 24, 2023
326757e
feat(new_metrics): add server-level metric entity (#1415)
empiredan Mar 25, 2023
565f951
feat(new_metrics): migrate built-in server-level metrics (#1418)
empiredan Mar 29, 2023
c6dcb7b
feat(new_metrics): migrate server-level metrics for nfs (#1421)
empiredan Mar 30, 2023
0db5649
feat(new_metrics): add disk-level metric entity and migrate disk-leve…
empiredan Apr 6, 2023
537612d
feat(new_metrics): add table-level metric entity and migrate table-le…
empiredan Apr 10, 2023
fc2351b
feat(new_metrics): add partition-level metric entity and migrate part…
empiredan Apr 14, 2023
f36cb58
feat(new_metrics): migrate server-level metrics for meta_service (#1437)
empiredan Apr 14, 2023
4d1f7f1
feat(new_metrics): add backup-policy-level metric entity and migrate …
empiredan Apr 15, 2023
1a37411
feat(new_metrics): migrate partition-level metrics for partition_guar…
empiredan Apr 16, 2023
447c61c
feat(new_metrics): migrate replica-level metrics for pegasus_manual_c…
empiredan Apr 17, 2023
66ae781
feat(new_metrics): migrate metrics for replica_stub (part 1) (#1455)
empiredan Apr 19, 2023
571cc42
feat(new_metrics): migrate metrics for replica_stub (part 2) (#1459)
empiredan Apr 26, 2023
679f56f
feat(collector): migrate the collector from pegasus-kv/collector (#1461)
acelyc111 Apr 27, 2023
adf4012
feat(new_metrics): migrate metrics for replica_stub (part 3) (#1462)
empiredan Apr 27, 2023
55efb07
feat(new_metrics): migrate metrics for replica_stub (part 4) (#1463)
empiredan May 5, 2023
3642edb
feat(new_metrics): migrate metrics for replica_stub (part 5) (#1469)
empiredan May 11, 2023
ba2136a
feat(new_metrics): migrate metrics for replica_stub (part 6) (#1474)
empiredan May 12, 2023
8811066
feat(new_metrics): migrate metrics for replica_stub (part 7) (#1475)
empiredan May 17, 2023
f607528
feat(new_metrics): migrate metrics for some duplication class (#1482)
empiredan May 22, 2023
c8f93f1
feat(new_metrics): migrate metrics for task queue (#1484)
empiredan May 25, 2023
6f29fa6
refactor(new_metrics): refactor enum definition for metric types and …
empiredan May 31, 2023
8cced34
feat(new_metrics): migrate metrics for failure detector (#1502)
empiredan Jun 1, 2023
b892f0c
feat(new_metrics): migrate metrics for network (#1504)
empiredan Jun 1, 2023
e147334
feat(new_metrics): migrate server-level metrics of rocksdb (#1506)
empiredan Jun 2, 2023
72af442
feat: Aggregate table/server level metrics (#1517)
xinghuayu007 Jun 7, 2023
2d2f632
feat(new_metrics): migrate metrics for profiler (#1524)
empiredan Jun 12, 2023
c028c1d
fix(new_metrics): profiled tasks are measured by the wrong metrics (#…
empiredan Jun 13, 2023
23b1261
feat(new_metrics): remove all table-level perf-counters for each repl…
empiredan Jun 14, 2023
5739d8b
refactor(new_metrics): remove perf-counters that are still used in sh…
empiredan Jun 14, 2023
917e7ef
feat(new_metrics): migrate metrics for latency tracer (#1537)
empiredan Jun 21, 2023
274c0fb
fix(new_metrics): total_capacity_mb/total_available_mb are not atomic…
empiredan Jun 25, 2023
3584394
feat(new_metrics): remove http service for perf counters (#1540)
empiredan Jun 25, 2023
d5563c7
feat(new_metrics): remove pegasus_counter_reporter (#1548)
empiredan Jun 27, 2023
c3d65bb
feat(new_metrics): remove perf counter since shared log has been dep…
empiredan Dec 8, 2023
481d9de
feat(new_metrics): remove deleted header files introduced in source …
empiredan Dec 8, 2023
69f5385
fix(new_metrics): fix unit tests in verifying values of backup_reques…
empiredan Dec 8, 2023
8b9221a
fix(IWYU): fix the suggestions reported by IWYU while rebase merging …
empiredan Dec 8, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions .github/workflows/lint_and_test_collector.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
---
# workflow name
name: Golang Lint and Unit Test - collector

# on events
on:
# run on each pull request
pull_request:
types: [ synchronize, reopened, opened ]
branches:
- master
- 'v[0-9]+.*' # release branch
- ci-test # testing branch for github action
- '*dev'
paths:
- collector/**

# for manually triggering workflow
workflow_dispatch:

# workflow tasks
jobs:
lint:
name: Lint
runs-on: ubuntu-20.04
steps:
- name: Checkout
uses: actions/checkout@v3
with:
fetch-depth: 1
- name: Set up Go
uses: actions/setup-go@v4
with:
go-version: 1.14
cache: false
- name: Lint
uses: golangci/golangci-lint-action@v3
with:
version: v1.29
working-directory: ./collector

build:
name: Build
runs-on: ubuntu-20.04
steps:
- name: Checkout
uses: actions/checkout@v3
with:
fetch-depth: 1
- name: Set up Go
uses: actions/setup-go@v2
with:
go-version: 1.14
- name: Build
working-directory: ./collector
run: make
30 changes: 24 additions & 6 deletions .github/workflows/lint_and_test_cpp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,11 @@ jobs:
- base_api_test
- base_test
- bulk_load_test
- detect_hotspot_test
# TODO(wangdan): Since the hotspot detection depends on the perf-counters system which
# is being replaced with the new metrics system, its test will fail. Temporarily disable
# the test and re-enable it after the hotspot detection is migrated to the new metrics
# system.
# - detect_hotspot_test
- dsn_aio_test
- dsn_block_service_test
- dsn_client_test
Expand All @@ -207,7 +211,9 @@ jobs:
- dsn_meta_state_tests
- dsn.meta.test
- dsn_nfs_test
- dsn_perf_counter_test
# TODO(wangdan): Since builtin_counters (memused.virt and memused.res) for perf-counters
# have been removed and dsn_perf_counter_test depends on them, disable it.
# - dsn_perf_counter_test
- dsn_replica_backup_test
- dsn_replica_bulk_load_test
- dsn_replica_dup_test
Expand Down Expand Up @@ -335,7 +341,11 @@ jobs:
- base_api_test
- base_test
- bulk_load_test
- detect_hotspot_test
# TODO(wangdan): Since the hotspot detection depends on the perf-counters system which
# is being replaced with the new metrics system, its test will fail. Temporarily disable
# the test and re-enable it after the hotspot detection is migrated to the new metrics
# system.
# - detect_hotspot_test
- dsn_aio_test
- dsn_block_service_test
- dsn_client_test
Expand All @@ -344,7 +354,9 @@ jobs:
- dsn_meta_state_tests
- dsn.meta.test
- dsn_nfs_test
- dsn_perf_counter_test
# TODO(wangdan): Since builtin_counters (memused.virt and memused.res) for perf-counters
# have been removed and dsn_perf_counter_test depends on them, disable it.
# - dsn_perf_counter_test
- dsn_replica_backup_test
- dsn_replica_bulk_load_test
- dsn_replica_dup_test
Expand Down Expand Up @@ -477,7 +489,11 @@ jobs:
# - base_api_test
# - base_test
# - bulk_load_test
# - detect_hotspot_test
# # TODO(wangdan): Since the hotspot detection depends on the perf-counters system which
# # is being replaced with the new metrics system, its test will fail. Temporarily disable
# # the test and re-enable it after the hotspot detection is migrated to the new metrics
# # system.
# # - detect_hotspot_test
# - dsn_aio_test
# - dsn_block_service_test
# - dsn_client_test
Expand All @@ -486,7 +502,9 @@ jobs:
# - dsn_meta_state_tests
# - dsn.meta.test
# - dsn_nfs_test
# - dsn_perf_counter_test
# # TODO(wangdan): Since builtin_counters (memused.virt and memused.res) for perf-counters
# # have been removed and dsn_perf_counter_test depends on them, disable it.
# # - dsn_perf_counter_test
# - dsn_replica_backup_test
# - dsn_replica_bulk_load_test
# - dsn_replica_dup_test
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/module_labeler_conf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ github:
- .github/**/*
admin-cli:
- admin-cli/**/*
collector:
- collector/**/*
docker:
- docker/**/*
go-client:
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -271,3 +271,6 @@ thirdparty/output/

#macOS
.DS_Store

#collector
collector/collector
24 changes: 24 additions & 0 deletions collector/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

build:
go mod tidy
go mod verify
go build -o collector

fmt:
go fmt ./...
28 changes: 28 additions & 0 deletions collector/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->

# Pegasus Collector

[中文文档]

Collector is a part of the Pegasus ecosystem that serves as:

1. the service availability detector
2. the hotkey detector
3. the capacity units recorder
92 changes: 92 additions & 0 deletions collector/aggregate/aggregatable.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

package aggregate

var v1Tov2MetricsConversion = map[string]string{
"replica*app.pegasus*get_qps": "get_qps",
"replica*app.pegasus*multi_get_qps": "multi_get_qps",
"replica*app.pegasus*put_qps": "put_qps",
"replica*app.pegasus*multi_put_qps": "multi_put_qps",
"replica*app.pegasus*remove_qps": "remove_qps",
"replica*app.pegasus*multi_remove_qps": "multi_remove_qps",
"replica*app.pegasus*incr_qps": "incr_qps",
"replica*app.pegasus*check_and_set_qps": "check_and_set_qps",
"replica*app.pegasus*check_and_mutate_qps": "check_and_mutate_qps",
"replica*app.pegasus*scan_qps": "scan_qps",
"replica*eon.replica*backup_request_qps": "backup_request_qps",
"replica*app.pegasus*duplicate_qps": "duplicate_qps",
"replica*app.pegasus*dup_shipped_ops": "dup_shipped_ops",
"replica*app.pegasus*dup_failed_shipping_ops": "dup_failed_shipping_ops",
"replica*app.pegasus*get_bytes": "get_bytes",
"replica*app.pegasus*multi_get_bytes": "multi_get_bytes",
"replica*app.pegasus*scan_bytes": "scan_bytes",
"replica*app.pegasus*put_bytes": "put_bytes",
"replica*app.pegasus*multi_put_bytes": "multi_put_bytes",
"replica*app.pegasus*check_and_set_bytes": "check_and_set_bytes",
"replica*app.pegasus*check_and_mutate_bytes": "check_and_mutate_bytes",
"replica*app.pegasus*recent.read.cu": "recent_read_cu",
"replica*app.pegasus*recent.write.cu": "recent_write_cu",
"replica*app.pegasus*recent.expire.count": "recent_expire_count",
"replica*app.pegasus*recent.filter.count": "recent_filter_count",
"replica*app.pegasus*recent.abnormal.count": "recent_abnormal_count",
"replica*eon.replica*recent.write.throttling.delay.count": "recent_write_throttling_delay_count",
"replica*eon.replica*recent.write.throttling.reject.count": "recent_write_throttling_reject_count",
"replica*app.pegasus*disk.storage.sst(MB)": "sst_storage_mb",
"replica*app.pegasus*disk.storage.sst.count": "sst_count",
"replica*app.pegasus*rdb.block_cache.hit_count": "rdb_block_cache_hit_count",
"replica*app.pegasus*rdb.block_cache.total_count": "rdb_block_cache_total_count",
"replica*app.pegasus*rdb.index_and_filter_blocks.memory_usage": "rdb_index_and_filter_blocks_mem_usage",
"replica*app.pegasus*rdb.memtable.memory_usage": "rdb_memtable_mem_usage",
"replica*app.pegasus*rdb.estimate_num_keys": "rdb_estimate_num_keys",
"replica*app.pegasus*rdb.bf_seek_negatives": "rdb_bf_seek_negatives",
"replica*app.pegasus*rdb.bf_seek_total": "rdb_bf_seek_total",
"replica*app.pegasus*rdb.bf_point_positive_true": "rdb_bf_point_positive_true",
"replica*app.pegasus*rdb.bf_point_positive_total": "rdb_bf_point_positive_total",
"replica*app.pegasus*rdb.bf_point_negatives": "rdb_bf_point_negatives",
}

var aggregatableSet = map[string]interface{}{
"read_qps": nil,
"write_qps": nil,
"read_bytes": nil,
"write_bytes": nil,
}

// aggregatable returns whether the counter is to be aggregated on collector,
// including v1Tov2MetricsConversion and aggregatableSet.
func aggregatable(pc *partitionPerfCounter) bool {
v2Name, found := v1Tov2MetricsConversion[pc.name]
if found { // ignored
pc.name = v2Name
return true // listed above are all aggregatable
}
_, found = aggregatableSet[pc.name]
return found
}

// AllMetrics returns metrics tracked within this collector.
// The sets of metrics from cluster level and table level are completely equal.
func AllMetrics() (res []string) {
for _, newName := range v1Tov2MetricsConversion {
res = append(res, newName)
}
for name := range aggregatableSet {
res = append(res, name)
}
return res
}
Loading
Loading