Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metricbeat/module/mongodb: Improve logic to calculate oplog info and window #42224

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

shmsr
Copy link
Member

@shmsr shmsr commented Jan 6, 2025

Proposed commit message

This change tries to even improve the implementation done here that was made some time back. The previous changes did help a lot but this PR aims to further improve as several users have been reporting high CPU usage. Now we are following some recommended ways used by MongoDB themselves to calculate the oplog window (i.e., lastTs - firstTs of the log). The change now again leverages the natural order of the log along with using Limit (to restrict to just one doc) and Projection. Only expensive process right now when we do sort the the log in reverse i.e., $natural: -1. We also have ugpraded the client library to further reduce any issues in the client side related to query, etc.

Please see for a detailed comparison: #42224 (comment). Also, please read the inline comments in the code itself to understand the implemented logic; as I've documented it properly for future reference.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Author's Checklist

  • Match the implementation with other oplog window calculators
  • Test the source
  • Benchmark

How to test this PR locally

  • Tested with MongoDB replicaSet

Related issues

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 6, 2025
@mergify mergify bot assigned shmsr Jan 6, 2025
Copy link
Contributor

mergify bot commented Jan 6, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @shmsr? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Jan 6, 2025

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Jan 6, 2025
@shmsr shmsr changed the title [DRAFT/ DO NOT REVIEW]: metricbeat/module/mongodb: Improve logic metricbeat/module/mongodb: Improve logic to calculate oplog info Jan 8, 2025
@shmsr shmsr changed the title metricbeat/module/mongodb: Improve logic to calculate oplog info metricbeat/module/mongodb: Improve logic to calculate oplog info and window Jan 8, 2025
@shmsr shmsr added the Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team label Jan 8, 2025
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 8, 2025
@shmsr shmsr marked this pull request as ready for review January 8, 2025 06:48
@shmsr shmsr requested review from a team as code owners January 8, 2025 06:48
@shmsr
Copy link
Member Author

shmsr commented Jan 9, 2025

So, to test the consumption of cpu and memory resources over the changes made to calculate oplog info/ window:

Let's consider 3 cases that we want to compare:


We will use the following script to track docker stats i.e., cpu and memory usage over time for the mongodb replicaset setup where 3 nodes are set:

#!/bin/bash

COMPOSE_PROJECT_DIR="$1"
OUTPUT_FILE="$2"
INTERVAL=1

# Set default values
if [ -z "$COMPOSE_PROJECT_DIR" ]; then
    COMPOSE_PROJECT_DIR="."
fi

if [ -z "$OUTPUT_FILE" ]; then
    OUTPUT_FILE="cpu_memory_usage.log"
fi

cd "$COMPOSE_PROJECT_DIR"

# Clear previous logs
> "$OUTPUT_FILE"

while true; do
    echo "----------------------------------------"
    echo "Docker Container Usage - $(date)"
    echo "----------------------------------------"

    docker compose ps -q | while read -r container_id; do
        container_name=$(docker inspect --format '{{.Name}}' "$container_id" | sed 's/\///')
        stats=$(docker stats --no-stream --format "{{.CPUPerc}}\t{{.MemPerc}}" "$container_id")
        cpu_usage=$(echo "$stats" | cut -f1)
        mem_usage=$(echo "$stats" | cut -f2)
        timestamp=$(date +%s)

        echo "$timestamp,$container_name,$cpu_usage,$mem_usage" >> "$OUTPUT_FILE"
    done

    sleep $INTERVAL
done

We will use a custom benchmarking suite:

func runBenchmarkModeXXX(client *mongo.Client) {
	start := time.Now()
	iterations := 10000
	workers := 10

	var wg sync.WaitGroup
	errChan := make(chan error, iterations)

	batchSize := iterations / workers

	for w := 0; w < workers; w++ {
		wg.Add(1)
		go func() {
			defer wg.Add(-1)
			for i := 0; i < batchSize; i++ {
				if _, err := getReplicationInfoXXX(client); err != nil {
					errChan <- fmt.Errorf("iteration failed: %w", err)
				}
			}
		}()
	}

	go func() {
		wg.Wait()
		close(errChan)
	}()

	// Process errors
	for err := range errChan {
		log.Printf("Error: %v", err)
	}

	duration := time.Since(start)
	fmt.Printf("Benchmark Results:\n")
	fmt.Printf("Total iterations: %d\n", iterations)
	fmt.Printf("Total time: %v\n", duration)
	fmt.Printf("Average time per operation: %v\n", duration/time.Duration(iterations))
	fmt.Printf("Operations per second: %.2f\n", float64(iterations)/duration.Seconds())
}

To do the benchmarking and observe the usage. I have taken out the logic from all 3 cases and made them a standalone program so that we benchmark the change in isolation.


Case 1:
container_usage1

Notice that cpu peaks at 800%+ and memory peaks at 80%+ which is bad and hence we received so many issues around it that claimed that the this calculation is causing memory spikes.


Case 2:
container_usage2

This greatly improved but only the memory consumption. The introduction of aggregation pipeline did a good job. Now memory peaks at 1.6% and cpu still peaks at around 800%. And hence, we were not getting issues reported for memory but now we were getting issues reported for CPU spikes.


Case 3:

container_usage3

Here, the CPU peaks at 50% that too for a short while and memory peaks at 0.36%, a considerable improvement for both.

Please also note that, for Case 1 and Case 2 benchmark did not even complete after a minute in my setup but Case 3 took <10s to do the benchmarking and it reported the final stats.

From this, we can surely that Case 3 (current PR changes) does massive improvement in calculating the oplog window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant