panic: runtime error: invalid memory address or nil pointer dereference #12502

girgen · 2023-01-12T17:17:19Z

Relevant telegraf.conf

# Configuration for telegraf agent

# Global tags can be specified here in key="value" format.
[global_tags]
  dc = "custom" # will tag all metrics with dc=pionen
  prod = "true"

# add this to rc.conf and put relevant files there
# telegraf_flags="-config-directory /usr/local/etc/telegraf.d"

[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will cache metric_buffer_limit metrics for each output, and will
  ## flush this buffer on a successful write.
  metric_buffer_limit = 1000
  ## Flush the buffer whenever full, regardless of flush_interval.
  flush_buffer_when_full = true

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "1s"

  ## Default flushing interval for all outputs. You shouldn't set this below
  ## interval. Maximum flush_interval will be flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "5s"

  ## Run telegraf in debug mode
  debug = false
  ## Run telegraf in quiet mode
  quiet = true
  ## Override default hostname, if empty use os.Hostname()
  hostname = "xxx-hostname.yyy.zzz"

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  logtarget = "file"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  logfile = "/var/log/telegraf/telegraf.log"

  ## The logfile will be rotated after the time interval specified.  When set
  ## to 0 no time based rotation is performed.  Logs are rotated only when
  ## written to, if there is no log activity rotation may be delayed.
  logfile_rotation_interval = "0h"

  ## The logfile will be rotated when it becomes larger than the specified
  ## size.  When set to 0 no size based rotation is performed.
  logfile_rotation_max_size = "1MB"

  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  logfile_rotation_max_archives = 5


# Configuration for influxdb server to send metrics to
[[outputs.influxdb]]
  ## The full HTTP or UDP endpoint URL for your InfluxDB instance.
  ## Multiple urls can be specified as part of the same cluster,
  ## this means that only ONE of the urls will be written to each interval.
  # urls = ["udp://192.168.1.244:8090"] # UDP endpoint example
  urls = ["https://influx.cxx.zzz:8086"] # required
  ## The target database for metrics (telegraf will create it if not exists).
  database = "pp_prod" # required
  ## Retention policy to write to.
  retention_policy = "default"
  ## Precision of writes, valid values are "ns", "us" (or "µs"), "ms", "s", "m", "h".
  ## note: using "s" precision greatly improves InfluxDB compression.
  precision = "s"

  ## Write timeout (for the InfluxDB client), formatted as a string.
  ## If not provided, will default to 5s. 0s means no timeout (not recommended).
  timeout = "5s"

  username = "xxxxx"
  password = "sEcReT"
  ## Set the user agent for HTTP POSTs (can be useful for log differentiation)
  # user_agent = "telegraf"
  ## Set UDP payload size, defaults to InfluxDB UDP Client default (512 bytes)
  # udp_payload = 512


# Statsd Server
[[inputs.statsd]]
  ## Address and port to host UDP listener on
  service_address = "localhost:8125"
  ## Delete gauges every interval (default=false)
  delete_gauges = true
  ## Delete counters every interval (default=false)
  delete_counters = true
  ## Delete sets every interval (default=false)
  delete_sets = true
  ## Delete timings & histograms every interval (default=true)
  delete_timings = true
  ## Percentiles to calculate for timing & histogram stats
  percentiles = [70.0, 90.0]

  ## separator to use between elements of a statsd metric
  metric_separator = "_"

  ## Number of UDP messages allowed to queue up, once filled,
  ## the statsd server will start dropping packets
  allowed_pending_messages = 10000

  ## Number of timing/histogram values to track per-measurement in the
  ## calculation of percentiles. Raising this limit increases the accuracy
  ## of percentiles but also increases the memory usage and cpu time.
  percentile_limit = 1000

  ## UDP packet size for the server to listen for. This will depend on the size
  ## of the packets that the client is sending, which is usually 1500 bytes.
  udp_packet_size = 1500



### Logs from Telegraf

```text
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x98b91f]

goroutine 1 [running]:
github.com/awnumar/memguard/core.Purge.func1(0xc000155930)
	github.com/awnumar/[email protected]/core/exit.go:23 +0x3f
github.com/awnumar/memguard/core.Purge()
	github.com/awnumar/[email protected]/core/exit.go:51 +0x25
github.com/awnumar/memguard/core.Panic({0x5093f80, 0xc0001119a0})
	github.com/awnumar/[email protected]/core/exit.go:85 +0x25
github.com/awnumar/memguard/core.NewBuffer(0x20)
	github.com/awnumar/[email protected]/core/buffer.go:73 +0x2d5
github.com/awnumar/memguard/core.NewCoffer()
	github.com/awnumar/[email protected]/core/coffer.go:30 +0x34
github.com/awnumar/memguard/core.init.0()
	github.com/awnumar/[email protected]/core/enclave.go:15 +0x2e



### System info

Telegraf 1.25.0, FreeBSD-13.1

### Docker

_No response_

### Steps to reproduce

1. install latest telegraf 1.25.0 from ports or package
2. start
3. se it crash and restart in loop (due to the start script's daemon restart)



### Expected behavior

not panic :)

### Actual behavior

it panics due to segmentation fault

### Additional info

1.24.x works fine. The problem was introduced with telegraf 1.25.0

I am the packager for FreeBSD, btw.

The text was updated successfully, but these errors were encountered:

powersj · 2023-01-12T17:31:39Z

This sounds very similar to #12403, which is due to awnumar/memguard#144

powersj · 2023-01-12T17:54:36Z

@girgen - was this run in a jail or some other type of container? Or was this on a vanilla freebsd system?

girgen · 2023-01-12T20:38:35Z

Ah, yes, in a jail. Sorry, forgot to mention that. I run most stuff in jails.

girgen · 2023-01-13T00:50:44Z

Is there a way simple, or a least not too hard, way to opt-out that module at buildtime? That would mean to opt out the secret stash feature, but as a short term solution, that would be preferred.

My other alternative is to let the start script fail and inform the user to reconfigure the jail if it does not have the allow.sysvipc=1 jail parameter.

Third alternative would be to downgrade the port until the problem is fixed.

The first alternativ is preferred. Can we fix a patch for the source code that opts out that module?

Best regards,
Palle

powersj · 2023-01-13T03:42:12Z

Is there a way simple, or a least not too hard, way to opt-out that module at buildtime?

The panic occurs during an init() that happens even with an empty import of memguard as a result, the fix looks to need to be in memguard.

Using golang.org/x/sys/unix and checking the error from the following may be enough:

err := unix.Mlockall(unix.MCL_FUTURE | unix.MCL_CURRENT)

But not sure how that reacts on non-linux/unix systems, need to play with this further.

girgen · 2023-01-13T21:07:36Z

Mm, yeah, something like

package core

import (
        "golang.org/x/sys/unix"
        "github.com/awnumar/memcall"
)

func init() {
        err := unix.Mlockall(unix.MCL_FUTURE | unix.MCL_CURRENT)
        if (err != nil) {

but then what? :) How can I opt out in that case? 🤔

powersj · 2023-01-18T13:55:55Z

but then what? :) How can I opt out in that case? thinking

My first goal was to try to see how that library could not panic, which would allow us to continue using the library. The opt-out would then not be necessary, as we can safely import it and would only need to throw and error if someone tried to use the secret-store features when the jail/container/etc. did not have the correct privilege.

For now, I believe what you should document is the need to add the least amount of privileges. I think that is the allow.mlock parameter. Let me know if that is indeed the minimum required or if you need additional parameters please!

owlcall · 2023-01-22T00:18:51Z

Adding allow.mlock = 1; in the jail config resolved the panic. Thank you. No other changes needed to be made other than the jail config.

Details from my issue (for posterity/relevance):

Running FreeBSD 13.1 RELEASE (telegraf-1.25), problems started suddenly around a month ago. Configs have been untouched for a very long time, but telegraf updates are automated.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x98b91f]

goroutine 1 [running]:
github.com/awnumar/memguard/core.Purge.func1(0xc000143930)
        github.com/awnumar/[email protected]/core/exit.go:23 +0x3f
github.com/awnumar/memguard/core.Purge()
        github.com/awnumar/[email protected]/core/exit.go:51 +0x25
github.com/awnumar/memguard/core.Panic({0x5093f80, 0xc0000f59b0})
        github.com/awnumar/[email protected]/core/exit.go:85 +0x25
github.com/awnumar/memguard/core.NewBuffer(0x20)
        github.com/awnumar/[email protected]/core/buffer.go:73 +0x2d5
github.com/awnumar/memguard/core.NewCoffer()
        github.com/awnumar/[email protected]/core/coffer.go:30 +0x34
github.com/awnumar/memguard/core.init.0()
        github.com/awnumar/[email protected]/core/enclave.go:15 +0x2e

Errors above are identical to those seen in issue #12403.

girgen · 2023-03-01T08:59:12Z

Hi,

While this is workaround, I would still like to pursue the idea of actually change the code to afvoid using the mlock. Is the mlock really necessary?

powersj · 2023-03-01T14:01:56Z

I would still like to pursue the idea of actually change the code to afvoid using the mlock. Is the mlock really necessary?

The memguard library is used by Telegraf in the secret store functionality and that is not a feature we are going to remove. If you have an idea as to workaround the import of the library when not necessary please do put up a PR.

girgen added the bug unexpected problem or unintended behavior label Jan 12, 2023

powersj added the waiting for response waiting for response from contributor label Jan 12, 2023

telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jan 12, 2023

powersj mentioned this issue Mar 10, 2023

FreeBSD invalid memory address or nil pointer dereference #12838

Closed

irwinsun mentioned this issue Apr 23, 2023

Goagent版本优化 TencentBlueKing/bk-ci#8671

Closed

srebhan mentioned this issue Aug 1, 2023

docs(config): Add notes for secret-store pitfalls #13707

Merged

powersj closed this as completed in #13707 Aug 1, 2023

powersj mentioned this issue Oct 16, 2023

panic: runtime error: invalid memory address or nil pointer dereference #14118

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

panic: runtime error: invalid memory address or nil pointer dereference #12502

panic: runtime error: invalid memory address or nil pointer dereference #12502

girgen commented Jan 12, 2023

powersj commented Jan 12, 2023

powersj commented Jan 12, 2023

girgen commented Jan 12, 2023

girgen commented Jan 13, 2023

powersj commented Jan 13, 2023

girgen commented Jan 13, 2023

powersj commented Jan 18, 2023 •

edited

Loading

owlcall commented Jan 22, 2023

girgen commented Mar 1, 2023

powersj commented Mar 1, 2023

panic: runtime error: invalid memory address or nil pointer dereference #12502

panic: runtime error: invalid memory address or nil pointer dereference #12502

Comments

girgen commented Jan 12, 2023

Relevant telegraf.conf

powersj commented Jan 12, 2023

powersj commented Jan 12, 2023

girgen commented Jan 12, 2023

girgen commented Jan 13, 2023

powersj commented Jan 13, 2023

girgen commented Jan 13, 2023

powersj commented Jan 18, 2023 • edited Loading

owlcall commented Jan 22, 2023

girgen commented Mar 1, 2023

powersj commented Mar 1, 2023

powersj commented Jan 18, 2023 •

edited

Loading