Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Server creation from the same snapshot sometimes takes an abnormal time #966

Open
frederic-arr opened this issue Jul 28, 2024 · 2 comments
Assignees

Comments

@frederic-arr
Copy link

What happened?

When creating several servers from snapshots, the provider sometimes waits an additional minute in the "Still creating..." state despite the server being ready for a while.

# Omitted
hcloud_server.arm_small[2]: Creation complete after 1m9s [id=50992996]
hcloud_server.arm_small[0]: Creation complete after 1m9s [id=50992998]
hcloud_server.arm_small[1]: Creation complete after 1m10s [id=50992999]
hcloud_server.arm_small[4]: Still creating... [1m10s elapsed]
hcloud_server.arm_small[3]: Still creating... [1m10s elapsed]
# The servers are ready soon after this message
# Omitted
hcloud_server.arm_small[4]: Creation complete after 2m11s [id=50992997]
hcloud_server.arm_small[3]: Creation complete after 2m11s [id=50993000]

What did you expect to happen?

The creation should be marked as complete as soon as possible (without exceeding the API rate limit).

Please provide a minimal working example

terraform {
  required_providers {
    hcloud = {
      source  = "hetznercloud/hcloud"
      version = "1.48.0"
    }
  }
}

variable "hcloud_token" {
  type      = string
  sensitive = true
}

variable "image_id_arm" {
  type = string
}

provider "hcloud" {
  token = var.hcloud_token
}

resource "hcloud_server" "arm_small" {
  name        = "arm-small-${count.index}"
  image       = var.image_id_arm
  server_type = "cax11"
  location    = "nbg1"
  public_net {
    ipv4_enabled = false
    ipv6_enabled = true
  }

  count = 5
}

var.image_id_arm is the image id of my snapshot. The snapshot was made in NBG1 and is 350MB. It is made with the following packer file:

# hcloud.pkr.hcl

packer {
  required_plugins {
    hcloud = {
      source  = "github.com/hetznercloud/hcloud"
      version = "~> 1"
    }
  }
}

variable "talos_version" {
  type    = string
  default = "v1.7.5"
}

variable "arch" {
  type    = string
  default = "arm64"
}

variable "server_type" {
  type    = string
  default = "cax11"
}

variable "server_location" {
  type    = string
  default = "nbg1"
}

locals {
  image = "https://github.com/siderolabs/talos/releases/download/${var.talos_version}/hcloud-${var.arch}.raw.xz"
}

source "hcloud" "talos" {
  rescue       = "linux64"
  image        = "debian-12"
  location     = "${var.server_location}"
  server_type  = "${var.server_type}"
  ssh_username = "root"

  snapshot_name   = "talos system disk - ${var.arch} - ${var.talos_version}"
  snapshot_labels = {
    type    = "infra",
    os      = "talos",
    version = "${var.talos_version}",
    arch    = "${var.arch}",
  }
}

build {
  sources = ["source.hcloud.talos"]

  provisioner "shell" {
    inline = [
      "apt-get install -y wget",
      "wget -O /tmp/talos.raw.xz ${local.image}",
      "xz -d -c /tmp/talos.raw.xz | dd of=/dev/sda && sync",
    ]
  }
}
@apricote apricote self-assigned this Aug 8, 2024
@apricote
Copy link
Member

apricote commented Aug 8, 2024

The API client currently uses an exponential backoff while waiting for actions. The version in the provider does not use a max time, so it can get to a point where there are minutes between each check.

We have added an alternative "truncated" backoff in a recent version of hcloud-go: hetznercloud/hcloud-go#459

Unfortunately the hcloud-go in the terraform provider is stuck on the v1 branch (see #877).

For now you can switch to a constant backoff by modifying your provider:

provider "hcloud" {
  token = var.hcloud_token

  poll_function = "constant"
  poll_interval = "5s" # default is 500ms, which might cause rate-limit issues with the constant polling
}

Copy link

github-actions bot commented Nov 6, 2024

This issue has been marked as stale because it has not had recent activity. The bot will close the issue if no further action occurs.

@github-actions github-actions bot added the stale label Nov 6, 2024
@jooola jooola added pinned and removed stale labels Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants