Skip to content

Commit

Permalink
Merge branch 'rachel-stage' into stage
Browse files Browse the repository at this point in the history
  • Loading branch information
stu-clark committed Nov 15, 2024
2 parents 6623371 + 6edf6f1 commit fff2bb8
Show file tree
Hide file tree
Showing 41 changed files with 514 additions and 172 deletions.
1 change: 1 addition & 0 deletions content/cumulus-netq-411/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ cascade:
version: "4.11"
imgData: cumulus-netq
siteSlug: cumulus-netq
old: true
---

NVIDIA® NetQ™ is a network operations tool set that provides visibility into your overlay and underlay networks, enabling troubleshooting in real-time. NetQ delivers data and statistics about the health of your data center—from the container, virtual machine, or host, all the way to the switch and port. NetQ correlates configuration and operational status, and tracks state changes while simplifying management for the entire Linux-based data center. With NetQ, network operations change from a manual, reactive, node-by-node approach to an automated, informed, and agile one. Visit {{<exlink url="https://www.nvidia.com/en-us/networking/ethernet-switching/netq/" text="Network Operations with NetQ">}} to learn more.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,19 +75,19 @@ If you restore NetQ data to a server with an IP address that is different from t
{{</notice>}}

```
cumulus@netq-appliance:~$ sudo vm-backuprestore.sh --restore --backupfile /home/cumulus/backup-netq-standalone-onprem-4.9.0-2029-02-06_12_37_29_UTC.tar
cumulus@netq-appliance:~$ sudo vm-backuprestore.sh --restore --backupfile /home/cumulus/backup-netq-standalone-onprem-4.10.0-2029-02-06_12_37_29_UTC.tar
Mon Feb 6 12:39:57 2024 - Please find detailed logs at: /var/log/vm-backuprestore.log
Mon Feb 6 12:39:57 2024 - Starting restore of data
Mon Feb 6 12:39:57 2024 - Extracting release file from backup tar
Mon Feb 6 12:39:57 2024 - Cleaning the system
Mon Feb 6 12:39:57 2024 - Restoring data from tarball /home/cumulus/backup-netq-standalone-onprem-4.9.0-2024-02-06_12_37_29_UTC.tar
Mon Feb 6 12:39:57 2024 - Restoring data from tarball /home/cumulus/backup-netq-standalone-onprem-4.10.0-2024-02-06_12_37_29_UTC.tar
Data restored successfully
Please follow the below instructions to bootstrap the cluster
The config key restored is EhVuZXRxLWVuZHBvaW50LWdhdGVfYXkYsagDIix2OUJhMUpyekMwSHBBaitUdTVDaTRvbVJDR3F6Qlo4VHhZRytjUUhLZGJRPQ==, alternately the config key is available in file /tmp/config-key
Pass the config key while bootstrapping:
Example(standalone): netq install standalone full interface eth0 bundle /mnt/installables/NetQ-4.11.0.tgz config-key EhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIix2OUJhMUpyekMwSHBbaitUdTVDaTRvbVJDR3F6Qlo4VHhZRytjUUhLZGJRPQ==
Example(cluster): netq install cluster full interface eth0 bundle /mnt/installables/NetQ-4.11.0.tgz config-key EhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIix2OUJhMUpyekMwSHBbaitUdTVDaTRvbVJDR3F6Qlo4VHhZRytjUUhLZGJRPQ==
Example(standalone): netq install standalone full interface eth0 bundle /mnt/installables/NetQ-4.12.0.tgz config-key EhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIix2OUJhMUpyekMwSHBbaitUdTVDaTRvbVJDR3F6Qlo4VHhZRytjUUhLZGJRPQ==
Example(cluster): netq install cluster full interface eth0 bundle /mnt/installables/NetQ-4.12.0.tgz config-key EhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIix2OUJhMUpyekMwSHBbaitUdTVDaTRvbVJDR3F6Qlo4VHhZRytjUUhLZGJRPQ==
Alternately you can setup config-key post bootstrap in case you missed to pass it during bootstrap
Example(standalone): netq install standalone activate-job config-key EhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIix2OUJhMUpyekMwSHBbaitUdTVDaTRvbVJDR3F6Qlo4VHhZRytjUUhLZGJRPQ==
Example(cluster): netq install cluster activate-job config-key EhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIix2OUJhMUpyekMwSHBbaitUdTVDaTRvbVJDR3F6Qlo4VHhZRytjUUhLZGJRPQ==
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,15 @@ Consider the following deployment options and requirements before you install th
| Single Server | High-Availability Cluster| High-Availability Scale Cluster |
| --- | --- | --- |
| On-premises or cloud | On-premises or cloud | On-premises only |
| Low scale<ul><li>Single server supports up to TKTK devices</li></ul>| Medium scale<ul><li>3-node deployment supports up to 100 devices and 12,800 interfaces</li></ul>| High scale<ul><li>3-node deployment supports up to 1000 devices and TKTK interfaces</li></ul>|
| Network size: small<ul></ul>| Network size: medium<ul><li>Supports up to 100 switches and 128 interfaces per switch*</li></ul>| Network size: large<ul><li>Supports up to 1,000 switches and 125,000 interfaces* </li></ul>|
| KVM or VMware hypervisor | KVM or VMware hypervisor | KVM or VMware hypervisor |
| System requirements<br><br> On-premises: 16 virtual CPUs, 64GB RAM, 500GB SSD disk<br><br>Cloud: 4 virtual CPUs, 8GB RAM, 64GB SSD disk | System requirements (per node)<br><br> On-premises: 16 virtual CPUs, 64GB RAM, 500GB SSD disk<br><br>Cloud: 4 virtual CPUs, 8GB RAM, 64GB SSD disk | System requirements (per node)<br><br>On-premises: 48 virtual CPUs, 512GB RAM, 3.2TB SSD disk|
| All features supported | All features supported| No support for:<ul><li>Network snapshots</li><li>Trace requests</li><li>Flow analysis</li><li>Duplicate IP address validations</li><li>MAC commentary</li><li>Link health view</li></ul> Limited support for:<ul><li>Topology validations</li></ul>|
| All features supported | All features supported| No support for:<ul><li>Network snapshots</li><li>Trace requests</li><li>Flow analysis</li><li>Duplicate IP address validations</li><li>MAC commentary</li><li>Link health view</li></ul>|

*Exact device support counts can vary based on multiple factors, such as the number of links, routes, and IP addresses in your network. Contact NVIDIA for assistance in selecting the appropriate deployment model for your network.

NetQ is also available through NVIDIA Base Command Manager. To get started, refer to the {{<exlink url="https://docs.nvidia.com/base-command-manager/#product-manuals" text="Base Command Manager administrator and containerization manuals">}}.
## Deployment Type: On-premises or Cloud

## Deployment Type: On-Premises or Cloud

**On-premises deployments** are hosted at your location and require the in-house skill set to install, configure, back up, and maintain NetQ. This model is a good choice if you want very limited or no access to the internet from switches and hosts in your network.

Expand All @@ -30,20 +32,28 @@ In all deployment models, the NetQ Agents reside on the switches and hosts they

## Server Arrangement: Single or Cluster

A **single server** is easier to set up, configure, and manage, but can limit your ability to scale your network monitoring quickly. Deploying multiple servers is more complicated, but you limit potential downtime and increase availability by having more than one server that can run the software and store the data. Select the standalone, single-server arrangements for smaller, simpler deployments.
A **single server** is easier to set up, configure, and manage, but limits your ability to scale your network monitoring. Deploying multiple servers allows you to limit potential downtime and increase availability by having more than one server that can run the software and store the data. Select the standalone, single-server arrangement for smaller, simpler deployments.

Select the **high-availability cluster** deployment for greater device support and high availability for your network. The clustering implementation comprises three servers: one master and two workers. NetQ supports high availability server-cluster deployments using a virtual IP address. Even if the master node fails, NetQ services remain operational. However, keep in mind that the master hosts the Kubernetes control plane so anything that requires connectivity with the Kubernetes cluster&mdash;such as upgrading NetQ or rescheduling pods to other workers if a worker goes down&mdash;will not work.
The **high-availability cluster** deployment supports a greater number of switches and provides high availability for your network. The clustering implementation comprises three servers: one master and two workers nodes. NetQ supports high availability server-cluster deployments using a virtual IP address. Even if the master node fails, NetQ services remain operational. However, keep in mind that the master hosts the Kubernetes control plane so anything that requires connectivity with the Kubernetes cluster&mdash;such as upgrading NetQ or rescheduling pods to other workers if a worker goes down&mdash;will not work.

During the installation process, you configure a virtual IP address that enables redundancy for the Kubernetes control plane. In this configuration, the majority of nodes must be operational for NetQ to function. For example, a three-node cluster can tolerate a one-node failure, but not a two-node failure. For more information, refer to the {{<exlink url="https://etcd.io/docs/v3.3/faq/" text="etcd documentation">}}.

The **high-availability scale cluster** deployment provides support for the greatest number of devices and provides an extensible framework for greater scalability. <!--As the number of devices in your network grows, you can add additional nodes to the cluster to support the additional devices. 4.12 supports only 3-node cluster-->
The **high-availability scale cluster** deployment provides the same benefits as the high-availability cluster deployment, but supports larger networks of up to 1,000 switches. NVIDIA recommends this option for networks that have over 100 switches and at least 100 interfaces per switch. It offers the highest level of scalability, allowing you to adjust NetQ's network monitoring capacity as your network expands.

Tabular data in the UI is limited to 10,000 rows. For large networks, NVIDIA recommends downloading and exporting the tabular data as a CSV or JSON file and opening it in a spreadsheet program for further analysis. Refer to the installation overview table at the beginning of this section for additional HA scale cluster deployment support information.

<!--As the number of devices in your network grows, you can add additional nodes to the cluster to support the additional devices. 4.12 supports only 3-node cluster-->

### Cluster Deployments and Load Balancers

As an alternative to the three-node cluster deployment with a virtual IP address, you can use an external load balancer to provide high availability for the NetQ API and the NetQ UI.

However, you need to be mindful of where you {{<link title="Install a Custom Signed Certificate" text="install the certificates">}} for the NetQ UI (port 443); otherwise, you cannot access the NetQ UI. If you are using a load balancer in your deployment, NVIDIA recommends that you install the certificates directly on the load balancer for SSL offloading. However, if you install the certificates on the master node, then configure the load balancer to allow for SSL passthrough.

## Base Command Manager

NetQ is also available through NVIDIA Base Command Manager. To get started, refer to the {{<exlink url="https://docs.nvidia.com/base-command-manager/#product-manuals" text="Base Command Manager administrator and containerization manuals">}}.

## Next Steps

After you've decided on your deployment type, you're ready to {{<link title="Install the NetQ System" text="install NetQ">}}.
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,16 @@ weight: 227
toc: 5
bookhidden: true
---
Follow these steps to set up and configure your VM on a cluster of servers in an on-premises deployment. First configure the VM on the master node, and then configure the VM on *each* additional node. NVIDIA recommends installing the virtual machines on different servers to increase redundancy in the event of a hardware failure.
Follow these steps to set up and configure your VM on a cluster of servers in an on-premises deployment. First configure the VM on the master node, and then configure the VM on each additional node. NVIDIA recommends installing the virtual machines on different servers to increase redundancy in the event of a hardware failure.

{{%notice note%}}
NetQ 4.12.0 only supports a 3-node HA scale cluster consisting of one master and 2 additional HA worker nodes.
NetQ 4.12.0 supports a 3-node HA scale cluster consisting of 1 master and 2 additional HA worker nodes.
{{%/notice%}}
- - -

## System Requirements

Verify that each node in your cluster meets the VM requirements.
Verify that *each node* in your cluster meets the VM requirements.

| Resource | Minimum Requirements |
| :--- | :--- |
Expand Down Expand Up @@ -199,9 +199,6 @@ cumulus@netq-server:~$ vim /tmp/cluster-install-config.json
{
"ip": "<INPUT>"
},
{
"ip": "<INPUT>"
},
{
"ip": "<INPUT>"
}
Expand All @@ -215,7 +212,34 @@ cumulus@netq-server:~$ vim /tmp/cluster-install-config.json
| `cluster-vip` | The cluster virtual IP address must be an unused IP address allocated from the same subnet assigned to the default interface for your master and worker nodes. |
| `master-ip` | The IP address assigned to the interface on your master node used for NetQ connectivity. |
| `is-ipv6` | Set the value to `true` if your network connectivity and node address assignments are IPv6. |
| `ha-nodes` | The IP addresses of each of the HA nodes in your cluster, including the `master-ip`. |
| `ha-nodes` | The IP addresses of each of the HA nodes in your cluster. |

{{%notice note%}}

NetQ uses the 10.244.0.0/16 (`pod-ip-range`) and 10.96.0.0/16 (`service-ip-range`) networks for internal communication by default. If you are using these networks, you must override each range by specifying new subnets for these parameters in the cluster configuration JSON file:

```
cumulus@netq-server:~$ vim /tmp/cluster-install-config.json
{
"version": "v2.0",
"interface": "eth0",
"cluster-vip": "10.176.235.101",
"master-ip": "10.176.235.50",
"is-ipv6": false,
"pod-ip-range": "192.168.0.1/32",
"service-ip-range": "172.168.0.1/32",
"ha-nodes": [
{
"ip": "10.176.235.51"
},
{
"ip": "10.176.235.52"
}
]
}
```

{{%/notice%}}

{{< /tab >}}
{{< tab "Completed JSON Example ">}}
Expand All @@ -229,9 +253,6 @@ cumulus@netq-server:~$ vim /tmp/cluster-install-config.json
"master-ip": "10.176.235.50",
"is-ipv6": false,
"ha-nodes": [
{
"ip": "10.176.235.50"
},
{
"ip": "10.176.235.51"
},
Expand All @@ -247,7 +268,7 @@ cumulus@netq-server:~$ vim /tmp/cluster-install-config.json
| `cluster-vip` | The cluster virtual IP address must be an unused IP address allocated from the same subnet assigned to the default interface for your master and worker nodes. |
| `master-ip` | The IP address assigned to the interface on your master node used for NetQ connectivity. |
| `is-ipv6` | Set the value to `true` if your network connectivity and node address assignments are IPv6. |
| `ha-nodes` | The IP addresses of each of the HA nodes in your cluster, including the `master-ip`. |
| `ha-nodes` | The IP addresses of each of the HA nodes in your cluster. |
{{< /tab >}}
{{< /tabs >}}

Expand All @@ -258,15 +279,6 @@ cumulus@netq-server:~$ vim /tmp/cluster-install-config.json
cumulus@<hostname>:~$ netq install cluster bundle /mnt/installables/NetQ-4.12.0.tgz /tmp/cluster-install-config.json
```

<!-- ## It's unclear how a user would override these settings in HA scale cluster. The "netq install cluster bundle" command doesn't allow for service-ip / pod-ip range options
<div class="notices note"><p></p><p>NetQ uses the 10.244.0.0/16 (<code>pod-ip-range</code>) and 10.96.0.0/16 (<code>service-ip-range</code>) networks for internal communication by default. If you are using these networks, you must override each range by specifying new subnets for these parameters in the install command:</p>
<pre><div class="copy-code-img"><img src="https://icons.cumulusnetworks.com/01-Interface-Essential/29-Copy-Paste/copy-paste-1.svg" width="20" height="20"></div>cumulus@hostname:~$ netq install cluster full interface eth0 bundle /mnt/installables/NetQ-4.11.0.tgz workers &lt;worker-1-ip&gt; &lt;worker-2-ip&gt; pod-ip-range &lt;pod-ip-range&gt; service-ip-range &lt;service-ip-range&gt;</pre><p>You can specify the IP address of the server instead of the interface name using the <code>ip-addr &lt;ip-address&gt;</code> argument:</p>
<pre><div class="copy-code-img"><img src="https://icons.cumulusnetworks.com/01-Interface-Essential/29-Copy-Paste/copy-paste-1.svg" width="20" height="20"></div>cumulus@hostname:~$ netq install cluster full ip-addr &lt;ip-address&gt; bundle /mnt/installables/NetQ-4.11.0.tgz workers &lt;worker-1-ip&gt; &lt;worker-2-ip&gt;</pre><p>If you change the server IP address or hostname after installing NetQ, you must reset the server with the <code>netq bootstrap reset keep-db</code> command and rerun the install command.</p>
<p></p></div>
-->

<div class="notices tip"><p>If this step fails for any reason, run <code>netq bootstrap reset</code> and then try again.</p></div>

## Verify Installation Status
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-latest
```

{{<notice tip>}}
You can specify a NetQ CLI version in the repository configuration. The following example shows the repository configuration to retrieve NetQ CLI v4.3: <pre>deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-4.3</pre>
You can specify a NetQ CLI version in the repository configuration. The following example shows the repository configuration to retrieve NetQ CLI v4.12: <pre>deb https://apps3.cumulusnetworks.com/repos/deb CumulusLinux-4 netq-4.12</pre>
{{</notice>}}


Expand All @@ -206,7 +206,7 @@ You can specify a NetQ CLI version in the repository configuration. The followin
cumulus@switch:~$ dpkg-query -W -f '${Package}\t${Version}\n' netq-apps
```
<!-- vale off -->
{{<netq-install/cli-version version="4.11" opsys="cl">}}
{{<netq-install/cli-version version="4.12" opsys="cl">}}
<!-- vale on -->
4. Continue with NetQ CLI configuration in the next section.

Expand All @@ -227,7 +227,7 @@ You can specify a NetQ CLI version in the repository configuration. The followin
root@ubuntu:~# dpkg-query -W -f '${Package}\t${Version}\n' netq-apps
```
<!-- vale off -->
{{<netq-install/cli-version version="4.11" opsys="ub">}}
{{<netq-install/cli-version version="4.12" opsys="ub">}}
<!-- vale on -->
3. Continue with NetQ CLI configuration in the next section.

Expand Down
Loading

0 comments on commit fff2bb8

Please sign in to comment.