Skip to content

Commit

Permalink
RFD 0011: IPv6 and multiple IP addresses support in SDC
Browse files Browse the repository at this point in the history
  • Loading branch information
melloc committed Oct 21, 2015
1 parent 0c284f9 commit 8f009dd
Show file tree
Hide file tree
Showing 2 changed files with 261 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ formal writing that it has come to represent.)
| draft | [RFD 7 Datalink LLDP and State Tracking](./rfd/0007/README.md) |
| publish | [RFD 9 sdcadm fabrics management]((./rfd/0009/README.md)) |
| predraft | [RFD 10 Sending GZ Docker Logs to Manta](./rfd/0010/README.md) |
| draft | [RFD 11 IPv6 and multiple IP addresses support in SDC]((./rfd/0011/README.md)) |
## Contents of an RFD

The following is a way to help you think about and structure an RFD
Expand Down
260 changes: 260 additions & 0 deletions rfd/0011/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
---
authors: Cody Mello <[email protected]>
state: draft
---

# RFD 11 IPv6 and multiple IP addresses support in SDC

# Introduction

This proposal lays out the work that needs to be done to add support for IPv6 to
SmartOS and SmartDataCenter (SDC). As the Internet grows and more people come
online around the globe, ISPs and online businesses are looking towards a future
where they will need to support IPv6. Major ISPs like AT\&T, Verizon and Comcast
have already started giving customers IPv6 service and IPv6-enabled modems and
routers.

Joyent customers and members of the SmartOS community are interested in using
IPv6 with their VMs, so this will be a useful feature and selling point to have.
(Note that it has been possible for some time to use IPv6 with [vmadm(1M)] but
[it required extra effort](https://digitalelf.net/2014/10/ipv6-the-smartos-way/).)
Parts of the stack have made room for future IPv6 support, but almost everything
that touches networking will require modifications and testing.

This proposal also suggests a change to the relationship between Network
Interface Cards (NICs) and IP addresses in SDC. Instead of always having one
address per NIC, it should be possible for multiple IP addresses to be placed on
a single NIC. We call this interface-centric provisioning. The idea is that when
a machine is being created, it should be possible to request multiple IP
addresses for each NIC (where addresses can be picked by the requester or left
up to the API based on what's available), as long as they are on the same VLAN
or overlay.

This document is approximately sorted in the order in which each step will need
to be implemented. Basic tooling will come first, and then the plumbing through
SDC. Once that is finished and tested, IPv6 support can be exposed through user
interfaces to end users and operators.

# Support in tooling

## vmadm(1M)

[vmadm(1M)] is the SmartOS tool for creating and managing virtual machines. It makes
use of illumos' ZFS and zone management tools to provision instances of the
images managed with [imgadm(1M)]. [vmadm(1M)] will need to be modified to appropriately
call out to [zonecfg(1M)] and set up the zone to use IPv6.

[vmadm(1M)] creates and updates objects based on an input JSON object. The following
fields will need to be modified to accommodate IPv6 and to allow specifying
multiple IP addresses:

- **nic.\*.ips** will accept an array of the following items:
- An IPv4 or IPv6 address using CIDR notation to indicate the routing prefix
- One of the strings `dhcp` or `addrconf`; note that `dhcp` and `addrconf`
cannot appear multiple times in the list, and there cannot be more than 32
items between **ips** and **allowed\_ips** since the kernel does not allow
the `allowed-ips` property of a NIC to contain more entries than this
- **nic.\*.gateways** will accept an array of IPv4 and IPv6 addresses
- **nic.\*.network6\_uuid** will be a UUID for an IPv6 network
- **routes** will accept both IPv4 and IPv6 networks and addresses as keys and
values (address family must match for each key-value pair, of course)
- **resolvers** will accept both IPv4 and IPv6 addresses

The fields **nic.\*.ip**, **nic.\*.netmask** and **nic.\*.gateway** will
continue to exist as is, and will be returned by `vmadm get`, but will be
considered deprecated. Note that these new fields allow IPv4 and IPv6 to be
mixed. This is to avoid creating a split in configuration and creating an IPv6
dual for everything, like `ping` vs. `ping6` or `traceroute` vs. `traceroute6`
in other systems.

Much of this work has been finished in [OS-2994].

## fwadm(1M) and fwrule(5)

[fwadm(1M)] is the SmartOS tool for creating and managing firewall rules. Under
the hood, it makes use of [ipfilter(5)], and converts rules written in the
[fwrule(5)] domain-specific language to the language defined in [ipf(4)]. Like
[vmadm(1M)], [fwadm(1M)] takes a JSON object as input when creating or modifying
rules. Since it passes off the firewall rules field to the [fwrule(5)] parser,
IPv6 will only need to be accounted for in the **ips** field of remote VMs, and
the **ip** field of NIC objects.

The parser for the firewall rules is generated by the [Jison parser
generator](http://zaach.github.io/jison/). The grammar will need to be adjusted
to accept and validate IPv6 addresses, and the consumers will need to make sure
that it gets fed appropriately to [ipf(1M)].

## Zone \& KVM Images and Setup

The startup scripts and programs for zones and KVM images will need to be
updated to make use of their IPv6 networking stack. For SmartOS and LX-branded
zones, much of this work has been done in [OS-4582] and [OS-4741]. KVM instances
that are assigned static IP addresses currently make use of the DHCPv4 server
embedded in QEMU, which is less than ideal. In order to respond appropriately to
the guest, QEMU needs to inspect each packet sent over the network. Instead, it
would be better to have the images check on boot (or more likely when the
system's networking service starts up) for what IP addresses they should be
using, and on what NICs. This data can be fetched via [mdata-get(1M)], and used
to configure the system.

The logic in the brand scripts that prepare the NICs for zones will also need to
be updated, so that properties like **allowed-ips** and **dhcp-nospoof** will
get set correctly.

# Support in SDC API's

## Networking API (napi)

The Networking API is responsible for managing data about networks within SDC.
It also takes care of contacting VMAPI and CNAPI for adding, removing and
updating NICs.

NAPI will need to be modified to support the following:

- Accept and validate new IPv6 networks (see [NAPI-308](https://devhub.joyent.com/jira/browse/NAPI-308)); a new field type, **address\_family** will need to be stored in Moray so people will be able to search for specific network types
- Manage and search IPv6 addresses
- Ensure that a NIC only has networks placed on it with the same VLAN tag
- Manage IPv6 network pools, and ensure that they are validated (a network pool
is either IPv4 or IPv6)

## Firewall API (fwapi)

The Firewall API is responsible for creating and managing firewall rules. Since
rules are pushed off to the same library used by [fwadm(1M)], there is not much
that should require updating beyond searching. The following changes for the
ListRules endpoint will need to be made:

- **ip** will need to accept IPv6 addresses
- **subnet** will need to accept IPv6 subnets
- **address\_family** will need to be added, so that it is possible to search
for rules affecting one of IPv4 or IPv6

# Compute Node Agents

The SDC headnode communicates with each of the compute nodes through agents
running in their Global Zone. These agents perform a variety of tasks locally,
ranging from updating the headnode, to making sure that the state on the compute
node matches what the headnode has stored.

The only agent that should need to be updated is the Compute Node Agent
(cn-agent), which manipulates NICs and associated information. Currently, it
assumes that IP addresses are IPv4 and converts them into 32-bit numbers. After
reviewing the source code for the Firewall Agent (firewaller) and for the
Networking Agent (net-agent), it looks like neither one should need to be
updated, since they don't manipulate IP addresses themselves, but instead pass
them off to other services like NAPI.

# Overlay Networks

## Fabrics

Work on adding IPv6 support to fabrics will occur during a second phase once
standard zones and networks are working. Once support is added, we will assign
users /64 subnets located within the fd00::/8 private network. RFC4193
recommends randomizing allocations within this space. We should probably provide
the option of picking or randomizing the prefix to the customer.

Since customers will most likely end up wanting private IP addresses that can
access the rest of the internet, we may need to explore implementing IPv6
support in [ipnat(1M)], and possibly 6to4 options. These will require further
evaluation in the future to determine if they're worth implementing, or leaving
up to the network operator. Protocols like NAT64 require a lot of configuration,
and running a cooperative DNS64 server, which may not be worth investing
resources in.

# Operator- and User-facing support

## Operations Portal (adminui)

The Operations Portal is the web interface for managing SDC and provisioning new
compute nodes and virtual machines. There are several things that will need to
be updated here:

- When managing NICs on a VM, the interface will need to allow for assigning
multiple IP addresses to the NIC
- Tests for validating input IP addresses and query parameters will need to
accommodate IPv6
- The interface for creating new networks should make it clear that IPv6 can be
used by giving example input
- When creating new network pools, once the address family is decided, only
networks of the same type should be suggested

## CloudAPI

CloudAPI will need to be extended to allow provisioning with IPv6 addresses, and
to also accept multiple addresses per NIC. Currently, it accepts the fields
**ipv4\_uuid** and **ipv4\_count**, but they cannot be used to assign multiple
IP addresses. We will want it to support the following fields and allow them to
be used for assigning multiple IP addresses:

- **ipv4\_uuid** is the UUID of the IPv4 network to use (VLAN/vxlan ID must match IPv6 network)
- **ipv4\_count** specifies how many IPv4 addresses should be selected from the pool and assigned to this NIC; it is currently restricted to only being 1
- **ipv4\_ips** is an array of IPv4 addresses that should be assigned to this NIC
- **ipv6\_uuid** is the UUID of the IPv6 network to use (VLAN/vxlan ID must match IPv4 network)
- **ipv6\_count** specifies how many IPv6 addresses should be selected from the pool and assigned to this NIC
- **ipv6\_ips** is an array of IPv6 addresses that should be assigned to this NIC

We began moving towards this schema and mindset in [ZAPI-598]. With the work
laid out in this proposal, we will finish it up.

CloudAPI will also need to be extended to allow for managing firewall rules that
apply to IPv6 networks and addresses.

## Docker

Docker and our APIs for supporting it will need to gain the appropriate support
and plumbing. The Docker Inspect API call only allows a single IP address to be
returned. Either we will need to only pick a representative IP address, or the
API will need to be improved to allow returning multiple IPv4 or IPv6 addresses.

# IPv6 on the admin network

Making it possible for SDC to have an IPv6 admin network would be a nice feature
to offer, but it is not essential. Since the admin network is usually a
non-routable private network, there will probably never be a real need for it to
support IPv6. As a result, some of these features may be put off for a while.
They are enumerated here though as a point of reference. Note that as of
[OS-4802], SmartOS hosts can use IPv6 from the global zone, but this cannot be
used within SDC.

The simplest path to assigning IPv6 addresses to nodes on the admin network
would be to run [in.ndpd(1M)] alongside Booter, and send out Router
Advertisements with the autonomous bit set, so that everyone else performs
SLAAC. If additional information is to be sent though, or if more control over
assigning addresses is needed, then it may be better to use DHCPv6.

## Binder

Binder is the DNS server used within SDC for locating admin services and compute
nodes. Currently, it only serves up IPv4 A records. Before SDC can be run on an
IPv6 admin network, Binder will need to gain support for serving IPv6 AAAA
records, so that various programs can continue to look up services on the admin
network via DNS.

## Booter

Booter is the DHCP and TFTP server used in SDC for assigning compute nodes IP
addresses and PXE booting them. In order for IPv6 to be used on the admin
network, Booter will need to gain support for DHCPv6 so that compute nodes can
get an IPv6 address and be sent the appropriate options and information to know
how to properly boot.

<!--- Manual page links -->
[in.ndpd(1M)]: https://smartos.org/man/1M/in.ndpd
[ipf(1M)]: https://smartos.org/man/1M/ipf
[ipnat(1M)]: https://smartos.org/man/1M/ipnat
[fwadm(1M)]: https://smartos.org/man/1M/fwadm
[imgadm(1M)]: https://smartos.org/man/1M/imgadm
[mdata-get(1M)]: https://smartos.org/man/1M/mdata-get
[vmadm(1M)]: https://smartos.org/man/1M/vmadm
[zonecfg(1M)]: https://smartos.org/man/1M/zonecfg
[ipf(4)]: https://smartos.org/man/4/ipf
[fwrule(5)]: https://smartos.org/man/5/fwrule
[ipfilter(5)]: https://smartos.org/man/5/ipfilter

<!-- Issue links -->
[OS-2994]: https://smartos.org/bugview/OS-2994
[OS-4582]: https://smartos.org/bugview/OS-4582
[OS-4741]: https://smartos.org/bugview/OS-4741
[OS-4802]: https://smartos.org/bugview/OS-4802
[ZAPI-598]: https://smartos.org/bugview/ZAPI-598

0 comments on commit 8f009dd

Please sign in to comment.