Discussion:
[gPXE-devel] 802.1Q support for gpxe
michael-dev
2010-03-04 19:18:16 UTC
Permalink
Hi,

I've been happy using gpxe to boot linux for a while and now'd like to
deploy it in an environment where support 802.1q is essential.
Namely, I'd like to deploy access points (based on an alix2 board) on
ports where the untagged vlan is already in use.

Therefore, I've had a quick glance at the gpxe source and it looked
quite easy to add. I just changed net/ethernet.c to add/verify 802.1q
headers and a 802.1q header declaration to if_ether.h .
No changes to legacy.c are required, as it looks like a packet of kind
0x8100 to it and will split and join it correctly.

To configure the vlan identifiers, use
set net0/vid 10,11,12
which means tagged with vlan 10, 11 and 12 where vlan 10
is the outer identity.

Please find a patch attached, feedback is welcomed.

Though, I could test it only with a single vlan header added (the
switches here don't support nested vlans, yet) and did it by embedding
set net0/vid 514
autoboot
into the image.

Sincerely,
M. Braun

-------------- next part --------------
A non-text attachment was scrubbed...
Name: vlan.diff
Type: text/x-patch
Size: 6520 bytes
Desc: not available
Url : http://etherboot.org/pipermail/gpxe-devel/attachments/20100304/f1d06251/attachment.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
Url : http://etherboot.org/pipermail/gpxe-devel/attachments/20100304/f1d06251/attachment-0001.bin
michael-dev
2010-03-10 14:19:59 UTC
Permalink
Hi,

I've reworked the patch to be fully optional and less invasive.

The receiving path is from the raw device via 802.1Q network protocol
into net_rx. The sending path is via linklayer of the proxy device
which adds 802.1Q header and the forwards the iobuf to the link layer
of the raw device. The transmit as well as all other commands to the
proxy device (except pull) are forwarded to the raw device.

Further, there is a vconfig command to add/remove vlan interfaces just
like linux does. In order to have network device names like net0.VID,
I increased the len of the netdevice name to 10 characters and let the
driver decide upon the name.
In order to achieve autobooting from the vlan device without trying to
boot from the raw device first, I optionally added a netboot command.
The code uses netdev_put and netdev_get commands to lock the raw device.

The most tricky part is the irq enable/disable code as well as the
open/close code, the patch just forwards these commands which just works
though there should be open/close counters.

Sincerely,
M. Braun

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gpxevlans.diff
Type: text/x-patch
Size: 18230 bytes
Desc: not available
Url : http://etherboot.org/pipermail/gpxe-devel/attachments/20100310/7449f402/attachment.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: not available
Url : http://etherboot.org/pipermail/gpxe-devel/attachments/20100310/7449f402/attachment-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
Url : http://etherboot.org/pipermail/gpxe-devel/attachments/20100310/7449f402/attachment-0002.bin
Michael Brown
2010-03-10 14:59:47 UTC
Permalink
Post by michael-dev
I've reworked the patch to be fully optional and less invasive.
The receiving path is from the raw device via 802.1Q network protocol
into net_rx. The sending path is via linklayer of the proxy device
which adds 802.1Q header and the forwards the iobuf to the link layer
of the raw device. The transmit as well as all other commands to the
proxy device (except pull) are forwarded to the raw device.
Further, there is a vconfig command to add/remove vlan interfaces just
like linux does. In order to have network device names like net0.VID,
I increased the len of the netdevice name to 10 characters and let the
driver decide upon the name.
In order to achieve autobooting from the vlan device without trying to
boot from the raw device first, I optionally added a netboot command.
The code uses netdev_put and netdev_get commands to lock the raw device.
The most tricky part is the irq enable/disable code as well as the
open/close code, the patch just forwards these commands which just works
though there should be open/close counters.
I like the look of this approach much more than the previous patch.

A few (non-exhaustive) comments from a quick scan:

char name[] in struct net_device may as well be 12 rather than 10, since it's
going to get dword-aligned anyway.

Why is ieee8021q_transmit() creating a new iobuf?

Device open counts should be implemented in net/netdevice.c within
netdev_open() and netdev_close(); you can use ib_open() and ib_close() in
net/infiniband.c as a reference. The NETDEV_OPEN flag must be removed and
replaced with something like a static inline netdev_is_open() which just
checks the netdev's open count. This change should probably be in its own
separate commit.

I would try to avoid calling fetch_uintz_setting() for each transmitted
packet. Better to hold the VLAN tag in a structure member. For example:

struct vlan_device {
/* Underlying net device */
struct net_device *netdev;
/* VLAN tag */
uint32_t tag;
};

with the struct vlan_device forming the private data of the net_device you
create.

Lastly, is 802.1Q Ethernet-specific? In that case, you may as well just treat
it as its own static struct ll_protocol, rather than trying to dynamically
create a ll_protocol based on the underlying net_device. (gPXE isn't really
set up to handle a struct ll_protocol that might be freed; they're assumed to
be static and permanent.)

Michael
michael-dev
2010-03-10 19:27:59 UTC
Permalink
Hi Michael,

thanks for that quick reply.
Post by Michael Brown
char name[] in struct net_device may as well be 12 rather than 10, since it's
going to get dword-aligned anyway.
done
Post by Michael Brown
Why is ieee8021q_transmit() creating a new iobuf?
As netdev_tx registers the iobuf in netdev->tx_queue, though
the iobuf cannot be registered in multiple lists, and because in case
of an error, netdev_tx of the raw device freeed the iobuf and netdev_tx
of the vlan device will also free the buffer.
Post by Michael Brown
Device open counts should be implemented in net/netdevice.c within
netdev_open() and netdev_close(); you can use ib_open() and ib_close() in
net/infiniband.c as a reference.
done
Post by Michael Brown
The NETDEV_OPEN flag must be removed and
replaced with something like a static inline netdev_is_open() which just
checks the netdev's open count. This change should probably be in its own
separate commit.
done
Post by Michael Brown
I would try to avoid calling fetch_uintz_setting() for each transmitted
struct vlan_device {
/* Underlying net device */
struct net_device *netdev;
/* VLAN tag */
uint32_t tag;
};
with the struct vlan_device forming the private data of the net_device you
create.
done
Post by Michael Brown
Lastly, is 802.1Q Ethernet-specific? In that case, you may as well just treat
it as its own static struct ll_protocol, rather than trying to dynamically
create a ll_protocol based on the underlying net_device. (gPXE isn't really
set up to handle a struct ll_protocol that might be freed; they're assumed to
be static and permanent.)
802.1Q can be used with Ethernet and FDDI. For gpxe, I wanted to support
ethernet and 802.11 link layer structs. Anyway, the ll_protocol of
netdev only is setup and freed when the device is created / destroyed,
so nobody should hold a reference to the virtual netdev anymore when the
ll_protocol is freed or changed.

Michael
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 01netopen.diff
Type: text/x-patch
Size: 9719 bytes
Desc: not available
Url : http://etherboot.org/pipermail/gpxe-devel/attachments/20100310/83eb3e5b/attachment-0003.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 02vlans.diff
Type: text/x-patch
Size: 21987 bytes
Desc: not available
Url : http://etherboot.org/pipermail/gpxe-devel/attachments/20100310/83eb3e5b/attachment-0004.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
Url : http://etherboot.org/pipermail/gpxe-devel/attachments/20100310/83eb3e5b/attachment-0005.bin
Stefan Hajnoczi
2010-04-16 08:42:30 UTC
Permalink
Cool patch :).

Please format the code according to gPXE coding style. For reference,
see include/gpxe/netdevice.h and net/netdevice.c.

Can you split the 'netboot' command into a completely separate patch?
This is a useful command that is unrelated to vlans, it breaks up
monolithic 'autoboot' into a finer-grained command that can be used
from scripts.

+static void ieee8021q_poll ( struct net_device* dev __unused) {
+ netdev_poll(((struct vlan_device_priv*)dev->priv)->netdev);
+}
__unused should not be there. Also this implementation has the side-effect of
calling netdev_poll() on the underlying device 1 + n times in the network stack
processing function, net_step() (where n is the number of vlan devices). I
think we won't have many vlan devices and will be okay though.

+static void ieee8021q_irq ( struct net_device* dev, int enable ) {
+ /* a little bit unsure about side effects */
+ return netdev_irq(((struct vlan_device_priv*)dev->priv)->netdev, enable);
+}
netdev_irq() is used by the PXE stack because the UNDI API involves hardware
interrupts. Only one active netdev is supported by the active PXE stack and
therefore side effects don't matter in this case. There can be no interference
because only one netdev will use irqs at a given time.

I still have some more to read and will let you know.

Stefan
Stefan Hajnoczi
2010-04-19 19:26:24 UTC
Permalink
+static const char * vid_ntoa ( const void *net_addr ) {
+ static char buf[2]; /* "00" */
+ const uint8_t eth_addr = * (uint8_t*)net_addr;
+
+ sprintf ( buf, "%02x", eth_addr);
+ return buf;
+}
The array should be buf[3] so there is space for the NUL terminator.

How about something like the following to eliminate memory management
and simplify things (I have not tried compiling this code):

struct vlan_device {
struct net_device* target; /* the underlying raw device */
struct net_device* netdev; /* the vlan device */
struct list_head list; /* vlan devices */

uint16_t vlanid;
uint8_t prio;

struct ll_protocol ll_protocol;
};

static LIST_HEAD ( ieee8021q_devices );

struct net_device * ieee8021q_create ( struct net_device* target,
uint16_t vid, uint8_t prio ) {
struct vlan_device *vdev;
struct net_device* netdev;

netdev = alloc_netdev ( sizeof ( *vdev ) );
if ( !netdev )
return NULL;

netdev_init ( netdev, &ieee8021q_operations );
vdev = netdev->priv;
vdev->target = target;

memcpy ( &vdev->ll_protocol, target->ll_protocol, sizeof (
*target->ll_protocol ) );
vdev->ll_protocol.push = ieee8021q_push;
netdev->ll_protocol = &vdev->ll_protocol;

netdev->ll_broadcast = target->ll_broadcast;
netdev->max_pkt_len = target->max_pkt_len;
memcpy(netdev->hw_addr,target->hw_addr,sizeof(netdev->hw_addr));
netdev->state = target->state;

snprintf ( netdev->name, sizeof ( netdev->name ), "%s.%d", target->name, vid);

/* Mark as link up; we don't yet handle link state */
netdev_link_up ( netdev );

/* Register network device */
if ( register_netdev ( netdev ) != 0 )
goto err_register_netdev;

/* write-back ll_addr init */
memcpy(netdev->ll_addr,target->ll_addr,sizeof(netdev->ll_addr));

/* add netdev to list */
vdev->netdev = netdev;
list_add(&vdev->list, &ieee8021q_devices);

/* set target device */
vdev->target = netdev_get(target);
vdev->vlanid = vid;
vdev->prio = prio;

DBG("Added device %s.\n", netdev->name);

return netdev;

err_register_netdev:
netdev_nullify ( netdev );
netdev_put ( netdev );
return NULL;
}

int ieee8021q_remove ( struct net_device* netdev) {
struct vlan_device* vdev = netdev->priv;

if (netdev->op->open != ieee8021q_open) {
DBG("Sorry, but this does not look like a ieee8021q device.\n"
"So I won't remove it!\n");
return -EINVAL;
}

unregister_netdev ( netdev );
netdev_put ( vdev->target );
netdev_nullify ( netdev );
netdev_put ( netdev );
return 0;
}

Some minor changes also added above were using NULL instead of 0 for a pointer
value to follow gPXE coding style. I also suggest adding error code to
ieee8021q_remove() so that the user can get error feedback if they tried to
remove a non-802.1q device.

As for the zero-copy transmit approach, I think it could be done with
a iob_orphan()
function which calls list_del() on the io_buffer. It might require
always returning
success from the transmit function though.

Another side effect is that transmit statistics will be missing for
the vlan netdevice.
I think we already don't collect receive statistics on the vlan device
because the
io_buffers are received by the target device. So perhaps that is not
a big loss.

Stefan
Stefan Hajnoczi
2010-05-15 11:29:25 UTC
Permalink
Hi Michael,
Any comments or new patches? I think your patches are close to being
ready for merge.

Stefan
Andreas Kotes
2010-10-14 13:02:27 UTC
Permalink
Hi all,
Any comments or new patches? I think your patches are close to being
ready for merge.
have they been merged / will they be merged?

I'd love them to be!

Cheers,

Andreas
--
Andreas Kotes, CISSP, CCNA - flatline IT services - ISP & IT Consulting
"Love many things, for therein lies the true strength, and whosoever
loves much performs much, and can accomplish much, and what is done
in love is done well." -- Vincent van Gogh
Loading...