Posts in Category: opensolaris

opensolaris, ntpd, transparent routing and sendto problems

I’ve been running an ntpd server as part of the UK pool since 2007 but since upgrading from OpenSolaris 2009.06 to snv_129, I’ve had a very poor score. So poor, that for more than a few weeks I’ve been dropped from the CNAME 🙁

The problem (after I fixed the missing ptys) manifest itself as a series of entries in /var/adm/messages with varying IP addresses but all of the form:

sendto( (fd=53): Not owner

and a random delay (or packet drop) to the time responses that meant I was deemed to be unreliable.

I spent a long time with truss and Google and didn’t come up with anything useful, but narrowed the behaviour down to something peculiar with my routing – I have three NICs in my OpenSolaris box: one of them with a public IP and one with a NATed one, although both end up at the same router (no – best you don’t ask why). To prevent NTP requests arriving on my public IP and then departing by the default route (via the NAT) I have used an odd looking IPFilter rule for Transparent Routing, which enables packets matching a rule to be sent to a specific NIC – in this case, all packets with a From address matching my public IP were being forced back out of the public NIC regardless of the routing table entries. This had worked for months on 2009.06, and after a lot of poking, appeared to be doing the right thing on snv_129 as well.

Most of the Google comments suggested that it’s perfectly acceptable to ignore sendto errors in most cases, but I couldn’t figure out where they were being sent from until I started poking around with ndd (in a failed attempt to find source based routing for UDP packets), and tucked away in the /dev/udp collection was exactly the setting I needed, so after issuing:

pfexec ndd -set /dev/udp udp_sendto_ignerr 1

The time started to flow again, and so far over 12 monitoring periods the step has generally been under 0.005 – with a nice stable ADSL line overnight I should be back in the UK pool by morning 🙂 I’m not sure what changed in the network as I haven’t gone back to my 2009.06 BE to take a look at the original ndd settings, but I was never as happy with my ntp score on OpenSolaris as I had been with my Qube 2 so this could have been the reason all along.

opensolaris zoneadm attach detach problems

After getting so excited about figuring out what was up with the upgrade to 2009.06 I ran into another, more sticky problem. I rushed into reattaching the zones I’d had to detach to get beadm working by using:

zoneadm -z zonename attach -F

Oh dear: that was a bad idea. The zone appeared to attach but zoneadm -z zonename boot failed and then I discovered it was impossible to delete, rename or reconfigure the zone.

After a few attempts to recover things, the correct answer turns out to be to manually edit /etc/zones/index to change the state of the zone to read configured, and then it’s trivial to reattach the zone with:

zoneadm -z zonename attach -u -d path/to/zonename/ROOT/zbe

at which point it automatically upgrades the zone to 2009.06.

opensolaris 2008.11 to 2009.06 upgrade fails on beadm

New OpenSolaris release: Yay !

Updater fails on beadm create, and manual attempts also fail: Boo !

After a lot of grumbling Googling down plenty of dead-ends it appears that beadm in 2008.11 gets very upset when there are Zones attached. A set of zoneadm detach commands later and the updater completed without any problems at all.