Monthly Archives: January 2010

opensolaris, ntpd, transparent routing and sendto problems

I’ve been running an ntpd server as part of the UK pool since 2007 but since upgrading from OpenSolaris 2009.06 to snv_129, I’ve had a very poor score. So poor, that for more than a few weeks I’ve been dropped from the uk.pool.ntp.org CNAME :(

The problem (after I fixed the missing ptys) manifest itself as a series of entries in /var/adm/messages with varying IP addresses but all of the form:

sendto(1.2.3.4) (fd=53): Not owner

and a random delay (or packet drop) to the time responses that meant I was deemed to be unreliable.

I spent a long time with truss and Google and didn’t come up with anything useful, but narrowed the behaviour down to something peculiar with my routing – I have three NICs in my OpenSolaris box: one of them with a public IP and one with a NATed one, although both end up at the same router (no – best you don’t ask why). To prevent NTP requests arriving on my public IP and then departing by the default route (via the NAT) I have used an odd looking IPFilter rule for Transparent Routing, which enables packets matching a rule to be sent to a specific NIC – in this case, all packets with a From address matching my public IP were being forced back out of the public NIC regardless of the routing table entries. This had worked for months on 2009.06, and after a lot of poking, appeared to be doing the right thing on snv_129 as well.

Most of the Google comments suggested that it’s perfectly acceptable to ignore sendto errors in most cases, but I couldn’t figure out where they were being sent from until I started poking around with ndd (in a failed attempt to find source based routing for UDP packets), and tucked away in the /dev/udp collection was exactly the setting I needed, so after issuing:

pfexec ndd -set /dev/udp udp_sendto_ignerr 1

The time started to flow again, and so far over 12 monitoring periods the step has generally been under 0.005 – with a nice stable ADSL line overnight I should be back in the UK pool by morning :) I’m not sure what changed in the network as I haven’t gone back to my 2009.06 BE to take a look at the original ndd settings, but I was never as happy with my ntp score on OpenSolaris as I had been with my Qube 2 so this could have been the reason all along.

Share
Page 1 of 11