Diary of a geek

June 2005
Mon Tue Wed Thu Fri Sat Sun
   
     

Andrew Pollock

Categories

Other people's blogs

Subscribe

RSS feed

Contact me

JavaScript required


Thursday, 30 June 2005

More fun and games: running Sarge in a UML

Once I'd got the underlying UML host system installed, I set about creating a couple of UML instances, one for the 2006 linux.conf.au guys, and one for mucking around in, and will be for the 2007 guys when they want to get the ball rolling.

Next I ran into some bizarre problems with UML. The host is running a 2.6.8 kernel with the SKAS patch, and the UMLs themselves are running 2.6.10. I used rootstrap to create the initial filesystem, and then jumped on the console to install OpenSSH, and then SSH in to do the rest of the configuration.

The problem was I got struck by #298427, which is indeed a bizarre little bug. I initially worked around this by reconfiguring the SSH daemon not to use PAM and setting PasswordAuthentication to yes. I later hit some strange segmentation faults with BIND 9 as well, so I tried the alternative workaround I had subsequently found, which was to move /lib/tls out of the way.

Given that this seems to fix the problem, and it's bigger than just SSH, I'm guessing the problem is actually some sort of libc6 + UML + 2.6.10 problem or something, although I haven't had this problem previously, and I was using the exact same UML kernel and host kernel, so it's a bit odd. Maybe its a Celeron thing or something. I don't know what the implications are of having /lib/tls nonexistent at the moment either, but it can't be ideal.

[03:08] [tech] [permalink]

Fun and games: installing Debian into a chroot

I was added to the Linux Australia sysadmin team shortly after linux.conf.au. To date, I've just been picking around the edges of stuff that needs doing on digital, but this week I got to really sink my teeth into stuff.

The 2006 linux.conf.au guys didn't have a server to host the website on, so LA acquired one specifically for this purpose. What we decided to do was use User Mode Linux to provide a virtual server for the 2006 guys, with a second one on standby for the 2007 guys.

The physical server is living with Jon Oxer in Adelaide at Internet Vision Technologies. I had the job of installing Linux on it.

Jon had done a quick install of Ubuntu on the first disk (it has two SATA disks) and so what I decided to do (as I wanted to use Sarge) was install Debian onto the second disk. I also wanted to mirror the disks, so I created a degraded RAID-1 array out of the second disk, created an LVM physical volume out of one of the arrays (with the associated volume group and logical volumes), and proceeded to try and install Sarge in a chroot onto this.

This went fairly well, except I ran into a few issues with the initrd. Because when Jon swapped it around to boot from the second disk, what was /dev/sdb became /dev/sda, and everything that had hardcoded devices in it promptly freaked out on bootup. I ended up dealing with this by making the initrd myself with the -k option, and tweaking the script that assembled the RAID array, as well as making sure the right devices existed. Poor old Jon had to switch back to the temporary install on the first disk a few times for me until I successfully got everything to work, but I really enjoyed working through the issues and further improving my understanding of Debian's initrd. It really does make the bootup a bit... fragile.

Once I managed to successfully boot from the degraded RAID array on /dev/sda (previously /dev/sdb), I copied the partition table across, and hot-added the second disk to the array and let it rebuild.

[03:02] [tech] [permalink]

Tuesday, 28 June 2005

sar breakage

I recently decided to take a bit more of an interest in what my server (which was starting to chug a bit) was doing, and started running sar. Then I thought I should actually look at what sar was collecting, so I wrote a script to throw it into a database.

Then I upgraded to 6.0.0, and the -H option, which outputs in a delimited format, which makes throwing the data into PostgreSQL trivial, promptly disappeared. This made me sad. I'm curious as to why said option disappeared.

Update:

I've discovered that in 6.0, there's a new command called sadf, which is like a wrapper around sar, and it's what does the equivalent output of the old sar -H with a -d option.

[16:20] [tech] [permalink]

Sunday, 26 June 2005

There's no outage like an unscheduled outage

At about 21:15 on Friday night daedalus seems to have shat itself. It was still pingable, and attempts to make a TCP connection on port 22 resulted in a connection being made and then unceremoniously closed before an SSH banner was made. HTTP requests just timed out. It looked like a good bit of resource starvation to me. I had an SSH connection open, and attempts to get it to do anything resulted in the packets being acknowledged, but no actual response.

Friday outages always suck because generally the earliest someone can reboot the box is Monday morning. Fortunately, Ben was kind enough to go in for me on Sunday morning and kick it in the guts. Ah the joys of being 1000 kilometres from my box... I think it might have also got wind that I was thinking of replacing it with something a bit newer and gruntier, and got offended or something.

There's no good evidence of what actually happened. It looks as if it was mostly dead from around 21:15. No cron jobs ran, no log entries, nothing. These are the worst "crashes" to try and diagnose. I know for a fact it's short on RAM, and a UDMA-66 IDE cable would help reduce I/O bottlenecks.

[00:34] [tech] [permalink]

Thursday, 23 June 2005

Quotable quote

Mikal: Dude, I run everything as root. Do you think I'd be running a firewall?

While trying to get a point-to-point Ethernet connection to work between our laptops at this month's meeting.

[02:18] [clug] [permalink]

Tuesday, 21 June 2005

Complying with Policy 10.1 is harder than it looks

I've been bashing on dhcp3 just a bit lately. The current thing I'm working on is bringing the Standards-Version up to the present day. Using the upgrading checklist that comes with debian-policy (which is mighty handy I might add), I've gotten to the stuff in section 10.1 first referred to by Policy version 3.5.7.0, namely the stuff about supporting building packages with the optional use of DEB_BUILD_OPTIONS.

The snippet in 10.1 makes it look a lot easier than it really is.

Certainly, I think the argus-server source package already had a pile of

     CFLAGS = -Wall -g
     INSTALL = install
     INSTALL_FILE    = $(INSTALL) -p    -o root -g root  -m  644
     INSTALL_PROGRAM = $(INSTALL) -p    -o root -g root  -m  755
     INSTALL_SCRIPT  = $(INSTALL) -p    -o root -g root  -m  755
     INSTALL_DIR     = $(INSTALL) -p -d -o root -g root  -m  755
     
     ifneq (,$(findstring noopt,$(DEB_BUILD_OPTIONS)))
     CFLAGS += -O0
     else
     CFLAGS += -O2
     endif
     ifeq (,$(findstring nostrip,$(DEB_BUILD_OPTIONS)))
     INSTALL_PROGRAM += -s
     endif
type stuff in debian/rules when I inherited it, but I only discovered how ineffectual this is on its own when I tried to add it to the debian/rules for dhcp3, and did some closer inspection of what was really going on.

Basically, setting CFLAGS in debian/rules isn't worth a pinch of shit unless you pass it to the call to ./configure like

./configure CFLAGS="$(CFLAGS)"
even then, all bets are off as to how it's going to work, depending on the upstream Makefile.

Again, using the Argus source package as my benchmark, I'd always casually eyeballed the build logs, seen lots of "gcc -O2" going on, and assumed it was because of my CFLAGS in debian/rules.

Wrong.

As it happened, the upstream Makefile was already using -O2. On closer inspection, there was no -Wall, so my CFLAGS in debian/rules wasn't being used at all.

So I fixed (for Argus) by passing CFLAGS to the ./configure invocation. Next problem is that because the upstream Makefile is using -O2, I'm now actually passing -O2 twice. The downside of this is that if a user builds the package with "DEB_BUILD_OPTS=noopt", it's going to pass "CFLAGS=-O0 -Wall" to gcc, along with the -O2 in the upstream Makefile, and $DEITY only knows what optimisation level it's going to actually use.

I think to do things properly, I'm going to have to patch the -O2 out of the upstream Makefile (for Argus).

Coming back to the DHCP package, it's weird. The configure script isn't actually a GNU autoconf configure script, so passing it a CFLAGS argument does diddley-squat. If I invoke make with a CFLAGS environment variable, it overrides the options passed to the compiler in the Makefile, which most importantly includes a whole bunch of -I's, so the build fails completely. So I'm not quite sure how I'm going to implement DEB_BUILD_OPTS support for this sucker just yet.

[06:28] [debian] [permalink]

Tuesday, 14 June 2005

The power of procrastination

I'm sure if I look back over my Debian activities, I'll always find a spike around exams, when I should be studying. Problem is, I have the attention span of a newt, and doing some QA or general packaging work is so much more appealing.

With that in mind, I present to you 3.0.2-1 of dhcp3.

I'll do some testing of it after my exams, and then upload it to unstable.

Now, it's time for a hotdog...

[20:28] [debian] [permalink]

Saturday, 11 June 2005

Cool cat...

[16:18] [life] [permalink]

Monday, 06 June 2005

You know you've been doing too much Java

when you start writing object-oriented PHP...

That said, it is kinda cool... I've written a class that tells you how many days or how many weeks until a specified date (no prize for guessing what I'm using it for).

[05:25] [uni] [permalink]

Sunday, 05 June 2005

Configuring the timezone on a Cyclades AlterPath ACS

So I noticed that the timezone on these puppies at work was wrong, so I went about trying to fix it. I happened upon this page in German, which after translation (thanks Babel Fish!), I managed to decipher the following:

The factory default contents of /etc/TIMEZONE are:

GST+7DST+6,M4.1.0,M10.5.0
and what I believe to be the desired contents for Australia/NSW or ACT:
EST+10EDT+11,M10.5.0/02:00,M3.4.0/03:00
From my understanding, translates to something like "The default timezone is GMT+10, but when we're in daylight saving time we want GMT+11. Daylight saving kicks in on the last Sunday of the tenth month at 2am and ends on the last Sunday of the third month at 3am." (So M10.5.0 means "day zero of week 5 of month 10). I presume this is a BusyBox thing. The things people do to save space.

Update

It's a uClibc thing. Some good documentation on the TZ variable is at http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap08.html

Update

I swear drugs are involved here. I had to use

EST-10,EDT-11,M10.5.0/02:00,M3.5.0/03:00
to get things to work as expected, which from my interpretation of some other documentation, really means I'm saying we're 10 hours behind UTC, which isn't the case, but yields a correct time. Go figure.

[17:22] [work] [permalink]

Saturday, 04 June 2005

Warwalking

Well kind of.

I've been in discussions with Sean Moroney, the General Manager of University House about setting up a wireless LAN.

This morning I went out armed with a couple of laptops and my Linksys WAP54G access point to see how good the signal would be from various parts of the building.

The whole exercise took a lot less time that I expected, and the signal from just the one access point at the top of one of the blocks was quite good, so I reckon about 4 access points strategically placed throughout the building should be enough the saturate it and the surrounding grounds.

[18:16] [tech] [permalink]

Carnage and mayhem

This morning when I went to use my laptop, everything seemed a bit kaput. The wireless adapter was associated with the access point, but nothing was happening. I went to my wired desktop PC and tried to bring up the web interface for the access point, and it wouldn't come up. This struck me as weird. Then I noticed that the stack of computing gear I have crammed into the wardrobe of the study seemed a bit quieter than usual. This was because caesar, my PC which is my "everything" (firewall, ADSL termination point, file server, fax gateway) box was off.

I checked the power to it, and couldn't get it to switch on. If I pulled the power cable and reinserted it, it'd flash some power lights briefly but then die again.

I pulled the lid off, and poked around and concluded that perhaps the power supply had died. Luckily there was a computer fair today, so I headed out there to try and find preferably a new micro-NLX case, as I didn't like my chances of getting just a power supply of the right size and shape.

Lucked out on the case, so I bought a $25 ATX power supply with the intention of using it and leaving the lid off and generally doing lots of dodgey power stuff to power the box from the ATX power supply externally.

Got home, tried that, and a pop and a bad smell later, I declared that if it actually wasn't the motherboard that had been dead before, it probably was now, as the box was still exhibiting the same behaviour with the external power supply.

So I raced back to the computer for a second time, and managed to get there within the last 30 minutes before closing time and bought a 1.7 Ghz Pentium IV Compaq Evo with 256Mb of RAM and a 20 Gb hard drive for $350. I got home, slapped the old hard drive in it, and everything just worked, which was really nice.

I've got to say that apart from the stupid proprietary screws, this box is so nice to work on. You can get the lid off without using any tools, and then the CDROM, floppy and hard drives are on a tray that flips up and lets you get at the CPU and DIMM slots. The hard drive takes some wide proprietary screws that allow it to just slot into a holder for it, that has a quick release catch. I really liked that, and was prepared to put up with the stupid screws for that feature.

The PCI slots are on a riser card, and that is attached to a separate chassis, and you can just rip that straight out, riser and all, and seat the PCI cards comfortably and then reseat the whole box and dice when you're done.

All in all, I think it was a good buy, so if I can get another one as a sacrificial play box at the next computer fair I probably will.

[06:47] [tech] [permalink]

Shotgun wedding

Well not quite.

The job overseas is looking like more of a vague possibility, and so for the potential purposes of visa applications, we have decided to bring forward the date of our wedding to July 23 rather than April 22 next year.

So now we have the joy of trying to arrange a wedding in 7 weeks or something.

[06:25] [life] [permalink]

Thursday, 02 June 2005

Firewall-1: Great when it works, utter poo when it doesn't

Let me just start off by saying that CheckPoint Firewall-1 is probably my preferred EPLed packet-filtering firewall. The GUI is good, the fact that it is "object-oriented" is also good. What is not good is the complexity of a deployment and the ability to troubleshoot it when things go wrong. When it's broke, you are left with your pants down.

I've just deployed some new (non-production) firewalls. I've got a management station (which is also an enforcement module), which is multi-homed (5 interfaces). There are two enforcement modules managed by this management station, one on each of two of the 5 interfaces.

I'm able to push a policy from the management station to the enforcement modules. That's all good. If I try to make the enforcement module fetch a policy from the management station, it decides it would like to talk to it on one of the unrelated interfaces, rather than on the interface that is directly connected, or to the interface with the IP address of the management station object in the rulebase. Bonkers.

Furthermore, if I do issue a

fw fetch master
this works fine. It's just trying to log to the wrong interface of the management station, and fetch its policy (by default) from the wrong interface. Highly vexing stuff. As it's non-production, I'm getting close to reinstalling Firewall-1, but I wouldn't mind determining the cause for future reference in situations where this might not be an option.

[17:54] [work] [permalink]

Wednesday, 01 June 2005

Mmm Cyclades AlterPath ACS good

Disclaimer: I like Cyclades products. They Just Worktm

Yesterday and today I got to have a fiddle with the new AlterPath ACS product, and they're very special. I'd previously used the TS product a few years ago, and been quite happy with them, especially when having to use Digi's cheap imitation, that didn't Just Worktm.

I got LDAP authentication for individual ports working reasonably painlessly, and for good measure, then enabled local LDAP authentication to the terminal server itself, with similar ease.

The web GUI is a bit sluggish, but was still useable.

The userspace environment on the box was a bit stripped out for my liking, but still useable. A non-BusyBox vi would go a long way, as would iproute, but other than that, I really can't complain.

[18:55] [work] [permalink]