Sysadmin at work
When I heard about the 4th Linux Showdown, sponsored by TrueAbility, I was pretty excited. I’m a pretty competitive guy, so the idea of competing in a sysadmin’s challenge sounded like fun.
In the Linux Showdown, you get 30 minutes to complete a certain number of sysadmin tasks. Some of the tasks are pretty simple, while some of the others become more difficult. I entered the first day and managed to get 9th place with a score of 100% and a time of just under 17 minutes.
The second day I ran into trouble. One of the tasks was to reset the mysql root password, and, though I followed the directions here, twice, I was never able to log into mysql as root. The commands seemed to be running correctly, but I was locked out.
In my day-job as the system administrator for a school, I would keep bashing away at the problem until I figured out what I was doing wrong. In the competition, I ran out of time after fifteen minutes of debugging and ended up with a lousy 40%. Ouch!
I was frustrated, but figured the third day’s competition should fit a bit better. The hint said that it was a scripting competition, and my python foo is pretty decent. Sure enough, day three involved finding files with modification times between two dates, adding them to a database, and then tarring them up.
I came up with a python script that found the necessary files and added them to the database. Except my clever ‘INSERT’ statement didn’t actually work. If I manually copied and pasted it into mysql, it worked perfectly, but it didn’t run from the script. Grrr. I spent ten minutes debugging… and my time was up!
Well, that sucked. This time I got an impressive 20%. Double ouch!
After finishing the test, I went to bed and spent fifteen minutes ranting to my poor wife. The next day, after cooling off, I decided I was done. The hint for the last competition said that it had something to do with security, and I wouldn’t call myself an expert on that. If I’m getting 20% in the areas that I’m relatively good at, then what should I expect in areas that I’m less comfortable with.
Then it hit me. If I’m not comfortable with it, why not just do it for fun? If I know I’m probably going to get a zero, who cares? I checked the leaderboard, and the highest score at the time was 67%, so my zero wouldn’t be so bad. I went ahead and started the last competition.
Step one, secure the mail server. We don’t run our own mail servers here at the school and I know nothing about postfix, so I spent ten minutes or so Googling for some kind of solution, typed in what I thought was a partial fix, and then decided to give up.
Step two, secure a page on the webserver. This is something I have to do quite often, so I was able to get it done in five minutes or so.
Finally, step three, secure an FTP server. Who still uses FTP? We don’t! I wasn’t even sure what the ftp daemon’s name was, so I ran a ‘ps aux | grep ftp’. This was the only reason that I noticed that the ftp daemon wasn’t using the config file in /etc, but rather some config file in someone’s home directory. I did what I thought would secure the ftp server in both config files, and saw that I had a little over two minutes left.
Ok, I could have spent some more time on postfix, but I knew nothing about it, so I decided that I was finished. Worst case, I’d get 33% for the webserver (which was the only fix I’d actually tested). Best case, 67% for the ftp server, which I was pretty sure I’d fixed. If so, I might actually get in the top twenty. So, I logged in to the leaderboard, checked my ranking… First!??!? With 100%? What?
Apparently the random lines from Google that I put into my postfix config had secured it. Pure luck. As I followed the leaderboard for the rest of the day, it became obvious that many people with a lot of experience with apache, postfix and ftp were whipping right through the contest, missing the ftp config file in the home directory, and getting 67%, while I kept sitting on top with the lone 100%. I felt like such a fraud.
Finally, in the last hour before the contest ending, someone else found the solution five minutes faster than I did and got first place. Praise God! I still felt like a fraud, but at least first place was going to someone who knew what they were doing.
So, in four days of competitions, I got the highest score in the areas I was weakest in and the lowest score in the areas I was strongest in. That seems to indicate either that I don’t know what my strengths and weaknesses are, or that the competition needs some tweaking. Well, I think I’m at least reasonably aware of my strengths and weaknesses, and I’m very aware of how much of a role chance played in all four days of competition. So how can this competition be tweaked?
The strengths of the competition are pretty obvious. The whole point of TrueAbility is to winnow out people who talk the talk, but can’t walk the walk. When you get a résumé, you don’t know whether the applicant can actually do all the things they claim to be able to do, so, with TrueAbility, you give someone a VM and a list of tasks, and see whether or not they can do them. TrueAbility doesn’t care how they do the tasks, they just check that the tasks are completed. Brilliant!
The biggest weakness in the competition is the time limit. A vast majority of the problems we face as sysadmins need to be fixed quickly, but rarely does a complex problem need to be solved within 30 minutes. This time limit in the competition introduces a bias against those who work methodically. While hiring fast workers is always nice, basing hiring decisions based on how fast someone can code rather than how well they code is not wise.
In addition, the marking (especially for the last few days) was extremely coarse, so ranking was heavily dependent on how quickly you finished. This was especially noticeable in the first day, where the only difference between 1st place and 28th place was whether you took 10 minutes to finish the job or 30 minutes. As was obvious in the last day’s competition, this emphasis on time caused people to rush so much that they made mistakes. Time makes a lousy basis for ranking.
So what’s the solution? I see two complementary things that could be done to improve the competition. The first is to break down the grading even more, and assign different values to the different tasks. I’d even add in some standard tasks (with a total score of a maximum of 20%) along the lines of “Make sure that you close any ports not needed for your task”, “Disallow password logins over ssh and set up the server to trust your ssh key”, and “Replace your Ubuntu install with the real sysadmin’s OS: Fedora”. Ok, I’m half joking on that last one, but you get the idea. The key thing is that it should be almost impossible to get 100%, but a mediocre sysadmin should be able to hit 70% with only minor difficulty, and a talented sysadmin shouldn’t have much trouble reaching 90%.
The other thing that would help would be a removal of the hard deadline. Instead, allow candidates to continue working beyond the time limit, with a deduction of 1-2% for every minute. This introduces a cost to breaking the deadline without causing the candidate to completely fail because they needed ten more minutes.
With these two adjustments, time should become secondary to doing the job right. If I spend 10 minutes getting 90%, I’ll still get a lower score than someone who takes their time to do it right in 30 minutes. And, if I spend 40 minutes reaching 90%, I’ll only lose 20% for going over and end with a score of 70%, rather than sitting at zero because I just couldn’t finish my script within the deadline.
TrueAbility, thank you for the time and effort you’ve put into developing the problems for this competition, and thank you for the creative idea of a sysadmin’s competition in the first place.
And I really want to congratulate those who were able to consistently get high scores under the tough time limits.
Now I’m off to get some sleep before our first day of school.
Messy wires credit – Cisco Spaghetti by CHRISTOPHER MACSURAK. Used under the CC-BY 2.0 license.