Tag Archives: linux

Odds and ends

Picture of the steles at Nahr el Kalb

 

Since I last posted, there have been a number of small updates, but nothing that seemed big enough to write about. So I figured it might be worth posting a short summary of what I’ve been up to over the last couple of months.

In no particular order:

FOSDEM
I had the opportunity to visit FOSDEM for the first time last month. Saw lots of cool things, met lots of cool people and even managed to bag a LibreOffice hoodie. Most importantly, it was a chance to build friendships, which have a far higher value than any code ever will.

Wireless access points
I should probably write a proper post about this sometime, but a number of years ago we bought about 30 TP-LINK WR741ND wireless APs and slapped a custom build of OpenWRT on them. We installed the last spare a couple of months ago and ran into problems finding a decent replacement (specific hardware revisions can be quite difficult to find in Lebanon). After much searching, we managed to get ahold of a TP-LINK WR1043ND for testing and our OpenWRT build works great on it. Even better, it has a four-port gigabit switch which will give us much better performance than the old 100Mbps ones.

LizardFS patches
I ran into a couple of performance issues that I wrote some patches to fix. One is in the process of being accepted upsteam, while the other has been deemed too invasive, given that upstream would like to deal with the problem in a different way. For the moment, I’m using both on the school system, and they’re working great.

Kernel patch (tools count, right?)
After the F26 mass rebuild, I ran into problems building the USB/IP userspace tools with GCC 7. Fixing the bugs was relatively simple, and, since the userspace tools are part of the kernel git repository, I got to submit my first patches to the LKML. The difference between a working kernel patch and a good kernel patch can be compared to the difference between a Volkswagen Beetle and the Starship Enterprise. I really enjoyed the iterative process, and, after four releases, we finally had something good enough to go into the kernel. A huge thank you goes out to Peter Senna, who looked over my code before I posted it and made sure I didn’t completely embarrass myself. (Peter’s just a great guy anyway. If you ever get the chance to buy him a drink, definitely do so.)

Ancient history
As of about three weeks ago, I am teaching history. Long story as to how it happened, but I’m enjoying a few extra hours per week with my students, and history, especially ancient history, is a subject that I love. To top it off, there aren’t many places in the world where you can take your students on a field trip to visit the things you’re studying. On Wednesday, we did a trip to Nahr el-Kalb (the Dog River) where there are stone monuments erected by the ancient Assyrian, Egyptian, and Babylonian kings among others. I love Lebanon.

Multiseat systems and the NVIDIA binary driver

Building mesa

Building mesa

Ever since our school switched to Fedora on the desktop, I’ve either used the onboard Intel graphics or AMD Radeon cards, since both are supported out of the box in Fedora. With our multiseat systems, we now need three external video cards on top of the onboard graphics on each system, so we’ve bought a large number of Radeon cards over the last few years.

Unfortunately, our local supplier has greatly reduced the number of AMD cards that they stock. In their latest price lists, they have a grand total of two Radeon cards in our price range, and one of them is almost seven years old!

This has led me to take a second look at NVIDIA cards, and I’m slowly coming back around to the concept of buying them and maybe even using their binary drivers. Our needs have changed since we first started using Linux, and NVIDIA’s binary driver does offer some unique benefits.

As we’ve started teaching 3D modeling using Blender, render time has become a real bottleneck for some of our students. We allow students to use the computers before and after school, but some of them don’t have much flexibility in their transportation and need to get their rendering done during the school breaks. Having two or three students all trying to render at the same time on a single multiseat system can lead to a sluggish system and very slow rendering. The easiest way to fix this is to do the rendering in the GPU, which Blender does support, but only using NVIDIA’s binary driver.

So about a month ago, I ordered a cheap NVIDIA card for testing purposes. I swapped it with an AMD card on one of our multiseat systems and powered it up. Fedora recognized the card using the open-source nouveau driver and everything just worked. Beautiful!

Then, a few hours later, I noticed the system had frozen. I rebooted it, and, after a few hours, it had frozen again. I moved the NVIDIA card into a different system, and, after a few hours, it froze while the original system just kept running.

Some research showed that the nouveau driver sometimes has issues with multiple video cards on the same system. There was some talk about extracting the binary driver’s firmware and using it in nouveau, but I decided to see if I could get the binary driver working without breaking our other Intel and AMD seats.

The first thing I did was upgrade the test system to Fedora 25 in hopes of taking advantage of the work done to make mesa and the NVIDIA binary driver coexist. I then installed the binary NVIDIA drivers from this repository (mainly because his version of blender already has the CUDA kernels compiled in). The NVIDIA seat came up just fine, but I quickly found that mesa in Fedora 25 isn’t built with libglvnd (a shim between either the mesa or NVIDIA OpenGL implementation, depending on which card you’re using and your applications) enabled, so all of the seats based on open drivers didn’t come up. But, even when it was enabled, I ran into this bug, so I ended up extending this patch so it would also work with Gallium drivers and applying it.

This took me several steps closer, but apparently the X11 GLX module is not part of libglvnd and NVIDIA sets the Files section in xorg.conf to use it’s own GLX module (which, oddly enough, doesn’t work with the open drivers). I finally worked around this via the ugly hack of creating two different xorg.conf.d directories and telling lightdm to use the NVIDIA one when loading the NVIDIA seat.

Voilà! We now have a multiseat system with one Intel built-in card using the mesa driver, two AMD cards using the mesa Gallium driver, and one NVIDIA card using the NVIDIA binary driver. And it only cost me eight hours and my sanity.

So what needs to happen to make this Just Work™? Either libglvnd needs to also include the X11 GLX module or we need a different shim to accomplish the same thing. And Fedora needs to build mesa with libglvnd enabled (but not until this bug is fixed!)

My mesa build is here and the source rpm is here. There is a manual “Provides: libGL.so.1()(64bit)” in there that isn’t technically correct, but I really didn’t want to recompile negativo17’s libglvnd to add it in and my mesa build requires that libglvnd implementation.

My xorg configs are here and my lightdm configuration is here. Please note that the xorg configs have my specific PCI paths; yours may differ.

And I do plan to write a script to automate the xorg and lightdm configs. I’ll update this post when I’ve done so.

Sidenote: As I was looking through my old posts to see if I had anything on NVIDIA, I came across a comment by Seth Vidal. He was an excellent example of what the Fedora community is all about, and I really miss him.

Update: Configuration has become much simpler. An updated post is here.

Notes on a mass upgrade to Fedora 23

Picture of Fedora 23 desktop

Fedora 23

One of the hardest parts of running Fedora in a school setting is keeping on top of the upgrades, and I ended up falling a few months behind. Fedora 23 was released back in November, and it took me until February to start the upgrade process.

For our provisioning process, we’ve switched from a custom koji instance to ansible (with our plays on github), and this release was the first time I was really able to take advantage it. I changed our default kickstart to point to the Fedora 23 repositories, installed it on a test system, ran ansible on it, and voilà, I had a working Fedora 23 setup, running perfectly with all our school’s customizations. It was the easiest upgrade experience I’ve ever had!

Well, mostly.

As usual, the moment you think everything is perfect is the moment everything goes wrong. On our multiseat systems, we have three external AMD graphics cards along with the internal Intel graphics. The first bug I noticed was that the Intel card wasn’t doing any graphics acceleration. It turns out that VGA arbitration is automatically turned on if you have more than one video card, and Intel cards don’t support it in DRI2. DRI3 does handle arbitration just fine, but it was (and still is) disabled in the latest xorg-x11-drv-intel in the updates repository. Luckily for me, there’s a build in koji that re-enables DRI3. Problem solved.

The second bug was…odd. While we use gnome-shell as the default desktop environment in the school, we use lightdm for logging in, mainly because of it’s flexibility. We run xscreensaver in the login screen (and only in the login screen) to make it clear which computers are off, which are on, and which are logged in. GDM doesn’t support xscreensaver, but lightdm does. And this brings us back to the bug. On the Intel seat, moving the mouse or pressing a key would stop the screensaver as expected, but the screen would remain black except for the username control. It seems that the “VisibilityNotify” event isn’t being honored by the driver (though don’t ask me why it should be passed down to the driver). I filed a bug, and then finally figured out that fading xscreensaver back in works around the problem.

The third bug is even stranger. On the teacher’s machine, we have a small script that starts x11vnc (giving no control to anyone connecting to it) so the teacher can give a demonstration to the students. But after install Fedora 23 on the teacher’s machine, the demo kept showing the same three frames over and over. The teacher’s system isn’t multiseat and is using the builtin Intel graphics, so, oddly enough, disabling DRI3 fixed the problem. I filed another bug.

When upgrading the staff room systems, I ran into a bug in which cups runs screaming into the night (ok, slight exaggeration) if you have a server announcing printers over both the old cups and new dnssd protocols. Since we don’t have any pre-F21 systems any more, I’ve just disabled the old cups protocol on the server.

And, finally, my principal, who teachers computers to grades 11 and 12, came in to ask me why LibreOffice was crashing for a couple (and only a couple) of his students when they were formatting cells on a spreadsheet that he gave them. After some fancy footwork involving rm’d .config/libreoffice directories and files saved into random odd formats and then back into ods, we finally managed to format the cells without a crash. Lovely.

All this brings me back to ansible. In each of the bugs that required changes to the workstations, all I had to do was update the ansible scripts and push the changes out. Talk about painless! Ansible has made this job so much easier!

And I do want to finish by saying that these bugs are part of the reason that I love Fedora. With Fedora, I have the freedom to fix these problems myself. For both the cups bug and the xscreensaver bug, I was able to dig into the source code to start tracking down where the problem lay and come up with a workaround. And if I can just get the LibreOffice bug to reproduce, I could get a crash dump off of it and possibly figure it out too. Hurrah for source code!

How do you rank a sysadmin?

Sysadmin at work

Sysadmin at work

When I heard about the 4th Linux Showdown, sponsored by TrueAbility, I was pretty excited. I’m a pretty competitive guy, so the idea of competing in a sysadmin’s challenge sounded like fun.

In the Linux Showdown, you get 30 minutes to complete a certain number of sysadmin tasks. Some of the tasks are pretty simple, while some of the others become more difficult. I entered the first day and managed to get 9th place with a score of 100% and a time of just under 17 minutes.

The second day I ran into trouble. One of the tasks was to reset the mysql root password, and, though I followed the directions here, twice, I was never able to log into mysql as root. The commands seemed to be running correctly, but I was locked out.

In my day-job as the system administrator for a school, I would keep bashing away at the problem until I figured out what I was doing wrong. In the competition, I ran out of time after fifteen minutes of debugging and ended up with a lousy 40%. Ouch!

I was frustrated, but figured the third day’s competition should fit a bit better. The hint said that it was a scripting competition, and my python foo is pretty decent. Sure enough, day three involved finding files with modification times between two dates, adding them to a database, and then tarring them up.

I came up with a python script that found the necessary files and added them to the database. Except my clever ‘INSERT’ statement didn’t actually work. If I manually copied and pasted it into mysql, it worked perfectly, but it didn’t run from the script. Grrr. I spent ten minutes debugging… and my time was up!

Well, that sucked. This time I got an impressive 20%. Double ouch!

After finishing the test, I went to bed and spent fifteen minutes ranting to my poor wife. The next day, after cooling off, I decided I was done. The hint for the last competition said that it had something to do with security, and I wouldn’t call myself an expert on that. If I’m getting 20% in the areas that I’m relatively good at, then what should I expect in areas that I’m less comfortable with.

Then it hit me. If I’m not comfortable with it, why not just do it for fun? If I know I’m probably going to get a zero, who cares? I checked the leaderboard, and the highest score at the time was 67%, so my zero wouldn’t be so bad. I went ahead and started the last competition.

Step one, secure the mail server. We don’t run our own mail servers here at the school and I know nothing about postfix, so I spent ten minutes or so Googling for some kind of solution, typed in what I thought was a partial fix, and then decided to give up.

Step two, secure a page on the webserver. This is something I have to do quite often, so I was able to get it done in five minutes or so.

Finally, step three, secure an FTP server. Who still uses FTP? We don’t! I wasn’t even sure what the ftp daemon’s name was, so I ran a ‘ps aux | grep ftp’. This was the only reason that I noticed that the ftp daemon wasn’t using the config file in /etc, but rather some config file in someone’s home directory. I did what I thought would secure the ftp server in both config files, and saw that I had a little over two minutes left.

Ok, I could have spent some more time on postfix, but I knew nothing about it, so I decided that I was finished. Worst case, I’d get 33% for the webserver (which was the only fix I’d actually tested). Best case, 67% for the ftp server, which I was pretty sure I’d fixed. If so, I might actually get in the top twenty. So, I logged in to the leaderboard, checked my ranking… First!??!? With 100%? What?

Apparently the random lines from Google that I put into my postfix config had secured it. Pure luck. As I followed the leaderboard for the rest of the day, it became obvious that many people with a lot of experience with apache, postfix and ftp were whipping right through the contest, missing the ftp config file in the home directory, and getting 67%, while I kept sitting on top with the lone 100%. I felt like such a fraud.

Finally, in the last hour before the contest ending, someone else found the solution five minutes faster than I did and got first place. Praise God! I still felt like a fraud, but at least first place was going to someone who knew what they were doing.

So, in four days of competitions, I got the highest score in the areas I was weakest in and the lowest score in the areas I was strongest in. That seems to indicate either that I don’t know what my strengths and weaknesses are, or that the competition needs some tweaking. Well, I think I’m at least reasonably aware of my strengths and weaknesses, and I’m very aware of how much of a role chance played in all four days of competition. So how can this competition be tweaked?

The strengths of the competition are pretty obvious. The whole point of TrueAbility is to winnow out people who talk the talk, but can’t walk the walk. When you get a résumé, you don’t know whether the applicant can actually do all the things they claim to be able to do, so, with TrueAbility, you give someone a VM and a list of tasks, and see whether or not they can do them. TrueAbility doesn’t care how they do the tasks, they just check that the tasks are completed. Brilliant!

The biggest weakness in the competition is the time limit. A vast majority of the problems we face as sysadmins need to be fixed quickly, but rarely does a complex problem need to be solved within 30 minutes. This time limit in the competition introduces a bias against those who work methodically. While hiring fast workers is always nice, basing hiring decisions based on how fast someone can code rather than how well they code is not wise.

In addition, the marking (especially for the last few days) was extremely coarse, so ranking was heavily dependent on how quickly you finished. This was especially noticeable in the first day, where the only difference between 1st place and 28th place was whether you took 10 minutes to finish the job or 30 minutes. As was obvious in the last day’s competition, this emphasis on time caused people to rush so much that they made mistakes. Time makes a lousy basis for ranking.

So what’s the solution? I see two complementary things that could be done to improve the competition. The first is to break down the grading even more, and assign different values to the different tasks. I’d even add in some standard tasks (with a total score of a maximum of 20%) along the lines of “Make sure that you close any ports not needed for your task”, “Disallow password logins over ssh and set up the server to trust your ssh key”, and “Replace your Ubuntu install with the real sysadmin’s OS: Fedora”. Ok, I’m half joking on that last one, but you get the idea. The key thing is that it should be almost impossible to get 100%, but a mediocre sysadmin should be able to hit 70% with only minor difficulty, and a talented sysadmin shouldn’t have much trouble reaching 90%.

The other thing that would help would be a removal of the hard deadline. Instead, allow candidates to continue working beyond the time limit, with a deduction of 1-2% for every minute. This introduces a cost to breaking the deadline without causing the candidate to completely fail because they needed ten more minutes.

With these two adjustments, time should become secondary to doing the job right. If I spend 10 minutes getting 90%, I’ll still get a lower score than someone who takes their time to do it right in 30 minutes. And, if I spend 40 minutes reaching 90%, I’ll only lose 20% for going over and end with a score of 70%, rather than sitting at zero because I just couldn’t finish my script within the deadline.

TrueAbility, thank you for the time and effort you’ve put into developing the problems for this competition, and thank you for the creative idea of a sysadmin’s competition in the first place.

And I really want to congratulate those who were able to consistently get high scores under the tough time limits.

Now I’m off to get some sleep before our first day of school.

Messy wires credit – Cisco Spaghetti by CHRISTOPHER MACSURAK. Used under the CC-BY 2.0 license.

Fedora 18 – A Sysadmin’s view

Road leading down into clouds

The road less traveled

At our school we have around 100 desktops, a vast majority of which run Fedora, and somewhere around 900 users. We switched from Windows to Fedora shortly after Fedora 8 was released and we’ve hit 8, 10, 13, 16, and 17 (deploying a local koji instance has made it easier to upgrade).

As I finished putting together our new Fedora 18 image, there were a few things I wanted to mention.

The Good

  1. Offline updates: Traditionally, our systems automatically updated on shutdown. In the 16-17 releases, that became very fragile as any systemctl scriptlets in the updates would block because systemd was in the process of shutting down. Now, with systemd’s support for offline updates, we can download the updates on shutdown, reboot the computer, and install the updates in a minimal system environment. I’ve packaged my offline updater here.
  2. btrfs snapshots: This isn’t new in Fedora 18, but, with the availability of offline updates, we’ve finally been able to take proper advantage of it. One problem we have is that we have impatient students who think the reset button is the best way to get access to a computer that’s in the middle of a large update. Now, if some genius reboots the computer while it’s updating, it reverts to its pre-update state, and then attempts the update again. If, on the other hand, the update fails due to a software fault, the computer reverts to its pre-update state and boots normally. Either way, the system won’t be the half-updated zombie that so many of my Fedora 17 desktops are.
  3. dconf mandatory settings: Over the years we’ve moved from gconf to dconf, and I love the easy way that dconf allows us to set mandatory settings for Gnome. This continued working with only a small modification from Fedora 17 to Fedora 18, available here and here.
  4. Javascript config for polkit: I love how flexible this is. We push out the same Fedora image to our school laptops, but the primary difference compared to the desktop is that we allow our laptop users to suspend, hibernate and shutdown their laptops, while our desktop users can’t do any of the above. What I would really like to do is have the JS config check for the existence of a file (say /etc/sysconfig/laptop), and do different things based on that, but I haven’t managed to work out how to do that yet. My first attempt is here.
  5. systemd: This isn’t a new feature in 18, but systemd deserves a shout-out anyway. It does a great job of making my workstations boot quickly and has greatly simplified my initscripts. It’s so nice to be able to easily prevent the display manager from starting before we have mounted our network directories.
  6. Gnome Shell: We actually started experimenting with Gnome Shell when it was first included in Fedora, and I switched to it as the default desktop in Fedora 13. As we’ve moved from 13 to 16, then 17, and now 18, it’s been a nice clean evolution for our users. When I first enabled Gnome Shell in our Fedora 13 test environment, the feedback from our students was very positive. “It doesn’t look like Windows 98 any more!” was the most common comment. As we’ve upgraded, our users have only become more happy with it.

The Bad

The bad in Fedora 18 mainly comes down to the one area where Linux in general, and Fedora specifically, is weak – being backwards-compatible. This was noticeable in two very specific places:

  1. Javascript config for polkit: While I was impressed with the new javascript config’s flexibility, I was most definitely not impressed that my old pkla files were completely ignored. As a system administrator, I find it frustrating when I have to completely rewrite my configuration files because “now we have a better way”. I’ve read the blog post explaining the reasoning behind the switch to the JS config, but how hard would it have been to either keep the old pkla interpreter, or, if it was really desired, rewrite the pkla interpreter in javascript? The ironic part of this is that the “old” pkla configuration was itself a non-backwards-compatible change from the even older PolicyKit configuration a little less than four years ago.
  2. dconf mandatory settings: With the version of dconf in Fedora 18, we now have the ability to have multiple user dconf databases. This is a great feature, but it requires a change in the format of the database profile files, which meant my database profile files from Fedora 17 no longer worked correctly. In fact, they caused gnome-settings-daemon to crash, which crashed Gnome and left users unable to log in. Oops. To be fair, this was a far less annoying change because I only had to change a couple of lines, but I’m still not impressed that dconf couldn’t just read my old db profile files.

As a developer, I totally understand the “I have a better way” mindset, but I think backwards compatibility is still vital. That’s why I love rsync and systemd, but have very little time for unison (three different versions in the Fedora repositories because newer versions don’t speak the same language as older versions).

I know some people will say, “If you want stability, just use RHEL.” That’s fine, but I’m not necessarily looking for stability. I like the rate of change in Fedora. What I dislike is when things break because someone wanted to do something different.

All in all, I’ve been really happy with Fedora as our school’s primary OS, and each new release’s features only make me happier. Now I need to go fix a regression in yum-presto that popped up because of some changes we made because we wanted to do something different.

GlusterFS Madness

Background
As mentioned in Btrfs on the server, we have been using btrfs as our primary filesystem for our servers for the last year and a half or so, and, for the most part, it’s been great. There have only been a few times that we’ve needed the snapshots that btrfs gives us for free, but when we did, we really needed them.

At the end of the last school year, we had a bit of a problem with the servers and came close to losing most of our shared data, despite using DRBD as a network mirror. In response to that, we set up a backup server which has the sole job of rsyncing the data from our primary servers nightly. The backup server is also using btrfs and doing nightly snapshots, so one of the major use-cases behind putting btrfs on our file servers has become redundant.

The one major problem we’ve had with our file servers is that, as the number of systems on the network has increased, our user data server can’t handle the load. The configuration caching filesystem (CCFS) I wrote has helped, but even with CCFS, our server was regularly hitting a load of 10 during breaks and occasionally getting as high as 20.

Switching to GlusterFS
With all this in mind, I decided to do some experimenting with GlusterFS. While we may have had high load on user data server, our local mirror and shared data servers both had consistently low loads, and I was hoping that GlusterFS would help me spread the load between the three servers.

The initial testing was very promising. When using GlusterFS over ext4 partitions using SSD journaling on just one server, the speed was just a bit below NFS over btrfs over DRBD. Given the distributed nature of GlusterFS, adding more servers should increase the speed linearly.

So I went ahead and broke the DRBD mirroring for our eight 2TB drives and used the four secondary DRBD drives to set up a production GlusterFS volume. Our data was migrated over, and we used GlusterFS for a week without any problems. Last Friday, we declared the transition to GlusterFS a success, wiped the four remaining DRBD drives, and added them to the GlusterFS volume.

I started the rebalance process for our GlusterFS volume Friday after school, and it continued to rebalance over the weekend and through Monday. On Monday night, one of the servers crashed. I went over to the school to power cycle the server, and, when it came back up, continued the rebalance.

Disaster!
Tuesday morning, when I checked on the server, I realized that, as a result of the crash, the rebalance wasn’t working the way it should. Files were being removed from the original drives but not being moved to the new drives, so we were losing files all over the place.

After an emergency meeting with the principal (who used to be the school’s sysadmin before becoming principal), we decided do ditch GlusterFS and go back to NFS over ext4 over DRBD. We copied over the files from the GlusterFS partitions, and then filled in the gaps from our backup server. Twenty-four sleepless hours later, the user data was back up and the shared data was up twenty-four sleepless hours after that.

Lessons learned

  1. Keep good backups. Our backups allowed us to restore almost all of the files that the GlusterFS rebalance had deleted. The only files lost were the ones created on Monday.
  2. Be conservative about what you put into production. I’m really not good at this. I like to try new things and to experiment with new ideas. The problem is that I can sometimes put things into production without enough testing, and this is one result.
  3. Have a fallback plan. In this case, our fallback was to wipe the server and restore all the data from the backup. It didn’t quite come to that as we were able to recover most of the data off of GlusterFS, but we did have a plan if it did.
  4. Avoid GlusterFS. Okay, maybe this isn’t what I should have learned, but I’ve already had one bad experience with GlusterFS a couple of years ago where its performance just wasn’t up to scratch. For software that’s supposedly at a 3.x.x release, it still seems very beta-quality.

The irony of this whole experience is that by switching the server filesystems from btrfs to ext4 with SSD journals, the load on our user data server has dropped to below 1.0. If I’d just made that switch, I could have avoided two days of downtime and a few sleepless nights.

Nuclear explosion credit – Licorne by Pierre J.. Used under the CC-BY-NC 2.0 license.

Goodbye GDM (for the moment)

Our school system has been running Fedora on our desktops since early 2008. During that time, our login screen has been managed by GDM and our desktop session has been GNOME. It doesn’t look like our desktop session is going to change any time soon, as we transitioned to GNOME Shell in Fedora 13 and the students and teachers have overwhelmingly preferred it to GNOME 2.

At our school we have a couple of IT policies that affect our login sessions. All lab computers that aren’t logged in have some form of screensaver running (not a black screen) as it helps students identify which computers are on and which aren’t at a glance. It also helps IT see which computers need to be checked. Logged in computers should never have a screensaver running and screen-locking is disabled as we have far more users than computers. Some may argue that these policies should be amended, but, for the moment, they are what they are.

In older versions of Fedora, gnome-screensaver was set to run in gdm with the floating Fedora bubbles coming on after a minute of disuse. The screensaver was inhibited during the login session (I experimented with changing the gconf settings so it didn’t come on for 100 hours and other such nonsense, but inhibiting the screensaver was the only way I found that worked reliably over long periods of time).

With Fedora 16 we now have a much more beautiful new version of GDM, but, unfortunately, the gnome-screensaver that comes with it no longer allows you to actually show a screensaver. I decided to try using xscreensaver instead, but it cannot run in GDM. It keeps complaining that something else is grabbing the keyboard, and I can only assume that something is GDM. Finally, I can’t even write a simple screensaver program in python as it seems I can’t even run a full-screen app over the GDM screen.

Add to all that the fact that we have 1000+ students in the school who are able to log into any lab computer and GDM lists all users who ever logged into the computer. Which theoretically could be 1000. Urgh!

So for our Fedora 16 system, I’ve switched over to lxdm. A quick configuration change to tell it to boot gnome-shell as its default session (and some hacks so it doesn’t try to remember what language the last user used to log in) and it was set. Xscreensaver runs just fine over it and we now have some pretty pictures of Lebanon and the school in a carousel as our login screensaver.

It looks like the screensaver functionality will get merged straight into gnome-shell, and, if it does, we may be able to have extensions that actually implement the screensaver. If that happens, and if GDM re-acquires the ability to not show the user list, we’ll switch back to GDM. Until then, we’ll stick with lxdm.

Now I just need to work out how to inhibit gnome-screensaver during login as gnome-screensaver --inhibit no longer works. I’m sure there was a good reason for removing that code, but for the life of me I can’t work out what it was…