Author Archives: jdieter

About jdieter

I am a teacher and system administrator for the Lebanon Evangelical School in Loueizeh, on the outskirts of Beirut, Lebanon. The views and opinions in this blog are my own and may not necessarily reflect those of the school.

Locks in the classroom – 2017

For the fifth year now, our grade nine students have been doing 3D modeling using Blender. Our students finished up their first assignments over a month ago, but it’s taken this long for me to get the top models together. So, with no further delay, here are the top models from each of the three grade nine classes (click on the pictures for Full HD renders).

We start with this nice mix of color and reflection.

Lock by A. Badr – CC BYSource

Next is a double-lock combination.

Lock by Amanda S. – CC BYSource

And I love the addition of a key in this lock.

Lock by christian07 – CC BY-SASource

Next is the missing link. I really like the mix of pink and black.

Lock by fromdawn02 – CC BY-SASource

Here we have safari-themed locks. Nice choice of textures!

Lock by A. Ayvazian – CC BY-SASource

And here’s a lock in a cave. The cave itself is a proper model, not just a background image.

Lock by A. Zamroud – CC BY-SASource

Here we have the lock on the Pearly Gates. I wish the lock texture was of better quality, but I really appreciate how @david190 has shaped the lock to match the texture, especially on the top.

Lock by @david190 – CC BY-SASource

And here we have a beautiful lock in a velvety box. Very nice!

Lock by Waad H – CC BY-SASource

We can’t have a lock assignment without the obligatory pirate’s treasure chest. I love the attention to detail!

Lock by Sarah B. – CC BY-SASource

And, our final submission includes a monkey. Because… monkeys rock? What a wonderful level of detail in this scene.

Lock by JP Kazzi – CC BY-SASource

Bare-metal Kubernetes

A few years ago, I attended my first Linux conference, DevConf 2014. Many of the speakers talked about containers and how wonderful they were, and my interest was piqued, but I’ve never really had an opportunity to use them.

As the sysadmin for a school, there just isn’t much need for the scalability provided for by containers. Our internal web site runs on a single VM, and the short downtimes required for system updates and upgrades are not a problem, especially if I plan them for the weekends. On the flip side, having something that we can use to spin up web services quickly isn’t a bad idea, so, over the last few months, I’ve been experimenting with Kubernetes.

My main goal was to get something running that was easy to set up and Just Works™. Well, my experience setting up Kubernetes was anything but easy, but, now that it’s up, it does seem to just work. My main problem was that I wanted to use my three oVirt nodes (running CentOS 7) as both Kubernetes nodes and masters, which meant the tutorials took a bit of finessing to get working.

I mostly followed this guide for the initial setup, and then this guide, but I did run into a few problems that I’d like to document. The first was that my containers were inaccessible on their cluster IP range, the primary symptom being that the kube-dashboard service couldn’t connect to the kubernetes service. It turned out that I, rather stupidly, forgot to start kube-proxy, which does all the iptables magic to direct traffic to the correct destination.

The second problem I ran into was that I couldn’t get pretty graphs in kube-dashboard because the heapster service wouldn’t start because I hadn’t set up the cluster DNS service, kube-dns. To be fair, the instructions for doing so are pretty unclear. In the end, I downloaded skydns-rc.yaml.sed and skydns-svc.yaml.sed, and replaced $DNS_DOMAIN and $DNS_SERVER_IP with the values I wanted to use.

The final problem I ran into is that I’m using our school’s local Certificate Authority for all the certificates we use, and I’ve had to keep on adding new subject alternative names to the server cert and then regenerate it. At the moment, it’s got the following:
DNS:kubernetes.example.com
DNS:node01.example.com
DNS:node02.example.com
DNS:node03.example.com
DNS:localhost
DNS:localhost.localdomain
DNS:kubernetes.default.svc.local
# Where I replaced $DNS_DOMAIN with “local”
DNS:kubernetes.local
DNS:kubernetes.default
IP Address:127.0.0.1
IP Address:172.30.0.1
# Where our cluster IP range is 172.30.0.0/16

I suspect I could now get rid of some of those hostnames/addresses, and I’m not even sure if this method is best practice, but at least it’s all working.

So I’m at the point now where I need to see if I can setup our MikroTik router as a load balancer and then see if I can get our web based marking system, LESSON, moved over to a container with multiple replicas. Hurrah for redundancy!

Broken Wagon Wheel by Kevin Casper is in the public domain / used under a CC0 license

Odds and ends

Picture of the steles at Nahr el Kalb

 

Since I last posted, there have been a number of small updates, but nothing that seemed big enough to write about. So I figured it might be worth posting a short summary of what I’ve been up to over the last couple of months.

In no particular order:

FOSDEM
I had the opportunity to visit FOSDEM for the first time last month. Saw lots of cool things, met lots of cool people and even managed to bag a LibreOffice hoodie. Most importantly, it was a chance to build friendships, which have a far higher value than any code ever will.

Wireless access points
I should probably write a proper post about this sometime, but a number of years ago we bought about 30 TP-LINK WR741ND wireless APs and slapped a custom build of OpenWRT on them. We installed the last spare a couple of months ago and ran into problems finding a decent replacement (specific hardware revisions can be quite difficult to find in Lebanon). After much searching, we managed to get ahold of a TP-LINK WR1043ND for testing and our OpenWRT build works great on it. Even better, it has a four-port gigabit switch which will give us much better performance than the old 100Mbps ones.

LizardFS patches
I ran into a couple of performance issues that I wrote some patches to fix. One is in the process of being accepted upsteam, while the other has been deemed too invasive, given that upstream would like to deal with the problem in a different way. For the moment, I’m using both on the school system, and they’re working great.

Kernel patch (tools count, right?)
After the F26 mass rebuild, I ran into problems building the USB/IP userspace tools with GCC 7. Fixing the bugs was relatively simple, and, since the userspace tools are part of the kernel git repository, I got to submit my first patches to the LKML. The difference between a working kernel patch and a good kernel patch can be compared to the difference between a Volkswagen Beetle and the Starship Enterprise. I really enjoyed the iterative process, and, after four releases, we finally had something good enough to go into the kernel. A huge thank you goes out to Peter Senna, who looked over my code before I posted it and made sure I didn’t completely embarrass myself. (Peter’s just a great guy anyway. If you ever get the chance to buy him a drink, definitely do so.)

Ancient history
As of about three weeks ago, I am teaching history. Long story as to how it happened, but I’m enjoying a few extra hours per week with my students, and history, especially ancient history, is a subject that I love. To top it off, there aren’t many places in the world where you can take your students on a field trip to visit the things you’re studying. On Wednesday, we did a trip to Nahr el-Kalb (the Dog River) where there are stone monuments erected by the ancient Assyrian, Egyptian, and Babylonian kings among others. I love Lebanon.

Scratch group projects – 2017

Scratch

Scratch

It’s January, so it must be time for this year’s Scratch projects from my grade 10 students. We’re moving on to python, but I’ve posted their projects at http://scratch.lesbg.com Feel free to play them and rate them. This is a first attempt for students, so do please be gentle on the ratings.

One of my personal favorites is Gravity Clash, which is strangely addicting, given how simple it is.

If you want to check out previous years’ projects, they’re also available at the links at the top left. If you have any comments or suggestions for the site itself, please leave them below.

Multiseat systems and the NVIDIA binary driver (update)

fireworks

Last month I wrote about using the NVIDIA binary driver with multiseat systems. There were a number of crazy tweaks that we had to use to make it work, but with some recent updates, the most egregious are no longer necessary. Hans de Goede posted about an Xorg update that removes the requirement for a separate Xorg configuration folder for the NVIDIA card, and I’ve created a pull request for negativo17.org’s NVIDIA driver that uses the updated Xorg configs in a way that’s friendly to multiseat systems.

To make it Just Work™, all you should need is xorg-x11-server-1.19.0-3.fc25 from F25 updates-testing, my mesa build (source here, Fedora’s mesa rebuilt with libglvnd enabled), my NVIDIA driver build (source here), and the negativo17.org nvidia repository enabled.

With the above packages, Xorg should use the correct driver automagically with whatever video card you have.

Multiseat systems and the NVIDIA binary driver

Building mesa

Building mesa

Ever since our school switched to Fedora on the desktop, I’ve either used the onboard Intel graphics or AMD Radeon cards, since both are supported out of the box in Fedora. With our multiseat systems, we now need three external video cards on top of the onboard graphics on each system, so we’ve bought a large number of Radeon cards over the last few years.

Unfortunately, our local supplier has greatly reduced the number of AMD cards that they stock. In their latest price lists, they have a grand total of two Radeon cards in our price range, and one of them is almost seven years old!

This has led me to take a second look at NVIDIA cards, and I’m slowly coming back around to the concept of buying them and maybe even using their binary drivers. Our needs have changed since we first started using Linux, and NVIDIA’s binary driver does offer some unique benefits.

As we’ve started teaching 3D modeling using Blender, render time has become a real bottleneck for some of our students. We allow students to use the computers before and after school, but some of them don’t have much flexibility in their transportation and need to get their rendering done during the school breaks. Having two or three students all trying to render at the same time on a single multiseat system can lead to a sluggish system and very slow rendering. The easiest way to fix this is to do the rendering in the GPU, which Blender does support, but only using NVIDIA’s binary driver.

So about a month ago, I ordered a cheap NVIDIA card for testing purposes. I swapped it with an AMD card on one of our multiseat systems and powered it up. Fedora recognized the card using the open-source nouveau driver and everything just worked. Beautiful!

Then, a few hours later, I noticed the system had frozen. I rebooted it, and, after a few hours, it had frozen again. I moved the NVIDIA card into a different system, and, after a few hours, it froze while the original system just kept running.

Some research showed that the nouveau driver sometimes has issues with multiple video cards on the same system. There was some talk about extracting the binary driver’s firmware and using it in nouveau, but I decided to see if I could get the binary driver working without breaking our other Intel and AMD seats.

The first thing I did was upgrade the test system to Fedora 25 in hopes of taking advantage of the work done to make mesa and the NVIDIA binary driver coexist. I then installed the binary NVIDIA drivers from this repository (mainly because his version of blender already has the CUDA kernels compiled in). The NVIDIA seat came up just fine, but I quickly found that mesa in Fedora 25 isn’t built with libglvnd (a shim between either the mesa or NVIDIA OpenGL implementation, depending on which card you’re using and your applications) enabled, so all of the seats based on open drivers didn’t come up. But, even when it was enabled, I ran into this bug, so I ended up extending this patch so it would also work with Gallium drivers and applying it.

This took me several steps closer, but apparently the X11 GLX module is not part of libglvnd and NVIDIA sets the Files section in xorg.conf to use it’s own GLX module (which, oddly enough, doesn’t work with the open drivers). I finally worked around this via the ugly hack of creating two different xorg.conf.d directories and telling lightdm to use the NVIDIA one when loading the NVIDIA seat.

Voilà! We now have a multiseat system with one Intel built-in card using the mesa driver, two AMD cards using the mesa Gallium driver, and one NVIDIA card using the NVIDIA binary driver. And it only cost me eight hours and my sanity.

So what needs to happen to make this Just Work™? Either libglvnd needs to also include the X11 GLX module or we need a different shim to accomplish the same thing. And Fedora needs to build mesa with libglvnd enabled (but not until this bug is fixed!)

My mesa build is here and the source rpm is here. There is a manual “Provides: libGL.so.1()(64bit)” in there that isn’t technically correct, but I really didn’t want to recompile negativo17’s libglvnd to add it in and my mesa build requires that libglvnd implementation.

My xorg configs are here and my lightdm configuration is here. Please note that the xorg configs have my specific PCI paths; yours may differ.

And I do plan to write a script to automate the xorg and lightdm configs. I’ll update this post when I’ve done so.

Sidenote: As I was looking through my old posts to see if I had anything on NVIDIA, I came across a comment by Seth Vidal. He was an excellent example of what the Fedora community is all about, and I really miss him.

Update: Configuration has become much simpler. An updated post is here.

From NFS to LizardFS

If you’ve been following me for a while, you’ll know that we started our data servers out using NFS on ext4 mirrored over DRBD, hit some load problems, switched to btrfs, hit load problems again, tried a hacky workaround, ran into problems, dropped DRBD for glusterfs, had a major disaster, switched back to NFS on ext4 mirrored over DRBD, hit more load problems, and finally dropped DRBD for ZFS.

As of March 2016, our network looked something like this:

Old server layout

Old server layout

Our NFS over ZFS system worked great for three years, especially after we added SSD cache and log devices to our ZFS pools, but we were starting to overload our ZFS servers and I realized that we didn’t really have any way of scaling up.

This pushed me to investigate distributed filesystems yet again. As I mentioned here, distributed filesystems have been a holy grail for me, but I never found one that would work for us. Our problem is that our home directories (including config directories) are stored on our data servers, and there might be over one hundred users logged in simultaneously. Linux desktops tend to do a lot of small reads and writes to the config directories, and any latency bottlenecks tend to cascade. This leads to an unresponsive network, which then leads to students acting out the Old Testament practice of stoning the computer. GlusterFS was too slow (and almost lost all our data), CephFS still seems too experimental (especially for the features I want), and there didn’t seem to be any other reasonable alternatives… until I looked at LizardFS.

LizardFS (a completely open source fork of MooseFS) is a distributed filesystem that has one fascinating twist: All the metadata is stored in RAM. It gets written out to the hard drive regularly, but all of the metadata must fit into the RAM. The main result is that metadata lookups are rocket-fast. Add to that the ability to direct different paths (say, perhaps, config directories) to different storage types (say, perhaps, SSDs), and you have a filesystem that is scalable and fast.

LizardFS does have its drawbacks. You can run hot backups of your metadata servers, but only one will ever be the active master at any one time. If it goes down, you have to manually switch one of the replicas into master mode. LizardFS also has a very complicated upgrade procedure. First the metadata replicas must be upgraded, then the master and finally the clients. And finally, there are some corner cases where replication is not as robust as I would like it to be, but they seem to be well understood and really only seem to affect very new blocks.

So, given the potential benefits and drawbacks, we decided to run some tests. The results were instant… and impressive. A single user’s login time on a server with no load… doubled. Instead of five seconds, it took ten for them to log in. Not good. But when a whole class logged in simultaneously, it took only 15 seconds for them to all log in, down from three to five minutes. We decided that a massive speed gain in the multiple user scenario was well worth the speed sacrifice in the single-user scenario.

Another bonus is that we’ve gone from two separate data servers with two completely different filesystems (only one which ever had high load) to five data servers sharing the load while serving out one massive filesystem, giving us a system that now looks like this:

New server setup

New server layout

So, six months on, LizardFS has served us well, and will hopefully continue to serve us for the next (few? many?) years. The main downside is that Fedora doesn’t have LizardFS in its repositories, but I’m thinking about cleaning up my spec and putting in a review request.

Updated to add graphics of old and new server layouts, info about Fedora packaging status, LizardFS bug links, and remove some grammatical errors

Updated 12 April 2017 I’ve just packaged up LizardFS to follow Fedora’s guidelines and the review request is here.