Archive for December, 2005

Comprehensive Exam Results

December 23rd 2005

Well, the Comprehensive Exam for Nuclear Power School was a success.

I finished with a score of 3.62 / 4.00. Which is very good, considering my average was 3.57 heading into the test. Most people score below average, not above. ;-)

The Comp is the broken up in the grading into individual classes so you can see what areas you were weak in. I did surprisingly poor in one course that I thought I was awesome that, but then again I did very good in some courses where I wasn’t expecting much. I even got a perfect score on one section of the test, which was nice.

The graduation was yesterday. And, it was FREEZING. I even told my wife not to go (she just missed about 2 hours worth of names being called and 1 hour of speeches). But it’s nice to have another certificate of something to take with me.

My friend got a nice plaque for scoring second among all Officer students. First place got the Vice Admiral Behrens award, and was claimed by a Naval Aviator who hadn’t cracked a textbook in quite a few years. He is seriously some kind of genius. :)

As far as the car goes, we managed to get a very nice “Career Starter Loan” through USAA, $25,000 with an interest rate of 2%. Using the money we were able to pay off our current car payment, pay off another decent car for my wife (a 2004 Saturn L-300) with only a very small increase in net payments per month.

Unfortunately I’ve still not had a ton of time for KDE. I did fix a bug in the –refresh-build option for kdesvn-build’s /trunk version.

Posted by mpyne under Uncategorized | No Comments »

Comprehensive Exam

December 19th 2005

Well, it’s finally here. I take the Comprehensive Examination for Nuclear Power School tomorrow. Pretty soon I’ll finally be done with this place. :-)

Problem is that after it’s over, I still need to look for a car for my wife and then get my Christmas shopping done in a hurry since she has already bought her present and had them wrapped for like a week, but I’ve been too busy to get out and around for the shopping.

Hopefully I’ll be able to make some software releases during the break as well. Abakus and kdesvn-build haven’t gotten much work lately.

Posted by mpyne under Uncategorized | No Comments »

Tale of a computer bug

December 2nd 2005

I finally fixed a bug that has been pestering me for literally months. It is bug 117541. The story of how I finally tracked down the cause may be interesting, and should certainly be a cautionary tale for C++ programmers. Allow me to relate the story. (Less technically inclined readers: I will try to make it understandable, but this is a bit dry)

I first noticed the bug when I upgraded some packages on my system around May 2005. The upgrades included an upgrade to glibc (the Gnu C library, one of the core pieces the system is built around). The upgrades included a handy feature to detect programs which were using memory incorrectly, and to abort them pre-emptively (since it could be an attack that causes the memory corruption). The problem was that when I ran JuK from Konsole (where you can see the debugging output), it would always display a glibc crash message when I shutdown JuK. Actually, JuK would always crash, but normally you wouldn’t see the message, and the nice little “Your program crashed” window would never show up for some reason, so I didn’t notice before.

I poked around for a bit, but I couldn’t discover the cause. This was when I was working on the improved cover art support for JuK, and one thing that was necessary was to save covers on shutdown. This crash was preventing that from happening, so I forced the covers to save before the shutdown crash seemed to happen since I couldn’t seem to find the bug.

Since the bug wasn’t exactly serious, and since it seemed to be only affecting me (I was the only one who seemed to get the crash), I just let it go. I didn’t even report it in the KDE Bug database.

A few days ago however, I got a report from a couple of different sources that they were noticing crashes on shutdown. And with the same error message I got! Now that I knew I wasn’t the only one getting the bug, I decided to try again.

First, I realized I needed to bring out the big guns for debugging. I installed the latest valgrind (and disabled the exec-stack protection which somehow found its way on my system :). Now, valgrind is nice because it is a tool that can track each and every single bit of memory your application uses, and tell you exactly how it is incorrectly used. Better yet, it will tell you from what part of the code the error occurred.

I reproduced the bug under valgrind, and received the following output:

JuK valgrind log

So what was happening is that the PlaylistItem::Pointer class, which guards the PlaylistItem, was trying to write to memory it no longer had access to. And then, even worse, JuK was trying to free memory that had already been freed.

But looking at the PlaylistItem::Pointer code, it looked to be perfect. There were certainly no obvious bugs where valgrind was pointing.

So what I did was start to add a debugging harness around PlaylistItem::Pointer. I even went so far as to add a function to JuK’s DCOP interface to allow me to print out the status of the PlaylistItem::Pointer internal data structures at any time.

This helped a little bit. I discovered that there were very many PlaylistItems that were being pointed to, but didn’t know that they were being pointed to.

The “obvious” thing to do from there is to figure out what was causing the PlaylistItem to reset and think it wasn’t being guarded. But there were only a few lines in all of JuK that could cause such a reset, and they all looked very innocuous. There was PlaylistItem::Pointer, which as I mentioned was damn near perfect. The only other option was the PlaylistItem construction code. But that code is only run once, and runs before the PlaylistItem::Pointer starts to point at it! There should have been no way to get the results I was getting.

I was stumped, even though I could have figured it out here. What I ended up doing was adding even more debugging. I would add calls to every function in PlaylistItem::Pointer to make sure the data was still consistent. When the data became inconsistent, it would force a crash. Eventually I was able to get JuK to crash in the very function call that made the data inconsistent.

What I discovered was surprising: The PlaylistItem constructor really was being run after its PlaylistItem::Pointer was made. Huh?

It made sense when I saw the backtrace of calls leading to the failure though. This had occurred while the PlaylistItem was being created. What happens is that when a PlaylistItem is created, it first has to call some code that it inherits from Qt and KDE. This happens before the PlaylistItem initializes its own data.

The Qt code (in QListViewItem) takes care of adding the newly created item into its parent list on-screen, using a function called insertItem() in the parent. JuK had its own implementation of that function, which was used to wrap the new item in a PlaylistItem::Pointer and then add it to a list of items to process when the results were needed. This way items could be added quickly, even if the user deleted the item in the meantime before the result was needed. But, this was happening from the Qt code. The PlaylistItem constructor hadn’t been run yet.

So what happened is that the new PlaylistItem::Pointer correctly set a flag in the new item to indicate that it was being guarded. Then, once insertItem() and the rest of Qt were done, the PlaylistItem constructor finally ran, and accidentally reset that flag to indicate no guards. Eventually the PlaylistItem was deleted, and since it had no guards, it didn’t indicate to the PlaylistItem::Pointer mechanism that it needed to be cleaned up.

On shutdown, JuK’s remaining PlaylistItem::Pointer objects would try to delete the items they were pointing to, but they had already been deleted without being cleaned up. This caused the two error messages detected in valgrind. This same error was detected by glibc, which forced the crash.


The code had to defeat a C++ mechanism to stop this kind of error, by forcing the code to pretend that the QListViewItem was a PlaylistItem using static_cast. If we had used dynamic_cast instead, which checks if the conversion is valid during program execution, we would have caught the error much sooner.

So the moral of the story is, if a QListView method you’re overridding says that it uses a QListViewItem*, that is possibly true even if you never create a QListViewItem* directly! Be careful, and use dynamic_cast when possible.

Posted by mpyne under Uncategorized | No Comments »