Technical Analysis

I thought I could use my prodigious financial know-how to do an analysis of Emacs Open Bugs chart.

First of all, we have clearly defined positive trend channels reaching back to 2009, broken by a period of recession.

But lately, this growth has been curbed and we’ve seen a clear development of a resistance line at 4100. It looked like we’d broken out of it in 2018, but then we have a reversal, but judging on this clear resistance, we’ll soon see a bounce-back.

Zooming even closer into the last couple of years, we see the emergence of a linear Fibonacci reversal trend coupled with a reverse Head & Shoulders® pattern complete with magnificent bubbles appearing and not the least hint of dandruff sloughing off of the chart, complete with a 10% (450 bugs) reduction due to me procrastinating a lot over the past month and a half and going through old bug reports, but now there’s a two week vacation from the vacation coming up.

This can only mean one thing, and I don’t have to spell it out for you, I think: Sell, sell, S-E-L SELL!

Your Emacs Statistics Service

I’ve been plugging away at the Emacs bug database the last month or so. I’m the kind of person who gets obsessed with something for a time and then I do something completely different the next month. So I’m away from Emacs developments for months on end (and this time it’s basically been a couple of years Due To Circumstances), but when I’m back and hacking away, I’m always wondering:

Are things getting better or are things getting worse?

Things are the same! The chart showing the number of open Emacs bugs has been remarkably flat over the last three years; three cheers for the Emacs maintainers and all their helpers!

But is this because there’s fewer bugs reported?

Hm… No, that red line (opened bugs) is almost a straight line, so it’s about fixing things and closing bug reports at a steady pace. (Interactive charts here.)

So let’s look at the number of commits and contributors:

Surprisingly enough the number of commits per month hasn’t been very large; about 300 per month. (The red line is when Emacs moved to git from bzr.) Or perhaps that’s not surprising; the fewer commits, the fewer new bugs?

In a longer perspective:

And here are the equivalent charts for the number of contributors:

Pretty steady at about 35 per month. (The recent spike is just me spelunking through the bugs database and applying a bunch of older patches that were submitted and not applied at the time they were submitted.) Or as the Emacs bug tracker graphs put it:

And in other Emacs-related news, I thought it was really amusing to see The New York Times using Emacs as an example when educating people about commit messages, and it was doubly amusing that the author used a commit message of mine when doing so.

So now I can say that I’m a New York Times-published author.

If it wasn’t for the small detail that that would be grossly misleading.

In conclusion:

Here’s a picture of some potatoes.

Towards a Cleaner Emacs Build

I’m planning on getting back into Emacs development after being mostly absent for a couple of years. One thing that’s long annoyed me when tinkering with the Lisp bits of Emacs is the huge number of compilation warnings. The C parts of Emacs were fixed up at least a decade ago, but this is what compiling the Lisp directory looks like:

For me, this has been a source of having to go slower when coding: I make a change, look at the output from the compilation window, and then do a double take when I see some warning about something I didn’t think I had touched.

And then it turns out to be an old warning about something completely different.

The number of warnings in an Emacs build has been fluctuating, but sort of growing. There were 440 Warning: lines output by the build process, totalling 1800 lines on stderr. That’s kinda a lot to look at when doing a build.

There’s a number of reasons that the Emacs build looks like this, but the most important is perhaps the somewhat unique way the Emacs Lisp code has traditionally been developed: Many of the major modules have been maintained out-of-tree, and often support a huge number of Emacs versions dating back to the 1980s. Not to mention XEmacs.

This leads to there being conditional calls to code that doesn’t exist in modern Emacs, and code that doesn’t use new calling conventions.

The other is that, well, Emacs has a long history, but the Emacs Lisp language is evolving constantly, what with lexical binding and all. What was good code in 1993 now uses outmoded idioms, and these idioms trigger compilation warnings.

The development situation has changed somewhat over the last few years: Now most of the code in the Emacs tree is developed in the Emacs git repository, and the external packages are instead distributed using the Emacs package system. So the half-external/half-internal development isn’t as big an issue any more (although there are still (very) significant packages developed this way, like CC mode).

And XEmacs compatibility isn’t a major issue any more for many people.

So I thought now was the time to roll up my t-shirt sleeves and get stuck in to the code and get organisised.

90% of the warnings took 10% of the time: They were easy syntactic changes to bring code up to date with the new Emacs Lisp standards. The next 9% took 90% of the time. And then the last 1% took another 90% of the time.

So there was a lot of questions asked and some new tests implemented to ensure that the changes didn’t break anything.

And a lot of questions answered by all the smart people on emacs-devel.

But now it’s over! That is, there’s one single Warning: left, and that’s being pondered.

The total output from a “make bootstrap” is down from 5200 lines to 2900 lines (on this machine; it may vary somewhat), which is a 40% reduction. Looking at Emacs compiling now is a calmer experience.

I also added some new progress messaging in parts where a single section takes so long that it looks like it’s crashed or something, so it’s not purely a “get rid of lines” project.

Virtually all of the warnings fixed were valid warnings (i.e., they were about things we’d rather not see in the Emacs Lisp code), but some warnings were false positives. For instance, Emacs has a method to mark functions as obsolete, and then you get a warning if you load that code, which is a nice way of letting users know that something is going to disappear in a few years. (And Emacs has a very conservative removal policy; obsolete functions are kept around for like a decade.)

But functions are sometimes obsoleted in groups, so you may have one obsolete function calling another obsolete function in that group, and that will issue a compilation warning… and it shouldn’t.

So we’ve introduced a new macro

(with-suppressed-warnings ((obsolete an-old-function))
  (an-old-function :foo :bar))

to make the byte compiler know that we know about this, and not issue any warnings about that. (And there are similar things about the number of arguments to the functions.)

I had to use the macro about a dozen places, which isn’t a lot, percentage wise.

I hope the somewhat less daunting compilation output will help developers, old and new, to get more stuff done. At least a little bit.

Ununicode

I’ve been messing around to see whether running a WordPress installation is fun or not (spoilers: it’s really not), and all of a sudden my test blog articles had turned a strange shade of non-UTF-8.

For instance, some texts I had quoted used that strange apostrophe in “it’s”, and that had turned into “it’s”.

Now, that sequence of characters (which are Unicode code points 0xE2, 0x20AC and 0x2122) bears no resemblance to the code point for ’, which is 0x2019. But the UTF-8 for ’ is #xE2 #x80 #x99, and that’s the clue: In Windows Code Page 1252, the Euro sign in #x80 and the TM sign is #x99, so what I had on my hands was UTF-8 interpreted as CP1252, and then output as UTF-8 by WordPress.

*phew*

I wondered whether any series of calls to `{en,de}code-coding-region’ coupled with `string-{as,to}-unibyte’ would possibly allow me to un-destroy the text, but that made my head hurt, so I wrote undecodify.el and put it on Microsoft Github.

(undecodify "it’s") => "it’s"

It’s trivial, but at least that fixed the blog articles.

Now I just have to wait for the next thing to go wrong with WordPress…