Digital Audio Extraction from Emacs

Triple Threat
SATA Multilane Connector

So my CD ripping situation is that I put a CD into the CD reading thing there (more about that in a thrilling later blog article), hit a key in Emacs, slap a CD cover onto the scanner, hit another key in Emacs to say that the format is (usually RET), and then inspect the CDDB data that Emacs presents me with, and then I `C-c C-c’, and then I repeat. Since I have three CD players, I can do three CDs in parallel.  The time to process one CD is about two minutes, but with the parallelism going on, I can usually process about twenty CDs in ten minutes.  (Unless the CDDB info is missing and I have to type stuff in.)

I started on this journey in 1997, when mp3s first became viable.  So over the years, I’ve ripped CDs as I bought them.  The earliest mp3-encoded albums started sounding pretty crappy to me, since the early mp3 encoders were pretty crappy.  By 2007, disks had gotten so cheap that it was viable to store the music in a lossless format, so I decided to re-rip them all and store the music in flac.

Now, I have around 4K CDs.  Ripping them all in the traditional, sequential way would just take too long.  I don’t really deal well with repetetive, boring, manual tasks.  And since I had ripped all these CDs before, all the data was already in freedb, so it would be a totally mindless manual job.

So I bought the Addonics cabinet seen above, which has a SATA multilane connector, and put three DVD readers into it.  The only problem in getting it to work reliably was that since all the readers were identical (I mean totally), and the SATA cabinet would bring them up in random order, the poor Linux udev system would name them /dev/scdX at random.  So I would never know how to address the top one until after trying 0, 1, 2.

Until I came up with the brilliant idea of uploading different versions of the available Optiarc firmware on each DVD reader.  They worked just as well with any firmware, but the firmware versions allowed me to create udev rules to differentiate.

scsi 7:0:0:0: CD-ROM            Optiarc  DVD RW AD-7170S  1.02 PQ: 0 ANSI: 5
scsi 5:0:0:0: CD-ROM            Optiarc  DVD RW AD-7170S  1.00 PQ: 0 ANSI: 5
scsi 6:0:0:0: CD-ROM            Optiarc  DVD RW AD-7170S  1.03 PQ: 0 ANSI: 5

With the 3-way parallel ripping setup I think I did all 4K over four nights, if I remember correctly.  While listening to music very loudly, and watching some tv series on DVD (Smallville?), and being totally shit-faced drunk.

Ah, fun times.  At least I think it was fun.  I can’t really remember, for some reason or other.

Anyway, here’s the source code for the Emacs parallel DAE interface.

Useful Consumer Review

I got a new phone today.  The Nokia E7.  And look!  It’s perfect!  It runs Gnus under ssh! Look how pretty Gnus is on the phone!

(The only thing that would have been perfecter would be if it actually ran Emacs on the phone itself, but I guess that’ll have to wait until somebody produces a useful Meego phone.)

I think I’ve got the appliance review thing down now.  This blog will be the next Engadget.

Scanning Record Sleeves

A CD Rippin’ Cupboard with an A3 Scanner

In the continuing story of bits and pieces related to my music playing Emacs@Home installation, here’s the sleeve scanning function.  It’s basically just a tiny data base of common CD/LP/tape sleeve sizes. There’s a lot of sizes, unfortunately.

But what I really wanted to have was something that could detect the image area automatically.  Why doesn’t that exist?  I mean, I couldn’t find it when I googled for it half a decade ago, so it can’t possibly exist now.

It should be pretty easy to detect the image area, you’d think.  Record sleeves are usually kinda square.  So you could use…  rectangle detection…  to find the image. On the other hand, I have CD sleeves that aren’t rectangular.  And I have sleeves that have a square border, and then blackness outside the border, so just detecting the square might over-crop stuff.

I thought about using green screen techniques.  If I painted the inside cover of the scanner cover a particular green colour, then I could probably whip up a technique to…  do something.  But I fear that there’d be colour leakage, with the reflected green light giving off a green tinge to paper sleeves that aren’t very thick.

So, I don’t know.  The result is that sleeves that are half a millimeter larger than the standard sizes I have tend to be slightly over-cropped.  It’s annoying, but not annoying enough that I ever bother to re-scan the offending sleeves.  And hand-editing a scan — you know, in Gimp or something — is so ridiculous that I have to laugh.  Just see:  “Ha ha.”

The ignomity of it all.

Editing Sound Files in Emacs

Emacs PCM Editing

I buy quite a lot of vinyl still.  And the hipsterish hipsters have started releasing things on tape, since vinyl is obviously too mainstream.  (I’m wondering when 78s will be making a comeback.)  So to listen to this music I need to sample it and then convert it to flac.

That’s trivial enough since I have a very nice DA, but the main issue is editing the music after sampling it.  I have, perhaps needless to say at this point, written an Emacs mode to do this.

As you can see, the mode is pretty self-explanatory.  It shows the wave forms, and you can zoom and set break points, and split the file up into pieces.  (Then name the files after querying freedb, possibly.)

The mode consist of one Emacs Lisp file and two C programs.  The first, summarize, goes through the PCM file and outputs the “energy level” in each section.  The second, bsplit, is just a fast file splitter. Oh, and there’s a patch to aplay to allow –-seeking to an arbitrary place so that you can skip around in the file and start playing sound at point.  (I see that I’ve forgotten to submit the patch to the ALSA people, so I did that just now.)

The interesting bit about wave.el is that it provides auto-splitting capabilities.  Or at least, it tries to.  It originally had a command for trying to put splitting marks at all points where the sound was “silent” for more than four seconds.  This worked somewhat OK, but what’s “silent” varies from sound source to sound source.  Some tapes are quite hissy.  And when I got a new record player, the background noise level dropped to almost nothing, so calibrating “silence” is boring.

Then I thought of a new approach: I know how many tracks there are in each file.  Say it’s a record album side with five sounds.  Then I could tell wave.el “this is five tracks, try to find a likely partition”.  It should snip away the initual “stylus hits the album” bump, and trim away the silence at the end, but otherwise put four sectional marks in the sound where it separates the tracks optimally.

However, I just wasn’t able to implement that in a satisfactory way.  The current wave.el isn’t really usable in automatic mode.

When you look at the sound files visually, it’s pretty obvious to a human bean where the tracks are, usually.  But I’m just not able to figure out a nice algorithm.  (I mean, I haven’t really tried a lot.  I think I spent a day on it last summer, if I remember correctly.)

If anybody has any ideas, I’m all ears.

Greylisting Considered Annoying

Nobody likes spam.  So to avoid spam they either inflict pain on others, like with challenge/response systems that send endless challenges to me since “I” have sent them spam (From headers are so hard to fake? (I know this guy who automatically responds to all challenge/response systems (evil, but understandable))), or they use “greylisting”, which is harmless, supposedly.

It just means that mail takes a bit longer to deliver, right? The first time you try (on a unique from/to pair), your MTA gets told that it has to wait for a while.

So when I do a Gmane subscription handling session, I first fire off a bunch of subscription requests.  Then, since so many list admins use greylisting now, I have to wait for fifteen minutes to complete the process.  Meanwhile, I’ve gone on to do other things, or I’ve left for a holiday in a differerent country, so the process stops in the middle, and the person who requested the list gets all sad and stuff.

See what you’re doing, greylisters?  You’re making Gmane users sad!  For shame!

Emacs Can Haz Brainz?

Adam mentioned MusicBrainz in the comments of the last article.  I took that as a challenge, of course.

I only implemented the query bits, though.  I’m selfish.

(Oh, OK, the only reason I didn’t do the submission part, too, is that I can’t make up my mind whether cddb.el and musicbrainz.el should share the same editing mode or not.  I think perhaps.)

[Update: That felt like a cop-out, so I’ve started implementing MusicBrainz submitting.  I needed some way to get a MusicBrainz-compatible CD Table-Of-Contents listing, so I hacked up cd-discid to do that.]