Editing Sound Files in Emacs

larsmagne23

15 years ago

Emacs PCM Editing

I buy quite a lot of vinyl still. And the hipsterish hipsters have started releasing things on tape, since vinyl is obviously too mainstream. (I’m wondering when 78s will be making a comeback.) So to listen to this music I need to sample it and then convert it to flac.

That’s trivial enough since I have a very nice DA, but the main issue is editing the music after sampling it. I have, perhaps needless to say at this point, written an Emacs mode to do this.

As you can see, the mode is pretty self-explanatory. It shows the wave forms, and you can zoom and set break points, and split the file up into pieces. (Then name the files after querying freedb, possibly.)

The mode consist of one Emacs Lisp file and two C programs. The first, summarize, goes through the PCM file and outputs the “energy level” in each section. The second, bsplit, is just a fast file splitter. Oh, and there’s a patch to aplay to allow –-seeking to an arbitrary place so that you can skip around in the file and start playing sound at point. (I see that I’ve forgotten to submit the patch to the ALSA people, so I did that just now.)

The interesting bit about wave.el is that it provides auto-splitting capabilities. Or at least, it tries to. It originally had a command for trying to put splitting marks at all points where the sound was “silent” for more than four seconds. This worked somewhat OK, but what’s “silent” varies from sound source to sound source. Some tapes are quite hissy. And when I got a new record player, the background noise level dropped to almost nothing, so calibrating “silence” is boring.

Then I thought of a new approach: I know how many tracks there are in each file. Say it’s a record album side with five sounds. Then I could tell wave.el “this is five tracks, try to find a likely partition”. It should snip away the initual “stylus hits the album” bump, and trim away the silence at the end, but otherwise put four sectional marks in the sound where it separates the tracks optimally.

However, I just wasn’t able to implement that in a satisfactory way. The current wave.el isn’t really usable in automatic mode.

When you look at the sound files visually, it’s pretty obvious to a human bean where the tracks are, usually. But I’m just not able to figure out a nice algorithm. (I mean, I haven’t really tried a lot. I think I spent a day on it last summer, if I remember correctly.)

If anybody has any ideas, I’m all ears.