Random Comics

I read some comics over the last week or so.

Mafalda used to run in the papers when I was a child, but I’m not quite sure whether I remember liking it or not. I like the artwork, but I think I found the humour to be rather annoying?

But I really liked Mafalda now — it’s funny and endearing.

The translation here, though, isn’t very good. I mean, there’s tons of puns in Mafalda and those are just a pain to translate, but frequently the strips (in English form) are more “eh? eh?” than “heh heh”, and I think that’s down to the translation.

It’s also a bit confusing that there’s not a hint of explanation of what we’re reading. I mean, I hate introductions and all of that jazz, so I’m happy that they let the work speak for itself, but there’s not even a date anywhere in the book — and that’s pretty important, since Quino speaks a lot about current affairs. But I’m guessing this book reprints work from about… 1966?

A new batch of minis from Kuš. I didn’t find this batch as strong as usual.

This one was overly didactic.

I liked this one.

And this one was funny.

And I have no idea what was going on in this one.

I read another Corto Maltese album in French (because reading works translated into French is easier than reading works written in French).

I’m not sure the colour palette chosen for this is totally successful — there’s a whole lot of beige and tans and not much else. I mean, I’ve got these comics in black and white, too, so that’s why I bought these French editions in colour.

In any case, these stories (there’s five more 20 page stories in this album) are still a delight to read. They’ve got such a mood going on.

Tegnehanne has done these books for a while now — they’re strongly autobiographical ones, and the worry is, of course, that she’d run out of stuff to write about.

Have no fear! This is as good as anything she’s done before — it’s funny and heartbreaking and uplifting at the same time.

She depicts her neighbours in rather, er, frank terms…

… so if these bits are true (and they certainly feel that way), I’m wondering whether there were strained relationships in the ‘hood after this was published.

I’ve been diligent with my French and read four issues of the Spirou magazine.

How current affairs-ey! Kid Paddle’s family takes in a Ukrainian refugee (from Chernobyl). (It turns out (on subsequent pages) that everything isn’t fun about a nuclear disaster anyway.)

Of the new serials, The School For Bad Parents has promise — very funny, but is he going to run out of ideas?

I love the Seccotine serialisation.

And of course, Les Fabrices are always hilarious.

And there aren’t too many series like the above (which I just find to be pretty dull), so this was a good batch of Spirous.

Galago is a long-running Swedish anthology. Lots of good stuff, but these two stood out:

This reminds me a bit of Lynda Barry’s late-80s artwork? And that’s high praise indeed.

This is very 2026, on the other hand, but also good.

Yes, I read some Marvel comics, too.

Planet She-Hulk is the worst of the bunch.

Venom (written by Al Ewing) is the best.

And some Image/Dark Horse/IDW books.

James Stokoe is insane (complimentary).

And so is Jake Smith (ditto).

Oh, and that’s it? I guess so.

Screenshotting Web Pages Without Cookie Banners

I was blathering on yesterday about how hard it is to screenshot a web page these days. I mean programmatically, because my use case is to make links on a blog have the same life span as the blog itself — taking screenshots in your browser manually is usually pretty easy.

But if you use, say, shot-scraper from a non-US IP address you usually get something like the above. Which sucks.

Today, though, I though — there are things like the Ublock “annoyances” list. For instance here we have some nice lists made to remove annoying things like cookie banners (and other modals). Why can’t shot-scraper use those lists, huh?

Why not indeed:

So what I did was I forked shot-scraper to add a syntax to load a Javascript file:

def _evaluate_js(page, javascript):
    try:
        if javascript.startswith("file "):
            path = javascript[5:].strip()  # everything after "file "
            return page.add_script_tag(path=path)
        else:
            return page.evaluate(javascript)
    except Error as error:
        raise click.ClickException(error.message)

And then I just generated a JS file that uses all those selectors to remove elements, and there you go. Kinda hacky, but…

The code lives in the Emacs WordPress library. But somebody should take this idea and integrate it with shot-scraper proper — a switch like --remove-annoyances that just downloads those block lists and uses them would be ideal.

It’s not like this is a panacea, though, because there’s so many other stumbling blocks. Like:

You’d have to work harder to get around those… And:

Some sites have so obfuscated HTML that it’s nigh impossible to just remove the offending elements.

But still! While this doesn’t work on all sites, it works on a whole lot of them, so that’s some progress, at least.

Hack off:

Hack on:

See? Better.

Screenshot All The Links

I’ve talked about this before, but to recap: As someone who does quite a bit of research into somewhat obscure topics on the web, there’s nothing as annoying as when you read an old web page that says something like “and you can read that really interesting interview on this page“, and then you follow that link, and discover that that site disappeared a decade ago.

And the Wayback Machine didn’t archive it.

So, ideally, whenever you link to something, a copy of what you’re linking to should be stored on your own site — so what you’re writing and what you’re linking to has the same lifespan. That’s kinda difficult to do, though — lots of issues with “safely” mirroring a site in a useful manner. But what’s trivial is to do is to screenshot what you’re linking to.

It’s a 90% solution: No, it’s not ideal to read a screenshot of a page instead of the page itself, but it’s a lot better than nothing:

But… Actually taking a screenshot of a web page and then manually uploading it to your blog site would be an insane amount of work. But computers are pretty good at automating stuff, so my Emacs-based WordPress interface does this automatically… as well as it can, because even screenshotting things from your own machine is getting to be pretty hard.

Because not only are there cookie banners and various other blockers, but even “nice” sites like the above somehow feel the need to plaster some modal over the page contents. *sigh* And that’s not the worst, really — there’s so many “anti scraper” tools that trigger for even the most innocent of automatic usages like the above that you may end up being permanently banned if you try to use anything other than the newest of the newest actual real browsers to visit a web site.

It’s not that I blame them — it’s an arms race against out-of-control AI scrapers, but the use cases that are most affected by all of this are use cases like this — the AI scrapers have infinite resources and use residential VPNs and heavy automation to seem like real people, and don’t care one whit one way or the other. Well, I’m guessing that playwright (which is what I’m using for this) will come with an LLM extension soon to click through all the modals, right?

[Slight digression: While typing this blog post, it occurred to me that Cloudflare had announced APIs for doing stuff like screenshots, so I wondered whether they’d come up with something fun in this area. So I pointed that API at an imdb page and viola:

A big fat nothing, because imdb uses the Big Amazon Firewall to block everything from data center IPs and browsers that don’t pass a human-like check.]

So I don’t really have a solution here for all of that. I just wanted to mention that I’ve cleaned up the code to actually display the linked screenshots and made it into a WordPress plugin. (Hover over that Microsoft Github link to see the plugin in action. And possibly click on that thumbnail you get when hovering, too.)

(Note that this isn’t one of those annoying “preview” things that some web sites put on URLs — I find that to be the most annoying thing ever, and totally useless. What you’re seeing here is a screen capture of the linked site taken the same date I posted this post — so you’re seeing exactly what I linked to when I linked to it.)

Unfortunately, there is no way to do automatic screenshots from the server — Cloudflare blocks/challenges all access from known data center IPs, so that’s just not feasible. So if you want to do something like this, you have to find your own way to get the screenshots of what you’re linking to.

The Best Linux Configuration Syntax Ever

I’ve been scanning a bunch of magazines for kwakk.info over the last few months, and I’ve got a pretty efficient setup — I’ve got an A3 Epson 50000XL scanner, so I can scan a double page spread in a couple seconds. I use a pedal (USB HID) to trigger the scan with my foot so that I have both hands free to press the magazine down, and with that setup I get a throughput of about 16 pages per minute. (I could use the scanner’s lid instead of my hands, but that would cut the throughput seriously.)

And I can do scanning “on autopilot” while watching TV, so it’s like knitting for me, basically.

But sometimes I make mistakes, and I have to redo a page. And sometimes the mag switches between black and white and colour, and I have to tell the laptop that somehow.

So my solution was to lean over to the laptop and hit the right key there, but… that’s so inefficient! So I decided to buy this little beauty:

It’s a three-key keyboard, which is just perfect — one key for “go back and redo”, one for “now it’s colour” and one for “now it’s black and white”. (OK, I could have used a two key keyboard, but a B&W toggle wouldn’t be very user friendly, especially since scanning is usually accompanied by this user with beer usage.)

This is also a USB HID and works out of the box in Linux. The keys output a, b and c. Why not? As good as anything else.

But the problem is now that pedal I use to trigger the scans with my foot. It outputs b. WHAT ARE THE ODDS!!!!

So I wondered how you remap a specific key from a specific device to something else these days. Under X, that would have been easy enough, but my laptop is running Wayland, and… well you know Wayland.

Instead of researching this problem, I just asked an LLM, and gave it the debug output from evtest. It said I should do this:

root@up:~# cat /etc/udev/hwdb.d/99-pedal.hwdb
evdev:input:b0003v1A86pE026e0111*
  KEYBOARD_KEY_70005=d

This should remap my pedal USB device to output d instead of b.

# systemd-hwdb update; udevadm trigger

And what do you know! It’s works!!!!

But… that configuration syntax? Seriously?

I guess this is a perfect time for systemd-based configuration — impossible for human beans to deal with, but perfect for LLMs? I guess Poettering was some kinda visionary after all.

Looking at the USB subsystem, I think 1A86 is the vendor ID and E026 is the product ID… but I don’t really know what the rest of those numbers are supposed to be, and I don’t have to know, do I?

And it’s annoying that this bit of user configuration lives under /etc instead of /home, but…

So now I can just continue scanning. Even faster than before! Vroom vroom!

(“noooo” I can hear my RSI softly yelling.)