Noindex Redux

A month ago, I wondered whether there was any way to make those useless WordPress overview pages (i.e., category, author and “page X” pages) go away from search index results.

To recap, whenever I’m looking for something, Google has a tendency to return a result pointing to “page 35” of somebody’s blog, but when I go to “page 35”, what I’m looking for isn’t there any more, because it’s now on “page 49”.

To illustrate, here’s a search for a term that appears only once on this blog:

Yes, Google returns the blog post (“Small Change”), but as the final result, after two overview pages where you probably won’t find “rs232” when you click on those links.

So I added “noindex” entries for the overview pages on March 5th. It’s now been more than a month, so what does things look like now?

It’s better! The blog post (“Small Change”) is now the first result, and there’s only one overview page included in the result. So perhaps in another month or so, Google will have re-fetched all the pages and removed that, too.

(Note that Google isn’t that good at counting. Ten, two… who cares!)

Now, if only WordPress were to make “noindex, follow” the default on all the overview pages, then the world would be a (very very slightly) better place.

OTB#48: L’eclisse

L’eclisse. Michelangelo Antonioni. 1962. ⚅

Oh, L’eclisse… not Réglisse… So this isn’t a French movie about liquorice, but an Italian movie about an eclipse.

Makes more sense.

[time passes]

OH MY EMACS! Everything in this movie is so gorgeous! The performers, the lighting, the costumes, the interiors, the framing, the film stock, the 2K transfer… And there’s no music telling us how to feel! I could watch this forever.

The only scenes I don’t find riveting are (ironically enough, don’t you think?) the stock market scenes. They just seem… forced? You could pretty much tell from the first scene (where Delon (and mom) made money) that there was going to be another scene with a stock market crash, and that’s not like the rest of the film at all: Throughout the rest of the scenes, there’s a thrilling feeling of not knowing where all this is going.

It’s not a perfect movie. There’s about… a quarter? of the movie that’s kinda less gripping. (From the crash and the following… 30? minutes?) It’s the bit where Monica Vitti can’t decide whether to fuck Alain Delon or not. OK, his character is a shallow, horrible human being, but c’mon. He’s Alain Delon.

But despite his Delonness, it’s the scenes where there’s just Monica Vitti and nobody else that’s the most striking. They’re really something. It’s hard to stop screenshotting because every shot is just wonderful. She manages to be this blank presence… very different from, say, Liv Ullman (in Bergman’s movies), but still as fascinating.

Oh, and it’s not a movie completely devoid of a soundtrack: There’s two scenes near the end, where music is used extremely efficiently.

This blog post is part of the Officially The Best series.

The Campaign Against Link Rot

This blog has been going for a while, and more and more of the very, very useful external links (ahem) now point to sites that have disappeared, or that have rearranged all their internal links.

This is sad.

I wondered whether there was a tool that’d just point all the broken links to archive.org, and there doesn’t seem to be. But the support over at the Broken Link Checker WordPress plugin seem to be on the ball, so perhaps there will be?

However, based on my extensive research (i.e., pointing about fifty broken links at the Wayback Machine and seeing what happened, about one third had results like):

And other things, like being blocked by robots.txt or… whatever.

So I was idly wondering… would it be possible to just… cache? Whatever we’re linking to? In WordPress?

And the answer is “no”, of course, because creating a mirror of a web page is Trey Difficult. Not to mention a security nightmare. But then it occurred to me that we’d get almost there by just grabbing a screenshot of the page at the time a blog article is written, and then just stash that in the media library! It’s not as good as having the actual text and stuff, but it’s something. You can at least read it. It’s a low-cost, 85% solution to an annoying problem.

But what UX to use to display these captures? Footnotes? A side bar? And then a smart suggestion from irc: What about a hover thing?

So now, 90 minutes later: Tada! Here’s a link to the FSF that should be cached at the time of posting, and stored in my WordPress library. You should be able to get a hovering popup that you can click on to see the (very long) PNG.

I have not implemented this as a WordPress plugin, but in ewp instead, and with some added JS and CSS on the blog. It should be a plugin instead, of course, but I don’t have the stamina to write PHP any more. I tried googling for such a plugin, but I couldn’t find anything.

So: If anybody thinks this is a good idea, please do go ahead and write a WordPress plugin that does this thing.

It basically calls cutycapt (or any other headless web “screen capture” application), and then adds some JS on the “mouseenter” event of the link.

Does this seem useful? Annoying? Confusing?

The Google Audit

As I’m sure you remember perfectly, in 2012 (!) I did something silly (no really): I scripted a teensy thing that would check what was playing on the stereo, and then search Youtube for a video that matched that as best it could (based on artist name, track title and the length of the track), and then just play the video (without sound).

To use as the background for the tiny USB monitor in the hallway that displays the weather forecast.

YES I KNOW.

It’s stupid, it’s frivolous, and it consumes Youtube resources without Google earning any of them sweet, sweet ad dollars, so I was expecting it to be shut down, or I’d just grow tired of it, or…

I mean, it’s… all kinds of stupid? Right? I agree completely. No argument there.

Over the years, there’s been some restrictions: Rate limiting on the API, and rate limiting for the website itself, and I had to fill out some forms about what I’m using it for, and… But basically, it’s been doing its stupid little thing.

Cut to August 2019:

Dear YouTube API Developer,

We are currently conducting a mandatory compliance review of your YouTube Data API Project. The review is to assess your compliance to our YouTube API Services Developer Policies (link) and to learn about how our service is being used.

At your convenience in the next seven (7) business days, please complete and submit the following information :

1 A fully functional demo account, including a username and password with which we may access your API Client. The demo account you provide will be used only for compliance inspection and the credentials will not be shared.

2 A fully completed Youtube API Audit Form

3 Screenshots of how your API Client and its users access and use the YouTube API Services

4 Documents relating to your implementation, access and use of YouTube API Services

I got the lovely email above, and I assumed that this was a very clumsy phishing attack. I mean… a demo account? With a password? Could it be more obvious?

So I ignored it, and then got further emails, and after the third “third and final notice” (I think?) I looked closer at the emails and confirmed that the address was really from @youtube.com, without any Unicode homographs, and it’s DKIM signed, and…

IT”S A REAL EMAIL FROM GOOGLE! I couldn’t believe it.

But I finally answered, and got a response from:

Which was also real! And not a phishing attack. It asked:

Regarding project key usage:

The given alphanumeric text [1 only] cannot be deciphered. Please provide us with a list of valid project keys associated with your API Client.

In order to check the project key for your API Client please login to Google API Console. After logging in go to IAM & admin -> settings -> project key.

OK, so I did that, and:

So there’s no project key? I wondered why they couldn’t just, like, look up this stuff themselves. And particularly since there’s no “project key” (whatever that is)… They should know already? Is this phishing after all? Are all those characters in @google.com really ASCII? They are.

After a few attempts at making it understood that I’m not running a web site; there’s no login; there’s no users: There’s just a stupid script running on my hallway computer, they asked to see a screencast of how it works.

Meanwhile, in the middle of all this, they stopped my access to the API, so I had to substitute a hard-coded video to play:


So… I guess… I’ll just wait…

Misunderstand me correctly: I’m not complaining or anything. I’m just… bemused. I mean, it’s just a stupid, fun little thing, and if Google says “er, perhaps don’t do that with the API?” then that’s fine. It’s their API. And I don’t envy those poor people working on the “dispute resolution” team. They probably have a script they’re running through to see whether the next Cambridge Analytica is doing something nefarious with the Youtube data (or at least have a way of saying, during the next Senate hearings, that they are doing something about that), and dealing with pissant hobbyists just using their APIs for fun is… probably not that fun?

It’s just… There’s a sort of disconnect. Whoever came up with this audit thing obviously didn’t have an option for “4c) Not using the API for anything that can even be audited because it’s just stupid”, which I think is probably 70% of the use cases. Because people do stupid shit.

So I’m amused. Bemused?

Am I Bemildred? I think I may be Bemildred. (He’s the one on the right.)

Meanwhile, I can’t have the background of the monitor all blank and stuff. So I’ve substituted it with this wonderfully glitched broken torrent download:

Uses less bandwidth, too.