Search Index Cleanliness Is Next To Something

Allegedly, 30% of all web pages are now WordPress. I’m guessing most of these WordPress sites aren’t typical blog sites, but there sure are many of them out there.

Which makes it so puzzling why Google and WordPress don’t really play together very well.

Lemme just use on of my own stupid hobby sites, Totally Epic, as an example:

OK, the first hit is nice, because it’s the front page. The rest of page one in the search results is all “page 14”, “category” pages and the like, none of which are pages that anybody searching for results are interested in.

The worst of these are the “page 14” links: WordPress, by default, does pagination by starting at the most recent article, and then counts backwards. So if you have a page length of five articles, the five most recent articles will be on the first page, then the next five articles are on “page 2”, and so on.

You know the problem with actually referring to these pages after the fact: What was once the final article on “page 2” will become the first article on “page 3” when the blog bloviator writes a new article: It pushes everything downwards.

So when you’re googling for whatever, and the answer is on a “page 14” link, it usually turns out not to be there, anyway. Instead it’s on “page 16”. Or “page 47”. Who knows?

Who can we blame for this sorry state of affairs? WordPress, sure; it’s sad that they don’t use some kind of permanent link structure for “pages”. Instead of https://totally-epic.kwakk.info/page/5/, the link could have been https://totally-epic.kwakk.info/articles/53-49/; i.e., the post numbers, or https://totally-epic.kwakk.info/date/20110424T042353-20110520T030245/ (a publication time range), or whatever. (This would mean that the pages could increase or shrink in size if the bloviator deletes or adds articles with a “fake” time stamp later, but whatevs?)

Can we also blame Google? Please? Can we?

Sure. There’s a gazillion blogs out there, and they basically all have this problem, and Google could have special-cased it for WordPress (remember that 30% thing? OK, it’s a dubious number) to rank these overview pages lower, and rank the individual articles higher. Because it’s those individual pages we’re interested in.

This brings us to a related thing we can blame Google for: They’re just not indexing obscure blogs as well as they used to. Many’s the time I’m looking for something I’m sure I’ve seen somewhere, and it doesn’t turn up anywhere on Google (not even on the Dark Web; i.e., page 2 of the search results). Here’s a case study.

But that’s an orthogonal issue: Is there something us blog bleeple can do to help with the situation, when both Google and WordPress are so uniquely useless in the area?

Uneducated as I am, I imagined that putting this in my robots.txt would help keep the useless results out of Google:

User-agent: *
Disallow: /author/
Disallow: /page/
Disallow: /category/

Instead this just made my Google Search Console give me an alert:

Er, OK. I blocked it, but you indexed it anyway, and that’s something you’re asking me to fix?

You go, Google.

Granted, adding the robots.txt does seem to help with the ranking a bit: If you actually search for something now, you do get “real” pages on the first page of results:

The very first link is one of the “denied” pages, though, so… it’s not… very confidence-inducing.

Googling (!) around shows that Google is mostly using the robots.txt as a sort of hand-wavy hint as to what it should do because the Calironia DMV added a robots.txt file in 2006.

It … makes … some kind of sense? I mean, for Google.

Instead the edict from Google seems to be that we should use a robots.txt file that allows everything to be indexed, but include a

directive in the HTML to tell Google not to index the pages insead.

Fortunately, there’s a plugin for that. But googling for that isn’t easy, because whenever you’re googling for stuff like this you get a gazillion SEO pages about how to get more of your pages on Google, not less. Oh, and this plugin seems even better (that is, it allows you to control what pages to noindex more pretty well).

So I added this to that WordPress site on March 5th, and I wonder how long it’ll take for the pages in question to disappear from Google (if ever). I’ll update when/if that happens.

Still, this future is pretty sad. Instead of flying cars we have the “Robots “noindex,follow” meta tag” WordPress plugin.

[Edit one week later: No changes in the Google index so far.]

[Edit four weeks later: All the pagination pages now no longer show up in Google if I search for something (like “site:totally-epic.kwakk.info epic”), so that’s definitely progress. If I just search for “site:totally-epic.kwakk.info” without any query items, then they’ll show up anyway, but I guess that doesn’t really matter much, because nobody does that.]

OTB#67: Badlands

Badlands. Terrence Malick. 1973. ⚄

As usual with American movies depicting teenagers, it’s always confusing: Are these older actors really supposed to be teenagers, or are they developmentally challenged adults? Spacek looks mid-20s, but acts like she’s aiming for twelve, and Sheen looks like he’s late-30s, but acts like aiming for fifteen? Or are they supposed to be their real age? Or is she supposed to be young and he’s a pedophile? Or the other way around?

IT”S SO CONFUSING!

OH!

Spacek just explained, in a voiceover, that she’s fifteen and Sheen is twentyfive. Well, thanks!

I guess:

Badlands is often cited by film critics as one of the greatest and most influential films of all time.

It is a fun movie, but the ending (where Malick does the oh-so-ironic “look at how famous these killers are” thing) is rather grating: The subtext of the movie becomes the text, which either means that Malick doesn’t trust the audience, or that Malick doesn’t believe that he’s one of those star-struck people.

Which he obviously is.

This blog post is part of the Officially The Best series.

BC&B: Morue à la Provençale le Caméléon w/ Aïoli

Food time!

The salt cod dishes in the Bistro Cooking book have been pretty spiffy… this one looks like it’s in a more bacalaoish direction than the previous ones, what with all the tomatoes and stuff.

There’s all the usual stuff… and then a whole lot of herbs. Even before starting to cook, it smells delicious.

Heeeerbs…

Oh yeah, there’s the salted cod that I’ve been watering for a day or so.

Quite a lot of onions and tomatoes: Half a kilo onions and two kilos of tomatoes (and half a kilo cod).

My favourite kitchen implement (after my new spiffy knives) are these bowls. I’ve got a whole stack of them, and they stack really well, so they take next to no room, but whenever I need something to put something in, these are usually perfect. And very steely.

Chop chop chop chop.

These herb-cutting shears are also really nice. Dishwasher safe, too. Makes chopping herbs so much easier and faster.

OK, so the onions go into a pan to soften up…

And then dump in all the tomatoes.

And then all the herbs. Mmm.

Looks like a sauce.

And there’s aioli to go with the potatoes.

I’ve made aioli before, but it was not a huge success. You see that recipe? Garlic, salt, egg yolks and extra-virgin olive oil? Basically everybody agrees that that’s a totally loopy recipe: It tastes way harsh. Most of the other recipes add lemon juice and Dijon mustard, so I’ll try that this time.

So it’s garlic and salt…

Mashed with a pestle.

Then add egg yolks…

… and then stir in the olive oil slowly. It didn’t break! And then I added lemon juice and mustard… and it was still pretty harsh, dude.

I see that basically everybody else recommends using mostly neutral oil, and just one third extra virginity stuff, and I think that’s a very sound idea, because this was balls-to-the-wall virgin, man.

Meanwhile the sauce has been puttering away…

… so it’s time to start the cod. Bay leaves and thyme…

… and then tear it all up. Looks pretty bad, but it’s a bit on the delicious side.

And then into the onion and tomato sauce for a couple of minutes.

Oh, I need something to read while eating. The next book on the shelf is What If It’s Us by Becky Albertalli & Adam Silvera, and I have no recollection of buying it. Let’s read the first three pages:

Oh, it’s a teenage romance comedy New York thing. We’ll, that’s fine by me.

And served with boiled potatoes with aioli.

Hm… well… it’s OK? But the tomato sauce definitely needs more… more… It just needs more. The sauce was rather flat: It needs more garlic, more herbs, and more chili, and perhaps some paprika? I mean, it’s not bad, but it needs more.

More more.

The book tries so hard have the repartee be witty, but mostly land on “well, that’s an awkward stream of words” instead. I appreciate the effort, but it’s not actually funny.

And I’m not the prime audience for this book (you may be surprised to learn that I’m not a teenager *gasp*), but everything in this book is so deadly earnest. Whenever our two protagonists get together, one of them will say something that’s just So Seriously Inadvertently Wrong that there’s all this sadness and low-key drama that ensues that it just gets a bit unbearable after a while. There’s four hundred pages of this stuff, and it’d be excessive at half that length.

Still, there’s some cute scenes here and there. It’s fine.

But it needs more.

More more.

Which made this a perfect pairing with the dish.

This blog post is part of the Bistro
Cooking & Books
series.

Parallax Error Beheads You

tl;dr: I made a silly 3D web page thing.

Yadda yadda:

For entirely nostalgic reasons, I’ve been buying a bunch of paperback books published by the largest Norwegian publishing house, Gyldendal, in the 60s and 70s. I guess these are the Norwegian equivalents of what Penguin was at the time: Cheap, but nice and with a nose for quality.

As a teenager (when the series had wound down), I’d use to walk around the library, looking at these artefacts and thinking “I should really read all of these”. I think one of the triggers for this weird desire is that they’re numbered, to it’s conceivable to read them all. Even in the correct order!

But the library didn’t have the oldest books in the series, so I never got started… But it’s a thought that has reoccurred to me over the years, and…

Look what happened:

I started buying them the other week. They’re still cheap: Inflation-adjusted they’re cheaper now than when they were published. Which is both nice and not-so-nice: It can be harder to find cheap books, because people don’t put them up for sale, as it’s not worth the bother. But I’ve bought 60% of them now (that’s 25% of them in the picture up there).

While doing this, I was also thinking about 3D. Perhaps because many of the covers are kinda pop-artey. And perhaps this nostalgia trip made me think about the demos I made as a teenager. And I’ve never done any 3D programming, ever, so I sat down and started typing some Clojurescript… and:

The stupid source code is here, and the live web site is here.

This is only the second Reagent thing I’ve written, and it’s… not a very Reagent-ey single page app. The main problem is that I have to do some low-level DOM fiddling, and I didn’t find a way to do that with Reagent’s “proper” way of doing things. For instance, when going from an animation to a transition, I have to stop the animation, query the 3D state of the object, copy that over to the object’s style as is, and then start the transition. Try as I might, I couldn’t figure out how to do that in Reagent without glitches, so I just resorted to altering the DOM directly (adding styles and stuff on the fly).

Working with CSS 3D, as a total novice, was pretty fun. You can play around with the 3D stuff in Emacs and see the changes immediately in the browser. Getting to grips with how to do perspective, or not, also took a few tries. For instance, when the books glide out of the library, the other faces haven’t loaded yet, so it just glides out straight towards the viewer, hiding the other faces. And then only starting to turn once the images have loaded. So that bit has a way-off perspective, while it’s more fun to have a closer perspective when the books are spinning…

Lots of trial and error. There’s 98 commits.

¯\_(ツ)_/¯

The annoying thing about CSS 3D is, of course, that there’s a number of browsers out there. The site looks somewhat choppy in Firefox, very smooth in Chrome, and there are some glitches in Safari, which seem to stem from Safari not being able to determine (fast enough?) what objects are behind what other objects when there’s a lot of them, and the Z axis different of the objects is less than a couple of pixels.

Oh, and I got to use a new tool:

To measure the spines for scanning. Fun!

The confusing title of this post is from an album by Max Tundra: