@algernon

algernon@lemmy.ml · 4 days ago

Same here. We have a joint account, I’m the sole earner, apart from mortgage and utilities and whatnot, she’s spending all of it, and that’s great, because she does it much better than I could. I suck at managing money, she does not. I wish I could give her more money.

algernon@lemmy.ml · 7 days ago

Most often, yes. But there are exceptions. A lot of Ubuntu developers were on Debian in the early days for example. I imagine some still are - there’s a bit of overlap between Debian & Ubuntu developers here and there.

I maintained packages in various BSDs pkgsrc/ports tree, even though I never daily drove any of them, and had them in a virtual machine at best.

algernon@lemmy.ml · 8 days ago

I… have my doubts. I do not doubt that a wider variety of poisoned data can improve training, by implementing new ways to filter out unusable training data. In itself, this would, indeed, improve the model.

But in many cases, the point of poisoning is not to poison the data, but to deny the crawlers access to the real work (and provide an opportunity to poison their URL queue, which is something I can demonstrate as working). If poison is served instead of the real content, that will hurt the model, because even if it filters out the junk, it will have access to less new data to train on.

algernon@lemmy.ml · 8 days ago

Yup. All of the things listed there are far better than this.

(I’m also in that article, look for “iocaine”, although it evolved into something a whole lot more powerful, and a lot easier to deploy since the article was written).

algernon@lemmy.ml · 8 days ago

Your exception is for http://ha/, you’re visiting http://ha:8123/. Ports matter. It probably worked yesterday because you either visited without a port, or clicked the “Continue to HTTP site” button, and LibreWolf remembered that for the rest of the session.

algernon@lemmy.ml · 8 days ago

I had a short tootstorm about this, because oh my god, this is some terribly ineffective, useless piece of nothing.

For one, Poison Fountain tells us to join the war effort and cache responses. Okay…

❯ curl -i https://rnsaffn.com/poison2/ --compressed -s
HTTP/2 200
content-disposition: inline
content-encoding: gzip
content-type: text/plain; charset=utf-8
x-content-type-options: nosniff
content-length: 959
date: Sun, 11 Jan 2026 21:17:36 GMT

Yeaah… how am I supposed to cache this? Do I cache one response and then continue serving that for the 50+ million crawlers that visit my sites every day? And you think a single, repetitive thing will poison anything at all? Really?

Then, the Poison Fountain explanation goes on to explain that serving garbage to the crawlers will end up in the training data. I’m fairly sure the person who set this up never worked with model training, because this is not what happens. Not even the AI companies are that clueless, they do not train on anything and everything, they do filter it down.

And what this fountain provides, is trivial to filter.

It’s also mighty hard to set up! It’s not just a reverse_proxy https://rnsaffn.com/posion2, because then you leak all the headers you got. No, you have to make a sanitized request that doesn’t leak data. Good luck!

Meanwhile, there are a gazillion of self-hostable garbage generators and tarpits that you can literally shove in a docker container and reverse proxy tarpit URLs to them, safely, locally. Much more efficient, far more effective. And, seeing as this is practically uncacheable, if I were to use it, I’d have to send all the shit that hits my servers, their way. As far as I can tell, this is a single Linode server. It probably wouldn’t crumble under my 50 million requests / day, but if ten more people would join the “war effort” without caching, my well educated guess is that it would fall over and die.

Besides, we have no idea whether poisoning works. We can’t measure that. What we can measure, is the load on our servers, and this helps fuck all in that regard. The bots will still come, they’ll still hit everything, and I’d have additional load due to the network traffic between my server and theirs (remember: the returned response provides no sane indicators that’d allow caching while keeping the responses useful for poisoning purposes).

Not only is this ineffective in poisoning, it’s not usable at all in its current state. And they call for joining the war effort. C’mon.

algernon@lemmy.ml · 8 days ago

I had a fun little toot storm about this, because… this thing is not very well thought out, to say the least.

algernon@lemmy.ml · edit-2 16 days ago

My layout is… perhaps a bit strange!

algernon's Programmer Erlang Monster

It’s the bastard child of Engram and Programmer Dvorak, laid out on a Keyboardio Model 100.

Apart from Engram being perhaps a less known layout, the symbols & number arrangement is likely going to raise questions, so I’ll go ahead and answer them: I hate the number row on the top. To enter many numbers, I’d need two hands, and plenty of finger movement. So instead, I put them on a layer in a numpad-like arrangement, so I can type them with one hand (and another to activate the layer), with far less hand movement. That also lets me input most of the symbols without having to reach for a modifier: they’re on the top row!

I also have a dedicated Compose key, and all my modifiers are one-shot (and so are the layer keys).

algernon@lemmy.ml · 26 days ago

It’s complex, because it does a lot of things, and it evolved over the past few years. The basics are very, very simple:

#+begin_src nix :tangle out/flake.nix :noweb no-export :noweb-prefix no :mkdirp t
{
  inputs = {
    <<flake-inputs>>
  };

  outputs = { self, nixpkgs, ... }@inputs: {
    <<flake-outputs>>
  };
}
#+end_src

In some other file, you can then do:

#+begin_src nix :noweb-ref flake-inputs
nixpkgs.url = "github:NixOS/nixpkgs/nixos-25.11";
#+end_src

And voila, you now have:

{
  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-25.11";
  };

  // ...rest of the flake
}

How you organize your org documents, is entirely up to you. The trick is that this lets you split it up as you see fit, and you are no longer restricted by what the language or any framework built on top of it can provide.

algernon@lemmy.ml · 26 days ago

If only they moved to something that isn’t VC-backed with a heavily right-leaning founder at its helm…

algernon@lemmy.ml · 26 days ago

Here’s one for my infra, and my first attempt, for my desktop. The latter will eventually merge into the former.

algernon@lemmy.ml · 28 days ago

The way I solved this problem is that I write my flake.nix in a literate-programming style using Org Roam. I have a code block that tangles out into out/flake.nix, which has a <<flake-inputs>> placeholder. I can have any number of code blocks, in any number of Org files that all reference flake-inputs. Thus, my inputs are near other code (& documentation) that uses them, split across dozens of files.

The downside is that you no longer write Nix directly, but have to tangle it out.

algernon@lemmy.ml · 1 month ago

This took a while to drop, but fuuuck, did it hurt when it did. In a good way.

algernon@lemmy.ml · 1 month ago

A new chapter, for sure. A very sad chapter, unfortunately.

algernon@lemmy.ml · 1 month ago

I’m a malicious actor.

algernon@lemmy.ml · 1 month ago

I have a social life because of the internet. That’s where most of my friends are, too.

algernon@lemmy.ml · 1 month ago

GNOME’s Video Player (Showtime) looks somewhat similar, as does Moonplayer.

algernon@lemmy.ml · 1 month ago

The article isn’t just old, it’s also wrong. builder.ai did collapse, but not because it hired 700 people from India to pretend to be AI.

See this debunking of the linked article.

algernon@lemmy.ml · 2 months ago

Whats are pros of XMPP?

Pros of XMPP is that I can fully self host it, it can do video & audio calls too, and has good clients that aren’t just a webpage wrapped in Blink (aka, Electron). Matrix is a pain in the ass to self host, especially if I don’t want to federate. My XMPP server is private, friends & family can use it, and that’s it. That’s what I needed, and it delivered perfectly. It does End-to-End encryption. It is weaker than Signal, for sure, but it’s enough for what I need it for. In short: it’s reasonably simple to self host, has good, usable clients for both platforms I care about (Linux & Android), we can chat, we can have group chats, we can have audio & video calls.

Also could u tell me about self hosting cost and time you spend on it?

Well, I’ve been self-hosting since about 1998, so the time I spend on it nowadays is very little. One of my servers has been running for ~4 years without any significant change. I upgrade it once in a while, tweak my spam filters once a week or so, and go my merry way. I haven’t rebooted it in… checks uptime 983 days. Maybe I should. My other, newer server, is only about a year old - it took a LOT of time to set that up, and the first few months required a lot of time. But that was because I switched from Debian to NixOS, and had to figure out a lot of stuff. Nowadays, I run just sys update && just sys deploy (at home, on my desktop pc), and both my tiny VPS and my homelab is upgraded. I do tweak it from time to time - because I want to, and I enjoy doing so. I don’t have to. Strictly necessary maintenance time is about an hour a week if I try to be a good sysadmin, ~10-15 minutes otherwise. It Just Works™.

As for costs: my setup is… complicated. I have a 2014-era Mac Mini in my home office, which hosts half my self-hosted things (Miniflux, Atuin server, EteBase, Grafana, Prometheus, ntfy, readeck, vaultwarden, victorialogs, and postgres to serve as a database for many of these). It’s power consumption is inconsequential, and the network traffic is negligible too - in a large part because I’m the primary user of it anyway. It is not connected to the public internet directly, however: I have an €5/month tiny VPS I rented from Hetzner, that fronts for it. The VPS runs WireGuard, and fronts the services on the Mac Mini through Caddy. iocaine takes care of the scrapers and other web-based annoyances (so hardly anything reaches my backend), unbound provides a resolver for my infra, vector ferries select logs from the VPS to VictoriaLogs in my homelab, and I’m running HAProxy to front for stuff Caddy isn’t good for (ie, anything other than http).

Oh, yeah, I forgot… we have poweroutages here every once in a while, so I have to turn the mac mini back on once a month or so. It happens so rarely that I didn’t set up proper Clang + Tavis-based LUKS unlocking, so I have to plug a monitor and a keyboard in. It didn’t reach a level of annoying to make me address it properly.

A bunch of my other services (GoToSocial, Forgejo + Forgejo runner, Minio [to be replaced with SeaweedFS or Garage], and my email) are still on an old server, because the mac mini doesn’t have enough juice to run them along with everything else it is already running. I plan to buy a refurbished ThinkCentre or similar, and host these in my homelab too. That’s going to be a notable up front cost, but as I plan to run the same thing for a decade, it will be a lot cheaper than paying for a similarly sized VPS for 10 years. The expensive part of this is storage (I have a lot of Stuff™), but only comparatively.

By far the most expensive part of my self-hosting are backups. I like to have at least two backups (so three copies total, including the original) of important things, and that’s not cheap - I have a lot of data to backup (granted, that includes my music, photo & media library, both of which are large).

algernon@lemmy.ml · 2 months ago

Music -> Navidrome / mpd + various clients
Google maps -> When we’re driving, I have an offline GPS. Otherwise CoMaps.
Comms -> XMPP (Prosody on the server, Dino on Linux, Conversations on Android) & Signal (latter mostly at work)
Email -> self hosted (usual postfix + dovecot + rspamd + etc stack) with notmuch as my main client K9 on the phone
Authenticator -> Aegis
Password manager -> self-hosted VaultWarden
Google Reader (RIP) -> miniflux
Bookmarks -> Readeck