@cadadr@polyglot.city cover

go away

This profile is from a federated server and may be incomplete. View on remote instance

@eri@plush.city avatar eri , to random

:boost_ok: boosts encouraged

fedi girl challenge: recommend me progressive/anti-fascist linux distributions

the only one i currently know of is elementary OS

thank you! :sparkles_trans:

cadadr ,
@cadadr@polyglot.city avatar

@eri MxLinux is explicitly anarchist i think, tho how substantially they adhere to any principles is beyond my knowledge

@cadadr@polyglot.city avatar cadadr , to random

if your fedi stats bot is being thrown off by instances providing inaccurate data, that's a you issue, and it's happening because you don't know what data is and how to collect it, and much more importantly, you're noticing the exaggerated numbers from GtS instances because they are, quite conveniently, illogical numbers

so while you're being thrown off by "baffled" stats, don't you think, you're excessively vulnerable to maliciously tweaked stats that aren't obviously inaccurate?

cadadr OP ,
@cadadr@polyglot.city avatar

perhaps you shouldn't collect "data" by merely hitting some endpoint in some web apps, and, if you really care about accuracy, bother for a wee moment to build in some basic statistical checks into your software that can easily spot exaggerated and otherwise implausible reports from instance APIs?

you can for example make a basic guess that an instance wouldn't grow >x% per day and code a case in your app so that any growth that exceeds that is flagged for manual review?

cadadr OP ,
@cadadr@polyglot.city avatar

an uncurated collection of numbers is an RNG, not a database

just because you store it in a DBMS it doesn't become data

data is a curated record of observations of variables. not any old garbage array is data

cadadr OP ,
@cadadr@polyglot.city avatar

stop and think about this for a moment. you wrote a thing that asks websites how popular they are, and takes their answer for Granted, without interrogation...

all this talk about robots.txt and ideas about consent is secondary. your software is Vulnerable

if someone snuck in a malicious patch into the mastodon docker image that reports all instances as 25% smaller, if they were smart about it, your code would report fedi as maybe 20% smaller, and nobody would notice without fucking Forensics…

cadadr OP ,
@cadadr@polyglot.city avatar

to be clear my position is, 100%, robots.txt is sacrosant, and i think all this bean counting is bullshit. it's uninteresting and boring to me

but i am posting this thread based on my familiarity with quantitative social science as an MA in linguistics, and from that PoV all i can say is, you don't get to complain about your participants. your data is your responsibility. people you observe don't live to be observed or to comply. resilience of your measurement tools is your responsibility

@cadadr@polyglot.city avatar cadadr , to histodons group

question: those of you who went abroad on erasmus during their doctoral studies, what was the experience like? would you recommend doing it? was it worth it?

it seems like part of my doctorate will benefit from archives and mayhaps oral history interviews in northeastern mediterranean and methinks it would make sense to just spend 6-12mo at someplace nearby, instead of many visas and travel back and forth home

academicchatter@a.gup.pe icon AcademicChatter group @phdlife histodons@a.gup.pe icon histodons group