Thatโs very 1984 of them
The root of the problem is Wikipedia not having local snapshots leaves their articles vulnerable to eroding sources.
Is it reasonable for them to keep their own local snapshots?
Thatโs not a trivial amount of work and data, particularly it itโs multimedia.
I think itโs a concerning issue affecting long-term viability of the platform. Itโll only get worse as time goes on and sources go offline.
Okay so, what is the currently going-for alternative that bypasses paywalls?
iโve had consistently good luck with the archive.org wayback machine
copy the headline and find the same thing free somewhere else. usually itโs a news site full of unreadable slop. pay walls used to be almost worth bypassing. no more. just another money grab, pretending to protect valuable information. not
Fair point. Very few if any news sites provide unique articles.
Iโm afraid there arenโt any. You can use the Bypass Paywalls Clean extension though
Oh well, archive.today it is in the meantime I guess.
As someone who uses Bypass Paywalls Clean, this is so frustrating.
Bypass Paywalls Clean was chased off of the Firefox Add-Ons site, chased off of Gitlab, and chased off of Github via DMCA takedown notices for copyright infringement. It is now hosted on the Russian Gitflic.ru.
We all know Russia sucks in a litany of ways, but one way it doesnโt suck is that it is one of the few countries left that has really thrown all caution to the wind and absolutely said โfuck itโ in terms of respecting the international Big Copyright norms as promoted by and deeply influenced by the USA copyright cabal (RIAA/MPAA).
We have spent the better part of two decades dealing with the DMCA being used as an outright weapon to silence information that corporations and government find inconvenient mostly because that information is wildly incriminating for them. It works especially strongly because a large amount of the worldโs internet has been consolidated to the US and its vast hosting structures like AWS and Cloudflare, putting enormous amounts of the internet under the direct influence of US laws like the DMCA.
Websites like Annaโs Archive, Libgen, and Sci-Hub live because they use hosting in countries that allow them to bypass these kind of restrictions. Russia is one of the most common countries for them to host the data out of due to the lack of enforcement of copyright laws, although it is obviously not the only country that these sites use.
Until we are able to alter international copyright protections to be reasonable instead of their current over-zealously and aggressively abusive nature, we will all suffer having to risk hosting of such sites in countries that are otherwise very unsavory to be associating with.
We live in the kind of world early piracy pioneers such as the original creators of The Pirate Bay were trying to fight from becoming a reality. The American copyright cabal fought tooth and nail to change Swedenโs interpretations of copyright law so they could send these men to prison.
And now Firefox completely bans it from even being sideloaded.
Iโm with you on this, but letโs be careful here.
We all know Russia sucks in a litany of ways, but one way itย doesnโtย suck is that it is one of the few countries left that has really thrown all caution to the wind and absolutely said โfuck itโ in terms of respecting the international Big Copyright norms as promoted by and deeply influenced by the USA copyright cabal (RIAA/MPAA).
I once made a YouTube video which somehow included a clip from some RT Russian TV bullshit show. (The show was in fact a direct ripoff of Gordon Ramseyโs Hell Kitchen, for which Iโm sure they did not get license for.)
Some fucking Russian troll bots then DMCAโd my YouTube video, for using their clip, even though it was clearly โfair useโ in US jurisdiction, and YouTube happily sucked their russian dicks and flagged and removed my video.
And my video had probably 15 views, like it wasnโt a big thing.
So they arenโt exactly the Robin Hood of free speech.
Of course they arenโt, they will happily block information that they dislike because itโs embarrassing and incriminating to them. Skepticism should cut both ways, skeptical of those who use Russian connection to delegitimize valuable tools and the people associated with them, and skepticism of why Russia allows those things to persist providing they impact Western countries but not Russia.
Until the Western copyright situation is amended to something reasonable, we have to be skeptical in all aspects of this situation. Iโd rather copyright was a reasonable length with reasonable policies so organizations didnโt have to resort to connections with Russia. In the meantime we have to work with the situation we have.
Not sure how this says anything about Russian copyright laws or Russian government.
Ironically, when Russia was joining the World Trade Organization in early 2010s, one requirement was for them to do something about pirate sites, namely torrent-sharing ones. So iirc the domain torrents.ru was taken away from what is now called RuTracker, and they blocked many other sites, which stay blocked to this day.
hey thanks, i had never heard of that bypass paywalls firefox addon
Thereโs also a version for Chrome if you swing that way.
I do not because I donโt like ads on Youtube, but thx.
I donโt think the issue is paywalls. I think the issue is the personal actions of the owner. I also really donโt think Russia plays into this. Again, the personal actions of the owner of achive[.]today were the reason it was removed. The site was used by the owner to personally attack someone.
Is your comment in the thread about Wikipedia banning archive.today?
edit: I realised by reading other comments that many used archive.today to bypass paywalls, aside from the archival purpose Wikipedia relied on.
Original post title was:
Until further notice: archive.today/archive.is/archive.ph/โฆ is banned from this community for apparently being a Russian DDOS tool
And linked to the /c/ukraine community which posted it.
Also, from the Ars story:
Patokallio wasnโt able to determine who runs Archive.today but mentioned apparent aliases such as โDenis Petrovโ and โMasha Rabinovich,โ and described evidence that the site is operated by someone from Russia.
The reason it matters:
It makes people suspect of anything hosted in Russia, which is frustrating because thereโs a lot of valuable shit hosted there by people who are not necessarily from there, such as Alexandra Elbakyan founder of Sci-Hub, who has had many accusations tossed her way due to her websites association with Russia:
In December 2019, The Washington Post reported that Elbakyan was under investigation by the US Justice Department for suspected ties to Russiaโs military intelligence arm, the GRU, to steal U.S. military secrets from defense contractors. Elbakyan has denied this, saying that Sci-Hub โis not in any way directly affiliated with Russian or some other countryโs intelligence,โ but noting that โof course, there could be some indirect help. The same as with donations, anyone can send them; they are completely anonymous, so I do not know who exactly is donating to Sci-Hub. There could be some help that Iโm simply unaware of. I can only add that I write all of Sci-Hub code and design myself and Iโm doing the serverโs configuration.โ
We cannot take for granted that one of the reasons we have access to a large amount of archived information on the internet is often because of unsavory countries who refuse to play by the US governments copyright rules.
We also cannot take for granted how connections with those countries are used to delegitimize people providing valuable services. Bypass Paywalls Clean in particular has had a litany of people assume itโs untrustworthy because of its current hosting situation because they donโt know the history of it and how itโs been kicked off of every other public repository that was stateside.
The archive.today person fucked things up and gave people more ammunition to claim that anything and everything associated with Russian internet is untrustworthy.
I donโt see as relevant a possible connection of archive.today to someone based in Russia.
The only facts that should be relevant are that the manager of it is an egomaniac, andcannot be trusted.
Good reminder to pay for journalism.
The Guardian, Le Monde, El Paรญs, Tageszeitung and many others need subscribers to stay independent of the oligarchs.
Also remember the journalists that need support the most are local papers and news stations. The big ones have plenty of donors, and while itโs worth the support, they are less likely to completely collapse than the news that is run in your city.
Go look for that independent source. They will report more news that actually affects you as well.
guardian is surviving by slowly becoming a tabloid. not sure if i would have paid for it anyway, and im not sure if this was preventable by paying for it in the first place.
yeah and theyโre also transphobic af as a policy. donโt give them a damn cent
https://www.buzzfeed.com/patrickstrudwick/guardian-staff-trans-rights-letter
can also find more stuff by just looking up โthe guardian transphobiaโ
I appreciate the guardian a lot more than I did before now that someone gave me a nytimes subscription, seeing how bad they are now. For the guardianโs faults, they do break some stories still, and somewhat comprehensively cover the news, perhaps better than the times, that is too busy trying to cover for Israel to even report honestly on epstein and apparently surrendered to the administration besides.
Paying for journalism simply promotes that those who donโt pay it donโt get it ie.: more paywalls, not less.
So what youโre saying is if we refuse to pay for journalism long enough, the journalists will eventually give up and just work for free? Not have to travel for their investigations, eat nothing and need no private home?
Democracy isnโt possible without an independent press.
Epstein was persecuted because the frigging Miami Herold reported about his abuses in 2018. He would have continued raping and trafficking kids for who knows how long without that. In a world where the media is owned by Epstein, that wonโt happen.
what democracy? every person in the leadership of america and most of the world were either friends with epstein or on his payroll.
Theyโre already mostly owned and working for the ultra-rich interests. There have been plenty of outlets over the years that had paying users, theyโre mostly owned at this point. Those that arenโt are getting quite click-baity.
Capitalism is hard on News. Facism is worse.
Itโs not our fault the media decided to switch to a subscription model while not providing a product worthy of paying a subscription, even before they downgrade it every year.
Itโs a problem, but one of their own making.
I havenโt said that journalists have to work for free. Just that we donโt have to be the ones who are trickled out to feed them. I doesnโt have to be โpoors vs workersโ unlike what the media is telling you, ya know? A better system is possible.
Huh, I donโt get that argument. To me, it seems that citizens paying journalists is desirable. Iโm genuinely curious, who else should pay them in your view?
It could be the citizens but done indirectly, for example via taxes. Even better, not all citizens: just tax the rich and put the money into a journalism pool, so the rich canโt choose to benefit any particular newspaper or editorial line.
Paying for journalism is ideal, but unfortunately makes it difficult to cite/link to a source the way Wikipedia needs as a way to ensure the information remains open and accessible.
Admittedly, Iโm not familiar with these outlets enough to know if those paywalls are significant, but the problem with direct article links is that those links can change. Archival services (I suppose not archive[.]is) are important for ensuring those articles remain accessible in the format they were presented in.
Iโve come across a number of older Wikipedia articles about more minor or obscure events where links lead to local new outlet websites that no longer exist or were consumed by larger media outlets and as a result no longer provide an appropriate citation.
For anyone curious, I looked into the DDOSing, and what was done is a simple string of JavaScript was added to archive[.]today that made a background request to the blog with a randomly generated search parameter. Every time someone looked at an archive, they unknowingly sent a request to the blog under attack.
Good reminder to donate to web.archive.org
While archive.org is good and more trustworthy than archive.is, it isnโt as useful for bypassing paywalls.
But Wikipedia doesnโt need to bypass paywalls, and you can bypass them yourself with a bit of work.
Thereโs websites with paywalls that even Bypass Paywalls Clean canโt bypass. In cases that it can, it sometimes just fetches the article contents from archive.today.
That doesnโt mean an alternative shouldnโt be found, but we also shouldnโt pretend that nothing is being lost by losing access to unpaywalled sources. For practical purposes, a paywalled source means no source for most readers, unless a non-paywalled alternative can be found to replace it.
Thatโs good for you, and it is okay for you to use archive.today personally, as long as you block their DDoSing.
But Wikipedia does not need to bypass paywalls, and they donโt require the source to be freely (or easily) viewable to verify the info.
Iโm still deciding how much I agree or disagree with this. Itโs true that they do cite books which you often canโt read online, but adding information backed up by a paywalled proof feels a bit โtrust me broโ. E.g. I could find/create a site with an impossibly large paywall and no-one would realistically able to check my sources.
I do hope this move results in more support for the IA/Wayback Machine and helps them to update some of their crawler tech โ thanks to the rise of AI, some sites are effectively (thru captchas etc.) or actively (through straight-up greed [coughRedditcough]) blocked from being archived almost entirely, which is frustrating for legit archivists/contributors.
This is understandable, but at the same time, none of the anti-paywall lists are as good as archive.today. They actually have paid accounts at a bunch of paywalled sites, and use them when scraping.
Unfortunately, theyโve allegedly modified the contents of some archived articles, so even though they may do better to archive, nothing archived is of any value because it cannot be trusted.
What if somebody used archive.today to bypass a paywall and then archived that using Web Archive? (So weโre sure the content stays the same)
Theyโre injecting data into the sites during archive so that wouldnโt work.
So are they removing all other websites that post lies or modify their articles to suit their narrative at times?
Fox news? MSN? CNN? BBC? Reuters? AP?
Why the sudden urge to validate the archives? How many articles have been proven to be modified?
Seems like theyโve been wanting to remove an entity the empire doesnโt control and theyโre using this as a cover to do it.
thatโs exactly one of the main reason they use archive sites for citations. but when an archival site does that it becomes useless.

If this is not an announcement, Lemmy lets you edit your post titles so you can correct that mistake instead of luring in people who think lemmy.world is also banning links using archive.today.
Iโm not speculating on your intent, only pointing out that you can correct this situation instead of apologizing after the fact.
https://lemmy.world/c/ukraine was where i saw this. i didnโt write it. thought lemmy would have linked to the original, was wrong. FYI
How does the paywall circumvention of archive.today works?
I guess that they genuinely owned subscriptions for popular paywalled sites.
It identifies itself as a google (or other) crawler, which sites often allow and give the full content to, for better SEO.
Iโve switched to .md when the community mentioned something was up with the .today domain. Hopefully that one isnโt compromised.
Itโs the same person running all of them, so yeah it is.
Damn.
URL
archive[.]today
archive[.]fo
archive[.]is
archive[.]li
archive[.]md
archive[.]ph
archive[.]vn
archiveiya74codqgiixo33q62qlrqtkgmcitqx5u2oeqnmn5bpcbiyd[.]onion

Democracy died in daylight, the darkness hides the rotten body.
Itโs relatively possible it never got out of the planning stages intact.
Or ever made it into planning?
Everyone seems to be ignoring the fact that he only did this in response to a malicious dox attempt.
He only modified archived pages in response to a dox attempt?
And the thing is, the discovery of the modified pages revealed that it wasnโt even the first time heโd modified pages. And he used a real personโs identity to try and shift blame.
Irrespective of the doxxing allegations, if heโs done all this multiple times already, it means the page archives canโt be trusted AND thereโs no guarantee that anything archived with the service will be available tomorrow.
Seems like we need to switch to URLs that contain the SHA256 of the page theyโre linking to, so we can tell if anything has changed since the link was created.
Actually a pretty good idea.
Only works for archived pages though, because for any regular page, a large portion of the page will be dynamically generated; hashing the HTML will only say the framework hasnโt changed.
You would need a way of verifying that the SHA256 is a true copy of the site at the time though and not a faked page. You could do something like have a distributed network of archives that coordinate archival at the same time and then using the SHA256 then be able to see which archives fetched exactly the same page at the same time through some search functionality. I mean if addons are already being used for doing the crawling then we may be mostly there already since said addons would just need to certify their archive and after that they can discard the actual copy of the page. You need need a way to validate those workers though since a bad actor could just run a whole bunch at the same time to legitimise a fake archival.
The idea is to verify the archival copyโs URL, not to verify the original content. So yes, a server could push different content to the archiver than to people, or vary by region, or an AitM could modify the content as it goes out to the archiver. But adding the sha256 in the URL query parameter means that if someone publishes a link to an archive copy online, anyone else using the link can know theyโre looking at the same content the other person was referencing.
If the archive content changes, that URL will be invalid; if someone uses a fake hash, the URL will be invalid (which is why MD5 wouldnโt be appropriate).
The beauty of this technique is that query parameters are generally ignored if unsupported by the web server, so any archival service could start using this technique today, and all it would require is a browser extension to validate the parameter.
Link it to something like Web of Trust, and youโve solved the separate issue you described.
In fact, this is a feature WoT could add to their extension today, and it would โJust Workโ. For that matter, Archive.org could add it to their extension today, too.
Remind me to ping Jason about that.
Seems like we need to switch to URLs that contain the SHA256 of the page theyโre linking to, so we can tell if anything has changed since the link was created.
IPFS says hi
Yes; the problem IPFS has is the same problem IPv6 has.
The hash-in-a-URL solution can function cleanly in the background on top of what people already use.
IPFS has gateways though, so you can link to the latest version of a page which can be updated by the owner, or alternatively link to a specific revision of the page that is immutable and canโt be forged.
It wasnโt a dox attempt though. The blog just collected information that was already publicly available on other sites.
Unfortunately, they shot themselves in the foot by responding the way they did. They basically did the job of anyone who wants them taken down and not trusted. It was probably the worst way they could have reacted. Such a tragedy to lose such a valuable website.
As they should since it doesnโt matter.
Yeah, someone being shitty to you doesnโt mean go you full-fledged shitty in return, it kind of proves your lack of trustworthiness to begin with. Itโs like Nazis being like โleftists were mean to me by explaining how my politics made me a Nazi, so Iโm gonna show them by Nazi-ing even harder! They forced me to be like this!โ It kind of betrays the argument that the reason you got that way was because leftists were mean to you.
Who cares why they did it?
It proves they can and do alter the โarchivedโ website, so itโs usefulness as a source is completely gone.
Archiving a site inherently requires altering it, to change embed URLs, scripts, etc. The fact they had that capability was never in question.
Yeah, ESH. His response of editing an archive showed the site to be unreliable as an archive. DDOSing from the site as a counter to the dox attempt caused the site serious reputational harm as well.
It sucks because his site was actually more reliable than The Internet Archive.
๐ฆ๐บ๐๐ฆ๐๐ฅ๐๐๐๐ฃ๐ ๐๐ ๐๐๐๐@hilariouschaos.comEnglish
512ยท2 days agoBro any archiving/scraping tool can be used for ddos u just tell it to archive the same site over and over and now u have a different IP spamming the endpoint
In this case, their CAPTCHA page intentionally included code to DoS a particular blog, sending a request to search for a random string every 300ms (search is very CPU-intensive). This was regardless of the archived site you were trying to view.
Any good archiver will check for an archived copy before making a request, and batch requests. This was very different than the attack youโre imagining โ if you opened any archive.today page, it would poll a developerโs personal blog, regardless of whether you were interacting with content from that blog.
donโt know all the details. fyi basically. i forget where i saw the same site mentioned for the same thing. donโt call me bro Bro















