00:04:56Wohlstand (Wohlstand) joins
00:43:20etnguyen03 quits [Client Quit]
01:03:55Shard11151 (Shard) joins
01:04:37Shard1115 quits [Ping timeout: 272 seconds]
01:04:37Shard11151 is now known as Shard1115
01:07:23etnguyen03 (etnguyen03) joins
01:07:25tekulvw (tekulvw) joins
01:12:13tekulvw quits [Ping timeout: 272 seconds]
01:34:04<nicolas17>I think classification.gov.au banned my IP
01:40:05<Flashfire42>Maybe or its just that its an australian government site and they are awful
01:52:27etnguyen03 quits [Client Quit]
01:57:57Wohlstand quits [Client Quit]
02:09:09Webuser987308 joins
02:12:27Webuser987308 quits [Client Quit]
02:13:01nine quits [Quit: See ya!]
02:13:11nine joins
02:13:11nine quits [Changing host]
02:13:11nine (nine) joins
02:32:25tekulvw (tekulvw) joins
02:37:05tekulvw quits [Ping timeout: 272 seconds]
02:40:15cyanbox joins
02:41:53iPwnedYourIOTSmartdog quits [Quit: Ping timeout (120 seconds)]
02:42:09iPwnedYourIOTSmartdog joins
02:45:25etnguyen03 (etnguyen03) joins
03:03:02<nukke>https://www.tomshardware.com/video-games/390tb-video-game-archive-being-taken-offline-due-to-skyrocketing-ram-ssd-and-hard-drive-prices-ai-driven-supply-squeeze-results-in-closure-of-one-of-the-largest-online-video-game-archives
03:06:10Webuser166457 quits [Quit: Ooops, wrong browser tab.]
03:06:59ericgallager joins
03:09:23PredatorIWD quits [Ping timeout: 272 seconds]
03:40:16<nicolas17>nukke: https://bsky.app/profile/textfiles.com/post/3mfut5w3dd22m
03:51:01<nexussfan>> Can preservation only happen at the internet archive?
03:51:37tekulvw (tekulvw) joins
03:55:33etnguyen03 quits [Remote host closed the connection]
03:56:33tekulvw quits [Ping timeout: 268 seconds]
03:58:57steering wants to start archiving irc
03:59:14<pabs>https://wiki.archiveteam.org/index.php/IRC/Logs :)
03:59:21<steering>by which i mean, an open database of masscan-and-connect-and-list-and-lusers-and-stuff
04:00:48<pabs>oh, like netsplit.de?
04:05:21nexussfan quits [Quit: Konversation terminated!]
04:06:52nexussfan (nexussfan) joins
04:13:13<steering>ye sorta
04:13:45<steering>more like https://www.ircstats.org/data but with more raw data
04:19:04cyanbox quits [Remote host closed the connection]
04:19:26cyanbox joins
05:04:39n9nes quits [Ping timeout: 272 seconds]
05:05:35n9nes joins
05:17:06barry joins
05:21:54tekulvw (tekulvw) joins
05:26:35tekulvw quits [Ping timeout: 268 seconds]
06:09:01CYBERDEV quits [Remote host closed the connection]
06:09:18CYBERDEV joins
06:12:01sg-72 joins
06:14:19sg72 quits [Ping timeout: 272 seconds]
06:16:23nexussfan quits [Client Quit]
06:16:44sg72 joins
06:18:45sg-72 quits [Ping timeout: 272 seconds]
06:37:33tekulvw (tekulvw) joins
06:42:11tekulvw quits [Ping timeout: 272 seconds]
06:47:43barry quits [Client Quit]
06:48:12barry joins
06:56:45skyrocket quits [Quit: ZNC 1.9.0+deb2build3 - https://znc.in]
07:19:17mrfooooo joins
07:26:03Webuser563361 joins
07:37:26tekulvw (tekulvw) joins
07:42:21tekulvw quits [Ping timeout: 272 seconds]
07:57:23Webuser842443 quits [Quit: Ooops, wrong browser tab.]
08:14:01nine quits [Quit: See ya!]
08:14:14nine joins
08:14:15nine quits [Changing host]
08:14:15nine (nine) joins
08:29:59SootBector quits [Remote host closed the connection]
08:31:06SootBector (SootBector) joins
08:43:34egallager joins
08:43:35<eggdrop>[tell] egallager: [2026-02-04T02:26:59Z] <pabs> re that wesnoth forums post, there are a few non-404 .sf2 files on IA, found using the little-things ia-cdx-search tool https://transfer.archivete.am/cqUq9/www.freesf2.com-non-404-sf2-files.txt
08:45:03ericgallager quits [Ping timeout: 272 seconds]
08:47:00n9nes quits [Ping timeout: 268 seconds]
08:49:51n9nes joins
09:29:58Cornelius76 (Cornelius) joins
09:32:33Cornelius7 quits [Ping timeout: 272 seconds]
09:32:33Cornelius76 is now known as Cornelius7
09:35:08Cornelius70 (Cornelius) joins
09:35:35Cornelius7 quits [Read error: Connection reset by peer]
09:35:35Cornelius70 is now known as Cornelius7
10:02:27cyanbox_ joins
10:06:07cyanbox quits [Ping timeout: 272 seconds]
10:11:30n9nes quits [Ping timeout: 268 seconds]
10:11:33n9nes joins
10:23:49cyan_box joins
10:27:31cyanbox_ quits [Ping timeout: 268 seconds]
10:28:39cyanbox joins
10:29:32cyanbox_ joins
10:31:27cyan_box quits [Ping timeout: 272 seconds]
10:33:59cyanbox quits [Ping timeout: 272 seconds]
10:41:59cyan_box joins
10:45:23cyanbox_ quits [Ping timeout: 272 seconds]
11:01:13n9nes quits [Ping timeout: 272 seconds]
11:02:03n9nes joins
11:11:50evergreen56 quits [Quit: Bye]
11:12:32evergreen56 joins
11:33:51tekulvw (tekulvw) joins
11:34:47arch quits [Ping timeout: 272 seconds]
11:38:26tekulvw quits [Ping timeout: 268 seconds]
11:40:29driib97 quits [Ping timeout: 272 seconds]
11:51:18driib97 (driib) joins
12:00:02Bleo1826007227196234552220 quits [Quit: The Lounge - https://thelounge.chat]
12:02:47Bleo1826007227196234552220 joins
12:04:33BornOn420_ quits [Ping timeout: 272 seconds]
12:05:01BornOn420 (BornOn420) joins
12:11:10etnguyen03 (etnguyen03) joins
12:24:59missaustraliana joins
12:35:21etnguyen03 quits [Client Quit]
12:43:12etnguyen03 (etnguyen03) joins
12:47:10SootBector quits [Ping timeout: 240 seconds]
12:49:13SootBector (SootBector) joins
12:57:51etnguyen03 quits [Client Quit]
13:15:16<IDK>so no myrient project? :(
13:17:53<IDK>if its concerns with DMCA, is it possible to do a project, but make the data inaccessible even through WBM?
13:26:10missaustraliana quits [Client Quit]
13:43:25<nicolas17>IDK: download speeds are like 200KB/s now that everyone and their dog is doing independent and uncoordinated archival for themselves
13:43:35<nicolas17>so a full-site project seems infeasible anyway
13:45:10xkey quits [Quit: WeeChat 4.8.1]
13:46:28Nekroschizofrenetyk joins
13:48:25<Nekroschizofrenetyk>Hi, https://forum.dobreprogramy.pl/ is being shut down today, an old Polish software forum. There's a job running on AB since yesterday but it will capture only a small section of the forum. Sharing here, maybe somebody would come up with some ideas to help save a little more of it. Sorry for spamming.
13:49:41Arcorann__ quits [Ping timeout: 272 seconds]
13:52:53PredatorIWD joins
13:52:53FiTheArchiver joins
13:53:16xkey (xkey) joins
13:53:38FiTheArchiver quits [Remote host closed the connection]
13:54:31<nicolas17>I have achieved the unthinkable and got a consistent list of classification.gov.au
13:54:38arch (arch) joins
14:02:10<nicolas17>no duplicate titles!
14:04:43<nicolas17>if anyone cares, here's the whole 19GiB of HTML https://transfer.archivete.am/14xFqu/classification-gov-au-pages.tar.zst
14:06:39tekulvw (tekulvw) joins
14:11:51tekulvw quits [Ping timeout: 272 seconds]
14:12:07<@arkiver>nicolas17: can we archive this with AB or similar? i did not yet look into it much
14:12:27<@arkiver>IDK: it's huge, 390 TB costs a lot. especially nowadays with higher hardware prices
14:12:52<@arkiver>i would like to do a project, but in that project i do not want to simply duplicate 200 TB (just throwing out a number) that is easily available elsewhere
14:13:04<@arkiver>so we'd need someone to look into what they have, and "how unique" it is
14:21:28nexussfan (nexussfan) joins
14:27:37adamus1red quits [Quit: SigTerm]
14:28:23adamus1red (adamus1red) joins
14:35:27Bleo1826007227196234552220 quits [Client Quit]
14:35:39Bleo1826007227196234552220 joins
14:39:57Wohlstand1 (Wohlstand) joins
14:40:31<nicolas17>hmmmm
14:40:47<nicolas17>I think AB would work actually
14:41:35<nicolas17>there's an akamai cookie and things break once it expires, but it seems a request without the cookie just gives you one?
14:42:10Bleo1826007227196234552220 quits [Client Quit]
14:42:19Wohlstand1 is now known as Wohlstand
14:42:25Bleo1826007227196234552220 joins
14:43:20<nicolas17>do !a< jobs retrieve all URLs in the list *before* recursing into links in them?
14:45:50<@arkiver>we need an AB-qualified person for that question ^
14:46:00<@arkiver>(i am not very experienced with AB)
14:46:54Bleo1826007227196234552220 quits [Client Quit]
14:47:09Bleo1826007227196234552220 joins
14:47:43<nicolas17>it looks like they don't only add new items at the beginning of the list, but also move items from elsewhere in the list to the beginning (I guess they modify items and the list is sorted by last modification)
14:48:00<nicolas17>so if they do such a change when we're halfway through the list, we could miss some
14:51:40<Guest>myrient++ ( #archiveteam )
14:51:46<Guest>myrient++
14:51:47<eggdrop>[karma] 'myrient' now has 5 karma!
14:51:49<Guest>there
14:53:31<kiska>arkiver: We seem to have a forum shutting down today? https://forum.dobreprogramy.pl/ From Nekroschizofrenetyk
14:58:23<Nekroschizofrenetyk>I'm sorry, I meant tomorrow
14:58:41<Nekroschizofrenetyk>2nd of March
14:59:44<kiska>Hrm... we have slightly more time I guess...
15:00:07<kiska>This the forum post for the shutdown? https://forum.dobreprogramy.pl/t/koniec-forum-to-forum-zostanie-wylaczone-2-marca-2026-roku-nowe-forum/667855
15:00:12<Nekroschizofrenetyk>yes
15:00:22<Nekroschizofrenetyk>I've found out only yesterday morning
15:01:19n9nes quits [Ping timeout: 268 seconds]
15:02:50n9nes joins
15:03:30<Nekroschizofrenetyk>I've manually archived first ca. 2150 URLs, the coverage by IA is pretty low - like maybe 15-25% of posts from 2004. I believe it's better with the <10 years-old posts. Archivebot has been running since yesterday (9s9anhxembmenhinblp4rl9di) but won't get near to archiving the forum
15:05:26<nicolas17>kiska: AB already running for dobreprogramy, but not fast enough
15:05:35susbaconhairman joins
15:05:50<kiska>Yeah which is why I highlighted arkiver as he and a few other people can start a DPOS
15:06:29<susbaconhairman>I'm sure this has been asked before, but are the Archive Team looking into preserving Myrient?
15:07:20<kiska>If we can get only the unique things that aren't on the IA possibly
15:07:26<kiska>Otherwise no
15:07:31<kiska>Also
15:07:31<kiska>myrient++
15:07:32<eggdrop>[karma] 'myrient' now has 6 karma!
15:08:58susbaconhairman quits [Client Quit]
15:19:14Wohlstand1 (Wohlstand) joins
15:19:35etnguyen03 (etnguyen03) joins
15:19:49Wohlstand quits [Ping timeout: 268 seconds]
15:19:50Wohlstand1 is now known as Wohlstand
15:22:47oxtyped quits [Ping timeout: 272 seconds]
15:38:45etnguyen03 quits [Client Quit]
15:39:31Nekroschizofrenetyk quits [Client Quit]
15:40:11tekulvw (tekulvw) joins
15:41:42<PC>there are some things on there that were removed from the IA, apparently
15:42:45oxtyped joins
15:42:56<PC>just from skimmin the landing page and FAQ, don't know which files exactly but there is an IA folder
15:43:28<PC>*skimming
15:43:32<PC>ah, there we go "Internet Archive: Various content that is at risk of being removed or was removed from the Internet Archive"
15:43:35etnguyen03 (etnguyen03) joins
15:43:42<PC>so https://myrient.erista.me/files/Internet%20Archive/
15:43:54<PC>not sure how that'd play with being archived
15:44:41<PC>seems that there's a romhacking.net archive on there? i remember that going down a bit ago, not sure if it's preserved elsewhere
15:44:57tekulvw quits [Ping timeout: 272 seconds]
15:55:56Webuser563361 quits [Quit: Ooops, wrong browser tab.]
16:04:18cyan_box quits [Read error: Connection reset by peer]
16:14:36hackbug quits [Remote host closed the connection]
16:16:52hackbug (hackbug) joins
16:17:26etnguyen03 quits [Client Quit]
16:17:46Wohlstand quits [Remote host closed the connection]
16:18:27etnguyen03 (etnguyen03) joins
16:26:32<justauser>We probably don't want to anger our founder.
16:26:54Nekroschizofrenetyk joins
16:27:03<justauser>https://minerva-archive.org/ looks serious at least - maybe you should join. Requires Discord account, though.
16:27:50<klea>I think by removed they mean darkened, which means they weren't deleted.
16:29:36etnguyen03 quits [Client Quit]
16:35:56nexussfan quits [Client Quit]
16:43:01<nicolas17>arkiver: it seems classification.gov.au might require HTTP2
16:46:37<klea>WARC--
16:46:38<eggdrop>[karma] 'WARC' now has 0 karma!
16:48:28<justauser>WARC++
16:48:30<eggdrop>[karma] 'WARC' now has 1 karma!
16:48:37<justauser>As if be have anything better.
16:48:43<justauser>s/be/we/
17:02:26<justauser>Going to restart the learn.redhat.com job, since it seemingly worked fine despite 202s.
17:02:29Nekroschizofrenetyk quits [Client Quit]
17:06:52hackbug quits [Remote host closed the connection]
17:07:42<justauser>Weird. Started going through correct pages, then finished too fast.
17:08:04<justauser>arkiver: So it didn't work.
17:09:40hackbug (hackbug) joins
17:12:39Webuser166442 joins
17:16:10nexussfan (nexussfan) joins
17:19:46tekulvw (tekulvw) joins
17:22:21MrMcNuggets (MrMcNuggets) joins
17:24:23tekulvw quits [Ping timeout: 272 seconds]
17:33:53ducky quits [Ping timeout: 272 seconds]
17:41:44<klea>Yeah that's true, we don't have something better.
17:48:52<egallager>so apparently the Minecraft PS3 source code leaked
17:48:58<nexussfan>I've heard
17:50:42Nekroschizofrenetyk joins
17:53:36ducky (ducky) joins
18:00:04etnguyen03 (etnguyen03) joins
18:03:39BlankEclair quits [Ping timeout: 272 seconds]
18:05:08BlankEclair (BlankEclair) joins
18:06:00tekulvw (tekulvw) joins
18:10:38tekulvw quits [Ping timeout: 268 seconds]
18:15:02etnguyen03 quits [Client Quit]
18:22:44<justauser>Opendiary seems to be fetching images now, not actual posts. They seem to be on S3.
18:23:13<justauser>Should we bump the rate limit? One day past the announced shutdown, but not yet dead.
18:23:52<justauser>Some people also posted *today*. If we had a hardcoded largest post, we missed that. /cc arkiver
18:24:01<nicolas17>if there's posts in the queue, we should probably prioritize them
18:24:27<nicolas17>if there's only images in the queue, we should ramp up rate limit a lot, S3 has endless capacity
18:24:31<justauser>I don't know if there are any, but seems like the transition was sharp.
18:24:58<justauser>So I guess it stopped doing todo and switched to :backfeed.
18:26:14<nicolas17>JAA imer: can you increase rate limit on opendiary? it seems there's only images left in queue, which are on S3/cloudfront
18:27:24<@imer>increased to 1k/min from 200
18:29:13<nicolas17>hm seems to be staying around 100/min
18:29:23<nicolas17>imer: hope you didn't typo that >.>
18:29:54<@imer>mh, might be a pattern limit. lets see
18:30:52<@imer>ah, ^user and ^tag are limited to 0
18:30:55<@imer>let me do some moving
18:31:13<nicolas17>huh
18:32:04tekulvw (tekulvw) joins
18:32:29<nicolas17>there may also be a ton of claims that need retrying btw
18:32:42<nicolas17>but I'm out of the loop on what item types we have etc
18:33:53<@imer>doing that^
18:36:28etnguyen03 (etnguyen03) joins
18:36:32<justauser>Looks good.
18:36:44<nicolas17>my docker container can't get images, my browser can, huh
18:37:13tekulvw quits [Ping timeout: 272 seconds]
18:39:16<nicolas17>success rate now 4%, what happened :|
18:40:11<justauser>Ouch. Something went wrong.
18:40:36<justauser>Did we have tons of normal items hiding in claims?
18:41:17<nicolas17>justauser: do you have a grabber running? is it getting anything?
18:41:28<justauser>Not on this project.
18:41:28<nicolas17>I'm getting asset/image items... which time out
18:43:05<justauser>Looks like we are trying to hammer opendiary.com now :/
18:43:13<justauser>But why?
18:43:30tekulvw (tekulvw) joins
18:44:01<@imer>mmh, it may have grabbed some from backfeed since i took the limit off briefly
18:44:20<@imer>!remindme 1h opendiary alive again?
18:44:20<eggdrop>[remind] ok, i'll remind you at 2026-03-01T19:44:20Z
18:45:07<nicolas17>why are image items not working for me :/
18:45:38<justauser>Looks like *some* images are not on S3.
18:45:51<justauser>E.g. https://www.opendiary.com/wp-content/uploads/avatars/556019/68297b7c08b56-bpthumb.png
18:46:03<nicolas17>2=0 https://files.opendiary.com/wp-content/uploads/2023/03/18092249/IMG_20230317_1026422-300x225.jpg
18:46:04<nicolas17>Server returned bad response. Sleeping 4 seconds.
18:46:06<nicolas17>3=0 https://files.opendiary.com/wp-content/uploads/2023/03/18092249/IMG_20230317_1026422-300x225.jpg
18:46:07<nicolas17>Server returned bad response. Sleeping 7 seconds.
18:46:09<nicolas17>yet they work in the browser
18:46:25<justauser>Wow. Some DoS defence on Amazon side?
18:50:36<nicolas17>for a second I thought "or my docker networking is fucked"
18:50:45<nicolas17>but of course that's not it, since it can connect to the tracker -_-
18:52:30<justauser>cURL + Tor = response, so nothing obvious.
18:56:51tekulvw quits [Ping timeout: 272 seconds]
19:08:14etnguyen03 quits [Client Quit]
19:09:08tekulvw (tekulvw) joins
19:10:11<pokechu22>nicolas17: Yes, !a < list jobs grab all URLs in the list before grabbing links from those URLs (and it recurses into links in the order that the original URLs were retrieved - roughly the order of the list, barring concurrency stuff)
19:10:45<pokechu22>(for !a < list with custom sitemap, which is probably what's needed there, URLs from the sitemap are grabbed in a random order, but it's still everything from the sitemap before any further links from URLs in the sitemap)
19:11:31Island joins
19:14:09tekulvw quits [Ping timeout: 268 seconds]
19:37:14etnguyen03 (etnguyen03) joins
19:39:10<justauser>Oh. Here's a theory. We reclaimed everything, including some 12+ KURLs of former todo, which is opendiary.com. Warriors rushed to claim it again, with no limit, and we killed the server.
19:40:22<justauser>So I guess we'll need the limit for a bit longer, unless we give up on the main domain and rush to get the images instead.
19:44:21<eggdrop>[remind] imer: opendiary alive again?
20:03:37<nicolas17>justauser: opendiary images are going great now... still not working for *me* :/
20:17:48<nicolas17>aaaaa I forgot to check at the right time
20:18:33<nicolas17>!remindme "18:50 UTC" check when classification.gov.au changes
20:18:33<eggdrop>[remind] error: "18:50 UTC" (parsed as 1772391000 → 2026-03-01T18:50:00Z) is in the past
20:18:39<nicolas17>!remindme "tomorrow 18:50 UTC" check when classification.gov.au changes
20:18:40<eggdrop>[remind] ok, i'll remind you at 2026-03-02T18:50:00Z
20:18:56<nicolas17>I think it's at 19:00 UTC
20:19:36<nicolas17>!remindme "tomorrow 18:50 UTC" current c.gov.au result count 3847828
20:19:37<eggdrop>[remind] ok, i'll remind you at 2026-03-02T18:50:00Z
20:35:19fionera quits [Read error: Connection reset by peer]
20:35:19<nicolas17>pokechu22: curl --http1.1 'https://www.classification.gov.au/' --compressed -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:140.0) Gecko/20100101 Firefox/140.0' -H 'Accept-Language: en'
20:35:24<nicolas17>Accept-Language makes it work \o/
20:36:12<pokechu22>qwarc or grab-site should definitely be able to do that (qwarc might make more sense if the cookie needs to be repeatedly invalidated, but I could totally see akamai being jank enough that the cookie only needs to be invalidated in real browsers...)
20:49:58fionera joins
20:49:58fionera quits [Changing host]
20:49:58fionera (Fionera) joins
20:51:37_null quits [Quit: Connection closed]
20:52:22_null (_null) joins
21:02:02Sk1d joins
21:02:15szczot3k quits [Ping timeout: 272 seconds]
21:04:15<Dango360>what about archiving myrient's *content listings* html? (not the actual files themselves, the folder structure)
21:04:29szczot3k (szczot3k) joins
21:10:05lennier2_ quits [Ping timeout: 268 seconds]
21:21:15nine quits [Ping timeout: 272 seconds]
21:21:32lunik1 quits [Quit: :x]
21:21:55nine joins
21:21:56nine quits [Changing host]
21:21:56nine (nine) joins
21:22:12lunik1 joins
21:28:37Nekroschizofrenetyk quits [Quit: Ooops, wrong browser tab.]
21:31:23Sk1d quits [Ping timeout: 272 seconds]
21:39:38Sk1d joins
21:45:14kdy quits [Ping timeout: 268 seconds]
21:46:44kdy (kdy) joins
21:58:46Webuser581568 joins
21:58:54Webuser581568 quits [Client Quit]
22:12:38<nicolas17>Dango360: I fear for my bodily integrity if I do anything with Myrient without Jason Scott agreeing
22:12:50<nicolas17>but I agree
22:13:02<nicolas17>and afaik myrient.erista.me doesn't host any files
22:13:15<nicolas17>all files redirect to another erista hostname
22:13:40<nicolas17>so we just need to not follow redirects
22:15:02lennier2_ joins
22:21:28Sk1d quits [Remote host closed the connection]
22:21:36Sk1d joins
22:35:40Webuser166442 quits [Quit: Ooops, wrong browser tab.]
22:36:03LddPotato quits [Read error: Connection reset by peer]
22:36:25Shard1115 quits [Ping timeout: 268 seconds]
22:37:11LddPotato (LddPotato) joins
22:38:37Shard1115 (Shard) joins
22:46:07Sk1d quits [Ping timeout: 272 seconds]
22:52:46Webuser799687 joins
22:52:46LddPotato quits [Read error: Connection reset by peer]
22:53:51LddPotato (LddPotato) joins
22:53:59Webuser799687 quits [Client Quit]
23:06:56LddPotato quits [Read error: Connection reset by peer]
23:08:29LddPotato (LddPotato) joins
23:09:55Arcorann__ (Arcorann) joins
23:17:43Goofybally quits [Killed (NickServ (GHOST command used by Goofybally9!~Goofyball@141.179.119.196))]
23:17:47Goofybally joins
23:19:24LddPotato quits [Read error: Connection reset by peer]
23:20:02LddPotato (LddPotato) joins
23:31:26DopefishJustin quits [Remote host closed the connection]
23:36:09LddPotato quits [Read error: Connection reset by peer]
23:36:44DopefishJustin joins
23:37:42LddPotato (LddPotato) joins
23:38:58Sk1d joins
23:53:15sg72 quits [Ping timeout: 272 seconds]