00:00:28etnguyen03 (etnguyen03) joins
00:06:04<pabs>sounded like no replacement
00:06:14<pabs>"The kernel bugzilla server, Ryabitsev said, is ""semi-dead"", and has been for several years. He suggested that the time has come to simply get rid of it. That server is running bugzilla 5.2; upstream is up to 5.9, but there is no upgrade path to get there. If the bugzilla server is removed, he said, he would find a way to keep the existing history around, but it would not be possible to create new entries. There did not seem to be any opposition
00:06:15<pabs>to removing the bugzilla server (which has never been all that extensively used in the kernel community), but it will not happen immediately."
00:06:29<pabs>thats the only mention of it in the article
00:06:57<pabs>ah also a summary from the kernel.org infra guy https://lwn.net/ml/all/20251209-roaring-hidden-alligator-068eea@lemur
00:07:21<pabs>from there:
00:07:37<pabs>"question remains with what to replace bugzilla, but it's a longer discussion topic that I don't want to raise here; it may be a job for the bugspray bot that can extend the two-way bridge functionality to multiple bug tracker frameworks"
00:07:51<that_lurker>ok. Then it would be nice if they would allow AB job with high concurrency to run through before the deletion :-)
00:09:10<pabs>yeah, asked for that. JAA saved kernel.bugzilla.org already in 2023, but there will be some newer comments/bugs/attachments I guess
00:11:46icedice (icedice) joins
00:35:14<nicolas17>yes this seems worth contacting
00:36:38nexussfan quits [Client Quit]
00:40:18nexussfan (nexussfan) joins
00:55:13<h2ibot>PaulWise edited Twitter (+516, add more details): https://wiki.archiveteam.org/?diff=58625&oldid=58559
00:58:55Dango360 (Dango360) joins
01:15:55<@arkiver>hexagonwin: bufftoon will be archived a bit close to the deadline
01:16:02<@arkiver>those expiring images are annoying
01:17:08matoro quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
01:17:22matoro joins
01:20:59SootBector quits [Remote host closed the connection]
01:21:07matoro quits [Client Quit]
01:21:32matoro joins
01:22:08SootBector (SootBector) joins
01:36:29azalea_sh_ quits [Ping timeout: 272 seconds]
01:46:39<pabs>TIL a BitTorrent based archiving group: https://sciop.net/
02:22:57azalea_sh__ (azalea_sh_) joins
02:24:12azalea_sh__ quits [Remote host closed the connection]
02:38:08<hexagonwin>arkiver: thanks. my attempt with browsertrix gave me 150GB in 13hrs and we got 26hrs so i guess it should be ok. please let me know if theres anything i can help
02:38:56<@JAA>arkiver: Yeah, I think we can run that Amino stuff through AB, probably in a few parts in parallel. I'll take a closer look in a bit.
02:43:21<@JAA>pabs: Sounds good, thanks!
02:49:50azalea_sh_ (azalea_sh_) joins
03:00:32icedice quits [Client Quit]
03:12:45<azalea_sh_>https://transfer.archivete.am/10ilVi/amino_semi.md
03:12:46<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/10ilVi/amino_semi.md
03:14:00<azalea_sh_>2 Parts, decompressed ~1.9G containing 22,569,396 URLs deduped against the previous URLs file (hopefully)
03:16:10<azalea_sh_>4am moment, ill go sleep now but that should be all from the public and semi public subsets at least
03:17:04<azalea_sh_>also sciop really does look interesting, i thought of putting the DB dump on there but registering gave me a 5XX so i guess ill just dump it somewhere else some time
03:19:50HP_Archivist quits [Quit: Leaving]
03:25:15<nicolas17>I was going to suggest shuffling the list before sending it to archivebot so that the requests are more scattered across the subdomains
03:25:30<nicolas17>but 85% of URLs are in pm1, so probably no point...
03:27:04<nicolas17>arkiver: should I AB?
03:27:14<nicolas17>being 22M URLs maybe I need to split it?
03:43:15PredatorIWD251 joins
03:45:03PredatorIWD25 quits [Ping timeout: 272 seconds]
03:45:03PredatorIWD251 is now known as PredatorIWD25
03:48:46<nicolas17>JAA: should I feed this aminoapps list into AB?
03:49:59<@JAA>nicolas17: Not as is, I think. See above, I'll take a closer look.
03:50:43<nicolas17>well it's gzipped, would need to merge and decompress (maybe recompress with zstd), but other than that...
04:20:31abirkill (abirkill) joins
04:21:33etnguyen03 quits [Remote host closed the connection]
04:37:11<TheTechRobo>Maybe transfer should have some form of abuse@ email address listed on the site so the phishing links can be reported by people who aren't part of AT?
04:38:02v01d quits [Ping timeout: 256 seconds]
04:38:58<that_lurker>There is a very nice and totally working contact us section :-P
05:03:43sg-72 quits [Remote host closed the connection]
05:04:21sg-72 joins
05:10:55<h2ibot>PaulWise edited Obstacles (+132, BasedFlare): https://wiki.archiveteam.org/?diff=58626&oldid=58584
05:15:01DogsRNice quits [Read error: Connection reset by peer]
05:54:44nexussfan quits [Quit: Konversation terminated!]
05:55:59nexussfan (nexussfan) joins
06:02:06nexussfan quits [Read error: Connection reset by peer]
06:02:09nexussfan (nexussfan) joins
06:02:26nexussfan quits [Client Quit]
06:08:03<h2ibot>Calmevening edited Android Applications (-1): https://wiki.archiveteam.org/?diff=58627&oldid=58492
06:08:04<h2ibot>Calmevening edited Android Applications (+1): https://wiki.archiveteam.org/?diff=58628&oldid=58627
06:16:04<h2ibot>Cooljeanius edited Social network (+10, /* List of social networks */ sometimes people…): https://wiki.archiveteam.org/?diff=58629&oldid=44189
06:21:04<h2ibot>Cooljeanius edited Social network (+129, /* List of social networks */ add some more): https://wiki.archiveteam.org/?diff=58630&oldid=58629
06:27:05<h2ibot>Cooljeanius edited Dealing with Cloudflare (+200, link to…): https://wiki.archiveteam.org/?diff=58631&oldid=58182
06:29:43sg-72 quits [Ping timeout: 272 seconds]
06:36:16khaoohs_ quits [Read error: Connection reset by peer]
06:36:50lennier2_ quits [Read error: Connection reset by peer]
06:36:56khaoohs_ joins
06:37:05lennier2_ joins
06:37:28beardicus quits [Quit: Ping timeout (120 seconds)]
06:37:42Snivy quits [Quit: Ping timeout (120 seconds)]
06:37:46beardicus (beardicus) joins
06:37:52kiska52 quits [Quit: Ping timeout (120 seconds)]
06:37:59Snivy (Snivy) joins
06:38:10kiska52 joins
06:38:28@dxrt quits [Remote host closed the connection]
06:38:52dxrt joins
06:38:54dxrt quits [Changing host]
06:38:54dxrt (dxrt) joins
06:38:54@ChanServ sets mode: +o dxrt
06:59:33Snivy quits [Client Quit]
06:59:49Snivy (Snivy) joins
07:03:06croissant_ quits [Ping timeout: 256 seconds]
07:07:43Dango360 quits [Ping timeout: 272 seconds]
07:09:33Dango360 (Dango360) joins
07:13:00Dango3600 (Dango360) joins
07:15:19Dango360 quits [Ping timeout: 272 seconds]
07:15:19Dango3600 is now known as Dango360
07:29:49mannie (nannie) joins
07:30:04<mannie>https://www.reddit.com/r/Archiveteam/comments/1pmw0fv/urgent_chinese_catholic_archive_facing_imminent/
07:30:31<mannie>Is there already taken action to preserve this?
07:30:45<mannie>If need I can run it with archivebot.
07:32:11<pokechu22>It looks like nothing has been started for that yet. I can't tell quite how big it is though - hopefully archivebot would be enough
07:32:20mannie quits [Remote host closed the connection]
07:33:11<pokechu22>hmm, but https://www.wanyouzhenyuan.cn/index.php?m=music&c=album&id=97 uses e.g. https://www.chinacath.cn/api/v2/track/783 too
07:33:42mannie (nannie) joins
07:34:47<pokechu22>That's probably going to need some extra work because https://www.wanyouzhenyuan.cn/index.php?m=music&c=album&id=97 uses e.g. https://www.chinacath.cn/api/v2/track/783 - but an archivebot job should at least get things started.
07:36:33mannie quits [Remote host closed the connection]
07:39:18mannie (nannie) joins
07:41:17Myself quits [Ping timeout: 272 seconds]
07:49:00Shard7959 quits [Ping timeout: 256 seconds]
07:49:14mannie quits [Remote host closed the connection]
07:53:29Myself joins
08:01:36Shard7959 (Shard) joins
08:02:02SootBector quits [Remote host closed the connection]
08:03:08SootBector (SootBector) joins
08:26:16Wohlstand (Wohlstand) joins
08:28:37atphoenix__ (atphoenix) joins
08:30:03atphoenix_ quits [Ping timeout: 272 seconds]
08:38:17stepney141 quits [Ping timeout: 272 seconds]
08:41:01sg72 joins
08:41:24stepney141 (stepney141) joins
09:20:47<hexagonwin>could someone run this url on archivebot? the company got bankrupt: https://shop.buyzle.co.kr/int/communication/CompanyInfo.do?_method=initial
09:21:36<hexagonwin>it's an online shopping mall website, i don't think the whole website <https://www.buyzle.co.kr/malls/index.html#> is really worth grabbing
09:25:23<hexagonwin>maybe this announcement section <https://shop.buyzle.co.kr/communication/NewsListMgt.do?_method=form> is also worth grabbing for context, but it's not static :(
09:32:57twiswist quits [Quit: twiswist]
10:00:46rohvani quits [Quit: The Lounge - https://thelounge.chat]
10:02:04rohvani joins
10:13:20twiswist (twiswist) joins
10:48:09BearFortress quits []
10:56:01pedantic-darwin quits [Read error: Connection reset by peer]
10:56:17pedantic-darwin joins
10:58:13<cruller>Slashdot Japan, which shut down, has been drop-caught and fake sites have been created by abusing its archives :(
11:04:55<cruller>I can find only Japanese articles about it. https://internet.watch.impress.co.jp/docs/yajiuma/2071555.html
11:10:09VerifiedJ quits [Quit: The Lounge - https://thelounge.chat]
11:10:41VerifiedJ (VerifiedJ) joins
11:11:47<hexagonwin>i saved that buyzle.co.kr with browsertrix and here's its wacz, does someone know how to extract all the urls in it so it can be run in archivebot?
11:11:49<hexagonwin>https://transfer.archivete.am/vUl7L/interpark-manual-20251216105204-f64351e2-a48.wacz
11:11:49<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/vUl7L/interpark-manual-20251216105204-f64351e2-a48.wacz
11:13:39<hexagonwin>cruller both srad.jp and sourceforge.jp seems to show a generic domain parking page for me
11:13:47croissant joins
11:15:30<cruller>Slashdot Japan is the predecessor of srad.jp
11:15:59<cruller>Btw, I can extract the urls now.
11:17:47BearFortress joins
11:18:00TunaLobster quits [Quit: So long and thanks for all the fish]
11:19:07<cruller>Hmm, this is a bit different from the one made by Archiveweb.page. But it doesn't seem too difficult.
11:21:51TunaLobster joins
11:24:24atphoenix__ quits [Read error: Connection reset by peer]
11:25:01atphoenix__ (atphoenix) joins
11:25:43Ryz quits [Quit: Ping timeout (120 seconds)]
11:25:56kiska52 quits [Quit: Ping timeout (120 seconds)]
11:26:14kiska52 joins
11:26:27@dxrt quits [Remote host closed the connection]
11:26:27Ryz (Ryz) joins
11:26:51dxrt joins
11:26:53dxrt quits [Changing host]
11:26:53dxrt (dxrt) joins
11:26:53@ChanServ sets mode: +o dxrt
11:27:43<cruller>I did just extract it, extract "index.cdx"es, and combine them into https://transfer.archivete.am/inline/e4ZWg/merged-index.cdx
11:43:34<cruller>Extract URLs starting with http, and dedupe. https://transfer.archivete.am/inline/15LXRG/urls.txt
11:52:15azalea_sh__ (azalea_sh_) joins
11:53:14<azalea_sh__>Hey there! Just noticed that the URLs were put to AB, thanks a bunch!
11:53:25<cruller>It's interesting that ArchiveWeb.page can't load interpark-manual-20251216105204-f64351e2-a48.wacz, but ReplayWeb.page can.
11:54:11azalea_sh__ quits [Remote host closed the connection]
12:03:09<cruller>For a small wacz file, you can load it into replayweb.page, go to the Resources tab, scroll all the way down, and copy everything. However, be careful with “__wb_method=POST”.
12:08:46<cruller>By the way, I once tried crawling a site with 1,000-10,000 pages using Brozzler, but it didn't work out and I gave up. I haven't even tried since then.
12:25:41<hexagonwin>sad :( a reliable browser based crawler would be very useful for quick grabs
12:28:34APOLLO03a quits [Read error: Connection reset by peer]
12:29:34APOLLO03 joins
12:30:22egallager quits [Quit: This computer has gone to sleep]
12:34:06<h2ibot>Cruller edited Alive... OR ARE THEY (+114, Add Kernel.org Bugzilla and 万有真原, Remove OSDN): https://wiki.archiveteam.org/?diff=58632&oldid=58623
12:34:36pabs quits [Ping timeout: 256 seconds]
12:35:56pabs (pabs) joins
12:45:21mls quits [Quit: Lost terminal]
12:51:51<cruller>JAA: Is the edit by 'Brad' still pending? May I help you?
12:52:56etnguyen03 (etnguyen03) joins
12:56:12<cruller>I've partially undone a Brad's edit (https://wiki.archiveteam.org/index.php?title=Deathwatch&oldid=57769), and tomorrow is my day off.
12:57:06uuid leaves [Part]
12:57:47<cruller>Should sites listed in https://wiki.archiveteam.org/index.php?title=Deathwatch#Pining_for_the_Fjords_ (Dying) be moved to the appropriate section after the deadline? Of course it would be better to do so, but out of proportion to the benefit.
12:59:39<cruller>Ugh, machine translater somtimes inject (extra) spaces.
13:05:12<cruller>But the Manual translator usually makes mistakes.
13:05:13Webuser614683 joins
13:05:31<Webuser614683>Is anyone trying to archive Guilded before it shuts down on the 19th?
13:07:48<cruller>hexagonwin: You can also test https://github.com/iipc/warcaroo
13:10:27azalea_sh__ (azalea_sh_) joins
13:11:40<azalea_sh__>Webuser614683: what would it take to archive guilded? Is there an accessible index of servers there? And do you know what the ratelimits look like? If so, I could write a scraper probably
13:13:11<Webuser614683>Best way to find servers would most likely manually (or automatically idk the Roblox ratelimits) communities which display guilded https://www.roblox.com/communities/9250955/Plehlowlas-Studio
13:13:11<Webuser614683>I also have ZERO clue what the ratelimits are
13:13:41<Webuser614683>The search bar on https://www.guilded.gg/explore/servers/overview seems to show most servers also you just got to search keywords for them
13:14:46azalea_sh__ quits [Read error: Connection reset by peer]
13:15:35azalea_sh__ (azalea_sh_) joins
13:15:51<azalea_sh__>I can definitely check later when I'm home
13:16:57<azalea_sh__>I'll log off IRC on this client because it sucks and I'm in a train (thank you Deutsche Bahn), i'll Check the channel logs later though
13:17:00azalea_sh__ quits [Remote host closed the connection]
13:17:22@imer quits [Quit: Ping timeout (120 seconds)]
13:17:51imer (imer) joins
13:17:51@ChanServ sets mode: +o imer
13:19:01Webuser302865 joins
13:23:09<Webuser614683>azalea_sh_ https://www.guilded.gg/docs/api/http_rate_limits
13:28:46etnguyen03 quits [Client Quit]
13:29:49<cruller>Webuser614683 azalea_sh__ : Please note that discussions about guilded may also be taking place in #robloxd. (idk though)
13:49:59sec^nd quits [Remote host closed the connection]
13:50:18sec^nd (second) joins
13:53:10Dada joins