| 00:00:28 | | etnguyen03 (etnguyen03) joins |
| 00:06:04 | <pabs> | sounded like no replacement |
| 00:06:14 | <pabs> | "The kernel bugzilla server, Ryabitsev said, is ""semi-dead"", and has been for several years. He suggested that the time has come to simply get rid of it. That server is running bugzilla 5.2; upstream is up to 5.9, but there is no upgrade path to get there. If the bugzilla server is removed, he said, he would find a way to keep the existing history around, but it would not be possible to create new entries. There did not seem to be any opposition |
| 00:06:15 | <pabs> | to removing the bugzilla server (which has never been all that extensively used in the kernel community), but it will not happen immediately." |
| 00:06:29 | <pabs> | thats the only mention of it in the article |
| 00:06:57 | <pabs> | ah also a summary from the kernel.org infra guy https://lwn.net/ml/all/20251209-roaring-hidden-alligator-068eea@lemur |
| 00:07:21 | <pabs> | from there: |
| 00:07:37 | <pabs> | "question remains with what to replace bugzilla, but it's a longer discussion topic that I don't want to raise here; it may be a job for the bugspray bot that can extend the two-way bridge functionality to multiple bug tracker frameworks" |
| 00:07:51 | <that_lurker> | ok. Then it would be nice if they would allow AB job with high concurrency to run through before the deletion :-) |
| 00:09:10 | <pabs> | yeah, asked for that. JAA saved kernel.bugzilla.org already in 2023, but there will be some newer comments/bugs/attachments I guess |
| 00:11:46 | | icedice (icedice) joins |
| 00:35:14 | <nicolas17> | yes this seems worth contacting |
| 00:36:38 | | nexussfan quits [Client Quit] |
| 00:40:18 | | nexussfan (nexussfan) joins |
| 00:55:13 | <h2ibot> | PaulWise edited Twitter (+516, add more details): https://wiki.archiveteam.org/?diff=58625&oldid=58559 |
| 00:58:55 | | Dango360 (Dango360) joins |
| 01:15:55 | <@arkiver> | hexagonwin: bufftoon will be archived a bit close to the deadline |
| 01:16:02 | <@arkiver> | those expiring images are annoying |
| 01:17:08 | | matoro quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 01:17:22 | | matoro joins |
| 01:20:59 | | SootBector quits [Remote host closed the connection] |
| 01:21:07 | | matoro quits [Client Quit] |
| 01:21:32 | | matoro joins |
| 01:22:08 | | SootBector (SootBector) joins |
| 01:36:29 | | azalea_sh_ quits [Ping timeout: 272 seconds] |
| 01:46:39 | <pabs> | TIL a BitTorrent based archiving group: https://sciop.net/ |
| 02:22:57 | | azalea_sh__ (azalea_sh_) joins |
| 02:24:12 | | azalea_sh__ quits [Remote host closed the connection] |
| 02:38:08 | <hexagonwin> | arkiver: thanks. my attempt with browsertrix gave me 150GB in 13hrs and we got 26hrs so i guess it should be ok. please let me know if theres anything i can help |
| 02:38:56 | <@JAA> | arkiver: Yeah, I think we can run that Amino stuff through AB, probably in a few parts in parallel. I'll take a closer look in a bit. |
| 02:43:21 | <@JAA> | pabs: Sounds good, thanks! |
| 02:49:50 | | azalea_sh_ (azalea_sh_) joins |
| 03:00:32 | | icedice quits [Client Quit] |
| 03:12:45 | <azalea_sh_> | https://transfer.archivete.am/10ilVi/amino_semi.md |
| 03:12:46 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/10ilVi/amino_semi.md |
| 03:14:00 | <azalea_sh_> | 2 Parts, decompressed ~1.9G containing 22,569,396 URLs deduped against the previous URLs file (hopefully) |
| 03:16:10 | <azalea_sh_> | 4am moment, ill go sleep now but that should be all from the public and semi public subsets at least |
| 03:17:04 | <azalea_sh_> | also sciop really does look interesting, i thought of putting the DB dump on there but registering gave me a 5XX so i guess ill just dump it somewhere else some time |
| 03:19:50 | | HP_Archivist quits [Quit: Leaving] |
| 03:25:15 | <nicolas17> | I was going to suggest shuffling the list before sending it to archivebot so that the requests are more scattered across the subdomains |
| 03:25:30 | <nicolas17> | but 85% of URLs are in pm1, so probably no point... |
| 03:27:04 | <nicolas17> | arkiver: should I AB? |
| 03:27:14 | <nicolas17> | being 22M URLs maybe I need to split it? |
| 03:43:15 | | PredatorIWD251 joins |
| 03:45:03 | | PredatorIWD25 quits [Ping timeout: 272 seconds] |
| 03:45:03 | | PredatorIWD251 is now known as PredatorIWD25 |
| 03:48:46 | <nicolas17> | JAA: should I feed this aminoapps list into AB? |
| 03:49:59 | <@JAA> | nicolas17: Not as is, I think. See above, I'll take a closer look. |
| 03:50:43 | <nicolas17> | well it's gzipped, would need to merge and decompress (maybe recompress with zstd), but other than that... |
| 04:20:31 | | abirkill (abirkill) joins |
| 04:21:33 | | etnguyen03 quits [Remote host closed the connection] |
| 04:37:11 | <TheTechRobo> | Maybe transfer should have some form of abuse@ email address listed on the site so the phishing links can be reported by people who aren't part of AT? |
| 04:38:02 | | v01d quits [Ping timeout: 256 seconds] |
| 04:38:58 | <that_lurker> | There is a very nice and totally working contact us section :-P |
| 05:03:43 | | sg-72 quits [Remote host closed the connection] |
| 05:04:21 | | sg-72 joins |
| 05:10:55 | <h2ibot> | PaulWise edited Obstacles (+132, BasedFlare): https://wiki.archiveteam.org/?diff=58626&oldid=58584 |
| 05:15:01 | | DogsRNice quits [Read error: Connection reset by peer] |
| 05:54:44 | | nexussfan quits [Quit: Konversation terminated!] |
| 05:55:59 | | nexussfan (nexussfan) joins |
| 06:02:06 | | nexussfan quits [Read error: Connection reset by peer] |
| 06:02:09 | | nexussfan (nexussfan) joins |
| 06:02:26 | | nexussfan quits [Client Quit] |
| 06:08:03 | <h2ibot> | Calmevening edited Android Applications (-1): https://wiki.archiveteam.org/?diff=58627&oldid=58492 |
| 06:08:04 | <h2ibot> | Calmevening edited Android Applications (+1): https://wiki.archiveteam.org/?diff=58628&oldid=58627 |
| 06:16:04 | <h2ibot> | Cooljeanius edited Social network (+10, /* List of social networks */ sometimes people…): https://wiki.archiveteam.org/?diff=58629&oldid=44189 |
| 06:21:04 | <h2ibot> | Cooljeanius edited Social network (+129, /* List of social networks */ add some more): https://wiki.archiveteam.org/?diff=58630&oldid=58629 |
| 06:27:05 | <h2ibot> | Cooljeanius edited Dealing with Cloudflare (+200, link to…): https://wiki.archiveteam.org/?diff=58631&oldid=58182 |
| 06:29:43 | | sg-72 quits [Ping timeout: 272 seconds] |
| 06:36:16 | | khaoohs_ quits [Read error: Connection reset by peer] |
| 06:36:50 | | lennier2_ quits [Read error: Connection reset by peer] |
| 06:36:56 | | khaoohs_ joins |
| 06:37:05 | | lennier2_ joins |
| 06:37:28 | | beardicus quits [Quit: Ping timeout (120 seconds)] |
| 06:37:42 | | Snivy quits [Quit: Ping timeout (120 seconds)] |
| 06:37:46 | | beardicus (beardicus) joins |
| 06:37:52 | | kiska52 quits [Quit: Ping timeout (120 seconds)] |
| 06:37:59 | | Snivy (Snivy) joins |
| 06:38:10 | | kiska52 joins |
| 06:38:28 | | @dxrt quits [Remote host closed the connection] |
| 06:38:52 | | dxrt joins |
| 06:38:54 | | dxrt is now authenticated as dxrt |
| 06:38:54 | | dxrt quits [Changing host] |
| 06:38:54 | | dxrt (dxrt) joins |
| 06:38:54 | | @ChanServ sets mode: +o dxrt |
| 06:59:33 | | Snivy quits [Client Quit] |
| 06:59:49 | | Snivy (Snivy) joins |
| 07:03:06 | | croissant_ quits [Ping timeout: 256 seconds] |
| 07:07:43 | | Dango360 quits [Ping timeout: 272 seconds] |
| 07:09:33 | | Dango360 (Dango360) joins |
| 07:13:00 | | Dango3600 (Dango360) joins |
| 07:15:19 | | Dango360 quits [Ping timeout: 272 seconds] |
| 07:15:19 | | Dango3600 is now known as Dango360 |
| 07:29:49 | | mannie (nannie) joins |
| 07:30:04 | <mannie> | https://www.reddit.com/r/Archiveteam/comments/1pmw0fv/urgent_chinese_catholic_archive_facing_imminent/ |
| 07:30:31 | <mannie> | Is there already taken action to preserve this? |
| 07:30:45 | <mannie> | If need I can run it with archivebot. |
| 07:32:11 | <pokechu22> | It looks like nothing has been started for that yet. I can't tell quite how big it is though - hopefully archivebot would be enough |
| 07:32:20 | | mannie quits [Remote host closed the connection] |
| 07:33:11 | <pokechu22> | hmm, but https://www.wanyouzhenyuan.cn/index.php?m=music&c=album&id=97 uses e.g. https://www.chinacath.cn/api/v2/track/783 too |
| 07:33:42 | | mannie (nannie) joins |
| 07:34:47 | <pokechu22> | That's probably going to need some extra work because https://www.wanyouzhenyuan.cn/index.php?m=music&c=album&id=97 uses e.g. https://www.chinacath.cn/api/v2/track/783 - but an archivebot job should at least get things started. |
| 07:36:33 | | mannie quits [Remote host closed the connection] |
| 07:39:18 | | mannie (nannie) joins |
| 07:41:17 | | Myself quits [Ping timeout: 272 seconds] |
| 07:49:00 | | Shard7959 quits [Ping timeout: 256 seconds] |
| 07:49:14 | | mannie quits [Remote host closed the connection] |
| 07:53:29 | | Myself joins |
| 08:01:36 | | Shard7959 (Shard) joins |
| 08:02:02 | | SootBector quits [Remote host closed the connection] |
| 08:03:08 | | SootBector (SootBector) joins |
| 08:26:16 | | Wohlstand (Wohlstand) joins |
| 08:28:37 | | atphoenix__ (atphoenix) joins |
| 08:30:03 | | atphoenix_ quits [Ping timeout: 272 seconds] |
| 08:38:17 | | stepney141 quits [Ping timeout: 272 seconds] |
| 08:41:01 | | sg72 joins |
| 08:41:24 | | stepney141 (stepney141) joins |
| 09:20:47 | <hexagonwin> | could someone run this url on archivebot? the company got bankrupt: https://shop.buyzle.co.kr/int/communication/CompanyInfo.do?_method=initial |
| 09:21:36 | <hexagonwin> | it's an online shopping mall website, i don't think the whole website <https://www.buyzle.co.kr/malls/index.html#> is really worth grabbing |
| 09:25:23 | <hexagonwin> | maybe this announcement section <https://shop.buyzle.co.kr/communication/NewsListMgt.do?_method=form> is also worth grabbing for context, but it's not static :( |
| 09:32:57 | | twiswist quits [Quit: twiswist] |
| 10:00:46 | | rohvani quits [Quit: The Lounge - https://thelounge.chat] |
| 10:02:04 | | rohvani joins |
| 10:13:20 | | twiswist (twiswist) joins |
| 10:48:09 | | BearFortress quits [] |
| 10:56:01 | | pedantic-darwin quits [Read error: Connection reset by peer] |
| 10:56:17 | | pedantic-darwin joins |
| 10:58:13 | <cruller> | Slashdot Japan, which shut down, has been drop-caught and fake sites have been created by abusing its archives :( |
| 11:04:55 | <cruller> | I can find only Japanese articles about it. https://internet.watch.impress.co.jp/docs/yajiuma/2071555.html |
| 11:10:09 | | VerifiedJ quits [Quit: The Lounge - https://thelounge.chat] |
| 11:10:41 | | VerifiedJ (VerifiedJ) joins |
| 11:11:47 | <hexagonwin> | i saved that buyzle.co.kr with browsertrix and here's its wacz, does someone know how to extract all the urls in it so it can be run in archivebot? |
| 11:11:49 | <hexagonwin> | https://transfer.archivete.am/vUl7L/interpark-manual-20251216105204-f64351e2-a48.wacz |
| 11:11:49 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/vUl7L/interpark-manual-20251216105204-f64351e2-a48.wacz |
| 11:13:39 | <hexagonwin> | cruller both srad.jp and sourceforge.jp seems to show a generic domain parking page for me |
| 11:13:47 | | croissant joins |
| 11:15:30 | <cruller> | Slashdot Japan is the predecessor of srad.jp |
| 11:15:59 | <cruller> | Btw, I can extract the urls now. |
| 11:17:47 | | BearFortress joins |
| 11:18:00 | | TunaLobster quits [Quit: So long and thanks for all the fish] |
| 11:19:07 | <cruller> | Hmm, this is a bit different from the one made by Archiveweb.page. But it doesn't seem too difficult. |
| 11:21:51 | | TunaLobster joins |
| 11:24:24 | | atphoenix__ quits [Read error: Connection reset by peer] |
| 11:25:01 | | atphoenix__ (atphoenix) joins |
| 11:25:43 | | Ryz quits [Quit: Ping timeout (120 seconds)] |
| 11:25:56 | | kiska52 quits [Quit: Ping timeout (120 seconds)] |
| 11:26:14 | | kiska52 joins |
| 11:26:27 | | @dxrt quits [Remote host closed the connection] |
| 11:26:27 | | Ryz (Ryz) joins |
| 11:26:51 | | dxrt joins |
| 11:26:53 | | dxrt is now authenticated as dxrt |
| 11:26:53 | | dxrt quits [Changing host] |
| 11:26:53 | | dxrt (dxrt) joins |
| 11:26:53 | | @ChanServ sets mode: +o dxrt |
| 11:27:43 | <cruller> | I did just extract it, extract "index.cdx"es, and combine them into https://transfer.archivete.am/inline/e4ZWg/merged-index.cdx |
| 11:43:34 | <cruller> | Extract URLs starting with http, and dedupe. https://transfer.archivete.am/inline/15LXRG/urls.txt |
| 11:52:15 | | azalea_sh__ (azalea_sh_) joins |
| 11:53:14 | <azalea_sh__> | Hey there! Just noticed that the URLs were put to AB, thanks a bunch! |
| 11:53:25 | <cruller> | It's interesting that ArchiveWeb.page can't load interpark-manual-20251216105204-f64351e2-a48.wacz, but ReplayWeb.page can. |
| 11:54:11 | | azalea_sh__ quits [Remote host closed the connection] |
| 12:03:09 | <cruller> | For a small wacz file, you can load it into replayweb.page, go to the Resources tab, scroll all the way down, and copy everything. However, be careful with “__wb_method=POST”. |
| 12:08:46 | <cruller> | By the way, I once tried crawling a site with 1,000-10,000 pages using Brozzler, but it didn't work out and I gave up. I haven't even tried since then. |
| 12:25:41 | <hexagonwin> | sad :( a reliable browser based crawler would be very useful for quick grabs |
| 12:28:34 | | APOLLO03a quits [Read error: Connection reset by peer] |
| 12:29:34 | | APOLLO03 joins |
| 12:30:22 | | egallager quits [Quit: This computer has gone to sleep] |
| 12:34:06 | <h2ibot> | Cruller edited Alive... OR ARE THEY (+114, Add Kernel.org Bugzilla and 万有真原, Remove OSDN): https://wiki.archiveteam.org/?diff=58632&oldid=58623 |
| 12:34:36 | | pabs quits [Ping timeout: 256 seconds] |
| 12:35:56 | | pabs (pabs) joins |
| 12:45:21 | | mls quits [Quit: Lost terminal] |
| 12:51:51 | <cruller> | JAA: Is the edit by 'Brad' still pending? May I help you? |
| 12:52:56 | | etnguyen03 (etnguyen03) joins |
| 12:56:12 | <cruller> | I've partially undone a Brad's edit (https://wiki.archiveteam.org/index.php?title=Deathwatch&oldid=57769), and tomorrow is my day off. |
| 12:57:06 | | uuid leaves [Part] |
| 12:57:47 | <cruller> | Should sites listed in https://wiki.archiveteam.org/index.php?title=Deathwatch#Pining_for_the_Fjords_ (Dying) be moved to the appropriate section after the deadline? Of course it would be better to do so, but out of proportion to the benefit. |
| 12:59:39 | <cruller> | Ugh, machine translater somtimes inject (extra) spaces. |
| 13:05:12 | <cruller> | But the Manual translator usually makes mistakes. |
| 13:05:13 | | Webuser614683 joins |
| 13:05:31 | <Webuser614683> | Is anyone trying to archive Guilded before it shuts down on the 19th? |
| 13:07:48 | <cruller> | hexagonwin: You can also test https://github.com/iipc/warcaroo |
| 13:10:27 | | azalea_sh__ (azalea_sh_) joins |
| 13:11:40 | <azalea_sh__> | Webuser614683: what would it take to archive guilded? Is there an accessible index of servers there? And do you know what the ratelimits look like? If so, I could write a scraper probably |
| 13:13:11 | <Webuser614683> | Best way to find servers would most likely manually (or automatically idk the Roblox ratelimits) communities which display guilded https://www.roblox.com/communities/9250955/Plehlowlas-Studio |
| 13:13:11 | <Webuser614683> | I also have ZERO clue what the ratelimits are |
| 13:13:41 | <Webuser614683> | The search bar on https://www.guilded.gg/explore/servers/overview seems to show most servers also you just got to search keywords for them |
| 13:14:46 | | azalea_sh__ quits [Read error: Connection reset by peer] |
| 13:15:35 | | azalea_sh__ (azalea_sh_) joins |
| 13:15:51 | <azalea_sh__> | I can definitely check later when I'm home |
| 13:16:57 | <azalea_sh__> | I'll log off IRC on this client because it sucks and I'm in a train (thank you Deutsche Bahn), i'll Check the channel logs later though |
| 13:17:00 | | azalea_sh__ quits [Remote host closed the connection] |
| 13:17:22 | | @imer quits [Quit: Ping timeout (120 seconds)] |
| 13:17:51 | | imer (imer) joins |
| 13:17:51 | | @ChanServ sets mode: +o imer |
| 13:19:01 | | Webuser302865 joins |
| 13:23:09 | <Webuser614683> | azalea_sh_ https://www.guilded.gg/docs/api/http_rate_limits |
| 13:28:46 | | etnguyen03 quits [Client Quit] |
| 13:29:49 | <cruller> | Webuser614683 azalea_sh__ : Please note that discussions about guilded may also be taking place in #robloxd. (idk though) |
| 13:49:59 | | sec^nd quits [Remote host closed the connection] |
| 13:50:18 | | sec^nd (second) joins |
| 13:53:10 | | Dada joins |