| 00:00:44 | <pokechu22> | The meta one is the job log and should be uploaded. The .cdx file is normally derived from the WARC by IA itself, though I don't know if that always happens or only happens for items that get indexed by web.archive.org. |
| 00:00:47 | | tekulvw quits [Ping timeout: 272 seconds] |
| 00:04:14 | <klea> | I think it only happens for items that get indexed by web.archive.org? https://archive.org/download/limewire.com_d_7xNKB_NfXjrIqBWo |
| 00:04:32 | <klea> | Tho, maybe it was me not running the derive thing after every file |
| 00:04:35 | <klea> | lemme make it derive. |
| 00:04:43 | <klea> | (if i remember howto) |
| 00:06:24 | <klea> | It seems if you have IA derive (which is the default I believe on the web uploader?), it will make a cdx. <https://archive.org/log/5191197716> claims it will do a CDXIndex. |
| 00:06:54 | <klea> | Huh |
| 00:06:58 | <klea> | [ PST: 2026-02-16 16:05:08 ] Executing: ulimit -v 1048576 && PYTHONPATH=/petabox/sw/lib/python timeout 600 /petabox/sw/bin/cdx_writer.pex 'WARCPROX-20260216205304743-00000-y1i40ow9.warc.gz' --file-prefix='limewire.com_d_7xNKB_NfXjrIqBWo' --exclude-list='/petabox/sw/wayback/web_excludes.txt' --stats-file='/f/_limewire.com_d_7xNKB_NfXjrIqBWo/cdxstats.json'> |
| 00:06:58 | <klea> | '/t/_limewire.com_d_7xNKB_NfXjrIqBWo/cdx.txt' |
| 00:07:09 | <klea> | Wait a second. |
| 00:07:35 | <klea> | Couldn't that be a way to bulk check lots of urls by making a warc with records of lots of data, and then getting the cdx and seeing what apparently is missing? |
| 00:07:50 | <klea> | Then you'd request deletion of all that crap, because nobody wants it. |
| 00:19:07 | | nine quits [Quit: See ya!] |
| 00:19:20 | | nine joins |
| 00:19:20 | | nine is now authenticated as nine |
| 00:19:20 | | nine quits [Changing host] |
| 00:19:20 | | nine (nine) joins |
| 00:23:56 | <cruller> | TheoH7: I uploaded the entire output directory. https://archive.org/details/community.jisc.ac.uk-2026-02-16-35e53623-00000 |
| 00:32:50 | | etnguyen03 quits [Client Quit] |
| 00:40:14 | | SootBector quits [Remote host closed the connection] |
| 00:41:22 | | SootBector (SootBector) joins |
| 00:42:30 | <TheoH7> | cruller: Thanks, have downloaded it. |
| 00:43:16 | <TheoH7> | It looks like I also managed to do one where hard-coded links to https://community.ja.net (old address from years ago) are clickable in the WARC. I will upload that to IA likely in a few hours. |
| 00:44:16 | <TheoH7> | To upload the whole directory, is the best way to zip and upload, or can you select a whole folder for upload? |
| 00:48:36 | <pokechu22> | You can upload multiple files at once within a directory (uploading directories/subdirectories might also be possible but I think is more complicated?) |
| 00:50:57 | <TheoH7> | pokechu22: Great, will do that. |
| 00:51:48 | <TheoH7> | Seems one of my crawls has somehow managed to start crawling old versions of this site stored on the Wayback Machine, which is odd. I've added the pattern to ignores but just curious how grab-site would've found and started crawling such URL's. |
| 00:52:11 | <TheoH7> | I do already have 1 crawl without that done though, and will only upload the 2nd one if contains materially more content |
| 01:02:20 | | tekulvw (tekulvw) joins |
| 01:03:29 | | ducky quits [Ping timeout: 272 seconds] |
| 01:04:19 | | etnguyen03 (etnguyen03) joins |
| 01:07:21 | | tekulvw quits [Ping timeout: 268 seconds] |
| 01:21:59 | | tekulvw (tekulvw) joins |
| 01:25:36 | | Webuser614729 joins |
| 01:26:29 | | Webuser614729 quits [Client Quit] |
| 01:27:43 | | wotd joins |
| 01:41:28 | | pokechu22 quits [Quit: System maintenance] |
| 02:22:10 | | sec^nd quits [Remote host closed the connection] |
| 02:22:35 | | sec^nd (second) joins |
| 02:36:40 | <nexussfan> | There's a site dedicated to archiving Iranian series and films <https://nostalgik-tv.com/> which says they have 4 terabytes of videos. Would it be a good idea to archive it, or not now? |
| 02:44:47 | | APOLLO03 quits [Ping timeout: 268 seconds] |
| 02:47:44 | | ducky (ducky) joins |
| 02:56:13 | | nine quits [Ping timeout: 272 seconds] |
| 02:58:38 | | nine joins |
| 02:58:40 | | nine is now authenticated as nine |
| 02:58:40 | | nine quits [Changing host] |
| 02:58:40 | | nine (nine) joins |
| 03:13:09 | | iPwnedYourIOTSmartdog quits [Ping timeout: 268 seconds] |
| 03:13:46 | | iPwnedYourIOTSmartdog joins |
| 04:01:44 | | etnguyen03 quits [Remote host closed the connection] |
| 04:11:07 | | tekulvw quits [Ping timeout: 268 seconds] |
| 04:14:04 | | tekulvw (tekulvw) joins |
| 04:23:37 | | tekulvw quits [Ping timeout: 272 seconds] |
| 04:26:20 | | Island quits [Read error: Connection reset by peer] |
| 04:28:43 | | tekulvw (tekulvw) joins |
| 04:33:45 | | tekulvw quits [Ping timeout: 272 seconds] |
| 05:04:47 | | n9nes quits [Ping timeout: 272 seconds] |
| 05:08:15 | | n9nes joins |
| 05:14:14 | | tekulvw (tekulvw) joins |
| 05:24:25 | | tekulvw quits [Ping timeout: 272 seconds] |
| 05:32:43 | | sec^nd quits [Remote host closed the connection] |
| 05:33:05 | | sec^nd (second) joins |
| 05:42:42 | <ericgallager> | I forget, did this make it here? https://www.theregister.com/2026/02/12/polyglot_notebooks_deprecation/ |
| 05:51:53 | | tekulvw (tekulvw) joins |
| 05:56:34 | | tekulvw quits [Ping timeout: 268 seconds] |
| 05:56:53 | | tekulvw (tekulvw) joins |
| 06:01:30 | | tekulvw quits [Ping timeout: 268 seconds] |
| 06:16:19 | | nexussfan quits [Quit: Konversation terminated!] |
| 06:17:13 | | ArchivalEfforts quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 06:17:22 | | ArchivalEfforts joins |
| 06:22:50 | | tekulvw (tekulvw) joins |
| 06:27:45 | | tekulvw quits [Ping timeout: 272 seconds] |
| 06:57:16 | | tekulvw (tekulvw) joins |
| 07:05:37 | | pokechu22 (pokechu22) joins |
| 08:52:56 | | ducky quits [Ping timeout: 268 seconds] |
| 08:53:09 | | ducky (ducky) joins |
| 08:54:25 | | Dango360 quits [Quit: The Lounge - https://thelounge.chat] |
| 09:29:34 | | TheEnbyperor_ quits [Read error: Connection reset by peer] |
| 09:30:09 | | cipherrot quits [Ping timeout: 272 seconds] |
| 09:30:09 | | TheEnbyperor quits [Ping timeout: 272 seconds] |
| 09:37:48 | | Snivy quits [Quit: The Lounge - https://thelounge.chat] |
| 09:38:00 | | TheEnbyperor joins |
| 09:38:11 | | petrichor (petrichor) joins |
| 09:38:17 | | Snivy (Snivy) joins |
| 09:38:23 | | Snivy quits [Remote host closed the connection] |
| 09:38:36 | | TheEnbyperor_ (TheEnbyperor) joins |
| 09:39:42 | | Snivy (Snivy) joins |
| 09:42:53 | | tekulvw quits [Ping timeout: 268 seconds] |
| 10:03:54 | | rohvani quits [Quit: The Lounge - https://thelounge.chat] |
| 10:09:54 | | @arkiver quits [Remote host closed the connection] |
| 10:10:21 | | arkiver (arkiver) joins |
| 10:10:21 | | @ChanServ sets mode: +o arkiver |
| 10:14:12 | | fireatseaparks quits [Remote host closed the connection] |
| 10:14:48 | | fireatseaparks (fireatseaparks) joins |
| 10:26:37 | | APOLLO03 joins |
| 10:47:11 | | Webuser505408 joins |
| 10:47:32 | | Webuser505408 quits [Client Quit] |
| 11:37:03 | | tekulvw (tekulvw) joins |
| 11:39:37 | | Cornelius7 (Cornelius) joins |
| 11:41:15 | | Cornelius quits [Ping timeout: 272 seconds] |
| 11:41:15 | | Cornelius7 is now known as Cornelius |
| 11:41:53 | | tekulvw quits [Ping timeout: 272 seconds] |
| 11:47:35 | | irisfreckles13 joins |
| 11:58:52 | | APOLLO03 quits [Read error: Connection reset by peer] |
| 11:59:51 | | APOLLO03 joins |
| 12:00:03 | | Bleo1826007227196234552220 quits [Quit: The Lounge - https://thelounge.chat] |
| 12:02:44 | | Bleo1826007227196234552220 joins |
| 12:37:37 | | petrichor quits [Client Quit] |
| 12:51:47 | <irisfreckles13> | how do i request yt video to be archived? |
| 12:51:47 | <irisfreckles13> | possible? |
| 12:53:37 | <klea> | irisfreckles13: Depends if it's in scope, see https://wiki.archiveteam.org/index.php/YouTube#Scope and if it's in scope you can query it to #down-the-tube |
| 13:02:28 | <h2ibot> | Bear created Philips (+1114, Philips - more like Sorryps): https://wiki.archiveteam.org/?oldid=60489 |
| 13:17:48 | | Shard111 quits [Quit: Im doing something rq. Il brb] |
| 13:19:14 | | Shard1115 (Shard) joins |
| 13:23:15 | | petrichor (petrichor) joins |
| 13:37:50 | | Arcorann quits [Ping timeout: 268 seconds] |
| 13:38:29 | <justauser> | ericgallager: Doesn't look too actionable... |
| 13:41:34 | | Webuser660697 joins |
| 14:03:07 | | irisfreckles13 quits [Ping timeout: 272 seconds] |
| 14:12:44 | | Dada joins |
| 14:18:20 | | Dango360 (Dango360) joins |
| 14:25:45 | <h2ibot> | Justauser edited Discourse/active (+148, Added https://forums.kicksecure.com/…): https://wiki.archiveteam.org/?diff=60490&oldid=60465 |
| 14:31:32 | | tekulvw (tekulvw) joins |
| 14:36:25 | | tekulvw quits [Ping timeout: 268 seconds] |
| 14:52:01 | <@arkiver> | imer: are you able to see something in your logs that is queuing the googleapis.com URLs? |
| 14:54:54 | <@imer> | arkiver: (assuming #//) no, don't think its related to the other spam though |
| 14:56:09 | <@arkiver> | right, sorry, this was for #// |
| 15:09:01 | | irisfreckles13 joins |
| 15:11:35 | | Webuser116786 joins |
| 15:11:58 | | Webuser116786 quits [Client Quit] |
| 16:02:33 | | tekulvw (tekulvw) joins |
| 16:07:15 | | tekulvw quits [Ping timeout: 272 seconds] |
| 16:13:57 | | Island joins |
| 16:28:01 | <h2ibot> | Bear edited Mortis (+17, Provided by [[User:BouleBoule]] but not…): https://wiki.archiveteam.org/?diff=60491&oldid=58254 |
| 16:30:01 | <h2ibot> | Bear edited Mortis (-3, misplaced pipes): https://wiki.archiveteam.org/?diff=60492&oldid=60491 |
| 16:37:02 | <h2ibot> | Bear edited List of websites excluded from the Wayback Machine (+356, More details on Philips.com ([[Philips]])): https://wiki.archiveteam.org/?diff=60493&oldid=60371 |
| 16:40:37 | | Goofybally9 quits [Quit: The Lounge - https://thelounge.chat] |
| 16:41:23 | | Goofybally joins |
| 16:42:12 | | Goofybally quits [Client Quit] |
| 16:42:43 | | Goofybally (Goofybally) joins |
| 16:46:03 | <h2ibot> | Bear edited List of websites excluded from the Wayback Machine (+181, steampunkal.com excluded between 2013-11-12 and…): https://wiki.archiveteam.org/?diff=60494&oldid=60493 |
| 16:52:03 | | DogsRNice joins |