00:00:44<pokechu22>The meta one is the job log and should be uploaded. The .cdx file is normally derived from the WARC by IA itself, though I don't know if that always happens or only happens for items that get indexed by web.archive.org.
00:00:47tekulvw quits [Ping timeout: 272 seconds]
00:04:14<klea>I think it only happens for items that get indexed by web.archive.org? https://archive.org/download/limewire.com_d_7xNKB_NfXjrIqBWo
00:04:32<klea>Tho, maybe it was me not running the derive thing after every file
00:04:35<klea>lemme make it derive.
00:04:43<klea>(if i remember howto)
00:06:24<klea>It seems if you have IA derive (which is the default I believe on the web uploader?), it will make a cdx. <https://archive.org/log/5191197716> claims it will do a CDXIndex.
00:06:54<klea>Huh
00:06:58<klea>[ PST: 2026-02-16 16:05:08 ] Executing: ulimit -v 1048576 && PYTHONPATH=/petabox/sw/lib/python timeout 600 /petabox/sw/bin/cdx_writer.pex 'WARCPROX-20260216205304743-00000-y1i40ow9.warc.gz' --file-prefix='limewire.com_d_7xNKB_NfXjrIqBWo' --exclude-list='/petabox/sw/wayback/web_excludes.txt' --stats-file='/f/_limewire.com_d_7xNKB_NfXjrIqBWo/cdxstats.json'>
00:06:58<klea>'/t/_limewire.com_d_7xNKB_NfXjrIqBWo/cdx.txt'
00:07:09<klea>Wait a second.
00:07:35<klea>Couldn't that be a way to bulk check lots of urls by making a warc with records of lots of data, and then getting the cdx and seeing what apparently is missing?
00:07:50<klea>Then you'd request deletion of all that crap, because nobody wants it.
00:19:07nine quits [Quit: See ya!]
00:19:20nine joins
00:19:20nine quits [Changing host]
00:19:20nine (nine) joins
00:23:56<cruller>TheoH7: I uploaded the entire output directory. https://archive.org/details/community.jisc.ac.uk-2026-02-16-35e53623-00000
00:32:50etnguyen03 quits [Client Quit]
00:40:14SootBector quits [Remote host closed the connection]
00:41:22SootBector (SootBector) joins
00:42:30<TheoH7>cruller: Thanks, have downloaded it.
00:43:16<TheoH7>It looks like I also managed to do one where hard-coded links to https://community.ja.net (old address from years ago) are clickable in the WARC. I will upload that to IA likely in a few hours.
00:44:16<TheoH7>To upload the whole directory, is the best way to zip and upload, or can you select a whole folder for upload?
00:48:36<pokechu22>You can upload multiple files at once within a directory (uploading directories/subdirectories might also be possible but I think is more complicated?)
00:50:57<TheoH7>pokechu22: Great, will do that.
00:51:48<TheoH7>Seems one of my crawls has somehow managed to start crawling old versions of this site stored on the Wayback Machine, which is odd. I've added the pattern to ignores but just curious how grab-site would've found and started crawling such URL's.
00:52:11<TheoH7>I do already have 1 crawl without that done though, and will only upload the 2nd one if contains materially more content
01:02:20tekulvw (tekulvw) joins
01:03:29ducky quits [Ping timeout: 272 seconds]
01:04:19etnguyen03 (etnguyen03) joins
01:07:21tekulvw quits [Ping timeout: 268 seconds]
01:21:59tekulvw (tekulvw) joins
01:25:36Webuser614729 joins
01:26:29Webuser614729 quits [Client Quit]
01:27:43wotd joins
01:41:28pokechu22 quits [Quit: System maintenance]
02:22:10sec^nd quits [Remote host closed the connection]
02:22:35sec^nd (second) joins
02:36:40<nexussfan>There's a site dedicated to archiving Iranian series and films <https://nostalgik-tv.com/> which says they have 4 terabytes of videos. Would it be a good idea to archive it, or not now?
02:44:47APOLLO03 quits [Ping timeout: 268 seconds]
02:47:44ducky (ducky) joins
02:56:13nine quits [Ping timeout: 272 seconds]
02:58:38nine joins
02:58:40nine quits [Changing host]
02:58:40nine (nine) joins
03:13:09iPwnedYourIOTSmartdog quits [Ping timeout: 268 seconds]
03:13:46iPwnedYourIOTSmartdog joins
04:01:44etnguyen03 quits [Remote host closed the connection]
04:11:07tekulvw quits [Ping timeout: 268 seconds]
04:14:04tekulvw (tekulvw) joins
04:23:37tekulvw quits [Ping timeout: 272 seconds]
04:26:20Island quits [Read error: Connection reset by peer]
04:28:43tekulvw (tekulvw) joins
04:33:45tekulvw quits [Ping timeout: 272 seconds]
05:04:47n9nes quits [Ping timeout: 272 seconds]
05:08:15n9nes joins
05:14:14tekulvw (tekulvw) joins
05:24:25tekulvw quits [Ping timeout: 272 seconds]
05:32:43sec^nd quits [Remote host closed the connection]
05:33:05sec^nd (second) joins
05:42:42<ericgallager>I forget, did this make it here? https://www.theregister.com/2026/02/12/polyglot_notebooks_deprecation/
05:51:53tekulvw (tekulvw) joins
05:56:34tekulvw quits [Ping timeout: 268 seconds]
05:56:53tekulvw (tekulvw) joins
06:01:30tekulvw quits [Ping timeout: 268 seconds]
06:16:19nexussfan quits [Quit: Konversation terminated!]
06:17:13ArchivalEfforts quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
06:17:22ArchivalEfforts joins
06:22:50tekulvw (tekulvw) joins
06:27:45tekulvw quits [Ping timeout: 272 seconds]
06:57:16tekulvw (tekulvw) joins
07:05:37pokechu22 (pokechu22) joins
08:52:56ducky quits [Ping timeout: 268 seconds]
08:53:09ducky (ducky) joins
08:54:25Dango360 quits [Quit: The Lounge - https://thelounge.chat]
09:29:34TheEnbyperor_ quits [Read error: Connection reset by peer]
09:30:09cipherrot quits [Ping timeout: 272 seconds]
09:30:09TheEnbyperor quits [Ping timeout: 272 seconds]
09:37:48Snivy quits [Quit: The Lounge - https://thelounge.chat]
09:38:00TheEnbyperor joins
09:38:11petrichor (petrichor) joins
09:38:17Snivy (Snivy) joins
09:38:23Snivy quits [Remote host closed the connection]
09:38:36TheEnbyperor_ (TheEnbyperor) joins
09:39:42Snivy (Snivy) joins
09:42:53tekulvw quits [Ping timeout: 268 seconds]
10:03:54rohvani quits [Quit: The Lounge - https://thelounge.chat]
10:09:54@arkiver quits [Remote host closed the connection]
10:10:21arkiver (arkiver) joins
10:10:21@ChanServ sets mode: +o arkiver
10:14:12fireatseaparks quits [Remote host closed the connection]
10:14:48fireatseaparks (fireatseaparks) joins
10:26:37APOLLO03 joins
10:47:11Webuser505408 joins
10:47:32Webuser505408 quits [Client Quit]
11:37:03tekulvw (tekulvw) joins
11:39:37Cornelius7 (Cornelius) joins
11:41:15Cornelius quits [Ping timeout: 272 seconds]
11:41:15Cornelius7 is now known as Cornelius
11:41:53tekulvw quits [Ping timeout: 272 seconds]
11:47:35irisfreckles13 joins
11:58:52APOLLO03 quits [Read error: Connection reset by peer]
11:59:51APOLLO03 joins
12:00:03Bleo1826007227196234552220 quits [Quit: The Lounge - https://thelounge.chat]
12:02:44Bleo1826007227196234552220 joins
12:37:37petrichor quits [Client Quit]
12:51:47<irisfreckles13>how do i request yt video to be archived?
12:51:47<irisfreckles13>possible?
12:53:37<klea>irisfreckles13: Depends if it's in scope, see https://wiki.archiveteam.org/index.php/YouTube#Scope and if it's in scope you can query it to #down-the-tube
13:02:28<h2ibot>Bear created Philips (+1114, Philips - more like Sorryps): https://wiki.archiveteam.org/?oldid=60489
13:17:48Shard111 quits [Quit: Im doing something rq. Il brb]
13:19:14Shard1115 (Shard) joins
13:23:15petrichor (petrichor) joins
13:37:50Arcorann quits [Ping timeout: 268 seconds]
13:38:29<justauser>ericgallager: Doesn't look too actionable...
13:41:34Webuser660697 joins
14:03:07irisfreckles13 quits [Ping timeout: 272 seconds]
14:12:44Dada joins
14:18:20Dango360 (Dango360) joins
14:25:45<h2ibot>Justauser edited Discourse/active (+148, Added https://forums.kicksecure.com/…): https://wiki.archiveteam.org/?diff=60490&oldid=60465
14:31:32tekulvw (tekulvw) joins
14:36:25tekulvw quits [Ping timeout: 268 seconds]
14:52:01<@arkiver>imer: are you able to see something in your logs that is queuing the googleapis.com URLs?
14:54:54<@imer>arkiver: (assuming #//) no, don't think its related to the other spam though
14:56:09<@arkiver>right, sorry, this was for #//
15:09:01irisfreckles13 joins
15:11:35Webuser116786 joins
15:11:58Webuser116786 quits [Client Quit]
16:02:33tekulvw (tekulvw) joins
16:07:15tekulvw quits [Ping timeout: 272 seconds]
16:13:57Island joins
16:28:01<h2ibot>Bear edited Mortis (+17, Provided by [[User:BouleBoule]] but not…): https://wiki.archiveteam.org/?diff=60491&oldid=58254
16:30:01<h2ibot>Bear edited Mortis (-3, misplaced pipes): https://wiki.archiveteam.org/?diff=60492&oldid=60491
16:37:02<h2ibot>Bear edited List of websites excluded from the Wayback Machine (+356, More details on Philips.com ([[Philips]])): https://wiki.archiveteam.org/?diff=60493&oldid=60371
16:40:37Goofybally9 quits [Quit: The Lounge - https://thelounge.chat]
16:41:23Goofybally joins
16:42:12Goofybally quits [Client Quit]
16:42:43Goofybally (Goofybally) joins
16:46:03<h2ibot>Bear edited List of websites excluded from the Wayback Machine (+181, steampunkal.com excluded between 2013-11-12 and…): https://wiki.archiveteam.org/?diff=60494&oldid=60493
16:52:03DogsRNice joins