00:04:33mr_sarge quits [Quit: mr_sarge]
00:08:03aninternettroll quits [Ping timeout: 268 seconds]
00:23:31aninternettroll (aninternettroll) joins
00:30:37n9nes quits [Remote host closed the connection]
00:34:55n9nes joins
00:43:07aninternettroll quits [Ping timeout: 268 seconds]
00:50:36ice quits [Ping timeout: 268 seconds]
00:52:26aninternettroll (aninternettroll) joins
00:54:13ice joins
00:59:47cruller|irc joins
01:18:16ice quits [Ping timeout: 268 seconds]
01:24:31bl791 quits [Ping timeout: 268 seconds]
01:31:03bl791 joins
01:32:11etnguyen03 (etnguyen03) joins
01:33:36ice joins
01:36:09bl791 quits [Ping timeout: 268 seconds]
01:41:19bl791 joins
01:46:38bl791 quits [Ping timeout: 268 seconds]
02:05:49bl791 joins
02:10:46bl791 quits [Ping timeout: 268 seconds]
02:14:28bl791 joins
02:19:19bl791 quits [Ping timeout: 268 seconds]
02:22:16bl791 joins
02:53:31MARA joins
02:54:28<MARA>hi, does anyone know where i can find info about when a website was excluded from the wayback machine? xoom specifically
02:55:03<MARA>i know that they might not have available snapshots from BEFORE when it was excluded, but i was thinking i might be able to grab the big crawl snapshot WARCs and then open them up to find snapshots
02:55:07<MARA>is that a ridiculous idea just asking
02:57:14<h2ibot>PaulWise edited ArchiveBot (+116, add best practice for redirect domains): https://wiki.archiveteam.org/?diff=60597&oldid=59571
03:00:14<h2ibot>PaulWise edited ArchiveBot (+276, add best practice for dead domains): https://wiki.archiveteam.org/?diff=60598&oldid=60597
03:01:50nablumon quits [Quit: WeeChat 4.6.3]
03:04:15<h2ibot>PaulWise edited ArchiveBot (+127, mention globalping): https://wiki.archiveteam.org/?diff=60599&oldid=60598
03:13:51TheTechRobo2 joins
03:15:16<h2ibot>PaulWise edited Category:Software archiving (+42, APTlantis): https://wiki.archiveteam.org/?diff=60600&oldid=60110
03:19:16Slimm quits [Quit: Going offline, see ya! (www.adiirc.com)]
03:28:17<pabs>MARA: best guess of when would be to find the date of the revision that added xoom.com to https://wiki.archiveteam.org/index.php/List_of_websites_excluded_from_the_Wayback_Machine
03:28:51<MARA>okay, thanks. i was trying to do that but it occurred to me maybe someone would know here. worried it was added much earlier though
03:28:56<MARA>but thank u very much
03:29:37Arcorann__ quits [Ping timeout: 268 seconds]
03:29:39<pabs>and also, most WARCs are restricted-download these days, so your plan might not have worked
03:30:24<MARA>: (
03:30:27<MARA>they hate me
03:30:35<MARA>im just trying to recover some midis
03:30:43<pabs>OTOH ArchiveBot WARCs are not restricted yet, and we did save some of xoom.com via AB https://archive.fart.website/archivebot/viewer/?q=xoom.com
03:31:05<MARA>poggers thats so great to hear
03:31:14<MARA>holy shit thanks
03:31:22<pabs>oh, no, those are single-page only saves :/
03:31:25<MARA>cant beleive this url but hey if it works
03:31:38<MARA>hehe
03:31:40<MARA>well thanks anyways
03:31:56<pabs>sorry!
03:31:56<MARA>got a few xoom websites to look for so just keeping an eye out for any archive of it
03:31:59<MARA>no dont be!
03:32:14<MARA>cant believe i wouldnt even be able to download the warc though. come on
03:32:25<MARA>if im willing to buy a massive hard drive and do that to myself why cant i do it
03:33:07<pabs>IA might give exceptions in some circumstances, but
03:33:18<pabs>best mail info@archive.org maybe they can help
03:34:26<MARA>is it for copyright purposes or what?
03:35:43<nulldata>Mainly due to AI bots scraping
03:35:47bl791 quits [Ping timeout: 268 seconds]
03:36:06<MARA>goddamn dude
03:36:19<MARA>people ruin everyhting
03:37:26bl791 joins
03:40:51PredatorIWD48 joins
03:42:47etnguyen03 quits [Client Quit]
03:43:16PredatorIWD4 quits [Ping timeout: 268 seconds]
03:43:16PredatorIWD48 is now known as PredatorIWD4
03:46:55etnguyen03 (etnguyen03) joins
03:51:48etnguyen03 quits [Remote host closed the connection]
04:08:36DogsRNice quits [Read error: Connection reset by peer]
04:15:29Island quits [Read error: Connection reset by peer]
04:21:19bl791 quits [Remote host closed the connection]
04:36:16^ quits [Read error: Connection reset by peer]
04:36:22^ (^) joins
05:04:28TheTechRobo2 quits [Quit: Bye :3]
05:04:40n9nes quits [Ping timeout: 268 seconds]
05:08:11n9nes joins
05:23:45Wohlstand (Wohlstand) joins
05:24:38nexussfan quits [Quit: Konversation terminated!]
05:27:37LddPotato quits [Read error: Connection reset by peer]
05:28:26LddPotato (LddPotato) joins
05:40:35<h2ibot>PaulWise edited Obstacles (+51, botcheck for klea): https://wiki.archiveteam.org/?diff=60601&oldid=60554
05:42:34LddPotato quits [Read error: Connection reset by peer]
05:43:12LddPotato (LddPotato) joins
05:49:37<h2ibot>PaulWise edited FTP (+65, ftpstatus.com /cc justauser): https://wiki.archiveteam.org/?diff=60602&oldid=59308
05:56:19LddPotato quits [Read error: Connection reset by peer]
05:57:51LddPotato (LddPotato) joins
06:02:44Webuser484198 quits [Quit: Ooops, wrong browser tab.]
06:08:48LddPotato quits [Read error: Connection reset by peer]
06:09:57LddPotato (LddPotato) joins
06:22:09<nicolas17>wth is this
06:22:18<nicolas17>https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.5390 redirects to wayback machine
06:23:01<nicolas17>which then doesn't work because the page is actually scripty and the XHR response was not saved
06:23:37<pokechu22>no search results for 10.1.1.20.5390 on https://scholar.archive.org/
06:24:37LddPotato quits [Read error: Connection reset by peer]
06:25:08<nicolas17>is CiteSeerX just dead, since it redirects to WBM?
06:25:43LddPotato (LddPotato) joins
06:28:03<nicolas17>I got the link from https://en.wikipedia.org/wiki/Force-directed_graph_drawing#cite_note-7
06:32:22<pokechu22>Nothing listed on https://en.wikipedia.org/wiki/CiteSeerX but maybe it is :/
06:33:28<nicolas17>WBM has even archived the redirect to WBM now
06:33:41<nicolas17>so if you go to https://citeseerx.ist.psu.edu/ you get a redirect loop kinda
06:34:38<nicolas17>https://web.archive.org/web/20260302172837/https://web.archive.org/web/20260302172836/https://citeseerx.ist.psu.edu/ I thought this wasn't even possible
06:38:22BornOn420 (BornOn420) joins
06:52:35emphie quits [Ping timeout: 268 seconds]
06:53:42tzt quits [Quit: tzt]
06:53:52tzt (tzt) joins
06:56:57<tzt>nicolas17: https://github.com/SeerLabs/CiteSeerX/issues/83
07:07:35emphie joins
07:15:24anarcat quits [Ping timeout: 268 seconds]
07:21:37anarcat (anarcat) joins
07:31:46rohvani joins
08:26:07retrograde (retrograde) joins
08:51:31Arcorann__ (Arcorann) joins
10:14:15Babsalom joins
11:04:11evergreen563 joins
11:07:16evergreen56 quits [Ping timeout: 268 seconds]
11:07:17evergreen563 is now known as evergreen56
11:50:11MARA quits [Quit: Ooops, wrong browser tab.]
12:00:02Bleo1826007227196234552220 quits [Quit: The Lounge - https://thelounge.chat]
12:02:49Bleo1826007227196234552220 joins
12:16:42<h2ibot>Manu edited Distributed recursive crawls (+104, Candidates: Add multinationales.org): https://wiki.archiveteam.org/?diff=60603&oldid=60552
13:08:12Babsalom1 joins
13:11:08Babsalom quits [Ping timeout: 268 seconds]
13:19:20pedantic-darwin joins
13:21:02FiTheArchiver joins
13:22:21Mist8kenGAS (Mist8kenGAS) joins
13:24:00Mist8kenGAS quits [Client Quit]
13:24:15Mist8kenGAS (Mist8kenGAS) joins
13:38:58driib97 quits [Ping timeout: 268 seconds]
13:44:28FiTheArchiver quits [Client Quit]
13:45:50<h2ibot>Imer edited Deathwatch (+497, /* 2026-03 */ add manga kakeru/yomeru): https://wiki.archiveteam.org/?diff=60604&oldid=60596
13:46:59Arcorann__ quits [Ping timeout: 268 seconds]
13:52:06<@imer>^ looks slightly scripty at a glance, so not sure if AB would work. It does have the image links in a json blob embedded in the page - does AB pick those up? e.g. https://manga-yomeru.com/manga/rhent -> "https:\/\/works.manga-yomeru.com\/rhent\/images\/frame-0.jpeg"
14:08:20<masterx244|m>in the worst case thats a 2-stage crawl with manual WARC parsing to catch those outlinks
14:44:57jonte quits [Ping timeout: 268 seconds]
14:45:07jonte (jonte4) joins
15:00:13beardicus1 (beardicus) joins
15:12:36Island joins
15:18:24hamouda joins