| 00:04:33 | | mr_sarge quits [Quit: mr_sarge] |
| 00:08:03 | | aninternettroll quits [Ping timeout: 268 seconds] |
| 00:23:31 | | aninternettroll (aninternettroll) joins |
| 00:30:37 | | n9nes quits [Remote host closed the connection] |
| 00:34:55 | | n9nes joins |
| 00:43:07 | | aninternettroll quits [Ping timeout: 268 seconds] |
| 00:50:36 | | ice quits [Ping timeout: 268 seconds] |
| 00:52:26 | | aninternettroll (aninternettroll) joins |
| 00:54:13 | | ice joins |
| 00:59:47 | | cruller|irc joins |
| 01:18:16 | | ice quits [Ping timeout: 268 seconds] |
| 01:24:31 | | bl791 quits [Ping timeout: 268 seconds] |
| 01:31:03 | | bl791 joins |
| 01:32:11 | | etnguyen03 (etnguyen03) joins |
| 01:33:36 | | ice joins |
| 01:36:09 | | bl791 quits [Ping timeout: 268 seconds] |
| 01:41:19 | | bl791 joins |
| 01:46:38 | | bl791 quits [Ping timeout: 268 seconds] |
| 02:05:49 | | bl791 joins |
| 02:10:46 | | bl791 quits [Ping timeout: 268 seconds] |
| 02:14:28 | | bl791 joins |
| 02:19:19 | | bl791 quits [Ping timeout: 268 seconds] |
| 02:22:16 | | bl791 joins |
| 02:53:31 | | MARA joins |
| 02:54:28 | <MARA> | hi, does anyone know where i can find info about when a website was excluded from the wayback machine? xoom specifically |
| 02:55:03 | <MARA> | i know that they might not have available snapshots from BEFORE when it was excluded, but i was thinking i might be able to grab the big crawl snapshot WARCs and then open them up to find snapshots |
| 02:55:07 | <MARA> | is that a ridiculous idea just asking |
| 02:57:14 | <h2ibot> | PaulWise edited ArchiveBot (+116, add best practice for redirect domains): https://wiki.archiveteam.org/?diff=60597&oldid=59571 |
| 03:00:14 | <h2ibot> | PaulWise edited ArchiveBot (+276, add best practice for dead domains): https://wiki.archiveteam.org/?diff=60598&oldid=60597 |
| 03:01:50 | | nablumon quits [Quit: WeeChat 4.6.3] |
| 03:04:15 | <h2ibot> | PaulWise edited ArchiveBot (+127, mention globalping): https://wiki.archiveteam.org/?diff=60599&oldid=60598 |
| 03:13:51 | | TheTechRobo2 joins |
| 03:15:16 | <h2ibot> | PaulWise edited Category:Software archiving (+42, APTlantis): https://wiki.archiveteam.org/?diff=60600&oldid=60110 |
| 03:19:16 | | Slimm quits [Quit: Going offline, see ya! (www.adiirc.com)] |
| 03:28:17 | <pabs> | MARA: best guess of when would be to find the date of the revision that added xoom.com to https://wiki.archiveteam.org/index.php/List_of_websites_excluded_from_the_Wayback_Machine |
| 03:28:51 | <MARA> | okay, thanks. i was trying to do that but it occurred to me maybe someone would know here. worried it was added much earlier though |
| 03:28:56 | <MARA> | but thank u very much |
| 03:29:37 | | Arcorann__ quits [Ping timeout: 268 seconds] |
| 03:29:39 | <pabs> | and also, most WARCs are restricted-download these days, so your plan might not have worked |
| 03:30:24 | <MARA> | : ( |
| 03:30:27 | <MARA> | they hate me |
| 03:30:35 | <MARA> | im just trying to recover some midis |
| 03:30:43 | <pabs> | OTOH ArchiveBot WARCs are not restricted yet, and we did save some of xoom.com via AB https://archive.fart.website/archivebot/viewer/?q=xoom.com |
| 03:31:05 | <MARA> | poggers thats so great to hear |
| 03:31:14 | <MARA> | holy shit thanks |
| 03:31:22 | <pabs> | oh, no, those are single-page only saves :/ |
| 03:31:25 | <MARA> | cant beleive this url but hey if it works |
| 03:31:38 | <MARA> | hehe |
| 03:31:40 | <MARA> | well thanks anyways |
| 03:31:56 | <pabs> | sorry! |
| 03:31:56 | <MARA> | got a few xoom websites to look for so just keeping an eye out for any archive of it |
| 03:31:59 | <MARA> | no dont be! |
| 03:32:14 | <MARA> | cant believe i wouldnt even be able to download the warc though. come on |
| 03:32:25 | <MARA> | if im willing to buy a massive hard drive and do that to myself why cant i do it |
| 03:33:07 | <pabs> | IA might give exceptions in some circumstances, but |
| 03:33:18 | <pabs> | best mail info@archive.org maybe they can help |
| 03:34:26 | <MARA> | is it for copyright purposes or what? |
| 03:35:43 | <nulldata> | Mainly due to AI bots scraping |
| 03:35:47 | | bl791 quits [Ping timeout: 268 seconds] |
| 03:36:06 | <MARA> | goddamn dude |
| 03:36:19 | <MARA> | people ruin everyhting |
| 03:37:26 | | bl791 joins |
| 03:40:51 | | PredatorIWD48 joins |
| 03:42:47 | | etnguyen03 quits [Client Quit] |
| 03:43:16 | | PredatorIWD4 quits [Ping timeout: 268 seconds] |
| 03:43:16 | | PredatorIWD48 is now known as PredatorIWD4 |
| 03:46:55 | | etnguyen03 (etnguyen03) joins |
| 03:51:48 | | etnguyen03 quits [Remote host closed the connection] |
| 04:08:36 | | DogsRNice quits [Read error: Connection reset by peer] |
| 04:15:29 | | Island quits [Read error: Connection reset by peer] |
| 04:21:19 | | bl791 quits [Remote host closed the connection] |
| 04:36:16 | | ^ quits [Read error: Connection reset by peer] |
| 04:36:22 | | ^ (^) joins |
| 05:04:28 | | TheTechRobo2 quits [Quit: Bye :3] |
| 05:04:40 | | n9nes quits [Ping timeout: 268 seconds] |
| 05:08:11 | | n9nes joins |
| 05:23:45 | | Wohlstand (Wohlstand) joins |
| 05:24:38 | | nexussfan quits [Quit: Konversation terminated!] |
| 05:27:37 | | LddPotato quits [Read error: Connection reset by peer] |
| 05:28:26 | | LddPotato (LddPotato) joins |
| 05:40:35 | <h2ibot> | PaulWise edited Obstacles (+51, botcheck for klea): https://wiki.archiveteam.org/?diff=60601&oldid=60554 |
| 05:42:34 | | LddPotato quits [Read error: Connection reset by peer] |
| 05:43:12 | | LddPotato (LddPotato) joins |
| 05:49:37 | <h2ibot> | PaulWise edited FTP (+65, ftpstatus.com /cc justauser): https://wiki.archiveteam.org/?diff=60602&oldid=59308 |
| 05:56:19 | | LddPotato quits [Read error: Connection reset by peer] |
| 05:57:51 | | LddPotato (LddPotato) joins |
| 06:02:44 | | Webuser484198 quits [Quit: Ooops, wrong browser tab.] |
| 06:08:48 | | LddPotato quits [Read error: Connection reset by peer] |
| 06:09:57 | | LddPotato (LddPotato) joins |
| 06:22:09 | <nicolas17> | wth is this |
| 06:22:18 | <nicolas17> | https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.5390 redirects to wayback machine |
| 06:23:01 | <nicolas17> | which then doesn't work because the page is actually scripty and the XHR response was not saved |
| 06:23:37 | <pokechu22> | no search results for 10.1.1.20.5390 on https://scholar.archive.org/ |
| 06:24:37 | | LddPotato quits [Read error: Connection reset by peer] |
| 06:25:08 | <nicolas17> | is CiteSeerX just dead, since it redirects to WBM? |
| 06:25:43 | | LddPotato (LddPotato) joins |
| 06:28:03 | <nicolas17> | I got the link from https://en.wikipedia.org/wiki/Force-directed_graph_drawing#cite_note-7 |
| 06:32:22 | <pokechu22> | Nothing listed on https://en.wikipedia.org/wiki/CiteSeerX but maybe it is :/ |
| 06:33:28 | <nicolas17> | WBM has even archived the redirect to WBM now |
| 06:33:41 | <nicolas17> | so if you go to https://citeseerx.ist.psu.edu/ you get a redirect loop kinda |
| 06:34:38 | <nicolas17> | https://web.archive.org/web/20260302172837/https://web.archive.org/web/20260302172836/https://citeseerx.ist.psu.edu/ I thought this wasn't even possible |
| 06:38:22 | | BornOn420 (BornOn420) joins |
| 06:52:35 | | emphie quits [Ping timeout: 268 seconds] |
| 06:53:42 | | tzt quits [Quit: tzt] |
| 06:53:52 | | tzt (tzt) joins |
| 06:56:57 | <tzt> | nicolas17: https://github.com/SeerLabs/CiteSeerX/issues/83 |
| 07:07:35 | | emphie joins |
| 07:15:24 | | anarcat quits [Ping timeout: 268 seconds] |
| 07:21:37 | | anarcat (anarcat) joins |
| 07:31:46 | | rohvani joins |
| 08:26:07 | | retrograde (retrograde) joins |
| 08:51:31 | | Arcorann__ (Arcorann) joins |
| 10:14:15 | | Babsalom joins |
| 11:04:11 | | evergreen563 joins |
| 11:07:16 | | evergreen56 quits [Ping timeout: 268 seconds] |
| 11:07:17 | | evergreen563 is now known as evergreen56 |
| 11:50:11 | | MARA quits [Quit: Ooops, wrong browser tab.] |
| 12:00:02 | | Bleo1826007227196234552220 quits [Quit: The Lounge - https://thelounge.chat] |
| 12:02:49 | | Bleo1826007227196234552220 joins |
| 12:16:42 | <h2ibot> | Manu edited Distributed recursive crawls (+104, Candidates: Add multinationales.org): https://wiki.archiveteam.org/?diff=60603&oldid=60552 |
| 13:08:12 | | Babsalom1 joins |
| 13:11:08 | | Babsalom quits [Ping timeout: 268 seconds] |
| 13:19:20 | | pedantic-darwin joins |
| 13:21:02 | | FiTheArchiver joins |
| 13:22:21 | | Mist8kenGAS (Mist8kenGAS) joins |
| 13:24:00 | | Mist8kenGAS quits [Client Quit] |
| 13:24:15 | | Mist8kenGAS (Mist8kenGAS) joins |
| 13:38:58 | | driib97 quits [Ping timeout: 268 seconds] |
| 13:44:28 | | FiTheArchiver quits [Client Quit] |
| 13:45:50 | <h2ibot> | Imer edited Deathwatch (+497, /* 2026-03 */ add manga kakeru/yomeru): https://wiki.archiveteam.org/?diff=60604&oldid=60596 |
| 13:46:59 | | Arcorann__ quits [Ping timeout: 268 seconds] |
| 13:52:06 | <@imer> | ^ looks slightly scripty at a glance, so not sure if AB would work. It does have the image links in a json blob embedded in the page - does AB pick those up? e.g. https://manga-yomeru.com/manga/rhent -> "https:\/\/works.manga-yomeru.com\/rhent\/images\/frame-0.jpeg" |
| 14:08:20 | <masterx244|m> | in the worst case thats a 2-stage crawl with manual WARC parsing to catch those outlinks |
| 14:44:57 | | jonte quits [Ping timeout: 268 seconds] |
| 14:45:07 | | jonte (jonte4) joins |
| 15:00:13 | | beardicus1 (beardicus) joins |
| 15:12:36 | | Island joins |
| 15:18:24 | | hamouda joins |