02:38:01nicolas17_ joins
02:41:59nicolas17 quits [Ping timeout: 260 seconds]
03:03:02nicolas17_ quits [Client Quit]
03:03:29nicolas17 joins
04:58:00nicolas17 quits [Client Quit]
05:32:10nicolas17 joins
05:41:30nicolas17_ joins
05:45:09nicolas17 quits [Ping timeout: 260 seconds]
06:09:11nicolas17 joins
06:12:34nicolas17_ quits [Ping timeout: 260 seconds]
06:46:55nicolas17_ joins
06:50:29nicolas17 quits [Ping timeout: 260 seconds]
06:52:12nicolas17 joins
06:55:44nicolas17_ quits [Ping timeout: 260 seconds]
07:03:30nicolas17_ joins
07:06:49nicolas17 quits [Ping timeout: 260 seconds]
07:20:48nicolas17_ quits [Read error: Connection reset by peer]
07:21:00nicolas17_ joins
07:28:13nicolas17 joins
07:31:54nicolas17_ quits [Ping timeout: 260 seconds]
07:33:30nicolas17_ joins
07:37:09nicolas17 quits [Ping timeout: 260 seconds]
07:38:42nicolas17 joins
07:39:14nicolas17_ quits [Read error: Connection reset by peer]
07:43:59nicolas17_ joins
07:47:04nicolas17 quits [Ping timeout: 260 seconds]
07:47:04nicolas17_ quits [Read error: Connection reset by peer]
07:47:33nicolas17_ joins
08:31:35BornOn420 quits [Remote host closed the connection]
08:32:06BornOn420 (BornOn420) joins
08:39:02pabs quits [Read error: Connection reset by peer]
08:39:30pabs (pabs) joins
10:11:09PredatorIWD25 quits [Ping timeout: 260 seconds]
10:54:17nicolas17 joins
10:57:49nicolas17_ quits [Ping timeout: 260 seconds]
10:59:16nicolas17 quits [Read error: Connection reset by peer]
10:59:33nicolas17 joins
11:08:37PredatorIWD25 joins
11:33:52nicolas17_ joins
11:38:02simon816 quits [Quit: ZNC 1.9.1 - https://znc.in]
11:38:04nicolas17 quits [Ping timeout: 260 seconds]
11:40:52simon816 (simon816) joins
11:43:12nicolas17 joins
11:46:49nicolas17_ quits [Ping timeout: 260 seconds]
11:54:41yano quits [Quit: WeeChat, https://weechat.org/]
11:55:05yano (yano) joins
12:04:02nicolas17_ joins
12:05:53KoalaBear joins
12:07:14nicolas17 quits [Ping timeout: 260 seconds]
12:08:24KoalaBear84 quits [Ping timeout: 260 seconds]
12:24:24pabs quits [Read error: Connection reset by peer]
12:25:14pabs (pabs) joins
12:32:06nicolas17_ quits [Client Quit]
12:33:04nicolas17_ joins
12:40:19nicolas17 joins
12:41:21nicolas17_ quits [Read error: Connection reset by peer]
12:57:45nicolas17_ joins
13:01:29nicolas17 quits [Ping timeout: 260 seconds]
13:04:21pabs quits [Client Quit]
13:04:48pabs (pabs) joins
13:07:02nicolas17 joins
13:10:14nicolas17_ quits [Ping timeout: 260 seconds]
14:21:46PredatorIWD256 joins
14:22:34PredatorIWD25 quits [Ping timeout: 260 seconds]
14:22:34PredatorIWD256 is now known as PredatorIWD25
14:35:10PredatorIWD250 joins
14:37:41PredatorIWD25 quits [Ping timeout: 276 seconds]
14:37:42PredatorIWD25 joins
14:40:04PredatorIWD250 quits [Ping timeout: 260 seconds]
14:45:43nicolas17_ joins
14:47:47nicolas17 quits [Read error: Connection reset by peer]
15:41:23corentin quits [Ping timeout: 276 seconds]
15:44:49Dango360 quits [Ping timeout: 260 seconds]
16:00:43Dango360 (Dango360) joins
16:20:21nicolas17 joins
16:23:54nicolas17_ quits [Ping timeout: 260 seconds]
16:33:41nicolas17_ joins
16:37:19nicolas17 quits [Ping timeout: 260 seconds]
16:49:04nicolas17 joins
17:38:14atirclog (atirclog) joins
17:38:14Topic: This channel is 100% not affiliated with archive.org. We will help if we can. | https://wiki.archiveteam.org/index.php/Internet_Archive
17:38:14Topic set by JAA at 2025-06-02 22:04:15Z
17:38:19Current users: atirclog (atirclog), cm, DogsRNice, magmaus3 (magmaus3), Fusl (Fusl), pokechu22 (pokechu22), Lord_Nightmare (Lord_Nightmare), nicolas17 (nicolas17), erenrich, JTL (JTL), ThreeHM (ThreeHeadedMonkey), tzt (tzt), Stagnant_ (Stagnant), rewby (rewby), yano (yano), danwellby, night (night), balrog (balrog), legoktm, lexikiq, Doranwen (Doranwen), mgrandi (mgrandi), Medowar (Medowar), DopefishJustin (DopefishJustin), ScenarioPlanet (ScenarioPlanet), @kaz (Kaz), sepro (sepro), nyakase (nyakase), fuzzy8021 (fuzzy80211), atphoenix__ (atphoenix), PredatorIWD25, AlsoHP_Archivist, Dango360 (Dango360), pabs (pabs), KoalaBear, simon816 (simon816), BornOn420 (BornOn420), zhongfu (zhongfu), riking, @hook54321 (hook54321), BearFortress, Matthww, DigitalDragons (DigitalDragons), Exorcism0666 (exorcism), qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2), dxrt (dxrt), Flashfire42 (flashfire42), Sanqui (Sanqui), threedeeitguy69 (threedeeitguy), Chris5010 (Chris5010), nothere, Jake (Jake), s-crypt (s-crypt), datechnoman (datechnoman), nulldata-alt (nulldata), imer (imer), that_lurker (that_lurker), @arkiver (arkiver), TheTechRobo (TheTechRobo), Pedrosso, xkey (xkey), Exorcism (exorcism), nukke (nukke), Barto (Barto), wessel1512 (wessel1512), DLoader (DLoader), rewby|backup (rewby), masterX244 (masterX244), angenieux2 (angenieux), steering (steering), Sluggs (Sluggs), Yakov (Yakov), ell7 (ell), codecrafters (codecrafters), luckcolors (luckcolors), Ryz (Ryz), HCross (HCross), kiska (kiska), nulldata (nulldata), murmur, justcool393 (justcool393), Swryl, mattwright324|m, igneousx (igneousx), theblazehen|m, audrooku|m, DigitalDragon (DigitalDragon), Vokun (Vokun), Sanqui|m (Sanqui), phaeton (phaeton), x9fff00 (x9fff00), nyuuzyou (nyuuzyou), q3k|m, nstrom|m, anon00001|m, britmob|m, schwarzkatz|m, Thibaultmol, tech234a (tech234a), Explo, l0rd_enki|m, nano412510 (nano412510), tomodachi94 (tomodachi94), Video, thermospheric (Thermospheric), hlgs|m, s-crypt|m|m, qyxojzh|m, yzqzss (yzqzss), Nulo|m, mikolaj|m, CrispyAlice2 (CrispyAlice2), th3z0l4|m, qw3rty, IRC2DC, Terbium, eggdrop (eggdrop), pie_ (pie_), andrew (andrew), fionera (Fionera), ats (ats), BlankEclair (BlankEclair), AK (AK), kokos, alexlehm (alexlehm), c3manu (c3manu), @JAA (JAA), @AlsoJAA (JAA), plcp, PAARCLiCKS (s4n1ty), Craigle (Craigle), jodizzle (jodizzle), kdy (kdy), kokos|, monika (boom), lumidify (lumidify), knecht (knecht), katia_ (katia), OrIdow6 (OrIdow6), Nemo_bis (Nemo_bis), katia (katia), @ChanServ, kpcyrd (kpcyrd), [42] (N4Y), lea (lea_), sknebel (sknebel), Dj-Wawa (Dj-Wawa), nyany (nyany)
17:40:28AlsoHP_Archivist quits [Client Quit]
17:40:43HP_Archivist (HP_Archivist) joins
17:52:07<katia>how would i go about finding all links for a domain? can cdx do that? i might or might not want to match for a string in the search
17:55:04<pokechu22>The approach I use is https://web.archive.org/cdx/search/cdx?url=example.com&collapse=urlkey&matchType=domain&fl=original&limit=100000&showResumeKey=true&resumeKey=
17:55:06<pokechu22>https://web.archive.org/cdx/search/cdx?url=example.com&collapse=urlkey&matchType=domain&fl=original&limit=100000&showResumeKey=true&resumeKey=eJxLzs_VSa1IzC3ISdXUTywqyUzOSdU3MjAyNDQ0NjA0MQZS-gYGCkARUwNzQ1MjAxNDAwMAtIwN5g etc. I believe JAA has said the resumekey approach sometimes misses URLs but it's been good enough for my purposes
18:30:50Fusl quits [Client Quit]
18:30:59Fusl (Fusl) joins
18:31:41linuxgemini (linuxgemini) joins
19:26:49corentin joins
19:36:36<@JAA>Yeah, IIRC, the resumeKey approach doesn't work when the first page returns zero entries despite there being results, or something like that. It's been a few years though since I looked at that.
19:38:05<pokechu22>Ah, that sounds like an issue when a filter is present; I generally do it without any filter at all (meaning 4XX/5XX results are present in the output)
19:40:55<@JAA>I believe it also happened without a filter sometimes. But I don't remember the details, and it's also possible it's been fixed since. There were definitely some (undocumented?) changes to the CDX server.
19:41:31<@JAA>The other advantage of the pagination based on page numbers is that you can parallelise it.
19:41:33<pokechu22>Yeah, I think they changed what the resumekey looked like at some point in the past year or so (I think it was more human readable in the past)
19:42:14<@JAA>I usually run little-things/ia-cdx-search with 4 connections, which seems to be enough to avoid rate limits.
20:07:28BornOn420 quits [Remote host closed the connection]
20:07:59BornOn420 (BornOn420) joins
21:02:44qwertyasdfuiopghjkl2 quits [Ping timeout: 260 seconds]
21:03:27qwertyasdfuiopghjkl2 joins
21:03:51qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:04:47qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins
21:05:11qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:06:08qwertyasdfuiopghjkl2 joins
21:06:32qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:07:07qwertyasdfuiopghjkl2 joins
21:07:31qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:08:06qwertyasdfuiopghjkl2 joins
21:08:30qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:08:54qwertyasdfuiopghjkl2 joins
21:09:18qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:10:08qwertyasdfuiopghjkl2 joins
21:10:32qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:11:35qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins
21:11:59qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:12:18qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins
21:12:41qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:13:12qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins
21:13:36qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:14:29qwertyasdfuiopghjkl2 joins
21:14:53qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:15:09qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins
21:15:33qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:16:16qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins
21:16:40qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:17:43qwertyasdfuiopghjkl2 joins
21:18:07qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
21:19:02qwertyasdfuiopghjkl2 joins
23:19:48Lord_Nightmare quits [Quit: ZNC - http://znc.in]
23:21:38Lord_Nightmare (Lord_Nightmare) joins
23:37:47Lord_Nightmare quits [Client Quit]
23:41:23Lord_Nightmare (Lord_Nightmare) joins