02:38:01 | | nicolas17_ joins |
02:41:59 | | nicolas17 quits [Ping timeout: 260 seconds] |
03:03:02 | | nicolas17_ quits [Client Quit] |
03:03:29 | | nicolas17 joins |
04:58:00 | | nicolas17 quits [Client Quit] |
05:32:10 | | nicolas17 joins |
05:41:30 | | nicolas17_ joins |
05:45:09 | | nicolas17 quits [Ping timeout: 260 seconds] |
06:09:11 | | nicolas17 joins |
06:12:34 | | nicolas17_ quits [Ping timeout: 260 seconds] |
06:46:55 | | nicolas17_ joins |
06:50:29 | | nicolas17 quits [Ping timeout: 260 seconds] |
06:52:12 | | nicolas17 joins |
06:55:44 | | nicolas17_ quits [Ping timeout: 260 seconds] |
07:03:30 | | nicolas17_ joins |
07:06:49 | | nicolas17 quits [Ping timeout: 260 seconds] |
07:20:48 | | nicolas17_ quits [Read error: Connection reset by peer] |
07:21:00 | | nicolas17_ joins |
07:28:13 | | nicolas17 joins |
07:31:54 | | nicolas17_ quits [Ping timeout: 260 seconds] |
07:33:30 | | nicolas17_ joins |
07:37:09 | | nicolas17 quits [Ping timeout: 260 seconds] |
07:38:42 | | nicolas17 joins |
07:39:14 | | nicolas17_ quits [Read error: Connection reset by peer] |
07:43:59 | | nicolas17_ joins |
07:47:04 | | nicolas17 quits [Ping timeout: 260 seconds] |
07:47:04 | | nicolas17_ quits [Read error: Connection reset by peer] |
07:47:33 | | nicolas17_ joins |
08:31:35 | | BornOn420 quits [Remote host closed the connection] |
08:32:06 | | BornOn420 (BornOn420) joins |
08:39:02 | | pabs quits [Read error: Connection reset by peer] |
08:39:30 | | pabs (pabs) joins |
10:11:09 | | PredatorIWD25 quits [Ping timeout: 260 seconds] |
10:54:17 | | nicolas17 joins |
10:57:49 | | nicolas17_ quits [Ping timeout: 260 seconds] |
10:59:16 | | nicolas17 quits [Read error: Connection reset by peer] |
10:59:33 | | nicolas17 joins |
11:08:37 | | PredatorIWD25 joins |
11:33:52 | | nicolas17_ joins |
11:38:02 | | simon816 quits [Quit: ZNC 1.9.1 - https://znc.in] |
11:38:04 | | nicolas17 quits [Ping timeout: 260 seconds] |
11:40:52 | | simon816 (simon816) joins |
11:43:12 | | nicolas17 joins |
11:46:49 | | nicolas17_ quits [Ping timeout: 260 seconds] |
11:54:41 | | yano quits [Quit: WeeChat, https://weechat.org/] |
11:55:05 | | yano (yano) joins |
12:04:02 | | nicolas17_ joins |
12:05:53 | | KoalaBear joins |
12:07:14 | | nicolas17 quits [Ping timeout: 260 seconds] |
12:08:24 | | KoalaBear84 quits [Ping timeout: 260 seconds] |
12:24:24 | | pabs quits [Read error: Connection reset by peer] |
12:25:14 | | pabs (pabs) joins |
12:32:06 | | nicolas17_ quits [Client Quit] |
12:33:04 | | nicolas17_ joins |
12:40:19 | | nicolas17 joins |
12:41:21 | | nicolas17_ quits [Read error: Connection reset by peer] |
12:57:45 | | nicolas17_ joins |
13:01:29 | | nicolas17 quits [Ping timeout: 260 seconds] |
13:04:21 | | pabs quits [Client Quit] |
13:04:48 | | pabs (pabs) joins |
13:07:02 | | nicolas17 joins |
13:10:14 | | nicolas17_ quits [Ping timeout: 260 seconds] |
14:21:46 | | PredatorIWD256 joins |
14:22:34 | | PredatorIWD25 quits [Ping timeout: 260 seconds] |
14:22:34 | | PredatorIWD256 is now known as PredatorIWD25 |
14:35:10 | | PredatorIWD250 joins |
14:37:41 | | PredatorIWD25 quits [Ping timeout: 276 seconds] |
14:37:42 | | PredatorIWD25 joins |
14:40:04 | | PredatorIWD250 quits [Ping timeout: 260 seconds] |
14:45:43 | | nicolas17_ joins |
14:47:47 | | nicolas17 quits [Read error: Connection reset by peer] |
15:41:23 | | corentin quits [Ping timeout: 276 seconds] |
15:44:49 | | Dango360 quits [Ping timeout: 260 seconds] |
16:00:43 | | Dango360 (Dango360) joins |
16:20:21 | | nicolas17 joins |
16:23:54 | | nicolas17_ quits [Ping timeout: 260 seconds] |
16:33:41 | | nicolas17_ joins |
16:37:19 | | nicolas17 quits [Ping timeout: 260 seconds] |
16:49:04 | | nicolas17 joins |
17:38:14 | | atirclog (atirclog) joins |
17:38:14 | | Topic: This channel is 100% not affiliated with archive.org. We will help if we can. | https://wiki.archiveteam.org/index.php/Internet_Archive |
17:38:14 | | Topic set by JAA at 2025-06-02 22:04:15Z |
17:38:19 | | Current users: atirclog (atirclog), cm, DogsRNice, magmaus3 (magmaus3), Fusl (Fusl), pokechu22 (pokechu22), Lord_Nightmare (Lord_Nightmare), nicolas17 (nicolas17), erenrich, JTL (JTL), ThreeHM (ThreeHeadedMonkey), tzt (tzt), Stagnant_ (Stagnant), rewby (rewby), yano (yano), danwellby, night (night), balrog (balrog), legoktm, lexikiq, Doranwen (Doranwen), mgrandi (mgrandi), Medowar (Medowar), DopefishJustin (DopefishJustin), ScenarioPlanet (ScenarioPlanet), @kaz (Kaz), sepro (sepro), nyakase (nyakase), fuzzy8021 (fuzzy80211), atphoenix__ (atphoenix), PredatorIWD25, AlsoHP_Archivist, Dango360 (Dango360), pabs (pabs), KoalaBear, simon816 (simon816), BornOn420 (BornOn420), zhongfu (zhongfu), riking, @hook54321 (hook54321), BearFortress, Matthww, DigitalDragons (DigitalDragons), Exorcism0666 (exorcism), qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2), dxrt (dxrt), Flashfire42 (flashfire42), Sanqui (Sanqui), threedeeitguy69 (threedeeitguy), Chris5010 (Chris5010), nothere, Jake (Jake), s-crypt (s-crypt), datechnoman (datechnoman), nulldata-alt (nulldata), imer (imer), that_lurker (that_lurker), @arkiver (arkiver), TheTechRobo (TheTechRobo), Pedrosso, xkey (xkey), Exorcism (exorcism), nukke (nukke), Barto (Barto), wessel1512 (wessel1512), DLoader (DLoader), rewby|backup (rewby), masterX244 (masterX244), angenieux2 (angenieux), steering (steering), Sluggs (Sluggs), Yakov (Yakov), ell7 (ell), codecrafters (codecrafters), luckcolors (luckcolors), Ryz (Ryz), HCross (HCross), kiska (kiska), nulldata (nulldata), murmur, justcool393 (justcool393), Swryl, mattwright324|m, igneousx (igneousx), theblazehen|m, audrooku|m, DigitalDragon (DigitalDragon), Vokun (Vokun), Sanqui|m (Sanqui), phaeton (phaeton), x9fff00 (x9fff00), nyuuzyou (nyuuzyou), q3k|m, nstrom|m, anon00001|m, britmob|m, schwarzkatz|m, Thibaultmol, tech234a (tech234a), Explo, l0rd_enki|m, nano412510 (nano412510), tomodachi94 (tomodachi94), Video, thermospheric (Thermospheric), hlgs|m, s-crypt|m|m, qyxojzh|m, yzqzss (yzqzss), Nulo|m, mikolaj|m, CrispyAlice2 (CrispyAlice2), th3z0l4|m, qw3rty, IRC2DC, Terbium, eggdrop (eggdrop), pie_ (pie_), andrew (andrew), fionera (Fionera), ats (ats), BlankEclair (BlankEclair), AK (AK), kokos, alexlehm (alexlehm), c3manu (c3manu), @JAA (JAA), @AlsoJAA (JAA), plcp, PAARCLiCKS (s4n1ty), Craigle (Craigle), jodizzle (jodizzle), kdy (kdy), kokos|, monika (boom), lumidify (lumidify), knecht (knecht), katia_ (katia), OrIdow6 (OrIdow6), Nemo_bis (Nemo_bis), katia (katia), @ChanServ, kpcyrd (kpcyrd), [42] (N4Y), lea (lea_), sknebel (sknebel), Dj-Wawa (Dj-Wawa), nyany (nyany) |
17:40:28 | | AlsoHP_Archivist quits [Client Quit] |
17:40:43 | | HP_Archivist (HP_Archivist) joins |
17:52:07 | <katia> | how would i go about finding all links for a domain? can cdx do that? i might or might not want to match for a string in the search |
17:55:04 | <pokechu22> | The approach I use is https://web.archive.org/cdx/search/cdx?url=example.com&collapse=urlkey&matchType=domain&fl=original&limit=100000&showResumeKey=true&resumeKey= |
17:55:06 | <pokechu22> | https://web.archive.org/cdx/search/cdx?url=example.com&collapse=urlkey&matchType=domain&fl=original&limit=100000&showResumeKey=true&resumeKey=eJxLzs_VSa1IzC3ISdXUTywqyUzOSdU3MjAyNDQ0NjA0MQZS-gYGCkARUwNzQ1MjAxNDAwMAtIwN5g etc. I believe JAA has said the resumekey approach sometimes misses URLs but it's been good enough for my purposes |
18:30:50 | | Fusl quits [Client Quit] |
18:30:59 | | Fusl (Fusl) joins |
18:31:41 | | linuxgemini (linuxgemini) joins |
19:26:49 | | corentin joins |
19:36:36 | <@JAA> | Yeah, IIRC, the resumeKey approach doesn't work when the first page returns zero entries despite there being results, or something like that. It's been a few years though since I looked at that. |
19:38:05 | <pokechu22> | Ah, that sounds like an issue when a filter is present; I generally do it without any filter at all (meaning 4XX/5XX results are present in the output) |
19:40:55 | <@JAA> | I believe it also happened without a filter sometimes. But I don't remember the details, and it's also possible it's been fixed since. There were definitely some (undocumented?) changes to the CDX server. |
19:41:31 | <@JAA> | The other advantage of the pagination based on page numbers is that you can parallelise it. |
19:41:33 | <pokechu22> | Yeah, I think they changed what the resumekey looked like at some point in the past year or so (I think it was more human readable in the past) |
19:42:14 | <@JAA> | I usually run little-things/ia-cdx-search with 4 connections, which seems to be enough to avoid rate limits. |
20:07:28 | | BornOn420 quits [Remote host closed the connection] |
20:07:59 | | BornOn420 (BornOn420) joins |
21:02:44 | | qwertyasdfuiopghjkl2 quits [Ping timeout: 260 seconds] |
21:03:27 | | qwertyasdfuiopghjkl2 joins |
21:03:27 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
21:03:51 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:04:47 | | qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins |
21:05:11 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:06:08 | | qwertyasdfuiopghjkl2 joins |
21:06:08 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
21:06:32 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:07:07 | | qwertyasdfuiopghjkl2 joins |
21:07:07 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
21:07:31 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:08:06 | | qwertyasdfuiopghjkl2 joins |
21:08:06 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
21:08:30 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:08:54 | | qwertyasdfuiopghjkl2 joins |
21:08:54 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
21:09:18 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:10:08 | | qwertyasdfuiopghjkl2 joins |
21:10:08 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
21:10:32 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:11:35 | | qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins |
21:11:59 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:12:18 | | qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins |
21:12:41 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:13:12 | | qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins |
21:13:36 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:14:29 | | qwertyasdfuiopghjkl2 joins |
21:14:29 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
21:14:53 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:15:09 | | qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins |
21:15:33 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:16:16 | | qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins |
21:16:40 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:17:43 | | qwertyasdfuiopghjkl2 joins |
21:17:43 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
21:18:07 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
21:19:02 | | qwertyasdfuiopghjkl2 joins |
21:19:02 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
23:19:48 | | Lord_Nightmare quits [Quit: ZNC - http://znc.in] |
23:21:38 | | Lord_Nightmare (Lord_Nightmare) joins |
23:37:47 | | Lord_Nightmare quits [Client Quit] |
23:41:23 | | Lord_Nightmare (Lord_Nightmare) joins |