07:47:23Maturion joins
08:36:59Maturion quits [Remote host closed the connection]
12:24:26shreyasminocha quits [Remote host closed the connection]
12:24:32shreyasminocha (shreyasminocha) joins
14:10:33kiryu joins
14:10:33kiryu quits [Changing host]
14:10:33kiryu (kiryu) joins
17:28:30Maturion joins
18:06:58pabs quits [Ping timeout: 255 seconds]
18:08:10pabs (pabs) joins
18:31:43systwi quits [Ping timeout: 255 seconds]
18:44:59systwi (systwi) joins
19:23:28tzt quits [Ping timeout: 255 seconds]
21:10:22Maturion quits [Remote host closed the connection]
21:45:17<imer>JAA: https://transfer.archivete.am/KEQpz/www.people.vcu.edu.txt this is what I got from common crawl cdx, might be dupes of IA cdx data
21:45:17<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/KEQpz/www.people.vcu.edu.txt
21:58:22<@JAA>imer: Thanks, looks like that yielded one extra URL which is a 404.
21:58:57<imer>woo lol
21:59:15<imer>my guess was correct at least then
21:59:48<@JAA>I wonder why it didn't show up in my lists. I didn't filter the CDX results to 200s or similar.
22:00:17<@JAA>And I thought all CC data is in the WBM. Maybe not...
22:12:27<imer>maybe the "newest" crawl isnt just yet or something
22:13:16<@JAA>Yeah, could be. The CDX API also isn't a full reflection of the WBM IIRC. Something something layers of indices.