00:30:21<@OrIdow6>Is the AB job for ShoutEngine slated to finish in time?
00:34:32<@OrIdow6>On curiouscat.qa, https://www.whois.com/whois/curiouscat.qa claims "Status: pendingDelete (Client requested delete)"
00:38:10<jodizzle>Yeah, saw something about that.
00:38:17<jodizzle>Strange that there's no announcement.
00:38:44<jodizzle>I AB'd the twitter and a couple app store pages, but couldn't find anything else.
00:42:52HackMii quits [Remote host closed the connection]
00:43:32HackMii (hacktheplanet) joins
00:43:36<@JAA>Status: serverHold (Expired)
00:43:51<@JAA>I guess they forgot to renew their domain?
00:49:47<@OrIdow6>Is "Client requested delete" something else then? Not familiar with DNS stuff
00:49:59<@OrIdow6>The DNS override trick doesn't seem to work, by the way
01:00:01dm4v quits [Client Quit]
01:02:13dm4v joins
01:02:15dm4v quits [Changing host]
01:02:15dm4v (dm4v) joins
01:19:57Hackerpcs quits [Quit: Hackerpcs]
01:20:40Hackerpcs (Hackerpcs) joins
01:25:00qwertyasdfuiopghjkl quits [Remote host closed the connection]
01:28:42Myself quits [Read error: Connection reset by peer]
01:28:43Myself7 (myself) joins
01:46:36Myself7 quits [Read error: Connection reset by peer]
01:47:18Myself (myself) joins
02:02:47dm4v quits [Ping timeout: 265 seconds]
02:06:04dm4v joins
02:06:06dm4v quits [Changing host]
02:06:06dm4v (dm4v) joins
02:25:23gazorpazorp quits [Client Quit]
02:26:25Myself quits [Read error: Connection reset by peer]
02:31:10Myself (myself) joins
02:32:36Arcorann_ joins
02:36:09Arcorann quits [Ping timeout: 258 seconds]
02:37:36gazorpazorp (gazorpazorp) joins
02:44:17march_happy quits [Remote host closed the connection]
02:45:49BlueMaxima quits [Client Quit]
02:54:23march_happy (march_happy) joins
03:11:42<@JAA>OrIdow6: DNS override seems to work fine for me. curl --resolve curiouscat.qa:443:104.26.8.190 https://curiouscat.qa/ -sv
03:12:57<@JAA>The IP comes from curiouscat.me, another older domain of the site which redirects to qa. Also resolves to 104.26.9.190 and 172.67.75.111.
03:13:13<@OrIdow6>Hm, I thought I tried that
03:16:15<@OrIdow6>I think I mistook their homepage for a Cloudflare error
03:18:28<@OrIdow6>Looks like it
03:18:52<@OrIdow6>Do we want to set something up for this?
03:19:06<@OrIdow6>In browser an /etc/hosts entry makes it work
03:20:07wyatt8750 quits [Remote host closed the connection]
03:24:37wyatt8740 joins
03:29:03wyatt8740 quits [Ping timeout: 258 seconds]
03:29:07wyatt8750 joins
03:33:39wyatt8750 quits [Ping timeout: 265 seconds]
03:42:05wyatt8740 joins
03:43:36katocala quits [Remote host closed the connection]
04:04:06ThreeHM quits [Ping timeout: 265 seconds]
04:04:27ThreeHM (ThreeHeadedMonkey) joins
04:19:46katocala joins
04:24:17qw3rty_ joins
04:28:05qw3rty quits [Ping timeout: 258 seconds]
04:31:10tzt quits [Ping timeout: 265 seconds]
04:43:30tzt (tzt) joins
05:21:03eroc1990 quits [Client Quit]
05:21:29eroc1990 (eroc1990) joins
05:43:06HackMii quits [Remote host closed the connection]
05:43:33HackMii (hacktheplanet) joins
07:06:18spirit joins
07:07:33Lord_Nightmare quits [Ping timeout: 258 seconds]
07:08:34Lord_Nightmare (Lord_Nightmare) joins
07:09:43<jodizzle>From some brief poking at curiouscat.qa, it seems like the site is javascript heavy. There's an API, rooted at https://curiouscat.qa/api/v2.1/
07:10:57<jodizzle>You can then use a `username` parameter to get data for a particular user's page: https://curiouscat.qa/api/v2.1/profile?username=
07:11:43<jodizzle>Pagination is handled with a `max_timestamp` parameter
07:13:01<jodizzle>Random example of both in use: https://curiouscat.qa/api/v2.1/profile?username=bbilarchive&max_timestamp=1631976526
07:15:06<jodizzle>For finding profiles, one possibility is to do a twitter search for curiouscat.qa/curiouscat.me URLs
07:15:46<@OrIdow6>I looked at it
07:15:51<@OrIdow6>There's an endpoint for comment likes that can presumably be enumerated by ID
07:15:55<@OrIdow6>Not necessarily covering all users though
07:16:06<@OrIdow6>Only users who have ever liked a comment
07:22:12<@OrIdow6>*looked at CDX
07:29:09<jodizzle>What's the endpoint?
07:42:47monoxane4 quits [Client Quit]
07:47:34monoxane4 (monoxane) joins
07:56:05flashfire42 quits [Quit: The Lounge - https://thelounge.chat]
07:56:06s-crypt quits [Read error: Connection reset by peer]
07:56:06kiska quits [Write error: Connection reset by peer]
07:56:06Ryz2 quits [Read error: Connection reset by peer]
07:59:13fangfufu quits [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]
08:03:56Megame (Megame) joins
08:05:17fangfufu joins
08:16:05<@OrIdow6>https://curiouscat.qa/api/v2/post/likes?postid=[ID]&_ob=none
08:16:49<@OrIdow6>Fairly sparse, though, so I'm not actually sure it's one per post, especially because it seem to go up to over a billion
08:17:48<@OrIdow6>So maybe something like your approach + discovery with backfeed would be best
08:20:38<@OrIdow6>User IDs around 10 million
08:28:59<@OrIdow6>arkiver: Can we get preliminary approval for this? "Curiouscat", unofficial Twitter sidecar site. Sometime within the last week, domain was marked for expiry/deletion, and DNS record removed (or maybe that just happened as part of the domain thing, not sure), no word whatsoever from operators. Archival would require overriding DNS for the entire site.
08:31:41<@OrIdow6>The hardcoded-IP thing does seem questionable to me in cases like this; what's the threshold of "authorized" access?
08:32:51<@OrIdow6>*exceeding authorized access
08:40:34march_happy quits [Ping timeout: 265 seconds]
08:43:56march_happy (march_happy) joins
08:57:42<jodizzle>Looks like the domain came back: https://curiouscat.qa/
09:03:04<@OrIdow6>Looks like a domain name squatter or similar
09:06:24<jodizzle>Testing against some profiles from twitter, I put together a script that I think works to iterate the API. Here it is if it's useful to any efforts: https://transfer.archivete.am/RTuZ5/process_curiouscat_api_urls.py
09:06:55sonick quits [Client Quit]
09:07:16<jodizzle>Pretty typical API I think.
09:09:19<jodizzle>I'll leave the snscrape twitter-search I have for 'curiouscat.qa OR curiouscat.me' running, to keep collecting profiles, though I imagine someone might come through with a faster solution.
09:33:50<AK>I can do an snscrape too if needed, but seems like jodizzle is on it
09:47:32qwertyasdfuiopghjkl joins
09:48:50r000t joins
10:28:42Ryz2 (Ryz) joins
10:28:43s-crypt (s-crypt) joins
10:29:32kiska (kiska) joins
11:03:18march_happy quits [Ping timeout: 258 seconds]
11:17:41<Arcorann_>https://embracer.com/release/embracer-group-enters-into-agreement-to-acquire-perfect-world-entertainment/ <-- don't know if this is worth looking into or not, but posting here anyway
11:21:08ragu_ quits [Client Quit]
11:32:35monoxane4 quits [Client Quit]
11:37:30monoxane4 (monoxane) joins
12:47:48x9fff00 (x9fff00) joins
12:54:46x9fff00 quits [Client Quit]
12:56:16sonick (sonick) joins
12:59:11x9fff00 (x9fff00) joins
12:59:22x9fff00 quits [Client Quit]
13:28:35Arcorann_ quits [Ping timeout: 258 seconds]
14:05:18<@arkiver>OrIdow6: not sure about faking DNS resolution
14:05:20<@arkiver>probably not
14:15:20<@OrIdow6>With this squatter having taken over the domain I suppose I have to agree
14:18:56<@arkiver>feel free to get the data in other ways though (just no faking of anything, including DNS)
14:22:43<@OrIdow6>Don't worry
14:27:43<@OrIdow6>Could have the Warc-Target-URI be the IP address; wget doesn't seem to work with this, though, and I suspect having an ignored Host: header wouldn't do well in playback
14:30:40<@OrIdow6>A quick look through the WARC spec doesn't look like that technically violates anything, but again
14:35:14<@OrIdow6>Any thoughts on just putting it the responses (or HARs or something) into WARC resource records? Could use the existing infrastructure (I think) but doesn't risk tripping up anything
14:36:32Megame quits [Client Quit]
14:43:22<@OrIdow6>Might use the wget-lua with just the DNS override Lua hook and no warc output; then take wget's vanilla mode output files and in the pipeline turn those into a WARC of resource records
14:44:00<@OrIdow6>If anyone has an easy way to ignore SNI problems when using Requests, that could simplify things
14:45:51<@OrIdow6>I assume that's what it is, anyhwo
14:51:19<@OrIdow6>s/Ignore SNI problems/override SNI// but I may hav found an easier way, anyhow: https://stackoverflow.com/questions/22609385/python-requests-library-define-specific-dns#22614367
14:51:40<@OrIdow6>May work on this later, tell me if anyone has any comments
15:50:28lukash7 joins
15:54:04lukash7 quits [Client Quit]
16:11:41lukash7 joins
16:34:33lukash7 quits [Client Quit]
16:37:01lukash7 joins
16:39:25lukash7 quits [Client Quit]
16:42:53lukash7 joins
16:51:37Random quits [Remote host closed the connection]
16:57:25lukash7 quits [Client Quit]
16:59:48lukash7 joins
17:09:20lukash7 quits [Client Quit]
17:20:37spirit quits [Client Quit]
17:26:24lukash7 joins
17:35:43lukash7 quits [Client Quit]
17:45:17<r000t>Does warrior use DNS from DHCP, and if so, how do I manually specify a known-clean DNS server?
17:46:51<jodizzle>OrIdow6: Why does the SO post you linked not count as faking DNS?
17:52:05lukash7 joins
17:56:32lukash7 quits [Client Quit]
17:59:59yawkat quits [Ping timeout: 258 seconds]
18:00:21yawkat (yawkat) joins
18:06:33appledash quits [Ping timeout: 265 seconds]
18:13:10appledash (appledash) joins
18:23:11<IDK>i'm kinda wondering if I got banned on IA
18:23:36<IDK>I recently starting to have the site taking too long to respond
18:27:37lukash7 joins
18:40:45lukash7 quits [Client Quit]
18:43:49lukash7 joins
19:19:03qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds]
19:20:08qwertyasdfuiopghjkl joins
20:09:37<@JAA>OrIdow6: Re ShoutEngine, probably not given it's been running for 9 days, there are about 9 days left, and it's only run about a third through the current queue. Unfortunately, their server seems to break easily.
20:54:45tzt quits [Ping timeout: 265 seconds]
20:55:09tzt (tzt) joins
21:13:03BlueMaxima joins
21:25:41aleph quits [Ping timeout: 265 seconds]
21:26:09aleph joins
21:30:01spirit joins
21:36:50sonick quits [Client Quit]
21:46:01Retroity joins
21:48:11sonick (sonick) joins
21:50:41<@JAA>Rerun of the Channel 9 video errors finished, now rerunning some 222 URLs again that seem like they should be alive. The other 6589 appear to be well and truly dead, mostly NXDOMAIN.
21:51:04<@JAA>Also, TIL there used to be a channel8.msdn.com.
22:09:34DogsRNice (Webuser299) joins
22:49:29Arcorann (Arcorann) joins
23:16:33<h2ibot>Markp93 edited ArchiveTeam Warrior (+0, W to w): https://wiki.archiveteam.org/?diff=48041&oldid=48017
23:41:44Retroity quits [Remote host closed the connection]
23:49:44mutantmnky quits [Remote host closed the connection]
23:50:00mutantmnky (mutantmonkey) joins
23:54:45Jake (Jake) joins