| 00:30:21 | <@OrIdow6> | Is the AB job for ShoutEngine slated to finish in time? |
| 00:34:32 | <@OrIdow6> | On curiouscat.qa, https://www.whois.com/whois/curiouscat.qa claims "Status: pendingDelete (Client requested delete)" |
| 00:38:10 | <jodizzle> | Yeah, saw something about that. |
| 00:38:17 | <jodizzle> | Strange that there's no announcement. |
| 00:38:44 | <jodizzle> | I AB'd the twitter and a couple app store pages, but couldn't find anything else. |
| 00:42:52 | | HackMii quits [Remote host closed the connection] |
| 00:43:32 | | HackMii (hacktheplanet) joins |
| 00:43:36 | <@JAA> | Status: serverHold (Expired) |
| 00:43:51 | <@JAA> | I guess they forgot to renew their domain? |
| 00:49:47 | <@OrIdow6> | Is "Client requested delete" something else then? Not familiar with DNS stuff |
| 00:49:59 | <@OrIdow6> | The DNS override trick doesn't seem to work, by the way |
| 01:00:01 | | dm4v quits [Client Quit] |
| 01:02:13 | | dm4v joins |
| 01:02:15 | | dm4v is now authenticated as dm4v |
| 01:02:15 | | dm4v quits [Changing host] |
| 01:02:15 | | dm4v (dm4v) joins |
| 01:19:57 | | Hackerpcs quits [Quit: Hackerpcs] |
| 01:20:40 | | Hackerpcs (Hackerpcs) joins |
| 01:25:00 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
| 01:28:42 | | Myself quits [Read error: Connection reset by peer] |
| 01:28:43 | | Myself7 (myself) joins |
| 01:46:36 | | Myself7 quits [Read error: Connection reset by peer] |
| 01:47:18 | | Myself (myself) joins |
| 02:02:47 | | dm4v quits [Ping timeout: 265 seconds] |
| 02:06:04 | | dm4v joins |
| 02:06:06 | | dm4v is now authenticated as dm4v |
| 02:06:06 | | dm4v quits [Changing host] |
| 02:06:06 | | dm4v (dm4v) joins |
| 02:25:23 | | gazorpazorp quits [Client Quit] |
| 02:26:25 | | Myself quits [Read error: Connection reset by peer] |
| 02:31:10 | | Myself (myself) joins |
| 02:32:36 | | Arcorann_ joins |
| 02:36:09 | | Arcorann quits [Ping timeout: 258 seconds] |
| 02:37:36 | | gazorpazorp (gazorpazorp) joins |
| 02:44:17 | | march_happy quits [Remote host closed the connection] |
| 02:45:49 | | BlueMaxima quits [Client Quit] |
| 02:54:23 | | march_happy (march_happy) joins |
| 03:11:42 | <@JAA> | OrIdow6: DNS override seems to work fine for me. curl --resolve curiouscat.qa:443:104.26.8.190 https://curiouscat.qa/ -sv |
| 03:12:57 | <@JAA> | The IP comes from curiouscat.me, another older domain of the site which redirects to qa. Also resolves to 104.26.9.190 and 172.67.75.111. |
| 03:13:13 | <@OrIdow6> | Hm, I thought I tried that |
| 03:16:15 | <@OrIdow6> | I think I mistook their homepage for a Cloudflare error |
| 03:18:28 | <@OrIdow6> | Looks like it |
| 03:18:52 | <@OrIdow6> | Do we want to set something up for this? |
| 03:19:06 | <@OrIdow6> | In browser an /etc/hosts entry makes it work |
| 03:20:07 | | wyatt8750 quits [Remote host closed the connection] |
| 03:24:37 | | wyatt8740 joins |
| 03:29:03 | | wyatt8740 quits [Ping timeout: 258 seconds] |
| 03:29:07 | | wyatt8750 joins |
| 03:33:39 | | wyatt8750 quits [Ping timeout: 265 seconds] |
| 03:42:05 | | wyatt8740 joins |
| 03:43:36 | | katocala quits [Remote host closed the connection] |
| 04:04:06 | | ThreeHM quits [Ping timeout: 265 seconds] |
| 04:04:27 | | ThreeHM (ThreeHeadedMonkey) joins |
| 04:19:46 | | katocala joins |
| 04:20:06 | | katocala is now authenticated as katocala |
| 04:24:17 | | qw3rty_ joins |
| 04:28:05 | | qw3rty quits [Ping timeout: 258 seconds] |
| 04:31:10 | | tzt quits [Ping timeout: 265 seconds] |
| 04:43:30 | | tzt (tzt) joins |
| 05:21:03 | | eroc1990 quits [Client Quit] |
| 05:21:29 | | eroc1990 (eroc1990) joins |
| 05:43:06 | | HackMii quits [Remote host closed the connection] |
| 05:43:33 | | HackMii (hacktheplanet) joins |
| 07:06:18 | | spirit joins |
| 07:07:33 | | Lord_Nightmare quits [Ping timeout: 258 seconds] |
| 07:08:34 | | Lord_Nightmare (Lord_Nightmare) joins |
| 07:09:43 | <jodizzle> | From some brief poking at curiouscat.qa, it seems like the site is javascript heavy. There's an API, rooted at https://curiouscat.qa/api/v2.1/ |
| 07:10:57 | <jodizzle> | You can then use a `username` parameter to get data for a particular user's page: https://curiouscat.qa/api/v2.1/profile?username= |
| 07:11:43 | <jodizzle> | Pagination is handled with a `max_timestamp` parameter |
| 07:13:01 | <jodizzle> | Random example of both in use: https://curiouscat.qa/api/v2.1/profile?username=bbilarchive&max_timestamp=1631976526 |
| 07:15:06 | <jodizzle> | For finding profiles, one possibility is to do a twitter search for curiouscat.qa/curiouscat.me URLs |
| 07:15:46 | <@OrIdow6> | I looked at it |
| 07:15:51 | <@OrIdow6> | There's an endpoint for comment likes that can presumably be enumerated by ID |
| 07:15:55 | <@OrIdow6> | Not necessarily covering all users though |
| 07:16:06 | <@OrIdow6> | Only users who have ever liked a comment |
| 07:22:12 | <@OrIdow6> | *looked at CDX |
| 07:29:09 | <jodizzle> | What's the endpoint? |
| 07:42:47 | | monoxane4 quits [Client Quit] |
| 07:47:34 | | monoxane4 (monoxane) joins |
| 07:56:05 | | flashfire42 quits [Quit: The Lounge - https://thelounge.chat] |
| 07:56:06 | | s-crypt quits [Read error: Connection reset by peer] |
| 07:56:06 | | kiska quits [Write error: Connection reset by peer] |
| 07:56:06 | | Ryz2 quits [Read error: Connection reset by peer] |
| 07:59:13 | | fangfufu quits [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in] |
| 08:03:56 | | Megame (Megame) joins |
| 08:05:17 | | fangfufu joins |
| 08:05:19 | | fangfufu is now authenticated as fangfufu |
| 08:16:05 | <@OrIdow6> | https://curiouscat.qa/api/v2/post/likes?postid=[ID]&_ob=none |
| 08:16:49 | <@OrIdow6> | Fairly sparse, though, so I'm not actually sure it's one per post, especially because it seem to go up to over a billion |
| 08:17:48 | <@OrIdow6> | So maybe something like your approach + discovery with backfeed would be best |
| 08:20:38 | <@OrIdow6> | User IDs around 10 million |
| 08:28:59 | <@OrIdow6> | arkiver: Can we get preliminary approval for this? "Curiouscat", unofficial Twitter sidecar site. Sometime within the last week, domain was marked for expiry/deletion, and DNS record removed (or maybe that just happened as part of the domain thing, not sure), no word whatsoever from operators. Archival would require overriding DNS for the entire site. |
| 08:31:41 | <@OrIdow6> | The hardcoded-IP thing does seem questionable to me in cases like this; what's the threshold of "authorized" access? |
| 08:32:51 | <@OrIdow6> | *exceeding authorized access |
| 08:40:34 | | march_happy quits [Ping timeout: 265 seconds] |
| 08:43:56 | | march_happy (march_happy) joins |
| 08:57:42 | <jodizzle> | Looks like the domain came back: https://curiouscat.qa/ |
| 09:03:04 | <@OrIdow6> | Looks like a domain name squatter or similar |
| 09:06:24 | <jodizzle> | Testing against some profiles from twitter, I put together a script that I think works to iterate the API. Here it is if it's useful to any efforts: https://transfer.archivete.am/RTuZ5/process_curiouscat_api_urls.py |
| 09:06:55 | | sonick quits [Client Quit] |
| 09:07:16 | <jodizzle> | Pretty typical API I think. |
| 09:09:19 | <jodizzle> | I'll leave the snscrape twitter-search I have for 'curiouscat.qa OR curiouscat.me' running, to keep collecting profiles, though I imagine someone might come through with a faster solution. |
| 09:33:50 | <AK> | I can do an snscrape too if needed, but seems like jodizzle is on it |
| 09:47:32 | | qwertyasdfuiopghjkl joins |
| 09:48:50 | | r000t joins |
| 10:28:42 | | Ryz2 (Ryz) joins |
| 10:28:43 | | s-crypt (s-crypt) joins |
| 10:29:32 | | kiska (kiska) joins |
| 11:03:18 | | march_happy quits [Ping timeout: 258 seconds] |
| 11:17:41 | <Arcorann_> | https://embracer.com/release/embracer-group-enters-into-agreement-to-acquire-perfect-world-entertainment/ <-- don't know if this is worth looking into or not, but posting here anyway |
| 11:21:08 | | ragu_ quits [Client Quit] |
| 11:32:35 | | monoxane4 quits [Client Quit] |
| 11:37:30 | | monoxane4 (monoxane) joins |
| 12:47:48 | | x9fff00 (x9fff00) joins |
| 12:54:46 | | x9fff00 quits [Client Quit] |
| 12:56:16 | | sonick (sonick) joins |
| 12:59:11 | | x9fff00 (x9fff00) joins |
| 12:59:22 | | x9fff00 quits [Client Quit] |
| 13:28:35 | | Arcorann_ quits [Ping timeout: 258 seconds] |
| 14:05:18 | <@arkiver> | OrIdow6: not sure about faking DNS resolution |
| 14:05:20 | <@arkiver> | probably not |
| 14:15:20 | <@OrIdow6> | With this squatter having taken over the domain I suppose I have to agree |
| 14:18:56 | <@arkiver> | feel free to get the data in other ways though (just no faking of anything, including DNS) |
| 14:22:43 | <@OrIdow6> | Don't worry |
| 14:27:43 | <@OrIdow6> | Could have the Warc-Target-URI be the IP address; wget doesn't seem to work with this, though, and I suspect having an ignored Host: header wouldn't do well in playback |
| 14:30:40 | <@OrIdow6> | A quick look through the WARC spec doesn't look like that technically violates anything, but again |
| 14:35:14 | <@OrIdow6> | Any thoughts on just putting it the responses (or HARs or something) into WARC resource records? Could use the existing infrastructure (I think) but doesn't risk tripping up anything |
| 14:36:32 | | Megame quits [Client Quit] |
| 14:43:22 | <@OrIdow6> | Might use the wget-lua with just the DNS override Lua hook and no warc output; then take wget's vanilla mode output files and in the pipeline turn those into a WARC of resource records |
| 14:44:00 | <@OrIdow6> | If anyone has an easy way to ignore SNI problems when using Requests, that could simplify things |
| 14:45:51 | <@OrIdow6> | I assume that's what it is, anyhwo |
| 14:51:19 | <@OrIdow6> | s/Ignore SNI problems/override SNI// but I may hav found an easier way, anyhow: https://stackoverflow.com/questions/22609385/python-requests-library-define-specific-dns#22614367 |
| 14:51:40 | <@OrIdow6> | May work on this later, tell me if anyone has any comments |
| 15:50:28 | | lukash7 joins |
| 15:54:04 | | lukash7 quits [Client Quit] |
| 16:11:41 | | lukash7 joins |
| 16:34:33 | | lukash7 quits [Client Quit] |
| 16:37:01 | | lukash7 joins |
| 16:39:25 | | lukash7 quits [Client Quit] |
| 16:42:53 | | lukash7 joins |
| 16:51:37 | | Random quits [Remote host closed the connection] |
| 16:57:25 | | lukash7 quits [Client Quit] |
| 16:59:48 | | lukash7 joins |
| 17:09:20 | | lukash7 quits [Client Quit] |
| 17:20:37 | | spirit quits [Client Quit] |
| 17:26:24 | | lukash7 joins |
| 17:35:43 | | lukash7 quits [Client Quit] |
| 17:45:17 | <r000t> | Does warrior use DNS from DHCP, and if so, how do I manually specify a known-clean DNS server? |
| 17:46:51 | <jodizzle> | OrIdow6: Why does the SO post you linked not count as faking DNS? |
| 17:52:05 | | lukash7 joins |
| 17:56:32 | | lukash7 quits [Client Quit] |
| 17:59:59 | | yawkat quits [Ping timeout: 258 seconds] |
| 18:00:21 | | yawkat (yawkat) joins |
| 18:06:33 | | appledash quits [Ping timeout: 265 seconds] |
| 18:13:10 | | appledash (appledash) joins |
| 18:23:11 | <IDK> | i'm kinda wondering if I got banned on IA |
| 18:23:36 | <IDK> | I recently starting to have the site taking too long to respond |
| 18:27:37 | | lukash7 joins |
| 18:40:45 | | lukash7 quits [Client Quit] |
| 18:43:49 | | lukash7 joins |
| 19:19:03 | | qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds] |
| 19:20:08 | | qwertyasdfuiopghjkl joins |
| 20:09:37 | <@JAA> | OrIdow6: Re ShoutEngine, probably not given it's been running for 9 days, there are about 9 days left, and it's only run about a third through the current queue. Unfortunately, their server seems to break easily. |
| 20:54:45 | | tzt quits [Ping timeout: 265 seconds] |
| 20:55:09 | | tzt (tzt) joins |
| 21:13:03 | | BlueMaxima joins |
| 21:25:41 | | aleph quits [Ping timeout: 265 seconds] |
| 21:26:09 | | aleph joins |
| 21:30:01 | | spirit joins |
| 21:36:50 | | sonick quits [Client Quit] |
| 21:46:01 | | Retroity joins |
| 21:48:11 | | sonick (sonick) joins |
| 21:50:41 | <@JAA> | Rerun of the Channel 9 video errors finished, now rerunning some 222 URLs again that seem like they should be alive. The other 6589 appear to be well and truly dead, mostly NXDOMAIN. |
| 21:51:04 | <@JAA> | Also, TIL there used to be a channel8.msdn.com. |
| 22:09:34 | | DogsRNice (Webuser299) joins |
| 22:49:29 | | Arcorann (Arcorann) joins |
| 23:16:33 | <h2ibot> | Markp93 edited ArchiveTeam Warrior (+0, W to w): https://wiki.archiveteam.org/?diff=48041&oldid=48017 |
| 23:41:44 | | Retroity quits [Remote host closed the connection] |
| 23:49:44 | | mutantmnky quits [Remote host closed the connection] |
| 23:50:00 | | mutantmnky (mutantmonkey) joins |
| 23:54:45 | | Jake (Jake) joins |