| 00:00:46 | <endrift> | it does to me too but a lot of them are marked as software |
| 00:01:06 | <endrift> | but I'll wait on a second opinion |
| 00:10:27 | <pabs> | "Codecademy to be acquired" https://www.skillsoft.com/press-releases/skillsoft-to-acquire-codecademy-a-leading-platform-for-learning-high-demand-technical-skills-creating-a-worldwide-community-of-more-than-85-million-learners https://news.ycombinator.com/item?id=29649755 |
| 00:12:13 | <pabs> | Codecademy site looks a bit JS-y but still some non-JS things |
| 00:12:41 | <pabs> | and lots of stuff behind login, but still some non-login bits |
| 00:42:42 | | HP_Archivist quits [Read error: Connection reset by peer] |
| 00:42:59 | | HP_Archivist (HP_Archivist) joins |
| 01:00:02 | | dm4v quits [Client Quit] |
| 01:04:42 | | dm4v joins |
| 01:04:44 | | dm4v is now authenticated as dm4v |
| 01:04:44 | | dm4v quits [Changing host] |
| 01:04:44 | | dm4v (dm4v) joins |
| 01:36:23 | | tzt quits [Ping timeout: 265 seconds] |
| 01:40:39 | | tzt (tzt) joins |
| 02:01:40 | <@JAA> | Everything from Channel 9 that's still available should be covered now. 24.5 TiB in total, last data uploading now. |
| 02:03:10 | | dm4v quits [Ping timeout: 258 seconds] |
| 02:05:01 | | dm4v joins |
| 02:05:04 | | dm4v is now authenticated as dm4v |
| 02:05:04 | | dm4v quits [Changing host] |
| 02:05:04 | | dm4v (dm4v) joins |
| 03:21:06 | | lennier1 quits [Client Quit] |
| 03:22:12 | | pabs quits [Quit: Don't rest until all the world is paved in moss and greenery.] |
| 03:24:41 | | pabs (pabs) joins |
| 04:21:48 | | qw3rty_ joins |
| 04:25:33 | | qw3rty__ quits [Ping timeout: 265 seconds] |
| 04:26:22 | | march_happy (march_happy) joins |
| 04:28:33 | <h2ibot> | OrIdow6 created Galerie.cz (+21, Redirected page to [[Blog.cz]]): https://wiki.archiveteam.org/?title=Galerie.cz |
| 04:39:35 | <h2ibot> | OrIdow6 created CuriousCat (+1371, Created page with "{{Infobox project | URL =…): https://wiki.archiveteam.org/?title=CuriousCat |
| 04:44:16 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
| 04:47:08 | | qwertyasdfuiopghjkl joins |
| 05:39:43 | <@OrIdow6> | Anyone have an easy way to get the IP address of m.curiouscat.qa prior to expiry? |
| 05:40:44 | | NotWebuser joins |
| 05:43:25 | <@OrIdow6> | jodizzle: Have any good Curiouscat test pages? |
| 05:43:58 | | Jonboy345 quits [Ping timeout: 258 seconds] |
| 05:46:29 | <jodizzle> | I tested using 104.26.8.190 again for m.curiouscat.qa and I think I got valid content on a URL |
| 05:49:11 | <jodizzle> | OrIdow6: Here's a list of API URLs for different profiles, if that's what you mean by test pages: https://transfer.archivete.am/g9V4z/curiouscat-search-3-min-faves-100-api-urls.txt |
| 05:49:57 | <@OrIdow6> | Profiles with unusual or edge cases that you found writing your iterator, I mean |
| 05:50:48 | <jodizzle> | Hmm |
| 05:51:45 | <@OrIdow6> | Thanks for that IP, by the way |
| 05:54:34 | <eroc1990> | Off topic but slightly on topic, looks like curiouscat was bought and their new owner let the domain lapse |
| 05:54:35 | <eroc1990> | https://twitter.com/Balbonator/status/1473248444643004417?s=20 |
| 05:54:51 | <eroc1990> | straight from their technical cofounder's mouth |
| 05:55:34 | <@OrIdow6> | That's what we guessed eroc1990 |
| 05:58:38 | <eroc1990> | Figured as much. Good to see some sort of confirmation of that though. |
| 05:58:43 | <jodizzle> | OrIdow6: I don't have any unusual profiles on hand, and the box I'm using is currently banned so it's not easy to re-check anything. But here's the most updated version of the iterator I'm using, if it helps: https://transfer.archivete.am/aZhye/process_curiouscat_api_urls.py |
| 05:59:16 | <jodizzle> | That's been stable for a while. Turns out the ban isn't necessarily a permaban, just a very long one (like 1+ day) |
| 06:00:02 | <@OrIdow6> | Some of the response-reading stuff looks useful |
| 06:00:13 | <@OrIdow6> | You got banned from Curiouscat? Thought it was from Twitter |
| 06:00:16 | <jodizzle> | The main thing I remember being an annoying detail is that sometimes the struct in the JSON containing the timestamp is called 'post', and sometimes it's called 'status' |
| 06:00:51 | <jodizzle> | No, it was curiouscat. It seems to be pegged to total number of requests or something. Currently I'm banned again, but that one may expire as well. |
| 06:01:37 | <jodizzle> | (Total number of requests over a long period as opposed to requesting very quickly, I mean) |
| 06:01:44 | <@OrIdow6> | What does a ban look like? |
| 06:02:49 | <jodizzle> | You get this response when requesting a profile: {'error': 'Wait a bit', 'error_code': 'ratelimited'} |
| 06:03:12 | <@OrIdow6> | Is that on the human-friendly pages or just API requests? |
| 06:03:24 | <@OrIdow6> | Human-friendly URLs, I should say |
| 06:04:05 | <@OrIdow6> | And what's the response code? |
| 06:04:17 | <jodizzle> | I've only iterated on the API requests, not human-friendly URLs |
| 06:04:26 | <@OrIdow6> | Also, how many requests over a long time? I hope this isn't another site where I'll be banned while testing the script |
| 06:04:47 | <@OrIdow6> | Not exactly, just, like, 100 or 100000 |
| 06:05:25 | <jodizzle> | Looks like that error JSON comes with a response code of 200 |
| 06:05:44 | <@OrIdow6> | Naturally |
| 06:06:11 | <@OrIdow6> | Thank you |
| 06:08:33 | <jodizzle> | Looks like the iterator once went through 7896 requests before getting hit with the ban |
| 06:09:33 | <jodizzle> | This was with a download delay in place |
| 06:19:44 | | lennier1 (lennier1) joins |
| 06:22:02 | | Jonboy345 joins |
| 07:12:00 | | qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds] |
| 08:05:26 | <@OrIdow6> | jodizzle: Would you mind doing a curl or similar against a human-friendly page from a banned machine? |
| 08:12:58 | | march_happy quits [Remote host closed the connection] |
| 08:13:38 | <@OrIdow6> | I suppose it shouldn't be critical as long as it doesn't happen independently of the API |
| 08:15:37 | | march_happy (march_happy) joins |
| 08:18:45 | <@OrIdow6> | jodizzle: Never mind, actually |
| 08:30:27 | | march_happy quits [Remote host closed the connection] |
| 08:32:34 | | march_happy (march_happy) joins |
| 08:38:36 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 08:41:50 | | march_happy quits [Remote host closed the connection] |
| 09:10:35 | | HackMii_ quits [Remote host closed the connection] |
| 09:11:11 | | HackMii_ (hacktheplanet) joins |
| 09:22:27 | <@OrIdow6> | jodizzle: Have any examples of accounts with status instead of post? |
| 09:24:50 | <jodizzle> | OrIdow6: Try https://curiouscat.qa/api/v2.1/profile?username=tetekoobsf and look at the last element of the `posts` array. |
| 09:31:26 | <@OrIdow6> | Thank you |
| 10:05:32 | | Jonboy3451 joins |
| 10:08:26 | | Jonboy345 quits [Ping timeout: 240 seconds] |
| 10:26:25 | <@OrIdow6> | Near done |
| 10:52:55 | | driib798943 (driib) joins |
| 10:54:09 | | driib79894 quits [Ping timeout: 265 seconds] |
| 10:54:09 | | driib798943 is now known as driib79894 |
| 11:04:25 | <@OrIdow6> | Will finish it up in the morning, just "branding" and CDX lists now |
| 13:23:58 | | Arcorann quits [Ping timeout: 258 seconds] |
| 13:49:13 | | sec^nd quits [Remote host closed the connection] |
| 13:49:50 | | sec^nd (second) joins |
| 13:54:38 | | Sluggs quits [Ping timeout: 258 seconds] |
| 14:28:25 | | qwertyasdfuiopghjkl joins |
| 14:29:20 | | Sluggs joins |
| 14:53:16 | | march_happy (march_happy) joins |
| 14:55:33 | | sec^nd quits [Remote host closed the connection] |
| 14:55:58 | | sec^nd (second) joins |
| 15:10:36 | | thelounge316 joins |
| 15:13:13 | | thelounge31 quits [Ping timeout: 265 seconds] |
| 15:13:13 | | thelounge316 is now known as thelounge31 |
| 15:14:10 | | Megame (Megame) joins |
| 15:17:37 | | sec^nd quits [Remote host closed the connection] |
| 15:17:57 | | sec^nd (second) joins |
| 15:31:40 | <@arkiver> | CDX lists? |
| 15:39:39 | | Ruthalas quits [Quit: Ping timeout (120 seconds)] |
| 15:39:45 | | superkuh quits [Remote host closed the connection] |
| 15:40:08 | | Ruthalas (Ruthalas) joins |
| 15:40:38 | | superkuh joins |
| 16:34:29 | | march_happy quits [Ping timeout: 258 seconds] |
| 17:15:59 | | spirit quits [Client Quit] |
| 18:00:26 | | beluga joins |
| 18:00:34 | <beluga> | How do I archive YouTube comments and replays as well as all comments metadata like likes and who made it. do I archive YouTube comments and replays as well as all comments metadata like likes and who made it. |
| 18:00:48 | <beluga> | Like what is the easiest way |
| 18:00:59 | <beluga> | Archive today can't load the new |
| 18:05:32 | <beluga> | Bye . |
| 18:05:35 | | beluga leaves |
| 18:38:11 | | Megame quits [Client Quit] |
| 18:44:02 | | spirit joins |
| 19:50:46 | | leo60228- quits [Ping timeout: 240 seconds] |
| 19:55:10 | | leo60228 (leo60228) joins |
| 20:37:11 | <@OrIdow6> | arkiver: Item lists from what's in the CDX server, I mean |
| 20:37:57 | | NotWebuser quits [Remote host closed the connection] |
| 20:39:56 | | wessel1512 is now authenticated as wessel1512 |
| 21:06:30 | | DogsRNice (Webuser299) joins |
| 21:07:57 | | sec^nd quits [Remote host closed the connection] |
| 21:08:24 | | sec^nd (second) joins |
| 21:14:19 | | lennier1 quits [Client Quit] |
| 21:23:57 | | Matthww8 joins |
| 21:24:35 | <@arkiver> | lets create a channel for curiouscat |
| 21:25:04 | <@arkiver> | just a warning in case this was not clear for someone - the data archived with the curiouscat project will not be in the wayback machine (at least not in the form that i understand it's going to be archived in) |
| 21:25:23 | | Matthww quits [Ping timeout: 265 seconds] |
| 21:25:23 | | Matthww8 is now known as Matthww |
| 21:25:52 | <@JAA> | Surely the channel name has to be a pun on Schrödinger's cat. Can't immediately think of a good one though. |
| 21:27:27 | <@arkiver> | how about we use schrodingers dog huh |
| 21:27:36 | <@arkiver> | could be #curiousdog |
| 21:27:45 | <AK> | #woof |
| 21:27:47 | <@JAA> | Meh |
| 21:27:53 | <@OrIdow6> | "Curiosity killed the cat" reference? |
| 21:28:06 | <@arkiver> | catkiller |
| 21:28:18 | <AK> | curious incident of the confused cat in the night time? |
| 21:28:35 | <@arkiver> | #curiousincidentoftheconfusedcatinthenighttime |
| 21:28:47 | <@JAA> | Nice and compact :-) |
| 21:29:03 | <@arkiver> | just how we like it |
| 21:29:06 | <thuban> | #nominativedeterminismstrikesagain |
| 21:30:14 | <@arkiver> | it was a curious incident in any case so :) |
| 21:32:37 | <@OrIdow6> | I do like #curiousincidentoftheconfusedcatinthenighttime |
| 21:32:58 | <@arkiver> | well OrIdow6 has spoken |
| 21:33:01 | <@arkiver> | #curiousincidentoftheconfusedcatinthenighttime it is |
| 21:46:48 | <@arkiver> | also thanks AK for the channel name :) |
| 21:46:58 | <AK> | I plead the fifth |
| 21:47:08 | <h2ibot> | OrIdow6 uploaded File:CuriousCat logo.png: https://wiki.archiveteam.org/?title=File%3ACuriousCat%20logo.png |
| 22:12:59 | | BlueMaxima joins |
| 22:22:52 | | lennier1 (lennier1) joins |
| 22:53:06 | | Hackerpcs quits [Client Quit] |
| 22:57:03 | | Myself quits [Ping timeout: 258 seconds] |
| 22:58:19 | | Hackerpcs (Hackerpcs) joins |
| 23:36:03 | | Myself (myself) joins |
| 23:42:34 | | Arcorann (Arcorann) joins |