| 01:00:47 | | dm4v quits [Read error: Connection reset by peer] |
| 01:03:15 | | dm4v joins |
| 01:03:15 | | dm4v is now authenticated as dm4v |
| 01:03:15 | | dm4v quits [Changing host] |
| 01:03:15 | | dm4v (dm4v) joins |
| 01:12:10 | <h2ibot> | OrIdow6 edited CuriousCat (+384, Some updates): https://wiki.archiveteam.org/?diff=48081&oldid=48058 |
| 01:31:13 | <h2ibot> | OrIdow6 edited Framasoft (+952, Ones closing in a week): https://wiki.archiveteam.org/?diff=48082&oldid=47022 |
| 01:32:30 | <@OrIdow6> | So for Framabooklin I think that in addition to saving them to the WBM I will look into putting them up as IA items |
| 01:50:12 | <@JAA> | Ok, that's pretty hilarious. Disqus is tracking WBM snapshot accesses of Picosong: https://disqus.com/home/forum/picosong/ |
| 02:03:28 | <@OrIdow6> | Wonder if it's the WBM trying to SPN resources that's doing it |
| 02:04:04 | | Lord_Nightmare quits [Ping timeout: 265 seconds] |
| 02:04:09 | <@JAA> | Yeah, that'd be my guess as well. |
| 02:09:29 | | Jake (Jake) joins |
| 02:32:25 | | Jake quits [Ping timeout: 252 seconds] |
| 02:43:13 | | lennier1 quits [Ping timeout: 265 seconds] |
| 02:44:37 | | lennier1 (lennier1) joins |
| 02:59:48 | | qw3rty joins |
| 03:15:00 | | tbc1887 (tbc1887) joins |
| 03:27:12 | | ThreeHM quits [Ping timeout: 265 seconds] |
| 03:29:40 | | ThreeHM (ThreeHeadedMonkey) joins |
| 03:53:35 | | G4te_Keep3r quits [Read error: Connection reset by peer] |
| 03:54:38 | | G4te_Keep3r joins |
| 04:02:26 | | mutantmnky quits [Remote host closed the connection] |
| 04:03:37 | | mutantmnky (mutantmonkey) joins |
| 04:04:05 | | ThreeHM quits [Ping timeout: 252 seconds] |
| 04:06:04 | | ThreeHM (ThreeHeadedMonkey) joins |
| 04:12:09 | | tzt quits [Ping timeout: 265 seconds] |
| 04:13:59 | | mutantmnky quits [Remote host closed the connection] |
| 04:15:07 | | Jake (Jake) joins |
| 04:15:07 | | mutantmnky (mutantmonkey) joins |
| 04:41:22 | | achivarin quits [Remote host closed the connection] |
| 04:44:50 | <h2ibot> | OrIdow6 edited Framasoft (+0, Word): https://wiki.archiveteam.org/?diff=48083&oldid=48082 |
| 04:45:15 | | tzt (tzt) joins |
| 04:46:31 | | Jake quits [Client Quit] |
| 04:57:45 | | achivarin (achivarin) joins |
| 05:16:09 | | eroc1990 quits [Client Quit] |
| 05:27:16 | | eroc1990 (eroc1990) joins |
| 05:49:14 | | BlueMaxima quits [Client Quit] |
| 06:19:29 | | DogsRNice quits [Read error: Connection reset by peer] |
| 06:23:29 | | tbc1887 quits [Read error: Connection reset by peer] |
| 07:18:50 | | tbc1887 (tbc1887) joins |
| 07:28:35 | | Megame (Megame) joins |
| 07:58:04 | <IDK_> | The capture is estimated to start in 1664 minutes. You may close your browser window and the page will still be saved. |
| 07:58:38 | <IDK_> | is IA getting lots of requests/ddosed or its a bug |
| 07:59:39 | <IDK_> | when it actually saves it just went to an error |
| 08:06:27 | | HP_Archivist quits [Ping timeout: 252 seconds] |
| 09:11:10 | | tbc1887 quits [Client Quit] |
| 09:49:59 | <Jon> | one that's been on my watchlist for a while; ninlive.com has removed direct access to their downloads due to ISP issues, they're hosting a torrent of all the material (~1T). I've talked to their admin about uploading to the live music archive before and he wasn't keen at the time so I held off. securing a copy of this torrent before I think of next steps. |
| 09:50:11 | <Jon> | I've got about half of it atm |
| 09:53:24 | | march_happy (march_happy) joins |
| 11:31:11 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
| 11:42:32 | | sec^nd quits [Remote host closed the connection] |
| 11:50:55 | | sec^nd (second) joins |
| 12:33:12 | <achivarin> | Has anyone tried using cloudscraper to get past Cloudflare captchas? |
| 13:08:36 | | qwertyasdfuiopghjkl joins |
| 13:13:27 | | pabs wonders if the cloudflare warp vpn thing would get around cloudflare captchas |
| 13:18:48 | | Arcorann quits [Ping timeout: 265 seconds] |
| 13:39:11 | | Megame quits [Client Quit] |
| 13:43:25 | | onetruth quits [Ping timeout: 252 seconds] |
| 13:44:28 | <AK> | iirc I think you can still get them, but it makes you look less suspicious to their systems, don't think it removes it completely though |
| 15:00:07 | <achivarin> | pabs: I'll try it. Thanks for the tip |
| 15:00:56 | <achivarin> | AK: What other ways are there to avoid 403 and captchas? |
| 15:02:08 | <AK> | https://support.cloudflare.com/hc/en-us/articles/200170136-Understanding-Cloudflare-Captchas-Managed-Challenge-and-Challenge-Passage Overview talks a little bit about what triggers is |
| 15:02:10 | <AK> | *it |
| 15:05:08 | <@JAA> | achivarin: cloudscraper may execute arbitrary JS on your system. :-| |
| 15:05:57 | <achivarin> | JAA: Bloody hell. Nevermind then |
| 15:06:11 | <@JAA> | Oh actually, I see they have a 'native' Python-only solver now. |
| 15:08:17 | <achivarin> | JAA: Still seem a bit sketchy to me, the repo doesn't even enable issues |
| 15:10:39 | <@JAA> | Could be that they just received too many stupid questions. It's a pretty popular package. |
| 15:13:58 | | HP_Archivist (HP_Archivist) joins |
| 15:14:16 | | HP_Archivist quits [Remote host closed the connection] |
| 15:15:28 | | HP_Archivist (HP_Archivist) joins |
| 15:15:46 | | HP_Archivist quits [Remote host closed the connection] |
| 15:16:58 | | HP_Archivist (HP_Archivist) joins |
| 15:17:16 | | HP_Archivist quits [Remote host closed the connection] |
| 15:18:28 | | HP_Archivist (HP_Archivist) joins |
| 15:18:46 | | HP_Archivist quits [Remote host closed the connection] |
| 15:19:58 | | HP_Archivist (HP_Archivist) joins |
| 15:20:16 | | HP_Archivist quits [Remote host closed the connection] |
| 15:21:28 | | HP_Archivist (HP_Archivist) joins |
| 15:21:46 | | HP_Archivist quits [Remote host closed the connection] |
| 15:22:58 | | HP_Archivist (HP_Archivist) joins |
| 15:23:16 | | HP_Archivist quits [Remote host closed the connection] |
| 15:24:28 | | HP_Archivist (HP_Archivist) joins |
| 15:24:46 | | HP_Archivist quits [Remote host closed the connection] |
| 15:25:30 | | HP_Archivist (HP_Archivist) joins |
| 15:27:22 | | HackMii_ quits [Remote host closed the connection] |
| 15:28:43 | | HackMii_ (hacktheplanet) joins |
| 15:44:11 | | Lord_Nightmare (Lord_Nightmare) joins |
| 16:02:01 | | Lord_Nightmare quits [Ping timeout: 252 seconds] |
| 16:03:29 | | bonga quits [Ping timeout: 252 seconds] |
| 16:05:14 | | HP_Archivist quits [Client Quit] |
| 16:40:31 | <achivarin> | JAA: Thanks for the advice. I'm gonna try it in a VM. |
| 17:03:25 | <h2ibot> | JustAnotherArchivist edited Your Shot (+23): https://wiki.archiveteam.org/?diff=48084&oldid=47571 |
| 17:03:26 | <h2ibot> | JustAnotherArchivist edited Plays.tv (+23): https://wiki.archiveteam.org/?diff=48085&oldid=47747 |
| 17:04:25 | <h2ibot> | JustAnotherArchivist edited Google Fusion Tables (+23): https://wiki.archiveteam.org/?diff=48086&oldid=47576 |
| 17:04:26 | <h2ibot> | JustAnotherArchivist edited SingStar (+23): https://wiki.archiveteam.org/?diff=48087&oldid=47583 |
| 17:04:27 | <h2ibot> | JustAnotherArchivist edited VampireFreaks (+24): https://wiki.archiveteam.org/?diff=48088&oldid=47581 |
| 17:05:25 | <h2ibot> | JustAnotherArchivist edited BBC Mixital (+23): https://wiki.archiveteam.org/?diff=48089&oldid=47582 |
| 17:05:26 | <h2ibot> | JustAnotherArchivist edited Mixer (+23): https://wiki.archiveteam.org/?diff=48090&oldid=47595 |
| 17:05:27 | <h2ibot> | JustAnotherArchivist edited Soup.io (+23): https://wiki.archiveteam.org/?diff=48091&oldid=47607 |
| 17:05:28 | <h2ibot> | JustAnotherArchivist edited Clutch (+23): https://wiki.archiveteam.org/?diff=48092&oldid=47817 |
| 17:06:25 | <h2ibot> | JustAnotherArchivist edited 腾讯微博 (+23): https://wiki.archiveteam.org/?diff=48093&oldid=47608 |
| 17:06:26 | <h2ibot> | JustAnotherArchivist edited NAVERまとめ (+23): https://wiki.archiveteam.org/?diff=48094&oldid=47743 |
| 17:06:27 | <h2ibot> | JustAnotherArchivist edited Nagi (+23): https://wiki.archiveteam.org/?diff=48095&oldid=47611 |
| 17:14:27 | <h2ibot> | JustAnotherArchivist created Samsung XR (+541, Barebones page): https://wiki.archiveteam.org/?title=Samsung%20XR |
| 17:57:57 | | test joins |
| 17:58:56 | | test quits [Remote host closed the connection] |
| 18:06:30 | | HP_Archivist (HP_Archivist) joins |
| 18:32:28 | | nicolas17 joins |
| 18:53:52 | <h2ibot> | JustAnotherArchivist created Twitch Sings (+611, Created page with "{{Infobox project |…): https://wiki.archiveteam.org/?title=Twitch%20Sings |
| 18:53:53 | <h2ibot> | JustAnotherArchivist edited Deathwatch (-27, /* 2020 */ Link to Samsung XR page): https://wiki.archiveteam.org/?diff=48098&oldid=48080 |
| 19:16:59 | | DogsRNice (Webuser299) joins |
| 19:17:28 | | DogsRNice quits [Remote host closed the connection] |
| 19:18:35 | | DogsRNice (Webuser299) joins |
| 19:18:58 | | DogsRNice quits [Remote host closed the connection] |
| 19:20:05 | | DogsRNice (Webuser299) joins |
| 19:20:28 | | DogsRNice quits [Remote host closed the connection] |
| 19:21:36 | | DogsRNice (Webuser299) joins |
| 19:21:58 | | DogsRNice quits [Remote host closed the connection] |
| 19:23:05 | | DogsRNice (Webuser299) joins |
| 19:23:28 | | DogsRNice quits [Remote host closed the connection] |
| 19:24:35 | | DogsRNice (Webuser299) joins |
| 19:24:58 | | DogsRNice quits [Remote host closed the connection] |
| 19:25:17 | <britmob|m> | What is the general opinions on proxies for web archival? I am looking to create a selenium-based archival program with warcprox and was wondering |
| 19:25:32 | <britmob|m> | if proxies are a no-go for abuse/posterity reasons. |
| 19:26:05 | | DogsRNice (Webuser299) joins |
| 19:26:28 | | DogsRNice quits [Remote host closed the connection] |
| 19:27:36 | | DogsRNice (Webuser299) joins |
| 19:27:58 | | DogsRNice quits [Remote host closed the connection] |
| 19:28:29 | <@OrIdow6> | Proxies are a specific instance of the general problem of things between the WARCing element and the site |
| 19:29:01 | <@OrIdow6> | That modify the data on the level WARC saves |
| 19:29:05 | | DogsRNice (Webuser299) joins |
| 19:29:28 | | DogsRNice quits [Remote host closed the connection] |
| 19:30:35 | | DogsRNice (Webuser299) joins |
| 19:30:58 | | DogsRNice quits [Remote host closed the connection] |
| 19:32:05 | | DogsRNice (Webuser299) joins |
| 19:32:24 | <britmob|m> | Yes, I definitely see the problem. Especially for things that will be ingested by Wayback. |
| 19:32:28 | | DogsRNice quits [Remote host closed the connection] |
| 19:33:25 | <@JAA> | Nothing wrong with warcprox (to my knowledge) as long as there's no further weird stuff between warcprox and the site you're archiving. |
| 19:33:59 | <britmob|m> | But as far as I know, SOCKS does not do that. It only proxies TCP/UDP unlike HTTP proxies. Is that still an issue just due to the fact that it's a proxy at all? |
| 19:34:30 | <britmob|m> | JAA: I was referring to an upstream proxy from warcprox to assist in IP ratelimiting. |
| 19:34:42 | <britmob|m> | Which is much more problematic than just warcprox sadly |
| 19:34:49 | <@JAA> | Right, yeah, I'd definitely avoid that. |
| 19:37:11 | <britmob|m> | Even SOCKS, which is just proxying raw packets? As far as I'm aware there are no technical reasons the data would be any different there (of course besides the IP difference). Is it just more of "it's a bad idea to have things in front" kind of thing at that point? |
| 19:37:21 | <@JAA> | Longer version: It's not outright bad if you stick to non-HTTP-aware proxies like SOCKS or Wireguard, but it adds a potential source of problems, so I'd avoid it anyway. |
| 19:37:33 | <britmob|m> | Understandable. Thanks. |
| 20:12:54 | | march_happy quits [Remote host closed the connection] |
| 20:20:18 | | Lord_Nightmare (Lord_Nightmare) joins |
| 20:40:18 | | march_happy (march_happy) joins |
| 20:52:37 | | BlueMaxima joins |
| 21:10:57 | | Jake (Jake) joins |
| 21:25:24 | | pabs quits [Read error: Connection reset by peer] |
| 21:29:27 | | pabs (pabs) joins |
| 22:26:39 | | Lord_Nightmare quits [Read error: Connection reset by peer] |
| 22:32:15 | | Lord_Nightmare (Lord_Nightmare) joins |
| 22:34:56 | | Arcorann (Arcorann) joins |
| 22:35:17 | | Arcorann quits [Remote host closed the connection] |
| 22:36:24 | | Arcorann (Arcorann) joins |
| 22:36:47 | | Arcorann quits [Remote host closed the connection] |
| 22:37:54 | | Arcorann (Arcorann) joins |
| 22:38:17 | | Arcorann quits [Remote host closed the connection] |
| 22:39:24 | | Arcorann (Arcorann) joins |
| 22:39:47 | | Arcorann quits [Remote host closed the connection] |
| 22:40:10 | | Arcorann (Arcorann) joins |
| 22:42:37 | | lennier1 quits [Client Quit] |
| 22:44:06 | | lennier1 (lennier1) joins |
| 23:25:19 | | march_happy quits [Ping timeout: 252 seconds] |
| 23:35:24 | | bonga joins |
| 23:41:16 | <pabs> | achivarin: would be interested to hear about the results of the cloudflare warp vpn vs cloudflare captcha thing |