00:03:28Exorcism quits [Remote host closed the connection]
00:04:31Exorcism (exorcism) joins
00:10:59kiryu joins
00:10:59kiryu quits [Changing host]
00:10:59kiryu (kiryu) joins
00:18:13<imer>no difference
00:21:09<imer>https://transfer.archivete.am/ablqi/2023-09-27_00-18-28.txt these are my findings
00:21:09<imer>arkiver: pokechu22: in summary i'd say: over 1s/request seems to be safe if the response is 200 (probably do 2)
00:21:09<imer>on error redirect the wait should probably be >4s (still bans at 4s on just errors, depending on how common those are I guess 10s might be better?), following redirects doesn't seem to make a difference
00:22:20<pokechu22>OK, and presumably thing will be fine if we scale it up to a 6s (or 12s) wait at concurrency=6? (Assuming things are reasonably staggered)
00:57:39Peronikola joins
01:00:10Peronikola quits [Remote host closed the connection]
01:00:32Peroniko quits [Ping timeout: 252 seconds]
02:19:13kiryu quits [Client Quit]
02:41:19@ChanServ sets mode: +o flashfire42
03:44:02<@arkiver>so we would currently be queuing each url back as every version pokechu22 listed earlier
03:45:21<@arkiver>but i see many of them just redirect tot the 404
03:45:24<@arkiver>to the *
03:50:58<@arkiver>code is online
03:51:35<@arkiver>pokechu22: should we actually queue all types?
03:51:49<@arkiver>most of them are really just redirects to the 404 page
04:13:56@flashfire42 quits [Client Quit]
04:13:56kiska quits [Client Quit]
04:16:18flashfire42 joins
04:17:33kiska (kiska) joins
04:28:11<pokechu22>arkiver: I don't think there's a benefit to queueing monsite, pagesperso, and pagespro versions of the same URL; it'll generally only exist in one of those
04:28:28<pokechu22>If you think we have time, then maybe, but otherwise it's probably not worth it
04:28:31<thuban>i don't believe it does that
04:29:14<thuban>just each variant within the appropriate group https://github.com/ArchiveTeam/pagespersoorange-grab/blob/master/pagespersoorange.lua#L116
04:29:42nulldata (nulldata) joins
04:30:41<pokechu22>Yeah, that looks like a more reasonable approach
04:31:02<pokechu22>though perso.wanadoo is listed twice?
04:32:00<pokechu22>perso.wanadoo.fr is dead (not worth saving at all) and perso.orange.fr is only a link to another domain (low value, especially since the link is only to the front page, but maybe worth doing still)
04:32:36<pokechu22>monsite.wanadoo.fr is a redirect still (but it's the same kind of redirect to the front page)
04:32:53<thuban>we could kick the can down the road a bit by queueing them all, but putting everything other than the working -orange domains into secondary
04:33:44<thuban>(did we ever establish whether the woopic.com cdn is subject to rate limits?)
04:34:24<pokechu22>Yes, imer confirmed that woopic.com has the same rate limits
04:38:02<thuban>i don't see that in those findings--'image' is only listed as tested at 1 req/s (and not being banned)
04:39:58<pokechu22>err, I guess the only finding was that the ban applies to both
04:46:52kiryu joins
04:46:52kiryu quits [Changing host]
04:46:52kiryu (kiryu) joins
05:03:58iCaotix77 joins
05:04:22<thuban>also, aiui current code doesn't do outlinks at all?
05:04:50iCaotix77 quits [Remote host closed the connection]
05:06:05<thuban>imho (a) general outlinks (or at least page requisites) should be sent to urls, even if they can't be processed right now due to the ingest situation, and (b) monsite.woopic.com will probably disappear when monsite-orange.fr does, so we need to get that at least
05:06:31iCaotix joins
05:06:34iCaotix53 joins
05:08:01<thuban>pokechu22, are there any other domains/cdns at risk?
05:08:38<pokechu22>I'm not aware of any
05:08:44<thuban>(there was definitely some 'my site builder'-type stuff in the ab logs, but i don't recall what domain it was on)
05:09:10<pokechu22>But we do know that monsite.woopic.com (and the other woopic subdomain that's some long string of giberish) give 415s when the site is dead
05:09:13<pokechu22>so they are at risk
05:09:21<pokechu22>I think the site builder stuff was onsite but I'll double-check that
05:09:36<thuban>i think so too, but ty
05:11:56iCaotix53 quits [Remote host closed the connection]
05:12:26<pokechu22>Right, the other one was https://0ace2c45a96c481cb5eae36816f50806.cdn.woopic.com/pperso/sitexpress/themes/images/produit/user_bank/bib/icone/2/b31.gif which I think is the site builder thing you're thinking of
05:13:41<thuban>ah, good
06:06:25@ChanServ sets mode: +o flashfire42
06:22:51qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
06:32:24Exorcism quits [Remote host closed the connection]
06:34:53Exorcism (exorcism) joins
06:37:11Exorcism quits [Remote host closed the connection]
06:38:05Exorcism (exorcism) joins
07:25:13<@flashfire42>Fuck it I am restarting my warrior moving it to orange
07:46:35Exorcism9 (exorcism) joins
07:48:31Exorcism quits [Read error: Connection reset by peer]
07:48:31Exorcism9 is now known as Exorcism
08:04:39<imer>Will check woopic ban behavior in a bit
09:14:43<imer>going strong at 2req/s 15min in on both 404 and 200
09:16:22<imer>going to up this to 5 req/s to see if anything happens
09:31:06<imer>doesnt look like it
09:41:25<imer>monsite.woopic 404's seem slow, getting a bunch of random timeouts, but no outright block
09:54:15<imer>yeah, the rate limits (if there is any) are higher
12:13:44imer quits [Ping timeout: 252 seconds]
12:27:14Peroniko joins
12:35:19Peroniko quits [Killed (NickServ (GHOST command used by Peroniko_))]
12:35:25Peroniko_ joins
12:35:42Peroniko_ quits [Client Quit]
12:36:04Peroniko (Peroniko) joins
13:36:55imer (imer) joins
13:58:47Exorcism quits [Remote host closed the connection]
14:01:15Exorcism (exorcism) joins
14:31:31Peroniko quits [Ping timeout: 265 seconds]
14:32:27Peroniko (Peroniko) joins
14:46:01Peroniko quits [Ping timeout: 265 seconds]
14:46:08Peroniko (Peroniko) joins
15:12:02<@arkiver>thanks for the feedback
15:13:51decky_e joins
15:15:59decky quits [Ping timeout: 265 seconds]
15:17:18<imer>what's missing to get started "for real"? I see some people snagged some items earlier :)
15:46:55Exorcism quits [Remote host closed the connection]
15:47:44Exorcism (exorcism) joins
15:55:05Exorcism quits [Remote host closed the connection]
15:55:51Exorcism (exorcism) joins
16:28:00Peroniko quits [Ping timeout: 265 seconds]
16:28:56Peroniko (Peroniko) joins
16:33:09Peroniko quits [Client Quit]
16:33:29Peroniko (Peroniko) joins
16:47:07decky joins
16:49:17decky_e quits [Ping timeout: 252 seconds]
18:15:47Exorcism quits [Remote host closed the connection]
18:16:55Exorcism (exorcism) joins
19:34:05Peroniko quits [Ping timeout: 265 seconds]
19:35:01Peroniko (Peroniko) joins
19:44:03iCaotix quits [Read error: Connection reset by peer]
19:45:51iCaotix joins
19:58:05Exorcism1 (exorcism) joins
19:59:55Exorcism quits [Read error: Connection reset by peer]
19:59:56Exorcism1 is now known as Exorcism
20:06:11systwi quits [Ping timeout: 252 seconds]
20:24:28systwi (systwi) joins
20:46:09Exorcism quits [Remote host closed the connection]
20:47:21Exorcism (exorcism) joins
21:19:44Exorcism quits [Remote host closed the connection]
21:20:52Exorcism (exorcism) joins
22:39:28@flashfire42 quits [Client Quit]
22:39:28kiska quits [Client Quit]
22:42:13flashfire42 joins
22:43:28kiska (kiska) joins
22:46:33@ChanServ sets mode: +o flashfire42
22:51:17iCaotix quits [Read error: Connection reset by peer]
22:52:06iCaotix joins
23:21:26Matthww11 quits [Quit: Ping timeout (120 seconds)]
23:22:31Matthww11 joins