00:09:40 | | yawkat` quits [Ping timeout: 255 seconds] |
00:10:56 | | yawkat (yawkat) joins |
00:25:45 | | HP_Archivist quits [Client Quit] |
00:29:15 | | Unholy23619246453 quits [Ping timeout: 265 seconds] |
00:35:07 | | leo60228 quits [Quit: ZNC 1.8.2 - https://znc.in] |
00:35:57 | | leo60228 (leo60228) joins |
00:42:01 | | dxrt_ is now known as dxrt |
00:42:01 | | dxrt is now authenticated as dxrt |
00:42:01 | | dxrt quits [Changing host] |
00:42:01 | | dxrt (dxrt) joins |
00:42:01 | | @ChanServ sets mode: +o dxrt |
00:58:21 | <pabs> | c3manu: also https://wiki.archiveteam.org/index.php/Mastodon |
01:29:52 | <@JAA> | arkiver: TL;DR on that is: there are three forums for different regions (forum.worldoftanks.{com,eu,asia}), all closing 2024-05-20 09:00, JS challenge can be passed by finding the smallest integer i (from 0) for which `md5(tpc + '::' + i)` starts with `chk` and setting two cookies to tpc and i (should be done in pipeline.py probably, but also failing the item if a response has the challenge later), |
01:29:58 | <@JAA> | strict throttling to 0.5 req/IP/s. URLs without slugs redirect to the canonical URL iff there are no special characters in the slug, else that needs to be found from the HTML. I can share my qwarc script if that helps. |
01:30:34 | <nicolas17> | what's tpc? |
01:30:58 | <@JAA> | No idea what it stands for. It's a variable in the JS code. |
01:31:12 | <nicolas17> | I mean does it come from the server? |
01:31:16 | <@JAA> | Yes |
01:31:19 | <@JAA> | And chk, too |
01:31:31 | <@JAA> | I've only seen chk = '0001', but maybe that varies. |
01:31:40 | <nicolas17> | yeah I was going to ask about chk next :p |
01:32:16 | <nicolas17> | if they used sha256(sha256(x)) we could have asked crypto miners for help /s |
01:32:58 | <nicolas17> | I thought that said 'chk' rather than `chk` and I was like "huh, you mean the first 3 bytes of md5 being ascii 'c' 'h' 'k'?" |
01:33:10 | <@JAA> | I mean, solving the challenge takes 25 ms with my non-optimised Python code. |
01:33:33 | <@JAA> | Largest i I've seen was under 300k. |
01:35:50 | <@JAA> | Specifically the hex digest starts with the value of the `chk` variable, yeah. |
01:36:16 | <@JAA> | E.g. 95c8dc44a288bde976f4a952d73e5b4c::8965 |
01:36:23 | <@JAA> | → 000128a595486a61820c22e5b7b11328 |
01:36:39 | <nicolas17> | do we have md5 readily available in wget-lua? |
01:37:09 | <@JAA> | It's almost certainly easier to do this in pipeline.py. |
01:37:47 | <nicolas17> | I'm not awake enough for this |
01:37:53 | <nicolas17> | yes, of course, pipeline.py |
02:01:40 | <fireonlive> | some nice piping 🙂↕️ |
02:31:20 | | Island_ joins |
02:35:53 | | Island quits [Ping timeout: 265 seconds] |
02:58:24 | <pabs> | does AT have any sort of webring archiving going on? |
02:58:27 | <pabs> | for eg I see "The sausage champs webring!" on http://meatspace.debian.net/planet/ |
02:58:42 | <pabs> | but neither the previous/next URLs link to any webring stuff |
03:02:48 | <fireonlive> | not that i know of |
04:20:42 | | Campbellh joins |
04:41:33 | <@arkiver> | JAA: how long is the solved challenge usable? |
04:45:16 | <@arkiver> | JAA: do they allow for attachments or embedeed images? |
04:51:23 | <@JAA> | arkiver: Hours, but I don't know exactly. |
04:51:30 | <@JAA> | Yes, there are image uploads. |
04:51:58 | <@JAA> | This has one: https://forum.worldoftanks.com/index.php?/topic/671288-misinformation/ |
04:52:36 | <@JAA> | Also embedded Imgur images, e.g. https://forum.worldoftanks.com/index.php?/topic/671359-theyre-all-over-the-place/ |
04:54:30 | <fireonlive> | i'd guess we'd have to enforce concurrency as well |
04:54:54 | <@JAA> | The challenge value seems to depend on the IP, so probably can't just hardcode it. |
04:55:00 | <nicolas17> | aww |
04:55:04 | <nicolas17> | was about to ask that |
04:55:32 | <nicolas17> | put the challenge response on a downloaded file like the dictionary and let everyone use the same and update it hourly or so :P |
04:55:33 | <@JAA> | Enforcing the concurrency might not be needed. The throttling happens server-side. You'll still average one request per 2 seconds. |
04:55:50 | <nicolas17> | but if it's per IP then nope |
04:56:11 | <fireonlive> | ah they don't ban/429 you or such? |
04:56:23 | <nicolas17> | they delay the response? |
04:57:15 | <@JAA> | Yeah, response gets delayed. The ban I saw was a connection timeout. |
04:57:37 | <@JAA> | That ban was when I tried 50 connections, I think. |
04:57:51 | <@JAA> | 10 was fine but 20 seconds average response time. |
04:57:52 | <fireonlive> | ahh |
04:57:58 | | Campbellh quits [Remote host closed the connection] |
04:58:13 | <nicolas17> | maybe not a ban |
04:58:22 | <fireonlive> | so those with warriors and 6 shouldn't be too hard pressed |
04:58:24 | <nicolas17> | maybe they just delayed you so much that it exceeded a timeout elsewhere :P |
04:58:54 | <@JAA> | Well, it lasted for 10-15 minutes after I stopped it, so... |
04:59:17 | <nicolas17> | oh ok yeah |
05:00:03 | <@JAA> | Yeah, challenge solution isn't reusable from another IP. |
05:01:24 | <@JAA> | Doesn't appear to be UA-dependent though. |
05:01:52 | <@JAA> | It *is* reusable between the three sites. |
05:03:41 | <@JAA> | (Or more specifically, the .com challenge solution works on .eu for me.) |
05:04:18 | <nicolas17> | if it takes milliseconds, and it can be done from the python script, I guess it doesn't actually matter much |
05:04:59 | <@JAA> | Yeah, I restricted my solver loop to 10 million hashes, which takes a few seconds. Realistically, it should never hit that. |
05:05:58 | <@JAA> | But we'd probably have separate items per site, and having to only solve the challenge once and then use the cookies for the entire multiitem is definitely useful. |
05:06:40 | <@JAA> | Or I guess keep the cookies in memory in pipeline.py, check whether they're still valid by making one request, solve the challenge if expired, or something along those lines. |
05:06:57 | <nicolas17> | a global var that spans more than one multiitem yeah |
05:07:29 | <nicolas17> | if it lasts hours, just renew it every 15/30 mins or so? |
05:08:37 | <@JAA> | Looks like the solution I got at 18:45 is still valid. |
05:08:50 | <@JAA> | So over 10 hours |
05:09:10 | <nicolas17> | ...ok what is this even supposed to protect? :D |
05:09:17 | <@JAA> | I suspect they might rotate it at a fixed time though. |
05:09:27 | <@JAA> | E.g. once per day or whatever. |
05:09:34 | <@JAA> | Low-effort spammers, I imagine. |
05:10:05 | <@JAA> | The kind you instantly get when operating a public forum with open registration. |
05:10:43 | | etnguyen03 quits [Remote host closed the connection] |
05:25:17 | | JaffaCakes118 quits [Remote host closed the connection] |
05:28:58 | | JaffaCakes118 (JaffaCakes118) joins |
06:05:05 | | BlueMaxima quits [Read error: Connection reset by peer] |
06:16:11 | <@arkiver> | is pad.riseup.net suddenly gone? |
06:17:40 | <fireonlive> | hmm. 404 |
06:18:31 | <@JAA> | I'd think that's unintentional. |
06:18:46 | <@arkiver> | pad.riseup.net is still listed at https://riseup.net/accounts |
06:18:55 | <@arkiver> | anyone with a contact there that could ask? |
06:24:28 | <@arkiver> | i created a ticket |
06:31:36 | | DopefishJustin quits [Remote host closed the connection] |
06:41:07 | | Island_ quits [Read error: Connection reset by peer] |
06:41:08 | | DopefishJustin joins |
06:41:08 | | DopefishJustin is now authenticated as DopefishJustin |
06:48:16 | | Island joins |
07:06:10 | | Unholy23619246453 (Unholy2361) joins |
07:41:34 | | Island quits [Read error: Connection reset by peer] |
08:19:26 | | Island joins |
08:43:56 | <h2ibot> | Manu edited Mailman/2 (+39, /* http://linuxbox.org/pipermail lost */): https://wiki.archiveteam.org/?diff=52261&oldid=52234 |
08:48:30 | | Island quits [Read error: Connection reset by peer] |
08:49:57 | <h2ibot> | Manu edited Mailman/2 (+98, /* https://lists.linuxfromscratch.org/ saved */): https://wiki.archiveteam.org/?diff=52262&oldid=52261 |
09:00:05 | | Bleo182600722719 quits [Client Quit] |
09:01:24 | | Bleo182600722719 joins |
09:04:59 | <h2ibot> | Manu edited Mailman/2 (+15, /* correction:…): https://wiki.archiveteam.org/?diff=52263&oldid=52262 |
09:10:29 | | loug joins |
09:34:05 | | nic8693 quits [Read error: Connection reset by peer] |
09:38:09 | | techdude3000 joins |
09:38:26 | | techdude3000 quits [Client Quit] |
09:44:07 | | nic8693 (nic) joins |
10:04:45 | | JaffaCakes118 quits [Remote host closed the connection] |
10:22:59 | | JaffaCakes118 (JaffaCakes118) joins |
10:26:08 | | Wohlstand (Wohlstand) joins |
10:30:04 | | eroc1990 quits [Client Quit] |
10:34:48 | | eroc1990 (eroc1990) joins |
11:01:43 | | Gereon quits [Ping timeout: 272 seconds] |
11:37:01 | | Notrealname1234 (Notrealname1234) joins |
11:50:07 | | JaffaCakes118_2 (JaffaCakes118) joins |
11:53:55 | | JaffaCakes118 quits [Ping timeout: 255 seconds] |
12:05:37 | | Gereon0 (Gereon) joins |
12:06:04 | | Notrealname1234 quits [Read error: Connection reset by peer] |
12:06:45 | | driib quits [Client Quit] |
12:07:31 | | driib (driib) joins |
12:16:59 | | Notrealname1234 (Notrealname1234) joins |
12:17:18 | | Notrealname1234 quits [Client Quit] |
12:19:31 | | driib quits [Client Quit] |
12:26:17 | | driib (driib) joins |
12:34:19 | | JaffaCakes118_2 quits [Read error: Connection reset by peer] |
12:35:08 | | JaffaCakes118 (JaffaCakes118) joins |
12:43:12 | | driib quits [Client Quit] |
12:45:00 | | driib (driib) joins |
12:47:37 | | lunik11 quits [Quit: :x] |
12:47:54 | | Wohlstand quits [Remote host closed the connection] |
12:48:15 | | lunik11 joins |
12:52:23 | | Wohlstand (Wohlstand) joins |
13:20:32 | | Wohlstand quits [Client Quit] |
13:35:33 | | MrMcNuggets (MrMcNuggets) joins |
14:25:34 | | Arcorann quits [Ping timeout: 255 seconds] |
14:38:00 | <@arkiver> | JAA: i'm not completely sure i'll be able to get a project runnig before the deadline |
14:38:13 | <@arkiver> | JAA: how does one trigger the js challenge page? |
14:39:29 | <@arkiver> | rewby: we have a short time deadline unfortunately. i'm not sure i'll be able to get a project running i time, but perhaps a target can be created just in case |
14:39:33 | <@arkiver> | for |
14:39:43 | <@arkiver> | archiveteam_worldoftanksforum_ |
14:39:54 | <@arkiver> | worldoftanksforum_ |
14:40:01 | <@arkiver> | Archive Team World of Tanks forum: |
14:43:47 | | MrMcNuggets quits [Ping timeout: 265 seconds] |
14:45:23 | | etnguyen03 (etnguyen03) joins |
14:48:13 | | etnguyen03 quits [Remote host closed the connection] |
14:50:36 | | etnguyen03 (etnguyen03) joins |
14:53:22 | | MrMcNuggets (MrMcNuggets) joins |
14:53:56 | | MrMcNuggets quits [Client Quit] |
15:07:25 | | parfait quits [Ping timeout: 255 seconds] |
15:17:56 | <fuzzy8021> | JAA how many ips are you thinking you would need on a box? |
15:18:17 | <katia> | how many you got? 👀 |
15:19:32 | <fuzzy8021> | borrowing a few atm |
15:19:55 | <fuzzy8021> | nothing like what some of the others here have access too |
15:22:03 | <h2ibot> | Manu edited Mailman/2 (+76, /* http://linuxmafia.com/pipermail/ archived */): https://wiki.archiveteam.org/?diff=52264&oldid=52263 |
15:39:03 | <@JAA> | arkiver: Accessing the site with cleared cookies returns the JS challenge. |
15:40:21 | <@JAA> | fuzzy8021: A /27 probably. Maybe a /28 would do. |
15:43:55 | <@JAA> | I should have a complete-ish copy of .asia's important pages now. Homepage, forums, topics, no images etc. |
15:44:07 | <@JAA> | (.asia is far smaller than the other two.) |
15:44:33 | <@JAA> | There's a chance .com will finish in time with my current setup. .eu seems very unlikely. |
16:01:57 | <fuzzy8021> | JAA i can spin up a box with a /27 if you want it. give me an hour |
16:02:24 | | nicolas17 quits [Remote host closed the connection] |
16:02:45 | | nicolas17 joins |
16:05:18 | <@JAA> | fuzzy8021: Sounds great! |
16:06:58 | <fuzzy8021> | ubuntu ok? |
16:09:19 | <@JAA> | I'd prefer Debian, but I can make Ubuntu work. |
16:24:59 | | pokechu22 quits [Read error: Connection reset by peer] |
16:25:03 | | pokechu22 (pokechu22) joins |
17:02:51 | | nulldata quits [Changing host] |
17:02:51 | | nulldata (nulldata) joins |
17:27:41 | | evanim_ joins |
17:29:37 | | evanim quits [Ping timeout: 255 seconds] |
17:29:37 | | evanim_ is now known as evanim |
17:42:04 | <nicolas17> | Matter 1.3 specifications are out, not sure if AB or #// is the best way to archive the PDFs https://transfer.archivete.am/inline/cUcYO/matter1.3.txt |
17:45:45 | <fireonlive> | ab |
17:47:15 | | Notrealname1234 (Notrealname1234) joins |
18:16:15 | | Notrealname1234 quits [Client Quit] |
18:48:22 | | tzt quits [Ping timeout: 255 seconds] |
18:49:19 | | nicolas17 quits [Ping timeout: 265 seconds] |