03:33:41 | <pabs> | https://www.wikihow.com/ is (presumably) huge and hasn't been archived since 2015, wonder if it is worth trying it |
03:40:06 | <@JAA> | We did run it through AB not long ago. |
03:41:31 | <pokechu22> | worth noting that there are some related domains e.g. https://www.wikihow.tech/ which sorta like redirecting back to the main domain? |
03:41:52 | <pokechu22> | For some reason I was under the impression that they had the API disabled, but that doesn't seem to be the case |
03:49:46 | <pabs> | https://kids.kiddle.co/ (encyclopedia for kids) has a broken api.php, Special:Export is 404, other Special: pages too |
05:56:30 | | Webuser792458 joins |
06:01:06 | <Webuser792458> | Does anyone have the official link for WikiForge? The mediawiki wiki (via https://www.mediawiki.org/wiki/Module:Used_by/data.json ) is linking to https://wikiforge.net/ which is a parked domain |
06:02:55 | <pokechu22> | Hmm, it might be dead :| |
06:03:38 | <pokechu22> | https://static-help.wikiforge.net/ still exists, matching https://web.archive.org/web/20230603151437/https://wikiforge.net/, but https://avid.wikiforge.net/ is broken and the main domain redirects to wikiforge.xyz (which is parked) |
06:22:02 | | DogsRNice quits [Read error: Connection reset by peer] |
09:36:53 | | that_lurker quits [Remote host closed the connection] |
09:36:59 | | that_lurker (that_lurker) joins |
09:55:23 | | MrMcNuggets quits [Quit: WeeChat 4.3.2] |
11:07:11 | | balrog quits [Ping timeout: 260 seconds] |
11:10:24 | | balrog (balrog) joins |
11:48:24 | | Webuser792458 quits [Quit: Ooops, wrong browser tab.] |
12:41:41 | | @imer quits [Quit: Oh no] |
13:05:52 | | imer (imer) joins |
13:05:52 | | @ChanServ sets mode: +o imer |
13:11:12 | | Webuser044246 joins |
13:11:57 | <Webuser044246> | Hi |
13:13:51 | <Webuser044246> | I need urgent help preserving The Cutting Room Floor |
13:14:08 | <katia> | do you have a link? |
13:14:28 | <Webuser044246> | https://tcrf.net, the site requires a modified wikiteam script to bypass their anti-scraping measures |
13:15:36 | <Webuser044246> | The site (ironically being about gaming preservation) is under control of a hostile web owner who has been blocking users left and right for clicking on their links wrong |
13:15:41 | <Webuser044246> | The site blocks VPNs and Tor |
13:17:05 | <Webuser044246> | I got one of their personal blocks while trying to scrape the site. Be warned, as when you get one of those, the site will serve up gigabytes of endless content meant to fill up your RAM |
13:19:12 | <katia> | sounds harsh yes. i cannot help, but maybe someone else can get to this |
13:19:45 | <Webuser044246> | I modified my copy of wikiteam to try to scrape their wiki. Maybe I could give patches |
13:20:16 | <katia> | maybe |
13:48:38 | <Webuser044246> | The problem is trying to get more IPs behind it. It's a game of wack-a-mole with the site owner, they actively look at the logs and ban anyone trying to scrape wiki data. |
14:19:44 | | Webuser044246 quits [Client Quit] |
14:33:02 | | Webuser072791 joins |
14:39:27 | | Webuser072791 quits [Client Quit] |
15:16:43 | | Webuser933227 joins |
15:35:18 | | Webuser933227 quits [Client Quit] |
18:28:03 | | MrMcNuggets (MrMcNuggets) joins |
18:50:29 | | qw3rty__ quits [Ping timeout: 276 seconds] |
19:04:23 | <pokechu22> | We did save TCRF.net somewhat recently (and it was large). I don't think we saved hidden palace (which is multiple terabytes) though |
22:53:24 | | Webuser603733 joins |
22:53:44 | <Webuser603733> | TCRF now blocks WikiTeam's user agent |
22:53:54 | <Webuser603733> | For some reason, WikiTeam's user agent parameter is hidden from the help list |
22:54:08 | <Webuser603733> | So I had to look through the source code to find it |
22:58:41 | | Webuser603733 quits [Client Quit] |
22:58:55 | <pokechu22> | https://archive.org/details/wiki-tcrf.net-20230823 - hmm, that's actually almost 2 years ago :| |
23:32:30 | | Webuser899739 joins |
23:32:33 | | Webuser899739 quits [Client Quit] |