01:02:43<pabs>cloudflare turnstile on https://akvopedia.org/
01:07:27<pabs>403 in wikibot and with curl locally https://www.appropedia.org/
01:08:37<pabs>hmm, not TLS fingerprinting though
01:09:35<BlankEclair>trying to dump, but also trying to figure out the http cookie file format
01:10:27<BlankEclair>oh i forgot expiry
01:13:32<pabs>BlankEclair: these curl params seem to be needed: -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:128.0) Gecko/20100101 Firefox/128.0' -H 'Pragma: no-cache'
01:13:55<BlankEclair>for cf_clearance, you need to send the cookie (duh), same ip, and same user agent
01:14:14<BlankEclair>now dumping
01:25:29pabs quits [Ping timeout: 260 seconds]
01:36:05pabs (pabs) joins
01:40:39davispuh quits [Ping timeout: 260 seconds]
01:43:44<pabs>the params were for https://www.appropedia.org/
01:48:18<BlankEclair>https://archive.org/details/wiki-akvopedia.org_s_wiki-20250806
01:48:21<BlankEclair>ah okay
01:51:43pabs needs to get curlmin setup...
01:52:19nepeat quits [Ping timeout: 260 seconds]
01:57:54nepeat (nepeat) joins
02:25:46<pabs>cool, curlmin works great
02:25:50<pabs>curlmin++
02:25:51<eggdrop>[karma] 'curlmin' now has 1 karma!
02:25:57<pabs>dh-make-golang++
02:25:58<eggdrop>[karma] 'dh-make-golang' now has 1 karma!
02:26:27<pabs>hmm, curlmin does need a mode for using args instead of stdin or a string...
02:28:14<pabs>time to learn Golang :/
04:46:22katia leaves
04:56:24katia (katia) joins
06:10:39Matthww quits [Quit: The Lounge - https://thelounge.chat]
08:26:04<@arkiver>i remember talking about this before, but can't find it in my logs
08:26:42<@arkiver>is there a way to list the domains used for hosting images for a wiki?
08:26:48<@arkiver>i don't see it in the siteinfo API data
08:28:31<@arkiver>i do see something there is an "externalimages" field in the siteinfo data
08:39:55<@arkiver>DigitalDragon: i'm just going to dump some more questions in here
08:40:33<@arkiver>are there cases of pages that have an API page like https://howtotrainyourdragon.fandom.com/api.php?action=query&prop=info&inprop=url&titles=Toothless_(Franchise) but no HTML rendered version of that page?
08:42:27<@arkiver>i also hope to have a look at the API responses for every _type_ of wiki suported by the wikiteam tools (and perhaps some that are not supported?)
08:44:51<@arkiver>are there cases of wikis that do not have their API publicly accessible?
09:08:52davispuh joins
09:15:04davispuh quits [Ping timeout: 260 seconds]
11:22:18Matthww joins
13:59:18<DigitalDragons>for images: yes, see https://mediawiki.org/wiki/API:Filerepoinfo ex. https://howtotrainyourdragon.fandom.com/api.php?action=query&meta=filerepoinfo
14:00:54<@arkiver>DigitalDragons: tank you
14:00:56<@arkiver>thank*
14:30:52<DigitalDragons>I've seen examples of "pages" on Fandom that redirect to non-mediawiki discussion posts, and also some wikis in other places with "private" pages that still appear in the API
14:31:23<DigitalDragons>Some wikis do have apis disabled yes
14:34:59<@arkiver>DigitalDragons: do you think it would be problematic for sites if we load the filerepoinfo page for each wiki page?
14:35:26<@arkiver>we could also not do it, but it'll be a bit less clean
14:51:47<@arkiver>for discovery, this will only discover in-wiki links
14:58:54leo60228 quits [Quit: ZNC 1.8.2 - https://znc.in]
14:59:44leo60228 (leo60228) joins
15:10:57<DigitalDragons>would it be substantially less clean? i'm not sure how expensive that call specifically is, but it's probably good to keep requests lower
15:13:25<DigitalDragons>we've had Fandom pop into #wikibot and ask us to chill for scraping more than one or two wikis at the same time
18:58:36DogsRNice joins
19:11:40DogsRNice quits [Client Quit]
20:41:06taavi quits [Remote host closed the connection]
20:42:53taavi (taavi) joins
21:08:43nulldata-alt (nulldata) joins
21:41:27useretail joins
22:39:28TheTechRobo quits [Quit: Ping timeout (120 seconds)]
22:42:58TheTechRobo (TheTechRobo) joins
23:09:49nulldata-alt quits [Ping timeout: 260 seconds]