| 01:02:43 | <pabs> | cloudflare turnstile on https://akvopedia.org/ |
| 01:07:27 | <pabs> | 403 in wikibot and with curl locally https://www.appropedia.org/ |
| 01:08:37 | <pabs> | hmm, not TLS fingerprinting though |
| 01:09:35 | <BlankEclair> | trying to dump, but also trying to figure out the http cookie file format |
| 01:10:27 | <BlankEclair> | oh i forgot expiry |
| 01:13:32 | <pabs> | BlankEclair: these curl params seem to be needed: -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:128.0) Gecko/20100101 Firefox/128.0' -H 'Pragma: no-cache' |
| 01:13:55 | <BlankEclair> | for cf_clearance, you need to send the cookie (duh), same ip, and same user agent |
| 01:14:14 | <BlankEclair> | now dumping |
| 01:25:29 | | pabs quits [Ping timeout: 260 seconds] |
| 01:36:05 | | pabs (pabs) joins |
| 01:40:39 | | davispuh quits [Ping timeout: 260 seconds] |
| 01:43:44 | <pabs> | the params were for https://www.appropedia.org/ |
| 01:48:18 | <BlankEclair> | https://archive.org/details/wiki-akvopedia.org_s_wiki-20250806 |
| 01:48:21 | <BlankEclair> | ah okay |
| 01:51:43 | | pabs needs to get curlmin setup... |
| 01:52:19 | | nepeat quits [Ping timeout: 260 seconds] |
| 01:57:54 | | nepeat (nepeat) joins |
| 02:25:46 | <pabs> | cool, curlmin works great |
| 02:25:50 | <pabs> | curlmin++ |
| 02:25:51 | <eggdrop> | [karma] 'curlmin' now has 1 karma! |
| 02:25:57 | <pabs> | dh-make-golang++ |
| 02:25:58 | <eggdrop> | [karma] 'dh-make-golang' now has 1 karma! |
| 02:26:27 | <pabs> | hmm, curlmin does need a mode for using args instead of stdin or a string... |
| 02:28:14 | <pabs> | time to learn Golang :/ |
| 04:46:22 | | katia leaves |
| 04:56:24 | | katia (katia) joins |
| 06:10:39 | | Matthww quits [Quit: The Lounge - https://thelounge.chat] |
| 08:26:04 | <@arkiver> | i remember talking about this before, but can't find it in my logs |
| 08:26:42 | <@arkiver> | is there a way to list the domains used for hosting images for a wiki? |
| 08:26:48 | <@arkiver> | i don't see it in the siteinfo API data |
| 08:28:31 | <@arkiver> | i do see something there is an "externalimages" field in the siteinfo data |
| 08:39:55 | <@arkiver> | DigitalDragon: i'm just going to dump some more questions in here |
| 08:40:33 | <@arkiver> | are there cases of pages that have an API page like https://howtotrainyourdragon.fandom.com/api.php?action=query&prop=info&inprop=url&titles=Toothless_(Franchise) but no HTML rendered version of that page? |
| 08:42:27 | <@arkiver> | i also hope to have a look at the API responses for every _type_ of wiki suported by the wikiteam tools (and perhaps some that are not supported?) |
| 08:44:51 | <@arkiver> | are there cases of wikis that do not have their API publicly accessible? |
| 09:08:52 | | davispuh joins |
| 09:15:04 | | davispuh quits [Ping timeout: 260 seconds] |
| 11:22:18 | | Matthww joins |
| 13:59:18 | <DigitalDragons> | for images: yes, see https://mediawiki.org/wiki/API:Filerepoinfo ex. https://howtotrainyourdragon.fandom.com/api.php?action=query&meta=filerepoinfo |
| 14:00:54 | <@arkiver> | DigitalDragons: tank you |
| 14:00:56 | <@arkiver> | thank* |
| 14:30:52 | <DigitalDragons> | I've seen examples of "pages" on Fandom that redirect to non-mediawiki discussion posts, and also some wikis in other places with "private" pages that still appear in the API |
| 14:31:23 | <DigitalDragons> | Some wikis do have apis disabled yes |
| 14:34:59 | <@arkiver> | DigitalDragons: do you think it would be problematic for sites if we load the filerepoinfo page for each wiki page? |
| 14:35:26 | <@arkiver> | we could also not do it, but it'll be a bit less clean |
| 14:51:47 | <@arkiver> | for discovery, this will only discover in-wiki links |
| 14:58:54 | | leo60228 quits [Quit: ZNC 1.8.2 - https://znc.in] |
| 14:59:44 | | leo60228 (leo60228) joins |
| 15:10:57 | <DigitalDragons> | would it be substantially less clean? i'm not sure how expensive that call specifically is, but it's probably good to keep requests lower |
| 15:13:25 | <DigitalDragons> | we've had Fandom pop into #wikibot and ask us to chill for scraping more than one or two wikis at the same time |
| 18:58:36 | | DogsRNice joins |
| 19:11:40 | | DogsRNice quits [Client Quit] |
| 20:41:06 | | taavi quits [Remote host closed the connection] |
| 20:42:53 | | taavi (taavi) joins |
| 21:08:43 | | nulldata-alt (nulldata) joins |
| 21:41:27 | | useretail joins |
| 22:39:28 | | TheTechRobo quits [Quit: Ping timeout (120 seconds)] |
| 22:42:58 | | TheTechRobo (TheTechRobo) joins |
| 23:09:49 | | nulldata-alt quits [Ping timeout: 260 seconds] |