00:00:25 | <pokechu22> | https://github.com/mwclient/mwclient/blob/master/mwclient/client.py#L267-L271 |
00:03:33 | <pokechu22> | Deleting that makes the first request pass, but now https://en.rodovid.org/api.php?gapfilterredir=all&inprop=protection&gapfilterlanglinks=all&gaplimit=500&generator=allpages&gapdir=ascending&continue=&gapnamespace=0&action=query&iiprop=timestamp|user|comment|url|size|sha1|metadata|archivename&prop=info|imageinfo fails (n.b. I've removed &format=json) |
00:04:14 | <pokechu22> | https://www.mediawiki.org/wiki/API:Imageinfo is 1.11, ok... |
00:19:57 | <pokechu22> | Ah, wikiteam tools require special:export, and although I was able to at least get it to the point where it gets a page list, I'm not sure if it can do anything beyond that... |
00:24:53 | <pokechu22> | Yeah, I don't think there are any good alternatives, as Special:Export isn't enabled and the alternative (which --xmlrevisions uses) only exists in newer mediawiki |
00:25:12 | <pokechu22> | I can still generate new page lists though, so that's something |
01:13:19 | <@JAA> | I'll run one AB job each starting from Allpages but without offsite links. That should at least grab a decent copy of the unique current data. |
01:14:43 | <@JAA> | Some of the wikis are pretty broken, by the way. E.g. https://be.rodovid.org/wk/%D0%A1%D0%BF%D1%8D%D1%86%D1%8B%D1%8F%D0%BB%D1%8C%D0%BD%D1%8B%D1%8F:Statistics |
01:14:49 | <@JAA> | > There are ' total pages in the database. |
01:15:07 | <pokechu22> | I'm partway through grabbing page lists (currently on he.rodovid.org) if that's helpful (though I don't think it'll give information that's different from Allpages) |
01:15:43 | <pokechu22> | Interestingly it doesn't seem like I'm getting rate-limited from repeatedly requesting page lists from the API |
01:18:05 | <@JAA> | https://fr.rodovid.org/wk/Special:Statistics claims to have stats for all wikis, but the numbers are totally off. |
01:18:30 | <@JAA> | At least I highly doubt that over 200k entries in the Russian version were created since Jan 1. |
01:21:07 | <@JAA> | Oh, images are served from https://rodovid.org/, that's unfortunate. |
01:26:04 | <@JAA> | Should be fine, the full-size image URLs can be constructed from the thumbs, so can still run that without offsite links. |
02:12:23 | | Iki1 quits [Client Quit] |
02:12:33 | | Iki1 joins |
03:26:42 | <pokechu22> | https://transfer.archivete.am/a98ip/rodovid.org_page_lists.7z is a list of pages for all of them |
05:00:07 | <@JAA> | It gets messier with the page history: https://fr.rodovid.org/history/Personne:689911 vs https://en.rodovid.org/history/Person:689911 |
05:00:20 | <@JAA> | There are translated person pages, too. |
05:02:07 | <@JAA> | And the AB jobs of course blew up via the family trees. :-| |
05:04:03 | <@JAA> | Yeah, this isn't going to work without grabbing lots and lots of duplicates. |
05:04:22 | <@JAA> | List of pages it is, I guess. |
09:26:36 | | kdqep quits [Read error: Connection reset by peer] |
14:59:28 | <Nemo_bis> | https://es.rodovid.org/wk/Especial:Export La acción que has solicitado está restringida a los usuarios de uno de estos grupos: "developer". |
15:03:37 | <Nemo_bis> | pokechu22: you can use prop=revisions so it might be possible to adapt the --xmlrevisions option https://es.rodovid.org/api.php?action=query&prop=revisions&titles=Persona:24256&rvprop=timestamp|user|comment|content |
15:04:13 | <Nemo_bis> | if you get blocked, as a last resort you can try action=raw https://es.rodovid.org/index.php?title=Persona:24256&action=raw |
15:06:20 | | vitzli (vitzli) joins |
15:08:36 | | vitzli quits [Client Quit] |
15:09:50 | <Nemo_bis> | mwclient is only needed for the convenience of generators (not available in early MediaWiki releases) and page continuation (varies a lot across MediaWiki revisions), you could remove it as dependency |
16:54:58 | | Matthww1 quits [Client Quit] |
16:54:58 | | eroc1990 quits [Quit: Ping timeout (120 seconds)] |
16:55:33 | | eroc1990 (eroc1990) joins |
16:55:40 | | Matthww16 joins |
18:12:03 | | tech_exorcist (tech_exorcist) joins |
18:15:02 | | tech_exorcist quits [Client Quit] |
18:20:26 | | tech_exorcist (tech_exorcist) joins |
20:03:16 | | Iki1 quits [Ping timeout: 240 seconds] |
20:24:29 | | tech_exorcist quits [Write error: Broken pipe] |
20:25:31 | | tech_exorcist (tech_exorcist) joins |
21:26:39 | | Iki1 joins |
22:08:54 | | tech_exorcist quits [Client Quit] |
23:39:27 | | Zerote joins |
23:39:44 | | Zerote quits [Remote host closed the connection] |