00:25:40 | | Craigle quits [Client Quit] |
00:27:16 | | Craigle (Craigle) joins |
01:27:59 | <pokechu22> | I assume it's fine for me to create a new set of items - I don't think I can edit the existing ones on https://archive.org/search.php?query=rodovid%20wikiteam |
01:28:36 | <pokechu22> | (the existing ones are https://archive.org/details/wiki-rorodovidorg; I'm thinking of doing https://archive.org/details/wiki-ro.rodovid.org instead) |
01:29:03 | <@JAA> | No advice on the naming, but yes, you need to create new items for this one. |
01:59:45 | <DiscantX> | I would just add the date (change line 77 of uploader.py to read: identifier = 'wiki-' + wikiname + "_" + wikidate) |
02:17:01 | <pokechu22> | I've uploaded the first 5 of them (https://archive.org/search.php?query=subject%3A%22rodovid%22) but it seems like I'm getting rate-limited by IA now |
02:17:11 | <pokechu22> | (503 Slow Down) |
02:17:54 | <pokechu22> | I'm guessing that means I need to email info@archive.org to get rate limiting removed (or just take multiple days to do it) |
02:18:05 | <DiscantX> | That happened to me a few months ago. You have to email them. |
02:18:26 | <DiscantX> | It took a couple days for them to get back to me but they did it. |
02:19:54 | <pokechu22> | Side note, is there a way to properly specify the license? The picker seemed to only allow choosing CC BY and then it automatically picks 4.0, but the actual one is 2.5 |
02:20:42 | <pokechu22> | (I'm also not using uploader.py, but instead doing them manually - perhaps I should look into that since I do have 20 more of these to do) |
02:23:54 | <DiscantX> | Looking at the code for uploader.py, it looks like it pulls the license straight from the site, so if you use that it should(?) be accurate. |
05:55:16 | | Kuatrero quits [Ping timeout: 240 seconds] |
06:01:02 | | michaelblob_ (michaelblob) joins |
06:01:16 | | michaelblob quits [Ping timeout: 240 seconds] |
16:18:28 | | Connection closed. |
16:18:43 | | atirclog (atirclog) joins |
16:18:43 | | Topic: https://archiveteam.org/index.php?title=WikiTeam |
16:18:43 | | Topic set by JAA at 2020-10-15 00:06:28Z |
16:18:47 | | Current users: atirclog (atirclog), Jake (Jake), igloo22225 (igloo22225), Iki1, qwertyasdfuiopghjkl, michaelblob (michaelblob), HackMii (hacktheplanet), @ChanServ, @chfoo (chfoo), @AlsoJAA (JAA), @arkiver (arkiver), Soul_, jodizzle (jodizzle), luckcolors (luckcolors), @JAA (JAA), @OrIdow6 (OrIdow6), @Sanqui (Sanqui), @hook54321 (hook54321), fo0bar_, phuzion (phuzion), qxtal (qxtal), Mayk78, Nemo_bis (Nemo_bis), pokechu22 (pokechu22), masterX244 (masterX244), DiscantX, user_ (gazorpazorp), tech234a (tech234a), Ryz (Ryz), @rewby (rewby), qw3rty, ThreeHM (ThreeHeadedMonkey), TheTechRobo (TheTechRobo), Craigle (Craigle), systwi (systwi), monika (boom), nepeat_ (nepeat), Terbium_, mind_combatant, @Sanqui|m (Sanqui), britmob|m, DigitalDragon, duce1337 (duce1337), sepro (sepro), eroc19902 (eroc1990), Matthww1, mrfooooo, atphoenix_ (atphoenix) |
16:59:59 | | Kuatrero joins |
17:26:46 | <pokechu22> | DiscantX: Most of the wikis I already did were fairly small; the history being that small is expected. Later ones are larger though, and the history is bigger. But I'll double-check the images |
17:28:51 | <pokechu22> | That's odd, I don't have a 9295.jpg locally; that does look broken |
17:30:16 | <pokechu22> | Actually, wait, those should be in an images folder, too |
17:32:25 | <pokechu22> | DiscantX: It looks correct when I download https://archive.org/download/wiki-de.rodovid.org/derodovidorg-20220907-wikidump.7z but the view contents thing is broken. Any idea why that might have happened? |
17:33:19 | <pokechu22> | (my guess is that it's related to there being 7089 images, but that's just a guess) |
17:43:14 | <pokechu22> | Hmm, no, this also applies to https://archive.org/download/wiki-ar.rodovid.org. Maybe I'm using a weird 7-zip config? |
17:43:52 | <pokechu22> | Oh, I have "solid block size" set to "solid" from something else; that's probably the issue |
17:51:18 | <pokechu22> | Hmm, for de.rodovid.org, using "ultra" compression with the default solid block of 4GB gives a 280 MB file, while using "solid" gives 180 MB |
17:56:16 | <pokechu22> | The default "ultra" settings also seem to cause issues... so maybe it's the weird SpecialīēVersion.html filename WSL generated instead? |
17:58:11 | <pokechu22> | (this is with the 7-zip UI, in Windows, instead of the command-line) |
21:43:51 | | HackMii quits [Remote host closed the connection] |
21:44:49 | | HackMii (hacktheplanet) joins |
23:28:17 | <pokechu22> | Ugh, it seems like the resume code doesn't handle unicode properly, giving dumpgenerator.py:1120: UnicodeWarning: Unicode unequal comparison failed to convert both arguments to Unicode - interpreting them as being unequal - making resuming a problem |
23:32:21 | <pokechu22> | I think I've worked around it but I'm not 100% sure |
23:38:08 | <michaelblob> | yeah that one has been an issue for a while, i ran into that a lot back when i was dumping a couple wikifarms |
23:39:26 | <pokechu22> | It looks like the resume code also resulted in 2 page entries for the same page in the xml, but that's better than the alternative, so whatever |
23:41:40 | <pokechu22> | https://github.com/Pokechu22/wikiteam/commit/8fb39152e27b6b647fbf7be35fd1f32ee6fe87aa |
23:55:02 | <pokechu22> | (would that one be worth upstreaming? or is it too jank?) |