00:33:57 | | pabs quits [Client Quit] |
00:35:58 | | pabs (pabs) joins |
00:42:24 | | mako quits [Client Quit] |
00:42:27 | | mako (mako) joins |
01:18:01 | | Bedivere joins |
02:01:16 | | BigBrain quits [Ping timeout: 245 seconds] |
02:03:52 | | BigBrain (bigbrain) joins |
02:30:49 | | PredatorIWD joins |
03:53:21 | | luckcolors quits [Quit: No Ping reply in 180 seconds.] |
03:54:25 | | luckcolors (luckcolors) joins |
04:53:33 | | Megame quits [Client Quit] |
04:57:40 | | Jake quits [Ping timeout: 265 seconds] |
05:56:21 | | hitgrr8 joins |
06:58:01 | <pabs> | http://wiki.tuhs.org/doku.php |
07:03:38 | <yzqzss|m> | pabs: downloading... |
07:09:10 | <yzqzss|m> | <https://archive.org/details/wiki-wiki.tuhs.org-20230620> done. |
07:20:26 | | Jake (Jake) joins |
07:32:49 | | pabs quits [Ping timeout: 265 seconds] |
07:51:35 | | jtagcat quits [Quit: Ping timeout (120 seconds)] |
07:51:50 | | jtagcat (jtagcat) joins |
08:52:43 | | Exorcism|m uploaded an image: (182KiB) < https://matrix.hackint.org/_matrix/media/v3/download/matrix.org/uGPXOgIELmwppWMKJUwnAiYC/image.png > |
08:52:44 | <Exorcism|m> | Currently uploading... |
09:07:06 | | BigBrain quits [Ping timeout: 245 seconds] |
09:31:32 | | BigBrain (bigbrain) joins |
09:49:43 | | pabs (pabs) joins |
10:34:52 | <Exorcism|m> | https://archive.org/details/wiki-wiki.linuxfoundation.org-20230618 done |
12:21:18 | | qwertyasdfuiopghjkl is now known as qwertyasdfuiopghjkl_ |
12:22:17 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
12:26:39 | | qwertyasdfuiopghjkl_ quits [Client Quit] |
13:03:28 | | Sir_Bedivere joins |
13:07:38 | | Bedivere quits [Ping timeout: 252 seconds] |
13:36:03 | <Nemo_bis> | masterX244: thanks, indeed that's what I expected :( though I'm getting less timeouts today |
13:40:53 | | musteringswanky joins |
13:41:08 | | musteringswanky leaves |
13:41:09 | <masterX244> | 60k images pulled so far |
13:48:17 | <Nemo_bis> | Not bad |
14:20:53 | <pabs> | https://betawiki.net/wiki/ |
14:21:38 | <pabs> | https://tcrf.net/ |
14:45:51 | | Kuatrero joins |
14:49:56 | | Sir_Bedivere quits [Ping timeout: 252 seconds] |
15:20:37 | | Kuatrero quits [Read error: Connection reset by peer] |
15:20:53 | | Bedivere joins |
15:32:36 | | Sir_Bedivere joins |
15:36:41 | | Bedivere quits [Ping timeout: 252 seconds] |
16:19:07 | <@arkiver> | masterX244: into WARCs? |
16:19:33 | <masterX244> | wikiteam script. got to massage the file that it created to run thru grab-site for a WARC, too |
16:19:56 | <@arkiver> | ah got it! |
16:20:10 | <@arkiver> | so these will be uploaded to the wikiteam collection on IA? |
16:20:33 | <masterX244> | can't upload directly to that, but its going into a dedicated item so it can be shuffled |
16:21:28 | <masterX244> | sidenote: https://archive.org/details/lostmediawiki-20230530 didnt get moved over yet, too |
17:22:27 | <pokechu22> | pabs: we did tcrf a bit ago and it's also like rediculously big. |
17:23:04 | <pokechu22> | and, looks like I did betawiki too: https://archive.org/details/wiki-betawikinet |
17:23:21 | <pokechu22> | hmm, tcrf wasn't *as* rediculously big as I thought, but https://tcrf.net/Special:MediaStatistics still says 88GB |
17:24:22 | <masterX244> | noticed i didnt pull the images on lostmediawiki, pulling that one now |
17:50:51 | | BigBrain quits [Ping timeout: 245 seconds] |
17:52:46 | | BigBrain (bigbrain) joins |
17:55:04 | | Kuatrero joins |
17:59:08 | | Sir_Bedivere quits [Ping timeout: 252 seconds] |
18:16:02 | <rktk> | is there a dump of TCRF available |
18:16:08 | <rktk> | and of hidden palace? |
18:16:36 | <rktk> | or could I grab those through an API request using the wikiteam tool? |
18:21:49 | <pokechu22> | tcrf yes, hidden palace no to my understanding |
18:22:20 | <pokechu22> | https://archive.org/details/wiki-tcrfnet-20230322 |
18:22:44 | <pokechu22> | https://hiddenpalace.org/Special:MediaStatistics says 4.92 TB |
18:23:13 | <pokechu22> | you *could* download it using the tools, but (unless you're downloading wikitext only and skipping files) you'd need a LOT of space |
18:25:01 | <rktk> | ah hidden palace stores a lot of their dumps on their site too |
18:25:09 | <rktk> | I know they shifted to archive.org recently though |
18:25:49 | <rktk> | pokechu22, i am trying to work towards a 300TB storage build |
18:25:56 | <rktk> | so... at some point space is not an issue :) |
18:26:10 | | Sir_Bedivere joins |
18:26:14 | <pokechu22> | More an issue for uploading it to archive.org than anything |
18:26:25 | <rktk> | yeah that's tricky |
18:26:30 | <rktk> | Noted though |
18:26:33 | <rktk> | Thank you :) |
18:30:29 | | Kuatrero quits [Ping timeout: 252 seconds] |
18:32:06 | | Kuatrero joins |
18:35:59 | | Sir_Bedivere quits [Ping timeout: 252 seconds] |
18:40:37 | <masterX244> | IA is a pain to upload when the pipes are clogged |
18:40:44 | <masterX244> | got a 1TB upload running since 2 days already |
19:17:31 | <Nemo_bis> | masterX244: be grateful it wasn't terminated forcing you to start from scratch! what are you using to upload? |
19:18:31 | <masterX244> | IA commandline tool, its a big bunch of 5GB warcs, that way i got a checkpoint every 5GB incase a fail happens |
19:18:41 | <Nemo_bis> | Ah ok. |
19:19:14 | <Nemo_bis> | 5 TB is well above the recommended item size limit... |
19:23:22 | <masterX244> | splitup in parts across multiple items |
19:23:50 | <masterX244> | did the same with my 1TB upload. if you got enough parts (50+) you can get a collection for those items |
19:23:55 | <Nemo_bis> | Possible of course, but I wonder what's the point. Many of these could just be their own items https://hiddenpalace.org/Special:MIMESearch/application/x-7z-compressed |
19:24:21 | <masterX244> | thats one method of splitting, using a natural grouping in the data |
19:25:54 | <Nemo_bis> | heh "This file/article is a candidate for deletion Reason: Archive.org saying it's a virus" https://hiddenpalace.org/SimCopter_(prototype) |
19:35:00 | <pokechu22> | https://files.hiddenpalace.org/b/b3/SimCopter_%28Protoype%29.zip gives https://www.virustotal.com/gui/file/e190c6df171dcc25c6271a66aa247826def322bbf8368eb5647498e42bffa4ae |
19:38:01 | <masterX244> | not sure if the IA has a bypass flag for files like that that are a falsepositive (only 3 firing up and its a heuristics signature) |
19:47:19 | <pokechu22> | I had one case of that and was able to get it fixed by emailing info@archive.org |
20:44:16 | | Megame (Megame) joins |
20:46:56 | <Nemo_bis> | Correction, I found a bigger image archive at https://www.avid.wiki/wiki/Special:MediaStatistics (50 GB) |
20:47:18 | | BigBrain quits [Remote host closed the connection] |
20:47:45 | | BigBrain (bigbrain) joins |
20:49:55 | <masterX244> | hiddenpalace is terabytes while yours is gigabytes only |
20:50:56 | <Nemo_bis> | bigger than the 20 GB I previously said was the biggest miraheze I found |
20:51:14 | <masterX244> | ahh. then it makes sense |
21:01:27 | <pokechu22> | I'm pretty sure I downloaded avid a while back |
21:02:02 | <pokechu22> | Yeah: https://archive.org/details/wiki-avidmirahezeorg_w |
21:13:49 | <Nemo_bis> | Nice. Seems to have grown a bit since then but not so urgent to have the images then, I'll remove it from my list. |
21:40:08 | | hitgrr8 quits [Client Quit] |
23:05:29 | | Gereon quits [Ping timeout: 252 seconds] |
23:18:07 | | Gereon (Gereon) joins |