00:33:57pabs quits [Client Quit]
00:35:58pabs (pabs) joins
00:42:24mako quits [Client Quit]
00:42:27mako (mako) joins
01:18:01Bedivere joins
02:01:16BigBrain quits [Ping timeout: 245 seconds]
02:03:52BigBrain (bigbrain) joins
02:30:49PredatorIWD joins
03:53:21luckcolors quits [Quit: No Ping reply in 180 seconds.]
03:54:25luckcolors (luckcolors) joins
04:53:33Megame quits [Client Quit]
04:57:40Jake quits [Ping timeout: 265 seconds]
05:56:21hitgrr8 joins
06:58:01<pabs>http://wiki.tuhs.org/doku.php
07:03:38<yzqzss|m>pabs: downloading...
07:09:10<yzqzss|m><https://archive.org/details/wiki-wiki.tuhs.org-20230620> done.
07:20:26Jake (Jake) joins
07:32:49pabs quits [Ping timeout: 265 seconds]
07:51:35jtagcat quits [Quit: Ping timeout (120 seconds)]
07:51:50jtagcat (jtagcat) joins
08:52:43Exorcism|m uploaded an image: (182KiB) < https://matrix.hackint.org/_matrix/media/v3/download/matrix.org/uGPXOgIELmwppWMKJUwnAiYC/image.png >
08:52:44<Exorcism|m>Currently uploading...
09:07:06BigBrain quits [Ping timeout: 245 seconds]
09:31:32BigBrain (bigbrain) joins
09:49:43pabs (pabs) joins
10:34:52<Exorcism|m>https://archive.org/details/wiki-wiki.linuxfoundation.org-20230618 done
12:21:18qwertyasdfuiopghjkl is now known as qwertyasdfuiopghjkl_
12:22:17qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
12:26:39qwertyasdfuiopghjkl_ quits [Client Quit]
13:03:28Sir_Bedivere joins
13:07:38Bedivere quits [Ping timeout: 252 seconds]
13:36:03<Nemo_bis>masterX244: thanks, indeed that's what I expected :( though I'm getting less timeouts today
13:40:53musteringswanky joins
13:41:08musteringswanky leaves
13:41:09<masterX244>60k images pulled so far
13:48:17<Nemo_bis>Not bad
14:20:53<pabs>https://betawiki.net/wiki/
14:21:38<pabs>https://tcrf.net/
14:45:51Kuatrero joins
14:49:56Sir_Bedivere quits [Ping timeout: 252 seconds]
15:20:37Kuatrero quits [Read error: Connection reset by peer]
15:20:53Bedivere joins
15:32:36Sir_Bedivere joins
15:36:41Bedivere quits [Ping timeout: 252 seconds]
16:19:07<@arkiver>masterX244: into WARCs?
16:19:33<masterX244>wikiteam script. got to massage the file that it created to run thru grab-site for a WARC, too
16:19:56<@arkiver>ah got it!
16:20:10<@arkiver>so these will be uploaded to the wikiteam collection on IA?
16:20:33<masterX244>can't upload directly to that, but its going into a dedicated item so it can be shuffled
16:21:28<masterX244>sidenote: https://archive.org/details/lostmediawiki-20230530 didnt get moved over yet, too
17:22:27<pokechu22>pabs: we did tcrf a bit ago and it's also like rediculously big.
17:23:04<pokechu22>and, looks like I did betawiki too: https://archive.org/details/wiki-betawikinet
17:23:21<pokechu22>hmm, tcrf wasn't *as* rediculously big as I thought, but https://tcrf.net/Special:MediaStatistics still says 88GB
17:24:22<masterX244>noticed i didnt pull the images on lostmediawiki, pulling that one now
17:50:51BigBrain quits [Ping timeout: 245 seconds]
17:52:46BigBrain (bigbrain) joins
17:55:04Kuatrero joins
17:59:08Sir_Bedivere quits [Ping timeout: 252 seconds]
18:16:02<rktk>is there a dump of TCRF available
18:16:08<rktk>and of hidden palace?
18:16:36<rktk>or could I grab those through an API request using the wikiteam tool?
18:21:49<pokechu22>tcrf yes, hidden palace no to my understanding
18:22:20<pokechu22>https://archive.org/details/wiki-tcrfnet-20230322
18:22:44<pokechu22>https://hiddenpalace.org/Special:MediaStatistics says 4.92 TB
18:23:13<pokechu22>you *could* download it using the tools, but (unless you're downloading wikitext only and skipping files) you'd need a LOT of space
18:25:01<rktk>ah hidden palace stores a lot of their dumps on their site too
18:25:09<rktk>I know they shifted to archive.org recently though
18:25:49<rktk>pokechu22, i am trying to work towards a 300TB storage build
18:25:56<rktk>so... at some point space is not an issue :)
18:26:10Sir_Bedivere joins
18:26:14<pokechu22>More an issue for uploading it to archive.org than anything
18:26:25<rktk>yeah that's tricky
18:26:30<rktk>Noted though
18:26:33<rktk>Thank you :)
18:30:29Kuatrero quits [Ping timeout: 252 seconds]
18:32:06Kuatrero joins
18:35:59Sir_Bedivere quits [Ping timeout: 252 seconds]
18:40:37<masterX244>IA is a pain to upload when the pipes are clogged
18:40:44<masterX244>got a 1TB upload running since 2 days already
19:17:31<Nemo_bis>masterX244: be grateful it wasn't terminated forcing you to start from scratch! what are you using to upload?
19:18:31<masterX244>IA commandline tool, its a big bunch of 5GB warcs, that way i got a checkpoint every 5GB incase a fail happens
19:18:41<Nemo_bis>Ah ok.
19:19:14<Nemo_bis>5 TB is well above the recommended item size limit...
19:23:22<masterX244>splitup in parts across multiple items
19:23:50<masterX244>did the same with my 1TB upload. if you got enough parts (50+) you can get a collection for those items
19:23:55<Nemo_bis>Possible of course, but I wonder what's the point. Many of these could just be their own items https://hiddenpalace.org/Special:MIMESearch/application/x-7z-compressed
19:24:21<masterX244>thats one method of splitting, using a natural grouping in the data
19:25:54<Nemo_bis>heh "This file/article is a candidate for deletion Reason: Archive.org saying it's a virus" https://hiddenpalace.org/SimCopter_(prototype)
19:35:00<pokechu22>https://files.hiddenpalace.org/b/b3/SimCopter_%28Protoype%29.zip gives https://www.virustotal.com/gui/file/e190c6df171dcc25c6271a66aa247826def322bbf8368eb5647498e42bffa4ae
19:38:01<masterX244>not sure if the IA has a bypass flag for files like that that are a falsepositive (only 3 firing up and its a heuristics signature)
19:47:19<pokechu22>I had one case of that and was able to get it fixed by emailing info@archive.org
20:44:16Megame (Megame) joins
20:46:56<Nemo_bis>Correction, I found a bigger image archive at https://www.avid.wiki/wiki/Special:MediaStatistics (50 GB)
20:47:18BigBrain quits [Remote host closed the connection]
20:47:45BigBrain (bigbrain) joins
20:49:55<masterX244>hiddenpalace is terabytes while yours is gigabytes only
20:50:56<Nemo_bis>bigger than the 20 GB I previously said was the biggest miraheze I found
20:51:14<masterX244>ahh. then it makes sense
21:01:27<pokechu22>I'm pretty sure I downloaded avid a while back
21:02:02<pokechu22>Yeah: https://archive.org/details/wiki-avidmirahezeorg_w
21:13:49<Nemo_bis>Nice. Seems to have grown a bit since then but not so urgent to have the images then, I'll remove it from my list.
21:40:08hitgrr8 quits [Client Quit]
23:05:29Gereon quits [Ping timeout: 252 seconds]
23:18:07Gereon (Gereon) joins