02:23:04<pabs>did these get added? https://moddingfridays.bleu255.com/ https://things.bleu255.com/runyourown/ http://wiki.artserver.org/index.php/Main_Page http://www.gc-forever.com/wiki/index.php?title=Main_Page https://wiki.hyperbola.info/ https://nick-black.com/dankwiki/index.php/Hack_on
02:23:33<pokechu22>I don't think so, I've been a bit behind
02:23:41<pokechu22>I did do gc-forever probably half a year ago though
02:23:58<pabs>ok
02:24:23<pokechu22>I'll run the rest now though
02:24:49<pabs>thanks
04:04:23<pokechu22>https://wiki.hyperbola.info is dokuwiki, Exorcism|m or someone else will need to dump it as I don't have the tools for that set up. But it does link to https://wiki.parabola.nu/Main_Page in the footer and I can save that
04:33:14<DigitalDragon>^^ done: https://archive.org/details/wiki-wiki.hyperbola.info-20230530
07:48:19Sir_Bedivere joins
07:48:23atphoenix quits [Remote host closed the connection]
07:48:23AnotherIki quits [Remote host closed the connection]
07:48:23Bedivere quits [Remote host closed the connection]
07:48:30AnotherIki joins
07:48:38atphoenix (atphoenix) joins
10:09:02TastyWiener95 quits [Quit: So long, farewell, auf wiedersehen, good night]
10:09:56TastyWiener95 (TastyWiener95) joins
10:18:51TastyWiener95 quits [Client Quit]
11:31:47TastyWiener95 (TastyWiener95) joins
14:38:17parfait (kdqep) joins
14:42:07hitgrr8 joins
15:23:07eroc19909 is now known as eroc1990
16:35:51parfait quits [Ping timeout: 265 seconds]
19:03:12icedice (icedice) joins
19:03:46<icedice>Can someone scrape The Lost Media Wiki for Imgur links to archive?
19:04:06<icedice>The forums have been scraped already, but apparently most Imgur links are on the wiki itself
19:04:56<icedice>https://lostmediawiki.com/
19:05:32<pokechu22>Looks like there's a 2018 dump at https://archive.org/download/wiki-lostmediawikicom
19:06:59<pokechu22>https://lostmediawiki.com/Special:MediaStatistics says 11GB of files (which, granted, isn't required for this purpose, but would still be good to save)
19:09:57<icedice>Worth a full archivation, I'd say
19:11:21<icedice>There's also The Cutting Room Floor which has some Imgur links and would be worth saving
19:11:22<icedice>https://tcrf.net/The_Cutting_Room_Floor
19:11:37<pokechu22>Yeah, that one was saved about a month ago I think?
19:11:54<icedice>Nice
19:12:07<pokechu22>and for it, https://tcrf.net/Special:MediaStatistics is 87.58 GB
19:12:15<icedice>Then it can just be scraped
19:12:21<icedice>from the WARC
19:13:56<pokechu22>wikiteam tools don't generate a WARC, but it should be possible to just download the history.xml for it and work with that (and fortunately the tools generate two 7z files, one with images and one without, specifically so that you don't need to download 87GB of images if you only want page text)
19:15:10<pokechu22>https://archive.org/details/wiki-tcrfnet-20230322 - I'll download and extract from this (wow, the compressed XML is still 500 MB, that's pretty big)
19:16:02<masterX244>running wikiteam tool atm on LMW
19:18:58<icedice>Nice
19:19:00<icedice>Thanks!
19:28:57<pokechu22>For reference, the command I use to extract links is `7z x -so tcrfnet-20230322-history.xml.7z '*.xml' | grep -Fhai -e 'mediafire' -e 'imgur' > tcrfnet-20230322-history.xml_temp.txt` followed by `grep -Pahoi '\S*imgur\S*' tcrfnet-20230322-history.xml_temp.txt | sort -u > tcrfnet-20230322-history.xml_imgur.txt`
20:16:15icedice quits [Ping timeout: 265 seconds]
20:16:48icedice (icedice) joins
21:10:55hitgrr8 quits [Client Quit]
22:06:41luckcolors_ quits [Quit: No Ping reply in 180 seconds.]
22:08:15luckcolors (luckcolors) joins
22:09:20Nemo_bis quits [Remote host closed the connection]
22:09:27Nemo_bis (Nemo_bis) joins
22:15:44TastyWiener95 quits [Client Quit]
22:30:37icedice quits [Ping timeout: 265 seconds]
23:18:28Bedivere joins
23:20:14Sir_Bedivere quits [Ping timeout: 252 seconds]
23:29:13TastyWiener95 (TastyWiener95) joins