00:04:21etnguyen03 quits [Ping timeout: 272 seconds]
00:19:38etnguyen03 (etnguyen03) joins
00:22:35pupnik quits [Remote host closed the connection]
00:42:54benjinsmi joins
00:44:33benjins joins
00:46:47benjinsm quits [Ping timeout: 272 seconds]
00:47:34benjinsmi quits [Ping timeout: 265 seconds]
00:55:47etnguyen03 quits [Ping timeout: 265 seconds]
01:08:19HP_Archivist (HP_Archivist) joins
01:10:01hitchhitchhitch joins
01:10:22Edel69 joins
01:10:48<hitchhitchhitch>Hello all, it seems like The China Project (https://thechinaproject.com/2023/11/06/some-sad-news/) is shutting down. Can someone please send this through archivebot?
01:14:13<@JAA>hitchhitchhitch: Thanks, I've started a job for it.
01:18:01hitchhitchhitch quits [Ping timeout: 265 seconds]
01:21:27<Edel69>Hello. Quick question about the Imgur Archive Project. Imgur abruptly deleted my account without warning and I lost all of my uploads. Will it be possible to identify a specific Imgur user's deleted files and albums once the archived data is ready to be publicly released?
01:24:18unvariedexcuse (unvariedexcuse) joins
01:24:27<nicolas17>Edel69: hmm not sure if public albums were archived
01:24:53icedice quits [Client Quit]
01:26:25<pokechu22>We also never archived the user profile pages (though we collected a list of them I think)
01:26:49<Edel69>None of my uploads were public. I guess that makes this even worse then. I just can't believe they nuked my decade old account with no warning, and I'm apparently not the only one.
01:28:40<nicolas17>Edel69: no, other way around, public galleries weren't archived because they were at less risk of deletion
01:28:42<pokechu22>You might be able to recover a list from browser history?
01:30:05<nicolas17>so random uploads that *weren't* on public galleries (the publicly-seen side of imgur.com where mostly memes are shared) are more likely to be archived
01:30:12<nicolas17>problem is I'm not sure how to get them by username
01:30:19<@JAA>It might in theory be possible to find albums, but since there is no index, you'd have to go through the full 650 TiB of data to find them, so it's infeasible.
01:30:31<@JAA>Individual image pages don't contain the username.
01:30:31<@arkiver>yeah no easy way
01:30:39<nicolas17>JAA: are the warcs even public?
01:31:06<@arkiver>i think so
01:31:23<@arkiver>yeah they are
01:31:57<pokechu22>note that galleries and albums are different
01:32:14<Edel69>nicolas17 I guess this means there's no hope then. Thanks for the clarification. pokechu22 Are you referring to a list of URLs?
01:32:43<nicolas17>URLs or image IDs
01:33:33<pokechu22>Pulling up an example from my browser history... which was never saved, I guess I need to mine that for more URLs... https://imgur.com/a/hZmgsE8 exists but https://imgur.com/gallery/hZmgsE8 doesn't, but on the other hand https://imgur.com/a/MSeaL6C is the same as
01:34:20<@JAA>Yeah, albums and galleries and their relations are weird. I think we discussed that in detail in #imgone early in the project.
01:34:37<@arkiver>yeah
01:34:58<nicolas17>are you EricBowman86?
01:35:15<nicolas17>or was that just a random one from browser history, not your upload?
01:35:18<pokechu22>ah, but https://i.imgur.com/6WN7pub.png was saved, probably extracted from my IRC logs, so it's *only* the album that wasn't saved
01:35:33<pokechu22>https://imgur.com/a/MSeaL6C is a random one from the "most viral" section
01:35:50<pokechu22>my username on imgur was pokechu22 though I also uploaded a lot of stuff when not signed in
01:35:57<Edel69>pokechu22 Probably wouldn't work out well because I had a lot of private albums and images. I doubt they're all in the browser history.
01:37:12etnguyen03 (etnguyen03) joins
01:49:01<@arkiver>i lost some channels previously on the 9th of July apparently
01:49:07<@arkiver>just reconnected to them
01:49:47shinji257 quits [Client Quit]
01:49:56shinji257 (shinji257) joins
01:53:37hitchhitchhitch joins
01:54:14hitchhitchhitch quits [Remote host closed the connection]
01:58:12<unvariedexcuse>hi all
01:58:28<unvariedexcuse>do you know of any effort towards preserving twitter spaces? (audio rooms on the website now known as X)
02:00:11<@arkiver>nowadays those are completely behind a login wall right?
02:00:28<@arkiver>i remember at some point one could listen to them without login, but last time i checked one it was behind a login
02:00:32<unvariedexcuse>arkiver: some metadata yes
02:00:58<unvariedexcuse>arkiver: but not actual audio chunks IIRC
02:01:10<nicolas17>how do we get the URLs to the audio chunks though?
02:01:33<@arkiver>yeah
02:01:46<@arkiver>same question here
02:01:57<unvariedexcuse>nicolas17: the live_video_stream API should be usable without login
02:01:58<nicolas17>also metadata is kind of important, we don't want a giant pile of unlabeled mp3s :)
02:02:21<unvariedexcuse>as they seem to be based off periscope infra, a fair amount of code could be shared with the periscope grab
02:03:56<unvariedexcuse>at first https://github.com/HoloArchivists/twspace-dl and now https://github.com/HitomaruKonpaku/twspace-crawler appear to be the state of the art
02:04:04Edel69 quits [Remote host closed the connection]
02:04:21<@arkiver>i'll check them out, did not have a very good look at twitter spaces yet
02:05:54<unvariedexcuse>some may be officially unrecorded but if you get the m3u URL while they're live you can download them in full within 30 days
02:09:01benjinsm joins
02:09:42<unvariedexcuse>watching for live spaces via the avatar_content API would likely not be suited for warriors as it requires login AFAIK
02:11:45<unvariedexcuse>searching for spaces via other means (either recorded or not) is otherwise notoriously difficult
02:12:27<unvariedexcuse>would they even be in scope for AT? your call
02:13:07benjins quits [Ping timeout: 265 seconds]
02:31:58jarfeh joins
02:38:42<jarfeh>Hello there! I'm trying to recover some missing videos off youtube that were titled "lounge edit". I recently found a website called YouTube Video Finder made by a "TheTechRobo". I have the URL for some of the deleted videos, but this website mentioned that there is a "#youtubearchive" here that has the video?
02:39:11<@JAA>/join #youtubearchive
02:39:28<pabs>archive.org has some youtube saved too btw, join #down-the-tube for that
02:39:54<jarfeh>I did check archive first, however it just only has the page saved of some of the videos without the actual video saved
02:40:16<TheTechRobo>yeah, you'll want #youtubearchive
02:40:44<TheTechRobo>the command that J.AA sent should work
02:42:29<jarfeh>It did! Thank you for your website by the way, I came across it in a comment thread in the DataHoarder reddit and it's helped with recovering
02:49:26<TheTechRobo>jarfeh: awesome, glad to hear it! :-)
03:06:11BlueMaxima quits [Read error: Connection reset by peer]
03:11:23<nulldata>https://twitter.com/YahtzeeCroshaw/status/1721687212541280425
03:11:23<eggdrop>nitter: https://nitter.net/YahtzeeCroshaw/status/1721687212541280425
03:11:47<nulldata>Might be good to backup the Zero Punctuation videos
03:12:50<@JAA>nulldata: The entire channel is already running through #down-the-tube. :-)
03:13:07<nulldata>Aw sweet- thanks :)
03:13:13<nulldata>Ah*
03:14:31<@JAA>Oh, there's a separate channel from the general Escapist one.
03:14:52<@JAA>Or maybe that's unofficial.
03:15:16<nulldata>I wonder if there's any Escapist videos exclusive to the site and not on the YT channel? A reply to the post says the entire video team left
03:34:34Mateon1 quits [Remote host closed the connection]
03:34:34Arcorann quits [Remote host closed the connection]
03:34:36Mateon1 joins
03:40:38Arcorann (Arcorann) joins
03:40:48jwn joins
04:21:27jwn quits [Remote host closed the connection]
04:32:26killshot joins
04:36:03dumbgoy quits [Ping timeout: 272 seconds]
04:42:56unvariedexcuse quits [Ping timeout: 245 seconds]
04:54:57unvariedexcuse (unvariedexcuse) joins
05:11:27unvariedexcuse leaves
05:22:55etnguyen03 quits [Ping timeout: 272 seconds]
05:29:49etnguyen03 (etnguyen03) joins
05:31:13DogsRNice quits [Read error: Connection reset by peer]
05:48:12etnguyen03 quits [Client Quit]
05:55:27killshot quits [Ping timeout: 265 seconds]
06:06:11killshot joins
06:08:13killshot quits [Remote host closed the connection]
06:08:36killshot joins
06:13:20killshot quits [Ping timeout: 265 seconds]
06:28:09nicolas17 quits [Ping timeout: 272 seconds]
07:12:38Island_ joins
07:12:48Dango360_ joins
07:14:02jarfeh85 joins
07:15:42Dango360 quits [Ping timeout: 265 seconds]
07:16:10jarfeh quits [Ping timeout: 265 seconds]
07:17:08Island quits [Ping timeout: 265 seconds]
07:18:44Island_ quits [Read error: Connection reset by peer]
07:21:12_Dango360 joins
07:21:30AK quits [Client Quit]
07:21:30imer quits [Client Quit]
07:21:31Mateon1 quits [Remote host closed the connection]
07:21:34TheTechRobo quits [Client Quit]
07:21:35killshot joins
07:21:36Mateon1 joins
07:21:46imer (imer) joins
07:21:58AK (AK) joins
07:22:04TheTechRobo (TheTechRobo) joins
07:23:02killshot1337 joins
07:24:48TheTechRobo quits [Excess Flood]
07:25:20TheTechRobo (TheTechRobo) joins
07:26:19Dango360_ quits [Ping timeout: 265 seconds]
07:26:48killshot quits [Ping timeout: 242 seconds]
09:06:51treora quits [Ping timeout: 265 seconds]
09:09:19treora joins
09:11:49icedice (icedice) joins
09:12:53icedice quits [Client Quit]
09:20:21qwertyasdfuiopghjkl quits [Remote host closed the connection]
09:34:39icedice (icedice) joins
09:35:24negge_ is now known as negge
09:52:25Wohlstand (Wohlstand) joins
10:00:02Bleo1 quits [Client Quit]
10:01:14Bleo1 joins
10:09:59qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
10:31:40killshot1337x joins
10:35:19jarfeh85 quits [Ping timeout: 265 seconds]
10:36:17killshot1337 quits [Ping timeout: 265 seconds]
10:46:27killshot1337 joins
10:46:29Wohlstand quits [Remote host closed the connection]
10:46:33eroc1990 quits [Read error: Connection reset by peer]
10:46:40eroc1990 (eroc1990) joins
10:46:44Wohlstand (Wohlstand) joins
10:51:15killshot1337x quits [Ping timeout: 265 seconds]
11:06:11pabs quits [Ping timeout: 272 seconds]
11:08:12pabs (pabs) joins
11:16:52threedeeitguy39 quits [Read error: Connection reset by peer]
11:17:21threedeeitguy390 (threedeeitguy) joins
12:37:16dumbgoy joins
12:38:39Arcorann quits [Ping timeout: 272 seconds]
12:40:23<magmaus3>Sharing here just in case: https://lemmy.sdf.org/post/7179616
12:43:01nyany quits [Read error: Connection reset by peer]
12:52:40etnguyen03 (etnguyen03) joins
12:59:44<thuban>(site is https://hikarinoakari.com/)
13:07:26<thuban>site itself looks ok for archivebot except for disqus (images are lazy-loaded but have in-source srcs)
13:08:38betamax_ is now known as betamax
13:09:12Megame (Megame) joins
13:12:09<thuban>music is behind an onsite landing page which base64s a link to an offsite landing page (either a login-walled forum or a link shortener) which links to a third-party host (mostly google drive/mega), so no chance of abing that
13:42:21pabs quits [Ping timeout: 265 seconds]
13:44:47pabs (pabs) joins
13:45:47etnguyen03 quits [Ping timeout: 272 seconds]
14:09:22nicolas17 joins
14:09:28kiryu quits [Remote host closed the connection]
14:09:45pabs quits [Client Quit]
14:10:17pabs (pabs) joins
14:10:32kiryu (kiryu) joins
14:38:34keksie joins
14:39:25keksie quits [Remote host closed the connection]
14:49:32<masterX244>Link shorteners could probably be extracted by some warc-digesting and then crunched out
15:00:49<thuban>in theory, yeah
15:01:58<thuban>(would have to go through another round of ab since it requires you to click through rather than being a redirect--oddly enough, the site claims to have a captcha but just works with js disabled)
15:03:00<thuban>in practice, idk how valuable it would be given that we don't have tooling for those file hosts
15:15:50rktk quits [Client Quit]
15:17:03rktk (rktk) joins
15:17:55rktk quits [Client Quit]
15:19:12rktk (rktk) joins
15:25:26<@arkiver>the problems at IA 1 to two months ago have now been fully fixed
15:28:20<thuban>excellent! :D
15:53:38Wohlstand quits [Client Quit]
15:58:12Wohlstand (Wohlstand) joins
15:58:44<nicolas17>arkiver: cool, let's resume bruteforcing imgur
15:58:48<nicolas17>(let's not :D)
16:01:03<@arkiver>:P
16:04:46Wohlstand quits [Client Quit]
16:11:40Island joins
16:12:02BearFortress quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
16:28:53parfait (kdqep) joins
16:40:36BearFortress joins
16:50:28<mgrandi>https://www-forbes-com.cdn.ampproject.org/v/s/www.forbes.com/sites/paultassi/2023/11/07/zero-punctuation-ends-as-the-escapist-faces-mass-resignations-after-eic-firing/amp/?amp_gsa=1&amp_js_v=a9&usqp=mq331AQGsAEggAID#amp_tf=From%20%251%24s&aoh=16993753722447&csi=0&referrer=https%3A%2F%2Fwww.google.com&ampshare=https%3A%2F%2Fwww.forbes.com%2Fsites%2Fpaultassi%2F2023%2F11%2F07%2Fzero-punctuation-ends-as-the-escapist-faces-mas
16:50:28<mgrandi>s-resignations-after-eic-firing%2F
16:50:43<mgrandi>Oh my gosh I got bamboozled by the url length I'm sorry
16:51:47<@JAA>https://www.forbes.com/sites/paultassi/2023/11/07/zero-punctuation-ends-as-the-escapist-faces-mass-resignations-after-eic-firing/amp/
16:54:33<mgrandi>Yes that
17:06:00<@JAA>The download links on https://hikarinoakari.com/ are a mess. Some go to a link shortener with a captcha, some go to a Twitter account, etc.
17:06:12<@JAA>The site itself is running through AB though.
17:09:13pabs quits [Ping timeout: 265 seconds]
17:11:43<vokunal|m>Now that IA is ok, can we get mediaonfire unclogged? If nothing changed, it's still going into temp storage
17:12:15<@arkiver>vokunal|m: mediafire still going to temporary storage?
17:12:31<@arkiver>i see WARCs appearing on IA
17:12:34<@arkiver>seems like it is going to IA
17:12:44<@arkiver>and it doesn't seem to be clogged
17:14:05<vokunal|m>ah cool
17:14:50<vokunal|m>it must be flowing smooth then. I was thinking the out was clogged, but it must just be some funky items. They've been flowing nonstop, but slow
17:15:35<@arkiver>we also queue mediafire items discovered in #//
17:17:30<vokunal|m>That makes sense why the claims wouldn't be going down that fast. I was confused when we had around 40k todo, and it drained into claims over a week or so, but didn't seem to be leaving claims
17:43:50parfait_ joins
17:44:52<fireonlive>🥳
17:47:53parfait quits [Ping timeout: 265 seconds]
17:52:08Megame quits [Client Quit]
18:02:55killshot1337 quits [Ping timeout: 272 seconds]
18:33:12etnguyen03 (etnguyen03) joins
19:08:56lennier2 quits [Remote host closed the connection]
19:14:19DogsRNice joins
19:15:10lennier1 (lennier1) joins
19:18:09jarfeh joins
19:55:46Island quits [Remote host closed the connection]
19:55:46parfait_ quits [Remote host closed the connection]
19:55:46DogsRNice quits [Remote host closed the connection]
19:55:53Island joins
19:55:58DogsRNice joins
19:56:00parfait_ joins
20:04:31HP_Archivist quits [Ping timeout: 272 seconds]
20:05:09_Dango360 quits [Ping timeout: 272 seconds]
20:57:05dumbgoy quits [Ping timeout: 272 seconds]
20:59:41dumbgoy joins
22:02:28dumbgoy_ joins
22:05:30dumbgoy quits [Ping timeout: 265 seconds]
22:07:47unsimply joins
22:08:54unsimply quits [Remote host closed the connection]
22:30:12dumbgoy_ quits [Read error: Connection reset by peer]
23:08:57BlueMaxima joins
23:18:12dumbgoy joins
23:20:38Pedrosso joins
23:41:12Barto quits [Ping timeout: 265 seconds]
23:42:26<Pedrosso>I'm new to hackint.org as well as to this chat. The wiki says this channel is supposed to be the right one to ask/inform about dying websites, is that accurate?
23:42:37<pokechu22>Pedrosso: Yes
23:43:37etnguyen03 quits [Ping timeout: 265 seconds]
23:49:46<Pedrosso>Spore is a game that's been out since september 4th 2008, and support (by EA) has been declining. Sporepedia. There's no official shutdown date afaik, however I am anxious considering the company that's hosting them (EA), and how the company has already almost broken the game in itself with its own launcher. What I'm worried about saving is the
23:49:46<Pedrosso>sporepedia (spore.com) A large and very old website with millions of users and creations (>10 million enumerated files with approximately the average filesize of 20kB) I've been using my own (bad) code to save only the creations, but it's inefficient and also leaves out all the forums, creators, comments, etc.
23:50:24<Pedrosso>I don't know how much this community cares for archiving such stuff, as it's a niche thing. Mind enlightening me?
23:53:33<@arkiver>can ArchiveBot handle that spore.com ?
23:53:37<@arkiver>would be nice to have a copy yes
23:53:50<pokechu22>EA doesn't like us unfortunately :|
23:54:11<@arkiver>does that matter? :P
23:54:48<@JAA>If they block or rate-limit us, it does.
23:54:49<pokechu22>Most of their websites timeout with archivebot, though that's mostly newer stuff (ea.com and like battlefront I think). Not sure if spore.com is also affected
23:54:53<Pedrosso>I have been downloading these files for a long while and have had 0 apparent problems with rate-limiting, etc
23:55:08<pokechu22>or, not even timeout, instead it acts more like a tarpit if I recall correctly
23:55:10<Flashfire42>Holy shit that site shared in the main channel is cancerous so many popups
23:55:41<pokechu22>(for reference the site shared there was https://hikarinoakari.com / https://imgur.com/jeJSEu6)
23:55:49<@JAA>I'm getting an expired cert on spore.com. Nice.
23:56:09<Pedrosso>the site is very much in a state of disrepair, hence why I'm concerned
23:56:17<pokechu22>I believe ScenarioPlanet sent me some spore-related stuff and that worked fine in the past, but it was a fairly small subset
23:56:53<Pedrosso>http://www.spore.com/sporepedia is what I'm referring to specifically
23:57:22<@JAA>The site does not work well without JS, so there's that.
23:57:50<Flashfire42>Spore just timed out for me on the main domain there. spore.com
23:58:03<@JAA>Yeah, took me a few tries as well to get there.
23:58:04<Flashfire42>sporepedia loads fine maybe it needs teh www
23:58:30<@arkiver>needs the www indeed
23:58:37<@JAA>We can certainly try to run it through ArchiveBot.