00:04:21 | | etnguyen03 quits [Ping timeout: 272 seconds] |
00:19:38 | | etnguyen03 (etnguyen03) joins |
00:22:35 | | pupnik quits [Remote host closed the connection] |
00:42:54 | | benjinsmi joins |
00:44:33 | | benjins joins |
00:46:47 | | benjinsm quits [Ping timeout: 272 seconds] |
00:47:34 | | benjinsmi quits [Ping timeout: 265 seconds] |
00:55:47 | | etnguyen03 quits [Ping timeout: 265 seconds] |
01:08:19 | | HP_Archivist (HP_Archivist) joins |
01:10:01 | | hitchhitchhitch joins |
01:10:22 | | Edel69 joins |
01:10:48 | <hitchhitchhitch> | Hello all, it seems like The China Project (https://thechinaproject.com/2023/11/06/some-sad-news/) is shutting down. Can someone please send this through archivebot? |
01:14:13 | <@JAA> | hitchhitchhitch: Thanks, I've started a job for it. |
01:18:01 | | hitchhitchhitch quits [Ping timeout: 265 seconds] |
01:21:27 | <Edel69> | Hello. Quick question about the Imgur Archive Project. Imgur abruptly deleted my account without warning and I lost all of my uploads. Will it be possible to identify a specific Imgur user's deleted files and albums once the archived data is ready to be publicly released? |
01:24:18 | | unvariedexcuse (unvariedexcuse) joins |
01:24:27 | <nicolas17> | Edel69: hmm not sure if public albums were archived |
01:24:53 | | icedice quits [Client Quit] |
01:26:25 | <pokechu22> | We also never archived the user profile pages (though we collected a list of them I think) |
01:26:49 | <Edel69> | None of my uploads were public. I guess that makes this even worse then. I just can't believe they nuked my decade old account with no warning, and I'm apparently not the only one. |
01:28:40 | <nicolas17> | Edel69: no, other way around, public galleries weren't archived because they were at less risk of deletion |
01:28:42 | <pokechu22> | You might be able to recover a list from browser history? |
01:30:05 | <nicolas17> | so random uploads that *weren't* on public galleries (the publicly-seen side of imgur.com where mostly memes are shared) are more likely to be archived |
01:30:12 | <nicolas17> | problem is I'm not sure how to get them by username |
01:30:19 | <@JAA> | It might in theory be possible to find albums, but since there is no index, you'd have to go through the full 650 TiB of data to find them, so it's infeasible. |
01:30:31 | <@JAA> | Individual image pages don't contain the username. |
01:30:31 | <@arkiver> | yeah no easy way |
01:30:39 | <nicolas17> | JAA: are the warcs even public? |
01:31:06 | <@arkiver> | i think so |
01:31:23 | <@arkiver> | yeah they are |
01:31:57 | <pokechu22> | note that galleries and albums are different |
01:32:14 | <Edel69> | nicolas17 I guess this means there's no hope then. Thanks for the clarification. pokechu22 Are you referring to a list of URLs? |
01:32:43 | <nicolas17> | URLs or image IDs |
01:33:33 | <pokechu22> | Pulling up an example from my browser history... which was never saved, I guess I need to mine that for more URLs... https://imgur.com/a/hZmgsE8 exists but https://imgur.com/gallery/hZmgsE8 doesn't, but on the other hand https://imgur.com/a/MSeaL6C is the same as |
01:34:20 | <@JAA> | Yeah, albums and galleries and their relations are weird. I think we discussed that in detail in #imgone early in the project. |
01:34:37 | <@arkiver> | yeah |
01:34:58 | <nicolas17> | are you EricBowman86? |
01:35:15 | <nicolas17> | or was that just a random one from browser history, not your upload? |
01:35:18 | <pokechu22> | ah, but https://i.imgur.com/6WN7pub.png was saved, probably extracted from my IRC logs, so it's *only* the album that wasn't saved |
01:35:33 | <pokechu22> | https://imgur.com/a/MSeaL6C is a random one from the "most viral" section |
01:35:50 | <pokechu22> | my username on imgur was pokechu22 though I also uploaded a lot of stuff when not signed in |
01:35:57 | <Edel69> | pokechu22 Probably wouldn't work out well because I had a lot of private albums and images. I doubt they're all in the browser history. |
01:37:12 | | etnguyen03 (etnguyen03) joins |
01:49:01 | <@arkiver> | i lost some channels previously on the 9th of July apparently |
01:49:07 | <@arkiver> | just reconnected to them |
01:49:47 | | shinji257 quits [Client Quit] |
01:49:56 | | shinji257 (shinji257) joins |
01:53:37 | | hitchhitchhitch joins |
01:54:14 | | hitchhitchhitch quits [Remote host closed the connection] |
01:58:12 | <unvariedexcuse> | hi all |
01:58:28 | <unvariedexcuse> | do you know of any effort towards preserving twitter spaces? (audio rooms on the website now known as X) |
02:00:11 | <@arkiver> | nowadays those are completely behind a login wall right? |
02:00:28 | <@arkiver> | i remember at some point one could listen to them without login, but last time i checked one it was behind a login |
02:00:32 | <unvariedexcuse> | arkiver: some metadata yes |
02:00:58 | <unvariedexcuse> | arkiver: but not actual audio chunks IIRC |
02:01:10 | <nicolas17> | how do we get the URLs to the audio chunks though? |
02:01:33 | <@arkiver> | yeah |
02:01:46 | <@arkiver> | same question here |
02:01:57 | <unvariedexcuse> | nicolas17: the live_video_stream API should be usable without login |
02:01:58 | <nicolas17> | also metadata is kind of important, we don't want a giant pile of unlabeled mp3s :) |
02:02:21 | <unvariedexcuse> | as they seem to be based off periscope infra, a fair amount of code could be shared with the periscope grab |
02:03:56 | <unvariedexcuse> | at first https://github.com/HoloArchivists/twspace-dl and now https://github.com/HitomaruKonpaku/twspace-crawler appear to be the state of the art |
02:04:04 | | Edel69 quits [Remote host closed the connection] |
02:04:21 | <@arkiver> | i'll check them out, did not have a very good look at twitter spaces yet |
02:05:54 | <unvariedexcuse> | some may be officially unrecorded but if you get the m3u URL while they're live you can download them in full within 30 days |
02:09:01 | | benjinsm joins |
02:09:42 | <unvariedexcuse> | watching for live spaces via the avatar_content API would likely not be suited for warriors as it requires login AFAIK |
02:11:45 | <unvariedexcuse> | searching for spaces via other means (either recorded or not) is otherwise notoriously difficult |
02:12:27 | <unvariedexcuse> | would they even be in scope for AT? your call |
02:13:07 | | benjins quits [Ping timeout: 265 seconds] |
02:31:58 | | jarfeh joins |
02:38:42 | <jarfeh> | Hello there! I'm trying to recover some missing videos off youtube that were titled "lounge edit". I recently found a website called YouTube Video Finder made by a "TheTechRobo". I have the URL for some of the deleted videos, but this website mentioned that there is a "#youtubearchive" here that has the video? |
02:39:11 | <@JAA> | /join #youtubearchive |
02:39:28 | <pabs> | archive.org has some youtube saved too btw, join #down-the-tube for that |
02:39:54 | <jarfeh> | I did check archive first, however it just only has the page saved of some of the videos without the actual video saved |
02:40:16 | <TheTechRobo> | yeah, you'll want #youtubearchive |
02:40:44 | <TheTechRobo> | the command that J.AA sent should work |
02:42:29 | <jarfeh> | It did! Thank you for your website by the way, I came across it in a comment thread in the DataHoarder reddit and it's helped with recovering |
02:49:26 | <TheTechRobo> | jarfeh: awesome, glad to hear it! :-) |
03:06:11 | | BlueMaxima quits [Read error: Connection reset by peer] |
03:11:23 | <nulldata> | https://twitter.com/YahtzeeCroshaw/status/1721687212541280425 |
03:11:23 | <eggdrop> | nitter: https://nitter.net/YahtzeeCroshaw/status/1721687212541280425 |
03:11:47 | <nulldata> | Might be good to backup the Zero Punctuation videos |
03:12:50 | <@JAA> | nulldata: The entire channel is already running through #down-the-tube. :-) |
03:13:07 | <nulldata> | Aw sweet- thanks :) |
03:13:13 | <nulldata> | Ah* |
03:14:31 | <@JAA> | Oh, there's a separate channel from the general Escapist one. |
03:14:52 | <@JAA> | Or maybe that's unofficial. |
03:15:16 | <nulldata> | I wonder if there's any Escapist videos exclusive to the site and not on the YT channel? A reply to the post says the entire video team left |
03:34:34 | | Mateon1 quits [Remote host closed the connection] |
03:34:34 | | Arcorann quits [Remote host closed the connection] |
03:34:36 | | Mateon1 joins |
03:40:38 | | Arcorann (Arcorann) joins |
03:40:48 | | jwn joins |
04:21:27 | | jwn quits [Remote host closed the connection] |
04:32:26 | | killshot joins |
04:36:03 | | dumbgoy quits [Ping timeout: 272 seconds] |
04:42:56 | | unvariedexcuse quits [Ping timeout: 245 seconds] |
04:54:57 | | unvariedexcuse (unvariedexcuse) joins |
05:11:27 | | unvariedexcuse leaves |
05:22:55 | | etnguyen03 quits [Ping timeout: 272 seconds] |
05:29:49 | | etnguyen03 (etnguyen03) joins |
05:31:13 | | DogsRNice quits [Read error: Connection reset by peer] |
05:48:12 | | etnguyen03 quits [Client Quit] |
05:55:27 | | killshot quits [Ping timeout: 265 seconds] |
06:06:11 | | killshot joins |
06:08:13 | | killshot quits [Remote host closed the connection] |
06:08:36 | | killshot joins |
06:13:20 | | killshot quits [Ping timeout: 265 seconds] |
06:28:09 | | nicolas17 quits [Ping timeout: 272 seconds] |
07:12:38 | | Island_ joins |
07:12:48 | | Dango360_ joins |
07:14:02 | | jarfeh85 joins |
07:15:42 | | Dango360 quits [Ping timeout: 265 seconds] |
07:16:10 | | jarfeh quits [Ping timeout: 265 seconds] |
07:17:08 | | Island quits [Ping timeout: 265 seconds] |
07:18:44 | | Island_ quits [Read error: Connection reset by peer] |
07:21:12 | | _Dango360 joins |
07:21:30 | | AK quits [Client Quit] |
07:21:30 | | imer quits [Client Quit] |
07:21:31 | | Mateon1 quits [Remote host closed the connection] |
07:21:34 | | TheTechRobo quits [Client Quit] |
07:21:35 | | killshot joins |
07:21:36 | | Mateon1 joins |
07:21:46 | | imer (imer) joins |
07:21:58 | | AK (AK) joins |
07:22:04 | | TheTechRobo (TheTechRobo) joins |
07:23:02 | | killshot1337 joins |
07:24:48 | | TheTechRobo quits [Excess Flood] |
07:25:20 | | TheTechRobo (TheTechRobo) joins |
07:26:19 | | Dango360_ quits [Ping timeout: 265 seconds] |
07:26:48 | | killshot quits [Ping timeout: 242 seconds] |
09:06:51 | | treora quits [Ping timeout: 265 seconds] |
09:09:19 | | treora joins |
09:11:49 | | icedice (icedice) joins |
09:12:53 | | icedice quits [Client Quit] |
09:20:21 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
09:34:39 | | icedice (icedice) joins |
09:35:24 | | negge_ is now known as negge |
09:52:25 | | Wohlstand (Wohlstand) joins |
10:00:02 | | Bleo1 quits [Client Quit] |
10:01:14 | | Bleo1 joins |
10:09:59 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
10:31:40 | | killshot1337x joins |
10:35:19 | | jarfeh85 quits [Ping timeout: 265 seconds] |
10:36:17 | | killshot1337 quits [Ping timeout: 265 seconds] |
10:46:27 | | killshot1337 joins |
10:46:29 | | Wohlstand quits [Remote host closed the connection] |
10:46:33 | | eroc1990 quits [Read error: Connection reset by peer] |
10:46:40 | | eroc1990 (eroc1990) joins |
10:46:44 | | Wohlstand (Wohlstand) joins |
10:51:15 | | killshot1337x quits [Ping timeout: 265 seconds] |
11:06:11 | | pabs quits [Ping timeout: 272 seconds] |
11:08:12 | | pabs (pabs) joins |
11:16:52 | | threedeeitguy39 quits [Read error: Connection reset by peer] |
11:17:21 | | threedeeitguy390 (threedeeitguy) joins |
12:37:16 | | dumbgoy joins |
12:38:39 | | Arcorann quits [Ping timeout: 272 seconds] |
12:40:23 | <magmaus3> | Sharing here just in case: https://lemmy.sdf.org/post/7179616 |
12:43:01 | | nyany quits [Read error: Connection reset by peer] |
12:52:40 | | etnguyen03 (etnguyen03) joins |
12:59:44 | <thuban> | (site is https://hikarinoakari.com/) |
13:07:26 | <thuban> | site itself looks ok for archivebot except for disqus (images are lazy-loaded but have in-source srcs) |
13:08:38 | | betamax_ is now known as betamax |
13:09:12 | | Megame (Megame) joins |
13:12:09 | <thuban> | music is behind an onsite landing page which base64s a link to an offsite landing page (either a login-walled forum or a link shortener) which links to a third-party host (mostly google drive/mega), so no chance of abing that |
13:42:21 | | pabs quits [Ping timeout: 265 seconds] |
13:44:47 | | pabs (pabs) joins |
13:45:47 | | etnguyen03 quits [Ping timeout: 272 seconds] |
14:09:22 | | nicolas17 joins |
14:09:28 | | kiryu quits [Remote host closed the connection] |
14:09:45 | | pabs quits [Client Quit] |
14:10:17 | | pabs (pabs) joins |
14:10:32 | | kiryu (kiryu) joins |
14:38:34 | | keksie joins |
14:39:25 | | keksie quits [Remote host closed the connection] |
14:49:32 | <masterX244> | Link shorteners could probably be extracted by some warc-digesting and then crunched out |
15:00:49 | <thuban> | in theory, yeah |
15:01:58 | <thuban> | (would have to go through another round of ab since it requires you to click through rather than being a redirect--oddly enough, the site claims to have a captcha but just works with js disabled) |
15:03:00 | <thuban> | in practice, idk how valuable it would be given that we don't have tooling for those file hosts |
15:15:50 | | rktk quits [Client Quit] |
15:17:03 | | rktk (rktk) joins |
15:17:55 | | rktk quits [Client Quit] |
15:19:12 | | rktk (rktk) joins |
15:25:26 | <@arkiver> | the problems at IA 1 to two months ago have now been fully fixed |
15:28:20 | <thuban> | excellent! :D |
15:53:38 | | Wohlstand quits [Client Quit] |
15:58:12 | | Wohlstand (Wohlstand) joins |
15:58:44 | <nicolas17> | arkiver: cool, let's resume bruteforcing imgur |
15:58:48 | <nicolas17> | (let's not :D) |
16:01:03 | <@arkiver> | :P |
16:04:46 | | Wohlstand quits [Client Quit] |
16:11:40 | | Island joins |
16:12:02 | | BearFortress quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
16:28:53 | | parfait (kdqep) joins |
16:40:36 | | BearFortress joins |
16:50:28 | <mgrandi> | https://www-forbes-com.cdn.ampproject.org/v/s/www.forbes.com/sites/paultassi/2023/11/07/zero-punctuation-ends-as-the-escapist-faces-mass-resignations-after-eic-firing/amp/?amp_gsa=1&_js_v=a9&usqp=mq331AQGsAEggAID#amp_tf=From%20%251%24s&aoh=16993753722447&csi=0&referrer=https%3A%2F%2Fwww.google.com&share=https%3A%2F%2Fwww.forbes.com%2Fsites%2Fpaultassi%2F2023%2F11%2F07%2Fzero-punctuation-ends-as-the-escapist-faces-mas |
16:50:28 | <mgrandi> | s-resignations-after-eic-firing%2F |
16:50:43 | <mgrandi> | Oh my gosh I got bamboozled by the url length I'm sorry |
16:51:47 | <@JAA> | https://www.forbes.com/sites/paultassi/2023/11/07/zero-punctuation-ends-as-the-escapist-faces-mass-resignations-after-eic-firing/amp/ |
16:54:33 | <mgrandi> | Yes that |
17:06:00 | <@JAA> | The download links on https://hikarinoakari.com/ are a mess. Some go to a link shortener with a captcha, some go to a Twitter account, etc. |
17:06:12 | <@JAA> | The site itself is running through AB though. |
17:09:13 | | pabs quits [Ping timeout: 265 seconds] |
17:11:43 | <vokunal|m> | Now that IA is ok, can we get mediaonfire unclogged? If nothing changed, it's still going into temp storage |
17:12:15 | <@arkiver> | vokunal|m: mediafire still going to temporary storage? |
17:12:31 | <@arkiver> | i see WARCs appearing on IA |
17:12:34 | <@arkiver> | seems like it is going to IA |
17:12:44 | <@arkiver> | and it doesn't seem to be clogged |
17:14:05 | <vokunal|m> | ah cool |
17:14:50 | <vokunal|m> | it must be flowing smooth then. I was thinking the out was clogged, but it must just be some funky items. They've been flowing nonstop, but slow |
17:15:35 | <@arkiver> | we also queue mediafire items discovered in #// |
17:17:30 | <vokunal|m> | That makes sense why the claims wouldn't be going down that fast. I was confused when we had around 40k todo, and it drained into claims over a week or so, but didn't seem to be leaving claims |
17:43:50 | | parfait_ joins |
17:44:52 | <fireonlive> | 🥳 |
17:47:53 | | parfait quits [Ping timeout: 265 seconds] |
17:52:08 | | Megame quits [Client Quit] |
18:02:55 | | killshot1337 quits [Ping timeout: 272 seconds] |
18:33:12 | | etnguyen03 (etnguyen03) joins |
19:08:56 | | lennier2 quits [Remote host closed the connection] |
19:14:19 | | DogsRNice joins |
19:15:10 | | lennier1 (lennier1) joins |
19:18:09 | | jarfeh joins |
19:55:46 | | Island quits [Remote host closed the connection] |
19:55:46 | | parfait_ quits [Remote host closed the connection] |
19:55:46 | | DogsRNice quits [Remote host closed the connection] |
19:55:53 | | Island joins |
19:55:58 | | DogsRNice joins |
19:56:00 | | parfait_ joins |
20:04:31 | | HP_Archivist quits [Ping timeout: 272 seconds] |
20:05:09 | | _Dango360 quits [Ping timeout: 272 seconds] |
20:57:05 | | dumbgoy quits [Ping timeout: 272 seconds] |
20:59:41 | | dumbgoy joins |
22:02:28 | | dumbgoy_ joins |
22:05:30 | | dumbgoy quits [Ping timeout: 265 seconds] |
22:07:47 | | unsimply joins |
22:08:54 | | unsimply quits [Remote host closed the connection] |
22:30:12 | | dumbgoy_ quits [Read error: Connection reset by peer] |
23:08:57 | | BlueMaxima joins |
23:18:12 | | dumbgoy joins |
23:20:38 | | Pedrosso joins |
23:41:12 | | Barto quits [Ping timeout: 265 seconds] |
23:42:26 | <Pedrosso> | I'm new to hackint.org as well as to this chat. The wiki says this channel is supposed to be the right one to ask/inform about dying websites, is that accurate? |
23:42:37 | <pokechu22> | Pedrosso: Yes |
23:43:37 | | etnguyen03 quits [Ping timeout: 265 seconds] |
23:49:46 | <Pedrosso> | Spore is a game that's been out since september 4th 2008, and support (by EA) has been declining. Sporepedia. There's no official shutdown date afaik, however I am anxious considering the company that's hosting them (EA), and how the company has already almost broken the game in itself with its own launcher. What I'm worried about saving is the |
23:49:46 | <Pedrosso> | sporepedia (spore.com) A large and very old website with millions of users and creations (>10 million enumerated files with approximately the average filesize of 20kB) I've been using my own (bad) code to save only the creations, but it's inefficient and also leaves out all the forums, creators, comments, etc. |
23:50:24 | <Pedrosso> | I don't know how much this community cares for archiving such stuff, as it's a niche thing. Mind enlightening me? |
23:53:33 | <@arkiver> | can ArchiveBot handle that spore.com ? |
23:53:37 | <@arkiver> | would be nice to have a copy yes |
23:53:50 | <pokechu22> | EA doesn't like us unfortunately :| |
23:54:11 | <@arkiver> | does that matter? :P |
23:54:48 | <@JAA> | If they block or rate-limit us, it does. |
23:54:49 | <pokechu22> | Most of their websites timeout with archivebot, though that's mostly newer stuff (ea.com and like battlefront I think). Not sure if spore.com is also affected |
23:54:53 | <Pedrosso> | I have been downloading these files for a long while and have had 0 apparent problems with rate-limiting, etc |
23:55:08 | <pokechu22> | or, not even timeout, instead it acts more like a tarpit if I recall correctly |
23:55:10 | <Flashfire42> | Holy shit that site shared in the main channel is cancerous so many popups |
23:55:41 | <pokechu22> | (for reference the site shared there was https://hikarinoakari.com / https://imgur.com/jeJSEu6) |
23:55:49 | <@JAA> | I'm getting an expired cert on spore.com. Nice. |
23:56:09 | <Pedrosso> | the site is very much in a state of disrepair, hence why I'm concerned |
23:56:17 | <pokechu22> | I believe ScenarioPlanet sent me some spore-related stuff and that worked fine in the past, but it was a fairly small subset |
23:56:53 | <Pedrosso> | http://www.spore.com/sporepedia is what I'm referring to specifically |
23:57:22 | <@JAA> | The site does not work well without JS, so there's that. |
23:57:50 | <Flashfire42> | Spore just timed out for me on the main domain there. spore.com |
23:58:03 | <@JAA> | Yeah, took me a few tries as well to get there. |
23:58:04 | <Flashfire42> | sporepedia loads fine maybe it needs teh www |
23:58:30 | <@arkiver> | needs the www indeed |
23:58:37 | <@JAA> | We can certainly try to run it through ArchiveBot. |