00:05:56<schwarzkatz>any progress on uploadir yet? :D
00:47:52<anarcat>JAA: thanks
01:01:35katocala quits [Remote host closed the connection]
01:02:57<fishingforsoup_>Can all of this be archived?
01:03:02<fishingforsoup_>https://www.reddit.com/r/kpop/comments/zndxgy/traineeas_social_media_channels_including_youtube/
01:03:16<fishingforsoup_>I never followed the group, but it might be hard to find years down the line.
01:12:03katocala joins
01:13:49lennier1 quits [Client Quit]
01:14:12lennier1 (lennier1) joins
02:56:59sonick quits [Client Quit]
03:35:10Island quits [Read error: Connection reset by peer]
03:54:24<Ryz>So, out of pure curiosity (and boredom as I archive a bunch of Pastebin URLs via WBM/SPN), I checked out Fiverr and searched the word 'archive'; apparently there are people who offer services to restore websites from the Wayback Machine back into Wordpress Oo;
03:55:21<Ryz>Or just another website
04:08:33treora joins
04:09:52treora quits [Remote host closed the connection]
04:09:54treora joins
04:32:51<@OrIdow6>I did look into that a while ago and as far as I could tell that was used a lot (though not exclusively) by spammers who would buy expiring domains, restore the content from the WBM, throw some ads onto them, and put them online
04:45:47wyatt8750 quits [Ping timeout: 250 seconds]
05:08:42wyatt8740 joins
05:12:04wyatt8750 joins
05:13:14wyatt8740 quits [Ping timeout: 264 seconds]
05:26:27sonick (sonick) joins
05:30:47Hackerpcs (Hackerpcs) joins
05:50:28lukash799 joins
07:18:56programmerq quits [Remote host closed the connection]
07:28:56hitgrr8 joins
07:55:45<monika>do they even bother stripping out the timings+copyright info in the html? 😂
08:56:42michaelblob_ quits [Read error: Connection reset by peer]
09:19:32michaelblob (michaelblob) joins
10:38:12<mgrandi>Furaffinity forums going read only, closing eventually, uses xenforo: https://forums.furaffinity.net/threads/forum-closure-fa-discord-coming-soon.1682702/
11:23:22immibis (immibis) joins
11:33:59jacksonchen666 quits [Remote host closed the connection]
11:43:52jacksonchen666 (jacksonchen666) joins
13:31:11mut4ntm0nkey quits [Remote host closed the connection]
13:31:29mut4ntm0nkey (mutantmonkey) joins
13:31:39Arcorann_ quits [Ping timeout: 265 seconds]
14:29:13IDK quits [Client Quit]
14:30:00IDK (IDK) joins
14:45:17mut4ntm0nkey quits [Remote host closed the connection]
14:46:17mut4ntm0nkey (mutantmonkey) joins
15:04:44Wingy quits [Client Quit]
15:05:22Wingy (Wingy) joins
15:15:31<qwertyasdfuiopghjkl>Some relevant-seeming info in that thread: The forums will go read-only on 2023-01-01 and be fully deleted at some time "in the first quarter [of 2023]". ( https://forums.furaffinity.net/threads/forum-closure-fa-discord-coming-soon.1682702/page-5#post-7378226 ,
15:15:32<qwertyasdfuiopghjkl>https://forums.furaffinity.net/threads/forum-closure-fa-discord-coming-soon.1682702/page-9#post-7378589 )
15:15:32<qwertyasdfuiopghjkl>An administrator of the site is looking for a way to "crawl the entirety of the forum" to make a public archive ( https://forums.furaffinity.net/threads/forum-closure-fa-discord-coming-soon.1682702/page-10#post-7378609 , https://forums.furaffinity.net/threads/forum-closure-fa-discord-coming-soon.1682702/page-10#post-7378623 ), so contacting them
15:15:33<qwertyasdfuiopghjkl>might be an option.
15:17:01mut4ntm0nkey quits [Remote host closed the connection]
15:17:07mut4ntm0nkey (mutantmonkey) joins
15:35:38dhrrr joins
15:44:12<dhrrr>Are there any plans for Twitter? While an exhaustive archive would be infeasible, can we at least archive popular tweets (for example, those with at least a certain number of likes)? snscrape allows filtering by like count
15:44:50<ivan>Twitter users can be submitted in #archivebot
15:45:30<ivan>there's a lot of Twitter in WBM
15:46:42<dhrrr>Yes but twitter has also many "main characters of the day", and isolated popular tweets. People who are not notable and that won't by sent to #archivebot.
15:48:29<ivan>if you have methods to make a good list of users, there's a way to submit a lot of users
15:50:36HP_Archivist (HP_Archivist) joins
15:50:37<ivan>you can, for example, scrape follows and analyze graphs of follow relationships
15:50:57Jonimus quits [Ping timeout: 250 seconds]
15:52:51<dhrrr>I'm aware of that. I'm just surprised that there isn't an ArchiveTeam project for dealing with popular tweets regardless of who posted them
16:15:42HP_Archivist quits [Client Quit]
16:20:44dhrrr quits [Remote host closed the connection]
16:38:12atphoenix_ is now known as atphoenix
16:51:59JensRex quits [Client Quit]
16:52:27JensRex (JensRex) joins
16:53:21Jonimus joins
17:19:42Island joins
17:29:44Island_ joins
17:31:55Island quits [Ping timeout: 250 seconds]
17:38:45lukash799 quits [Client Quit]
17:44:41lukash799 joins
18:06:46<@OrIdow6>monika: I suspect but am sure that this is partially responsible for the proliferation of "welcome to the US petabox" (Google it in quotes) pages that A oede noticed a while ago
18:10:25<@OrIdow6>For a more modern example, here's a site that seems to be doing this https://germanyweek.org/ - notice the links about an online casinos at the bottom - and here https://germanyweek.org/program is a page where they've messed up on the crawling and copied a WBM error page
18:10:31jacksonchen666 quits [Ping timeout: 245 seconds]
18:10:39mut4ntm0nkey quits [Remote host closed the connection]
18:11:41<@OrIdow6>And I am being deliberate with putting this in bs instead of ot, I do think this is relevant to resource allocation on smaller sites and thought about doing some kind of writeup about it eventually
18:12:18Megame (Megame) joins
18:13:02mut4ntm0nkey (mutantmonkey) joins
18:16:39<@OrIdow6>Sounds like a good idea qwertyasdfuiopghjkl
18:23:03<@OrIdow6>Anyone want to do it?
18:23:33<@OrIdow6>This group seems to be full of furries so I don't think we should have difficulty establishing rapport at that end
18:34:57Island joins
18:36:02Island_ quits [Ping timeout: 264 seconds]
18:47:07mut4ntm0nkey quits [Remote host closed the connection]
18:47:37mut4ntm0nkey (mutantmonkey) joins
18:49:51spirit joins
19:06:23Island quits [Ping timeout: 250 seconds]
19:13:37Island joins
19:17:43Megame quits [Client Quit]
19:20:26Island quits [Ping timeout: 264 seconds]
19:35:36<mgrandi>I'll pm the mod, see if I can start the conversation
19:36:37<@OrIdow6>Ok
19:39:56Island joins
19:41:12<mgrandi>I also have experimented in archiving the main FA content, they don't have any rate limiting or anything that I can see, it is behind cloudflare though
19:41:59Island_ joins
19:44:47Island quits [Ping timeout: 265 seconds]
19:51:47<@OrIdow6>We ran a project for that half a decade ago or so I believe
19:52:55Island joins
19:53:55<@JAA>Oh wow, that project used wpull, interesting.
19:54:27Island_ quits [Ping timeout: 265 seconds]
20:01:35Island_ joins
20:02:30<@OrIdow6>Sounds like Python version misery to me
20:04:50Island quits [Ping timeout: 264 seconds]
20:07:02hitgrr8 quits [Client Quit]
20:07:24lukash799 quits [Client Quit]
20:13:25lukash799 joins
20:36:13<mgrandi>Yeah, I used wpull for that as well, for my own infrastructure set up, it's pretty easy, once you have the cloudflare key, I need to set up some ignore rules and then rerun it to get the media
20:50:04godane (godane) joins
21:09:12<@OrIdow6>Depending on the forum thing goes it may raise the possibility of doing that without risking antagonizing them
21:39:45<schwarzkatz>always cool to see site owners wanting to help archive their sites. I have a good feeling about this FA project.
21:41:30jacksonchen666 (jacksonchen666) joins
22:50:31sec^nd quits [Ping timeout: 245 seconds]
22:55:10sec^nd (second) joins
23:09:29<@arkiver>qwertyasdfuiopghjkl: please add it to deathwatch
23:28:05wyatt8750 quits [Ping timeout: 265 seconds]
23:28:32wyatt8740 joins
23:30:26jacksonchen666 quits [Remote host closed the connection]
23:50:36jacksonchen666 (jacksonchen666) joins