00:33:02 | | sonick (sonick) joins |
00:37:38 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
00:49:44 | <eightthree> | (sorry for repost my element client glitches and only offers repost or delete) |
01:08:54 | <nulldata> | Can someone please throw https://www.a3artistsagency.com/ into AB? Agency shutting down. https://www.hollywoodreporter.com/business/business-news/a3-artists-agency-shuts-down-1235821430/ |
01:15:18 | | pabs quits [Client Quit] |
01:16:00 | <fireonlive> | nulldata: done :) |
01:18:33 | | pabs (pabs) joins |
01:26:48 | | Wohlstand quits [Client Quit] |
01:49:44 | | kiryu quits [Remote host closed the connection] |
01:55:10 | | kiryu joins |
01:55:10 | | kiryu is now authenticated as kiryu |
01:55:10 | | kiryu quits [Changing host] |
01:55:10 | | kiryu (kiryu) joins |
02:07:34 | <nulldata> | Thanks! |
02:34:16 | <pabs> | pokechu22: a jira https://ftp.esteban.polymtl.ca/jira |
02:34:52 | <pokechu22> | No public issues it seems |
02:44:07 | | aninternettroll quits [Ping timeout: 272 seconds] |
02:47:35 | | aninternettroll (aninternettroll) joins |
02:52:37 | <fireonlive> | on .af: https://abyssdomain.expert/@filippo/111923622943979038 |
04:11:23 | | line quits [Remote host closed the connection] |
04:12:58 | <nicolas17> | cool mastodon domain that |
04:30:06 | <fireonlive> | yee |
04:48:38 | | line joins |
05:41:01 | | Island quits [Read error: Connection reset by peer] |
06:05:50 | | BlueMaxima quits [Read error: Connection reset by peer] |
06:11:37 | | MetaNova quits [Remote host closed the connection] |
06:13:09 | | MetaNova (MetaNova) joins |
06:28:42 | | Barto quits [Quit: WeeChat 4.2.1] |
06:30:52 | | Barto (Barto) joins |
06:45:18 | <h2ibot> | JustAnotherArchivist edited Deathwatch (+207, /* 2024 */ Add Commaful): https://wiki.archiveteam.org/?diff=51725&oldid=51715 |
07:04:19 | <@JAA> | Commaful is JS hell. There seem to be decent sitemaps, and AB should be getting the main content but not the comments I think. |
07:07:51 | <@JAA> | The announcement is also JS hell and doesn't even work in the WBM. |
07:27:48 | | Arcorann (Arcorann) joins |
07:34:14 | <mgrandi> | Is there a channel for steam stuff? Or maybe we can create one, I've been working on a steam workshop downloader and have a few different versions working (since steam workshop is kinda awful and you need different strategies per game) |
07:55:32 | <h2ibot> | JacksonChen666 edited Deathwatch (+128, queer.af with definitive shutdown date): https://wiki.archiveteam.org/?diff=51726&oldid=51725 |
08:07:19 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
08:30:38 | <Vokun> | mgrandi: Technically #outofsteam I think, but I've never seen it used. All the links on the forum point to archiveteam-bs though |
09:40:54 | <magmaus3> | does anyone have any expired discord cdn links? |
10:00:02 | | Bleo18260 quits [Client Quit] |
10:01:19 | | Bleo18260 joins |
10:47:47 | | JustThatNerdyNerd joins |
10:48:40 | <JustThatNerdyNerd> | I have a site that detected a web scraper when archive.org tried to archive it, so I created a Warc with a Chrome extension, wondering if I could provide it here to be pushed to IA |
10:57:02 | | JustThatNerdyNerd quits [Remote host closed the connection] |
11:04:51 | <thuban> | ...and gone |
11:14:56 | <mgrandi> | I don't but any copied discord links should expire eventually right? |
11:18:08 | <@OrIdow6> | I think we should consider trying to publicize the Discord thing more, I haven't heard that discussed outside of AT channels |
11:18:56 | <@OrIdow6> | (Says the person who hasn |
11:19:05 | <@OrIdow6> | 't bothered going through their own browsing history for them) |
11:19:11 | <FireFly> | discord thing? |
11:21:39 | <@OrIdow6> | FireFly: see https://wiki.archiveteam.org/index.php?title=Deathwatch#cite_note-36 , Discord doesn't like that people were using them to host files |
11:21:51 | <@OrIdow6> | And is going to break a bunch of file links |
11:22:45 | <FireFly> | ahh, I see |
11:30:33 | | eroc1990 quits [Read error: Connection reset by peer] |
11:32:24 | | eroc1990 (eroc1990) joins |
11:34:23 | <@OrIdow6> | !tell JustThatNerdyNerd you can upload it to the Internet Archive yourself, but as it comes from a third party it will not be put into the Wayback Machine (also - Chrome extensions, last I checked, do not have access to the raw TCP/TLS stream that requests happen over, i.e. they must fake some information - exactly why this is in place) |
11:34:24 | <eggdrop> | [tell] ok, I'll tell JustThatNerdyNerd when they join next |
11:39:25 | <h2ibot> | OrIdow6 edited Frequently Asked Questions (+96, /* halp pls halp */ This was a bit obtuse): https://wiki.archiveteam.org/?diff=51727&oldid=51149 |
11:48:24 | | sdomi (sdomi) joins |
11:51:27 | <h2ibot> | OrIdow6 edited Frequently Asked Questions (+75, A bit of general editing to this page): https://wiki.archiveteam.org/?diff=51728&oldid=51727 |
11:52:28 | <h2ibot> | OrIdow6 edited Frequently Asked Questions (+0, Syntax): https://wiki.archiveteam.org/?diff=51729&oldid=51728 |
12:35:07 | | decky_e_ quits [Read error: Connection reset by peer] |
12:35:39 | | Arcorann quits [Ping timeout: 272 seconds] |
12:42:56 | | icedice (icedice) joins |
12:45:39 | | monoxane (monoxane) joins |
12:45:55 | | monoxane quits [Remote host closed the connection] |
13:17:54 | | Wohlstand (Wohlstand) joins |
13:24:16 | | lea quits [Quit: quit.] |
13:28:03 | | PredatorIWD quits [Quit: Leaving] |
13:28:34 | | sknebel quits [Remote host closed the connection] |
13:28:58 | | ThetaDev quits [Read error: Connection reset by peer] |
13:29:29 | | @AlsoJAA quits [Ping timeout: 272 seconds] |
13:29:37 | | ThetaDev joins |
13:30:07 | | lflare quits [Ping timeout: 272 seconds] |
13:30:23 | | lflare (lflare) joins |
13:30:28 | | sknebel (sknebel) joins |
13:31:51 | | lea (lea_) joins |
13:35:50 | | AlsoJAA (JAA) joins |
13:35:50 | | @ChanServ sets mode: +o AlsoJAA |
13:44:56 | | lea quits [Client Quit] |
13:47:03 | | lea (lea_) joins |
13:48:20 | | sknebel quits [Client Quit] |
13:49:20 | | @AlsoJAA quits [Ping timeout: 240 seconds] |
13:50:45 | | sknebel (sknebel) joins |
13:52:50 | | ThetaDev quits [Ping timeout: 240 seconds] |
13:55:01 | | ThetaDev joins |
13:59:20 | | ThetaDev quits [Ping timeout: 240 seconds] |
14:04:21 | | lea quits [Client Quit] |
14:05:21 | | ThetaDev joins |
14:05:21 | | AlsoJAA (JAA) joins |
14:05:21 | | @ChanServ sets mode: +o AlsoJAA |
14:05:26 | | lea (lea_) joins |
14:12:20 | | lea quits [Client Quit] |
14:13:15 | | lea (lea_) joins |
14:13:57 | | etnguyen03 (etnguyen03) joins |
14:40:54 | | lea quits [Client Quit] |
14:41:08 | | lea (lea_) joins |
15:21:13 | | icedice quits [Client Quit] |
15:22:51 | | etnguyen03 quits [Ping timeout: 272 seconds] |
15:29:55 | | etnguyen03 (etnguyen03) joins |
15:37:50 | | riku quits [Ping timeout: 240 seconds] |
15:43:45 | | etnguyen03 quits [Ping timeout: 272 seconds] |
15:47:16 | | manalog joins |
16:04:17 | | aaaaa joins |
16:04:47 | | aaaaa quits [Remote host closed the connection] |
16:21:07 | <@arkiver> | aninternettroll: this is not just about starting a project for any random nice site we might want |
16:21:15 | <@arkiver> | it's usually aimed at stuff that is going to be deleted or in serious danger of being deleted |
16:21:51 | <@arkiver> | is speedrun.com shutting down? if not, likely not a project |
16:22:02 | <aninternettroll> | fair enough |
16:22:19 | <@arkiver> | eightthree: nsfw is always included if it's being deleted |
16:22:56 | <aninternettroll> | but a mastodon would be more relevant, since instances go down all the time? |
16:23:23 | <@arkiver> | perhaps, yeah |
16:23:41 | <@arkiver> | on nsfw stuff - for as long as i've been at Archive Team (10+ years) we have always treated it as regular content |
16:24:12 | <@arkiver> | aninternettroll: we'll have a new try at Reddit again soon, so that might be a nice one to run when it starts again |
16:25:43 | <joepie91|m> | mass-archiving mastodon as a whole is not welcome; users generally deliberate let their posts get deleted over time, and they are meant to be transient. this is not a case of "negligent company yeets stuff that people expected to stay online" |
16:26:36 | <joepie91|m> | (there are some exceptions, specific instances, where it can make sense to archive things) |
16:29:25 | <h2ibot> | Switchnode edited Telegram (+1, /* Web data only */ queue bot does not actually…): https://wiki.archiveteam.org/?diff=51730&oldid=51476 |
16:35:26 | <h2ibot> | JustAnotherArchivist edited Frequently Asked Questions (+1, Fix unit): https://wiki.archiveteam.org/?diff=51731&oldid=51729 |
16:36:10 | <fireonlive> | the 'i's strike again |
16:40:27 | <h2ibot> | JustAnotherArchivist edited Frequently Asked Questions (+21, Fix inaccuracy about IA whitelisting): https://wiki.archiveteam.org/?diff=51732&oldid=51731 |
16:43:12 | <@JAA> | ^ Cc arkiver, OrIdow6 |
17:05:08 | | pedantic-darwin quits [Client Quit] |
17:06:57 | | manalog quits [Ping timeout: 265 seconds] |
18:41:17 | | etnguyen03 (etnguyen03) joins |
18:53:17 | <nicolas17> | I asked this a million times but I keep forgetting what the unsatisfactory answer is: I have two URLs for the same giant file, is there any way I can archive them such that 1. the content is deduplicated, 2. they work in WBM? |
18:59:52 | <@JAA> | Yes, wget-at with the appropriate options or qwarc will write a revisit record for the second and later retrieval. |
19:03:35 | <thuban> | what about wpull/ab? |
19:03:52 | <@JAA> | AB no |
19:03:57 | <nicolas17> | afaik archivebot doesn't do deduplication |
19:04:11 | <@JAA> | wpull has some support for it, I think, but I don't recall how (well) it works. |
19:04:19 | <nicolas17> | I could do wget-at and upload the warc to IA myself, but then I need to arrange for WBM-approval |
19:05:09 | | etnguyen03 quits [Ping timeout: 272 seconds] |
19:06:08 | | etnguyen03 (etnguyen03) joins |
19:08:34 | | icedice (icedice) joins |
19:09:58 | <nicolas17> | for other files that *don't* have duplicates, I was planning to automate sending the URL to #archivebot :p |
19:33:14 | | Island joins |
19:35:58 | | Island quits [Read error: Connection reset by peer] |
19:37:48 | | Island joins |
19:41:20 | | eightthree quits [Ping timeout: 240 seconds] |
19:43:19 | | eightthree joins |
19:54:43 | <pokechu22> | How big is "giant"? It might be easier to just let it be duplicated |
19:55:49 | <nicolas17> | don't let facts get in the way of my anti-wastefulness obsession |
19:57:17 | <fireonlive> | inb4 20MiB |
19:57:19 | <fireonlive> | :p |
19:57:47 | <@JAA> | Yeah, numbers for average daily size + duplicated amount please. :-) |
20:10:20 | | etnguyen03 quits [Ping timeout: 240 seconds] |
20:38:07 | | SootBector quits [Remote host closed the connection] |
20:38:35 | | SootBector (SootBector) joins |
20:40:47 | | eightthree quits [Ping timeout: 272 seconds] |
20:42:45 | | eightthree joins |
20:53:45 | | that_lurker quits [Quit: I am most likely running a system update] |
20:56:02 | | that_lurker (that_lurker) joins |
21:07:23 | | JustThatNerdyNerd joins |
21:07:24 | <eggdrop> | [tell] JustThatNerdyNerd: [2024-02-14T11:34:23Z] <OrIdow6> you can upload it to the Internet Archive yourself, but as it comes from a third party it will not be put into the Wayback Machine (also - Chrome extensions, last I checked, do not have access to the raw TCP/TLS stream that requests happen over, i.e. they must fake some information - exactly why this is in place) |
21:07:48 | | JustThatNerdyNerd quits [Remote host closed the connection] |
21:09:58 | <fireonlive> | hi/bye |
21:27:31 | | Aes joins |
21:32:01 | | BlueMaxima joins |
21:32:13 | <Aes> | Hi, my name is Aesara https://github.com/aesarab. I help develop MyPaint, a painting software that started development in 2004. Over the past month I've been working on a new website, which exists in a new and separate git repo from the old one. The old website has been up since 2015, that is, 9 years. |
21:32:59 | <Aes> | My question is, is it worth the time to archive that website, and if so, how? |
21:33:36 | <nicolas17> | if it's just a "static website" we can get it archived quite easily |
21:33:58 | <Aes> | It's a static website made with hugo |
21:34:04 | <Aes> | sorry, jekyll |
21:35:31 | <Aes> | The git repo for the old website is public, but will be archived. I'm not sure what will happen if e.g. we move from GitHub to something else |
21:36:33 | <Aes> | Oh yeah, additionally we use the GitHub wiki for a lot of documentation, including user manuals. We're moving this to the new website also. |
21:38:41 | <nicolas17> | post website link :) JAA: can you throw it into archivebot? |
21:38:54 | <Aes> | https://mypaint.app |
21:39:05 | <Aes> | repo @ https://github.com/mypaint/website |
21:39:38 | <joepie91|m> | +1 for actively ensuring that it gets archived :) |
21:41:20 | <Aes> | thanks, the repo for all assets in the wiki is at git@github.com:mypaint/mypaint.wiki.git |
21:41:47 | <Aes> | *all assets and md files |
21:42:25 | <Aes> | wait, I realise that github wikis don't have a way of uploading files, so if a user doesn't clone the repo we had this really dumb way of attaching images. Let me find it |
21:43:28 | <Aes> | https://github.com/mypaint/mypaint/wiki/Writing-Documentation#pages |
21:48:36 | <Aes> | grep -r '!\[.*\](http\(s\)\?://' . |
21:48:57 | <Aes> | that command should find all instances of externally linked resources |
21:57:23 | <@JAA> | Aes: https://mypaint.app/ is running through ArchiveBot now. Thanks for reaching out! |
21:58:05 | <Aes> | Okay, so I was worried about backlinks for the wiki (I'm sure we have a couple orphaned pages), so I used the list of md files to create a list of links comprising the github wiki |
21:58:06 | <Aes> | https://pastebin.com/exjaarRn |
21:59:48 | <@JAA> | Also running through AB now. |
22:00:02 | <@JAA> | I'll also queue all your repos to our GitHub project because why not. :-) |
22:01:08 | <Aes> | Thanks, I really appreciate all the work you guys are doing. The MyPaint project has had the rug pulled out from it several times with our hosting providers. Between GNA, our old website, and our old wiki. Archived resources for these have been seriously helpful in building back up |
22:04:29 | <Aes> | wait, old as in the websites before the one's we're replacing |
22:04:45 | <Aes> | Anyway, thanks again. Catch u around |
22:04:52 | | Aes quits [Client Quit] |
22:05:44 | <eightthree> | im trying to get a feel for how things are included or excluded from the archive created vs the original i.e. mapping out the differences (for my personal understanding) |
22:05:46 | <eightthree> | censorship is also very subjective |
22:13:58 | | riku (riku) joins |
22:25:33 | | Darken (Darken) joins |
22:27:41 | <nicolas17> | pokechu22 JAA: 3x12GB (x2 duplicated URLs) at most once a week? |
22:29:14 | <@JAA> | Cute |
22:29:56 | <@JAA> | I mean, deduping wouldn't be bad, but duplicating it wouldn't be that terrible either. |
22:32:09 | <nicolas17> | IIRC someone uploaded a few of them as archive.org items (I think using wget so they may have the known warc-format issues of wget?) and they had 4x redundancy, because of the two URLs × {http,https} |
22:32:11 | <nicolas17> | /o\ |
22:32:51 | <@JAA> | :-/ |
22:33:10 | <@JAA> | Grabbing over both HTTP and HTTPS seems completely unnecessary. |
22:33:17 | <fireonlive> | https only! |
22:33:24 | <fireonlive> | 🔒 |
23:01:50 | | etnguyen03 (etnguyen03) joins |
23:38:45 | | etnguyen03 quits [Ping timeout: 272 seconds] |
23:58:23 | | decky_e joins |