00:00:48sepro1 (sepro) joins
00:00:51sepro quits [Read error: Connection reset by peer]
00:00:51sepro1 is now known as sepro
00:15:02sepro quits [Read error: Connection reset by peer]
00:15:19sepro (sepro) joins
00:17:06HackMii_ quits [Remote host closed the connection]
00:18:13HackMii_ (hacktheplanet) joins
00:56:10Arcorann (Arcorann) joins
01:28:35Hackerpcs quits [Quit: Hackerpcs]
01:30:33Hackerpcs (Hackerpcs) joins
01:56:37Stilett0 joins
01:59:16Stiletto quits [Ping timeout: 240 seconds]
02:30:57HackMii_ quits [Remote host closed the connection]
02:31:23HackMii_ (hacktheplanet) joins
03:34:09tbc1887 (tbc1887) joins
03:35:36tbc1887 quits [Client Quit]
06:08:16march_happy quits [Ping timeout: 240 seconds]
06:08:32march_happy (march_happy) joins
06:12:46march_happy quits [Ping timeout: 240 seconds]
06:13:43march_happy (march_happy) joins
06:49:16Arcorann quits [Ping timeout: 240 seconds]
07:13:07<pabs>an acquisition https://ftw.usatoday.com/2022/08/amazon-buy-electronic-arts
07:22:46march_happy quits [Ping timeout: 240 seconds]
07:36:00systwi (systwi) joins
07:43:11Stilett0 quits [Read error: Connection reset by peer]
07:46:35march_happy (march_happy) joins
07:59:28lennier1 quits [Quit: Going offline, see ya! (www.adiirc.com)]
08:00:02lennier1 (lennier1) joins
08:27:04Arcorann (Arcorann) joins
08:51:22BlueMaxima quits [Client Quit]
09:00:46march_happy quits [Ping timeout: 240 seconds]
09:01:48march_happy (march_happy) joins
09:06:40march_happy quits [Ping timeout: 265 seconds]
09:11:00march_happy (march_happy) joins
09:53:30le0n quits [Client Quit]
09:53:38HackMii_ quits [Remote host closed the connection]
09:54:14le0n (le0n) joins
09:54:27HackMii_ (hacktheplanet) joins
09:55:24le0n quits [Client Quit]
10:01:43le0n (le0n) joins
10:40:16march_happy quits [Ping timeout: 240 seconds]
10:43:46Arcorann quits [Ping timeout: 240 seconds]
11:00:44<h2ibot>JAABot edited CurrentWarriorProject (+4): https://wiki.archiveteam.org/?diff=48833&oldid=48832
11:04:21<joepie91|m>pabs: I guess I tempted fate when I said "surely EA can't get any worse" a while ago...
11:08:39<pabs>:)
11:37:49qwertyasdfuiopghjkl joins
11:38:24march_happy (march_happy) joins
11:57:34HackMii_ quits [Write error: Broken pipe]
11:58:11HackMii_ (hacktheplanet) joins
11:58:19applepiesavannah|m leaves
12:35:29march_happy quits [Read error: Connection reset by peer]
12:35:50march_happy (march_happy) joins
13:14:39Hackerpcs quits [Client Quit]
13:15:17Hackerpcs (Hackerpcs) joins
13:51:24VerifiedJ quits [Quit: The Lounge - https://thelounge.chat]
13:52:11VerifiedJ (VerifiedJ) joins
14:00:23eroc19909 (eroc1990) joins
14:01:16eroc1990 quits [Ping timeout: 240 seconds]
14:01:25michaelblob_ (michaelblob) joins
14:04:46michaelblob quits [Ping timeout: 240 seconds]
14:04:52michaelblob (michaelblob) joins
14:05:06<Frogging101>pabs: The article says it's not happening
14:05:11<Frogging101>You scared me
14:08:16michaelblob_ quits [Ping timeout: 265 seconds]
14:35:50thetechrobo_ (TheTechRobo) joins
14:39:16TheTechRobo quits [Ping timeout: 240 seconds]
14:45:56tech_exorcist (tech_exorcist) joins
15:46:23thetechrobo_ is now known as TheTechRobo
15:54:17eroc19909 is now known as eroc1990
16:44:52march_happy quits [Ping timeout: 265 seconds]
17:00:49<Ryz>Welp, that's saddening, http://www67.tcup.com/ was shut down on 2022 August 01, and doesn't seem like any of us heard it until I found it at random... s:
17:01:14<Ryz>Reminder on being proactive on finding websites and services shutting down, especially those in a non-English environment
17:10:34HackMii_ quits [Remote host closed the connection]
17:11:19HackMii_ (hacktheplanet) joins
17:20:16HackMii_ quits [Ping timeout: 240 seconds]
17:23:53HackMii_ (hacktheplanet) joins
17:30:02Hackerpcs quits [Client Quit]
17:32:41Hackerpcs (Hackerpcs) joins
17:43:14Hackerpcs quits [Client Quit]
17:43:45Hackerpcs (Hackerpcs) joins
17:50:44<Maakuth|m>arkiver: I sent you private message
17:50:44<madpro|m>I've considered the idea of a twitter bot that retweets mentions of the word "closing" or "shut down"
17:50:45<madpro|m>Something like that, also filtering for non-English terms, might be the best option in terms of a long-term solution
17:50:45<madpro|m>We couldn't call it `Archivebot` though, so I'm all ears for any creative names
18:01:32<Maakuth|m>Reverse Grim Reaper
18:01:44<madpro|m>The Grim Planter?
18:02:00<@JAA>I was doing that manually for a while, and there is a *lot* of noise, duplication, etc.
18:02:00<madpro|m>* The Grim Sower
18:02:28<@OrIdow6>Yeah like 98% of it is noise
18:02:35<@OrIdow6>Judging from my experiments with Reddit
18:02:46<@OrIdow6>Plus not all of us use Twitter
18:03:35<madpro|m>JAA: You're not alone, I know at least two other people who query Twitter (or Google) randomly to catch unheard shutdowns
18:03:46<madpro|m>Point is, none of them cover anything outside of English
18:04:06<madpro|m>A bot could, at the very least, extend the scope of manual queries
18:04:18<@JAA>I did it in a couple European languages as well at the time, but yeah.
18:05:13<madpro|m>OrIdow6: That is true, but also consider that if something is shutting down Twitter and Tiktok are the best contenders for people to complain
18:05:25<@OrIdow6>madpro|m: What is true?
18:05:40<madpro|m>Sorry?
18:05:49<madpro|m>Oh wait, replies must be broken
18:05:54<madpro|m>> Plus not all of us use Twitter
18:06:23<madpro|m>* I meant for someone (one person) trying to get the word out there Twitter is not very efficient.
18:06:48<madpro|m>But, hypothetically, if a crowd of people complain together, they will likely be doing it on Twitter.
18:07:02<madpro|m>Hence why it's good to query Twitter, even if you (like me) prefer to boycott the site otherwise
18:09:16<@OrIdow6>That information comes from Twitter doesn't mean it has to go back on Twitter
18:09:22<@OrIdow6>Unless I'm misreading this
18:10:25<madpro|m>But it helps, network effects be wicked (つ▀¯▀)つ
18:12:00<@JAA>I've previously thought (and talked here) about a potential issue tracker-ish system where issues are automatically created from web, Twitter, Reddit, etc. searches. That would probably be more productive in terms of actually getting stuff archived. But the SNR is tiny, and designing such a system to automatically merge 'similar enough' hits would be tricky.
18:12:32<@OrIdow6>Almost no vaguely interested member of the public is going to subscribe to a bot that reposts a huge amount of irrelevant garbage, occasional mentions of big sites with well-known shutdowns, and a tiny portion of unique finds
18:13:40<madpro|m>I have my ideas :} you have yours.
18:13:59<madpro|m>If we have a spammy tool today, maybe someone will figure out how to filter it down the line. But if we never try, there won't be anything to build on
18:14:17<@JAA>Also, we're in the business of archiving stuff, not telling people that stuff shuts down. :-)
18:14:34<@JAA>I mean, feel free to build the bot and see if it's any useful. But probably not in the name of AT.
18:14:52<madpro|m>Can't hit what you can't see though, as Ryz started this whole tangent
18:15:20<@OrIdow6>JAA: On that I have thought that you could have a bot that closes issues containing some addable-to set of strings or regexes or whatever
18:17:35<@JAA>Well, for that particular case, it would've been of no use, as there were no mentions of the tcup.net shutdown on Twitter as far as I can see. :-P
18:19:23<madpro|m>Did you search in Japanese tho https://twitter.com/cogito1961/status/1562405584271458305?s=20&t=hnUnktbSouIUQVJCj5yzvA
18:19:54<madpro|m>Also, I believe (Nosamu?) may have pinged me about it some time ago as well
18:20:39<madpro|m>Then again, I have barely been IRC or the AT Wiki throughout this summer. I just figured someone would have heard by now
18:20:47<@JAA>I tried a few terms, but I don't actually know Japanese, so that's kind of difficult. I also only searched before the deadline.
18:20:49<madpro|m>*on IRC
18:21:28<@JAA>(Also, Twitter's search sucks.)
18:22:18<@JAA>Ah yeah, one of the terms I tried yielded no results earlier but now does. Standard Twitter.
18:22:29<Ryz>I do tend to wander and roam a lot of places on the internet when it comes to insane curiosity and looting opportunities for archival potential; that one I found that was shutdown, I found randomly from a NSFW website <#>;
18:22:38<madpro|m>Fair enough :}
18:23:07<@JAA>Here's one random tweet about it from 10 days prior: https://twitter.com/uzurainfo_test/status/1549634285681799169
18:25:43<madpro|m>* Correction: It was the `cakes` shutdown Nosamu pinged me about https://twitter.com/cakes_PR/status/152934177425444864
18:29:16<Ryz>I tend to archivey a lot ever since my appearance in ArchiveTeam, based on my various feelings and experiences seeing the stuff I like suddenly vanish or taken down; it was at the point to extend my thoughts or brainstormy, on what stuff is likely to be changed or taken down; for instances, bands being disbanded after many many years being together
18:29:16<Ryz>as a music group? Seriously never thought of that, and that's one of the main reasons for voicing MoeLarryShemp in the first place
18:29:56<Ryz>Not only never thought of it, during my time, I don't think anyone else even me has tackled something like that at all
18:29:58<madpro|m>🤔
18:36:42<Ryz>For searching in different languages, I just use a machine translator (if Google Translate is iffy to directly use, https://translate.projectsegfau.lt/ will do the job without any nastyness and all the goodness); and if possible, try to learn if there's slang or anything that's related to shutdowns or changes
18:40:03<@OrIdow6>So what would it take to set up the issue tracker thing?
18:41:30<madpro|m>> On that I have thought that you could have a bot that closes issues containing some addable-to set of strings or regexes or whatever
18:41:49<madpro|m>on https://github.com/ArchiveTeam/ArchiveBot ?
18:42:34<@OrIdow6>No, on what J A A had said
18:42:48<@OrIdow6>The issue tracker
18:44:24<@JAA>Well, an issue tracker with an API. We have the Gitea instance, but I don't know how well that scales when you get to a very large number of issues. Otherwise, we'd need something else. Bugzilla is one that comes to mind which can certainly handle at least a couple millions without issues (heh).
18:45:13<@JAA>Other than that, a software that aggregates the data sources and tries to match the hits together to reduce load on the tracker and the meatbags reviewing the issues, I guess.
18:52:16tzt quits [Ping timeout: 240 seconds]
18:54:10tzt (tzt) joins
18:58:22Iki joins
19:05:34tech_exorcist quits [Remote host closed the connection]
19:05:56tech_exorcist (tech_exorcist) joins
19:22:02<@OrIdow6>In the interest of not requiring too much work/people it would be best to to a first attempt with the Gitea instance, I think
19:23:23<@OrIdow6>*do
19:27:08<thuban>i would settle for improving our accessibility to manual reporting a bit. there are one or two communities i've considered trying to put out a general psa in (very international and sympathetic to internet preservation), but can i direct them to the archiveteam@archiveteam.org email or the @archiveteam twitter? does anyone check those?
19:27:17<thuban>of course asking people to come to irc is convenient for _us_ (we're all right here, quick turnaround on any followup questions, open and highly accessible protocol with web clients), but i worry that people unfamiliar with it will pass even given a direct webchat link
19:28:00<@OrIdow6>Shoot, skip the deduplicator and ideally I could have my little Python Pushshift scanner posting to Gitea within a few hours
19:28:25<@OrIdow6>Yeah that could be improved as well
19:28:37<@OrIdow6>Could have a web form as well
19:28:37<@JAA>Emails work usually unless it's urgent. So if the deadline is on the order of weeks, it might be fine. Although we'll hear later about it than we could because Jason only checks them periodically.
19:28:50<@JAA>Re Twitter, soon™
19:29:30<@JAA>Reddit /r/Archiveteam also works.
19:31:42<thuban>mm, forgot about that one. (a lot of general archiving talk, though. how about a flair for shutdown alert posts?)
19:32:18<@OrIdow6>I and I suspect several other people read basically everything on there fairly frequently
19:33:21<thuban>as to email... i feel like a turnaround time of "weeks" is something that could be improved very easily
19:33:42<@OrIdow6>Which is fine with the subReddit since it's only a few posts a day at most
19:34:16<@JAA>Yeah, I have it in my email client as an RSS feed.
19:36:33<@JAA>I agree on the emails. I'll ask Jason about it.
19:36:58<thuban>cool, thanks!
20:02:33Iki quits [Ping timeout: 265 seconds]
21:08:39<@OrIdow6>Comments on something like this as the result of a Reddit comments checker? https://try.gitea.io/adbfb0fa90548ae6fda9/sdfvdsf/issues
21:10:32<@OrIdow6>"something like this" = "basically this"
21:20:40<@JAA>I'd distinguish between posts and comments (and am not sure it's worth creating an issue for every comment). Cutting on word boundaries and adding an ellipsis if something was removed would be nice-to-have and make the titles a bit more readable.
21:30:17<@OrIdow6>Thank you
21:40:28HP_Archivist quits [Quit: Leaving]
21:42:27<@OrIdow6>In the process of testing this I have already discovered a few shutdowns
21:45:28<Ryz>LoooooOOOOOoooooot!
21:47:31<thuban>does gitea support issue tags like github's?
21:51:44<thuban>(in addition to 'not a website shutdown' and 'a website shutdown', it might be nice to have 'a shutdown of something other than a website, we should check whether it has a web presence')
21:56:24Stiletto joins
21:58:21<Ryz>Tell us the looooooooooot fading away OrIdow6 <#>;
21:58:54<@OrIdow6>Not too exciting Ryz
21:59:13<@OrIdow6>https://old.reddit.com/r/popheads/comments/wyu6xr/after_16_years_the_singles_jukebox_is_shutting/ - http://www.thesinglesjukebox.com/ (no announcement on the site itself but a lot of discussion of it)
21:59:57<@OrIdow6>https://www.cnbc.com/2022/08/24/amazon-is-shutting-down-amazon-care-telehealth-service.html - amazon.care
22:00:59tech_exorcist quits [Client Quit]
22:01:15<@OrIdow6>A resturant in Chicago - https://chicago.eater.com/2022/8/26/23323169/tavern-on-rush-closing-rush-street-gold-coast-phil-stefani - https://www.tavernonrush.com/
22:01:27march_happy (march_happy) joins
22:02:07<@OrIdow6>A DnD-related thing I can't find now
22:03:28<@OrIdow6>One of those has no date and may just be a content freeze, and the two others are at the end of the year
22:03:32<@OrIdow6>So we may have heard of them anyway
22:06:08<@OrIdow6>thuban: Not sure how well it would work to integrate this with the idea of an issue tracker for ArchiveTeam efforts to save sites - seems like you might want a separation of the machine uploads and the human stuff
22:06:52<@OrIdow6>E.g. one thing I was talking about was a bot that closes issues as duplicates if they have certain keywords in them - would you have a tag "bot" that it only pays attention to? Etc.
22:09:18<thuban>i don't follow; this needn't be integrated with an issue tracker for efforts to save sites. surely you would want a human to tag hits and misses here regardless of what you subsequently did with the hits?
22:13:40fangfufu quits [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]
22:13:49IDK quits [Quit: Connection closed for inactivity]
22:14:15<@JAA>thuban: Yes, Gitea has issue labels.
22:17:08<@OrIdow6>thuban: Oh, I misread
22:17:46fangfufu joins
22:22:11michaelblob_ (michaelblob) joins
22:25:40michaelblob quits [Ping timeout: 240 seconds]
22:26:28michaelblob (michaelblob) joins
22:27:16michaelblob_ quits [Ping timeout: 240 seconds]
22:35:29<@OrIdow6>Thoughts on it now? Implemented J A A's suggestions and tried to add some filtering, not that that did much https://try.gitea.io/adbfb0fa90548ae6fda9/sdfvdsf/issues
22:38:42<Jake>(presumably you'd want to close and tag)
22:39:16<thuban>https://try.gitea.io/adbfb0fa90548ae6fda9/sdfvdsf/issues/164 has missing title
22:39:53<@OrIdow6>Jake: close and tag what?
22:40:12<Jake>Sorry, I was a bit behind. When a post wouldn't be relevant to a shutdown
22:42:02<@OrIdow6>thuban: long story short, it's because the search API takes "is 'shutting down'" to match "is shutting down" but the part that matches for titles/highlights doesn't as it gets tripped up on the quote
22:42:24<thuban>((good bad idea: what if we did this for a while and then trained a text classifier on the results))
22:43:32<@OrIdow6>Though I do have a fix for that
22:55:31michaelblob quits [Read error: Connection reset by peer]
22:55:55michaelblob (michaelblob) joins
22:56:26Chris5010 quits [Quit: Ping timeout (120 seconds)]
22:56:33flashfire42 quits [Quit: Ping timeout (120 seconds)]
22:56:33Ryz2 quits [Quit: Ping timeout (120 seconds)]
22:56:41fangfufu quits [Client Quit]
22:56:43@kaz quits [Quit: Ping timeout (120 seconds)]
22:56:48nico_32 quits [Read error: Connection reset by peer]
22:56:48CraftByte quits [Quit: Ping timeout (120 seconds)]
22:56:49fangfufu joins
22:56:53flashfire42 (flashfire42) joins
22:56:55nico_32 (nico) joins
22:56:57CraftByte (DragonSec|CraftByte) joins
22:56:58kaz (Kaz) joins
22:56:58@ChanServ sets mode: +o kaz
22:56:59Ryz quits [Quit: Ping timeout (120 seconds)]
22:57:01HotSwap quits [Quit: ZNC - http://znc.in]
22:57:25Ryz (Ryz) joins
22:57:36nothere quits [Quit: Leaving]
22:57:38nyany quits [Remote host closed the connection]
22:57:45endrift quits [Quit: +++CARRIER LOST+++]
22:57:48Ryz2 (Ryz) joins
22:57:48BPCZ quits [Remote host closed the connection]
22:57:49nyany (nyany) joins
22:57:49Chris5010 (Chris5010) joins
22:57:53shogchips joins
22:57:53endrift joins
22:58:09nothere joins
22:58:52adia quits [Ping timeout: 240 seconds]
22:59:02adia (adia) joins
22:59:06shoghicp quits [Read error: Connection reset by peer]
22:59:12HotSwap joins
22:59:13BPCZ (BPCZ) joins
23:26:20Stiletto quits [Remote host closed the connection]
23:32:09Stiletto joins
23:38:07jacobk quits [Ping timeout: 265 seconds]
23:43:12<@OrIdow6>So what I'm thinking for exclusions is that there's a bot you ping with a list of keywords, and it scans through all open issues, and closes the ones where all the keywords appear within some distance from a bolded (query-matching) segment of text
23:43:25<@OrIdow6>And does this in the future as well
23:43:31<@OrIdow6>Maybe they expire after some interval
23:43:39<@OrIdow6>Ignores
23:45:17<@OrIdow6>Perhaps excluding issues which have seen human activity
23:48:21BlueMaxima joins
23:48:55<thuban>it's not obvious to me how effective that would be as a multiplier for human tagging effort (https://xkcd.com/1205/)
23:50:11<@OrIdow6>I know
23:50:54<thuban>why don't we start with some exploratory analysis? i have an insanely high tolerance for boredom; put something up, i'll tag a thousand, and we can look at the results
23:51:02<@OrIdow6>But largely what worries me is turning it into Deathwatch, where no one wants to do the boring work each day so it fills up
23:51:14<@OrIdow6>Tag in what way?
23:51:26<thuban>signal/noise
23:51:30<@OrIdow6>Oh I see
23:52:05<@OrIdow6>Going to do some other things first... but can have that set up within a few hours
23:52:22<@OrIdow6>Sounds like a good idea
23:53:50<@OrIdow6>(I can't just run the script now due to a post that seems to be tripping it up for some reason)