00:02:10loug8318142 quits [Quit: The Lounge - https://thelounge.chat]
00:13:24SootBector quits [Remote host closed the connection]
00:13:45SootBector (SootBector) joins
00:23:49<pabs>xkey: re AB ignores, see also https://wiki.archiveteam.org/index.php/ArchiveBot/Ignore
00:24:58<pabs>tzt: sounds like something to add to https://wiki.archiveteam.org/index.php/Deathwatch
01:16:19<@JAA>Re Alf's question, I've been thinking about this recently: it would be good to have a simple prominent thing on the wiki homepage as a very low entry barrier for how to tell us about a shuttering site (operators or users), contact us in case we cause issues, etc.
01:17:37szczot3k3 (szczot3k) joins
01:17:56<@JAA>There's a lot of stuff linked from the homepage that probably explains it more or less, but it's a lot of stuff to go through for a first-time visitor.
01:19:30<@JAA>The first question in the FAQ is kind of that, but the link to the FAQ is not exactly prominent.
01:21:13szczot3k quits [Ping timeout: 260 seconds]
01:21:13szczot3k3 is now known as szczot3k
01:23:54<@JAA>!tell Alf There's no FAQ entry, but if you tell us what the site is, we'll make sure it gets archived. Include quirks would be helpful if applicable, e.g. if there's a rate limit we should obey or not easily discoverable parts of the site.
01:23:54<eggdrop>[tell] ok, I'll tell Alf when they join next
01:24:19<@JAA>Including* meh
01:47:28<h2ibot>Thezt edited Deathwatch (+255, Add Star.ne.jp shutdown): https://wiki.archiveteam.org/?diff=54145&oldid=54127
01:53:26graham9 joins
02:09:48nicolas17 quits [Quit: Konversation terminated!]
02:10:01nicolas17 joins
02:13:48<pabs>yeah, I encountered the need for such a low-entry-barrier FAQ, when I posted a HN thread about AT: https://news.ycombinator.com/item?id=42447579
02:16:30<pabs>TheTechRobo++
02:16:31<eggdrop>[karma] 'TheTechRobo' now has 9 karma!
02:16:41<pabs>(for the #jseater topic change)
02:16:55<TheTechRobo>narc :P
02:17:00<pabs>:)
02:24:02nicolas17 quits [Client Quit]
02:24:15nicolas17 joins
03:44:27Wohlstand quits [Ping timeout: 252 seconds]
03:48:35Webuser851025 joins
03:50:06Webuser851025 quits [Changing host]
03:50:06Webuser851025 joins
03:52:33Webuser851025 quits [Client Quit]
03:58:07pixel leaves [Error from remote client]
04:01:12<@OrIdow6>JAA: Agreed, we need a giant green button saying "report a site shutting down"
04:01:17BlueMaxima quits [Read error: Connection reset by peer]
04:34:41Hans5958 leaves
04:40:33cm quits [Ping timeout: 252 seconds]
04:41:27ljcool2006 joins
04:44:16cm joins
04:44:25<ljcool2006>>Livestream is shutting down in January 2025. Keep streaming on Vimeo
04:44:46<ljcool2006>livestream doesn't seem to have an article on the wiki yet
04:51:41<@JAA>Acquired in 2017, lots of stuff already redirects to Vimeo, including the announcement of the acquisition, which didn't set a timeline: https://web.archive.org/web/20230701084346/https://livestream.com/blog/livestream-vimeo-acquisition
04:51:57<@JAA>The announcement's just a banner on the page.
04:52:41<@JAA>Considering it's live streaming, there's probably not very much to archive.
04:56:03<h2ibot>JustAnotherArchivist edited Deathwatch (+300, /* 2025 */ Add Livestream): https://wiki.archiveteam.org/?diff=54146&oldid=54145
04:58:04<h2ibot>JustAnotherArchivist edited Deathwatch (+2, /* 2025 */ Fix ref): https://wiki.archiveteam.org/?diff=54147&oldid=54146
05:02:05<h2ibot>Wireball edited List of websites excluded from the Wayback Machine (+23, The School Bus Conversion Network (e.g. forums)…): https://wiki.archiveteam.org/?diff=54148&oldid=54081
05:05:07katocala quits [Ping timeout: 252 seconds]
05:05:16BornOn420 quits [Remote host closed the connection]
05:05:25katocala joins
05:05:54BornOn420 (BornOn420) joins
05:07:41DogsRNice quits [Read error: Connection reset by peer]
05:16:07katocala quits [Ping timeout: 252 seconds]
05:17:05katocala joins
05:21:04<@arkiver>i'll look into livestream.com !
05:31:54<@arkiver>JAA: there's quite some data still on livestream.com , they host some old streams
05:39:05<@arkiver>does anyone have an idea for a channel for "Vimeo Livestream" or "livestream.com"?
05:39:08th3z0l4 joins
05:39:38th3z0l4_ quits [Ping timeout: 260 seconds]
05:41:19<tzt>deadtrickle? because its no longer a stream
06:00:15<h2ibot>JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=54149&oldid=54148
06:45:38<Stagnant_>https://forum.bradleysmoker.com - SMF forum with over 22 years worth of posts will "discontinue in beginning of 2025". Can someone add it to archivebot? Doesn't seem to be rate limited.
07:02:55<@OrIdow6>^ is there a writeup anywhere of AB tricks to get around session IDs?
07:07:13<@arkiver>JAA: ^
07:07:18<@arkiver>tzt: fine with me :)
07:07:24<@arkiver>#deadtrickle for livestream.com
07:07:51<pokechu22>OrIdow6: Run it as !a https://forum.bradleysmoker.com/?archiveteam so that the first page load gets session IDs, and then it can later load https://forum.bradleysmoker.com/ and https://forum.bradleysmoker.com/index.php with the cookie set to avoid session IDs on those. And then just hope that the session IDs don't expire in the middle of the job and cause issues that way
07:10:11<@OrIdow6>pokechu22: Huh, do you then ignore the URLs with the session IDs so it doesn't do a parallel crawl of them?
07:10:18<@OrIdow6>Testing it in a browser
07:11:12<pokechu22>IIRC wpull does some stuff to strip session IDs from URLs (which also means that the URLs saved in the WARC for the first page won't directly work, as those ones will have session IDs but they'll be requested without). But even if the URL has a session ID in it, as long as the cookie's been set later page loads won't have session IDs
07:12:08<@OrIdow6>ah
07:12:43<pokechu22>huh, that's weird; I'm also not seeing session IDs when using curl
07:13:16<@OrIdow6>Thx pokechu, started the job for it
07:14:10<pokechu22>Looks like they don't do that session ID thing when using curl's UA but do with (current) firefox UAs. Interesting, though probably not relevant in this case since we have the ?archiveteam workaround
07:15:37<@OrIdow6>Huh
07:15:58<@OrIdow6>Stagnant_: Running
07:16:08<Stagnant_>i'm not seeing any session ids on my firefox browser. i do see them with chrome though
07:16:14<Stagnant_>Great, thanks ;)
07:17:05<pokechu22>They show up only on the very first time you load the site (easiest thing to try is to load the page in private browsing); afterwards they don't show up
07:17:27<Stagnant_>Oh I see
08:32:32i_have_n0_idea9 quits [Quit: The Lounge - https://thelounge.chat]
08:38:51i_have_n0_idea9 (i_have_n0_idea) joins
08:49:41<eggdrop>[remind] OrIdow6: my little pony
08:50:57opl joins
08:58:50loug8318142 joins
09:15:13nulldata quits [Quit: So long and thanks for all the fish!]
09:15:17<xkey>pabs: thanks for the link!
09:16:08nulldata (nulldata) joins
09:25:20<opl>hello! forwarding this from someone more involved in the fighting game community: "there's a big japanese arcade called a-cho that is shutting down soon and they have a very long running youtube channel with thousands and thousands of videos of matches and tournaments and whatnot. Anyway it turns out they are going to be deleting their youtube
09:25:20<opl>channel <https://x.com/Chickzama/status/1875045024507019668>
09:25:20<opl><https://x.com/chibax7jp/status/1875084159624114489>"
09:25:21<eggdrop>nitter: https://xcancel.com/Chickzama/status/1875045024507019668
09:25:24<opl>according to the tweets, the arcade shuts down on 2025-01-31, and the youtube channels are to be deleted on 2025-02-28
09:25:29<opl>the youtube channels in question are <https://www.youtube.com/channel/UCCfnriDcUslGMUMX4Ctkyjg> (at GAMEacho/a-cho GAME) and <https://www.youtube.com/channel/UCkXtcsyQ6g8coNrclPvt29w> (at zero3japan/a-cho battle movie). i believe they're both eligible to get queued up for archival, but i'm not familiar enough with the process to attempt that myself
09:29:24<@OrIdow6>!remindme 1d my little pony
09:29:26<eggdrop>[remind] ok, i'll remind you at 2025-01-04T09:29:24Z
09:41:42Island quits [Read error: Connection reset by peer]
09:44:29<opl>i'm also noticing some of the tournament results pages aren't archived by IA. roots: http://www.a-cho.com/ac/res_2018.html (contains links to previous years at the bottom) http://www.a-cho.com/ac/res_2019.html http://www.a-cho.com/ac/res_2020.html (pages for just 2019 and 2020?)
10:00:59<h2ibot>JAABot edited CurrentWarriorProject (-2): https://wiki.archiveteam.org/?diff=54150&oldid=53879
10:04:38<Flashfire42>Hey um why is archiveteams choice youtube atm? It saves the IPs of warriors in the url and some people may not want that without opting in?
10:05:33<Flashfire42>cc JAA arkiver ?
10:06:56<@arkiver>youtube has introduced new blocking, i wanted to get the queue down a bit
10:27:43<joepie91|m>arkiver: note that this is likely getting the IPs blocked of people running the warrior
10:43:59sec^nd quits [Remote host closed the connection]
10:47:39pixel (pixel) joins
11:04:27Gadelhas562873 quits [Ping timeout: 252 seconds]
11:09:54Gadelhas562873 joins
11:11:25pseudorizer quits [Ping timeout: 252 seconds]
11:12:57pseudorizer (pseudorizer) joins
11:18:31<@arkiver>joepie91|m: Flashfire42: i've moved it back to telegram
11:19:13<@arkiver>on blocking - the items being handed out was pretty stable, no obvious sign of blocking (then we would see the number of requested items going down)
11:23:38<@arkiver>and thanks joepie91|m - i did not consider the point enough in switching over. will keep it like it is now (so no youtube default)
11:37:20sec^nd (second) joins
11:38:19Webuser716927 joins
11:38:43Webuser716927 quits [Client Quit]
11:51:50<joepie91|m>👍️
11:52:07<joepie91|m>as for the blocking, it seems to work in waves, it doesn't seem to be a fully automated/rolling process
11:52:23<joepie91|m>people seem to have gotten blocked for using yt-dlp in the past even though they've stopped using it for example
11:52:36<joepie91|m>so presumably there's some kind of batch data crunching going on that spits out IPs to block every once in a while
11:59:21<h2ibot>Bzc6p edited Internet Archive (+106, /* Wayback Machine Save Page Now */ mention…): https://wiki.archiveteam.org/?diff=54151&oldid=53670
12:00:03Bleo182600722719623 quits [Quit: The Lounge - https://thelounge.chat]
12:00:21<h2ibot>JAABot edited CurrentWarriorProject (+2): https://wiki.archiveteam.org/?diff=54152&oldid=54150
12:02:54Bleo182600722719623 joins
12:04:43<@OrIdow6>https://x.com/acho_kyoto/status/1871138196291256535 a-cho good shutdown notice
12:04:43<eggdrop>nitter: https://xcancel.com/acho_kyoto/status/1871138196291256535
12:09:23<h2ibot>Bzc6p edited Cafeblog.hu (+189, /* Site reconnaissance */ more info): https://wiki.archiveteam.org/?diff=54153&oldid=54132
12:12:13<@OrIdow6>opl: Not sure abou the Youtube channel but I'm running at least one more ArchiveBot crawl for a-cho, because it seems their site is half-broken and the last one didn't discover those pages
12:13:51lflare quits [Quit: Bye]
12:14:12<@OrIdow6>!remindme 2w see if the a-cho job got http://www.a-cho.com/ac/res_2019.html and http://www.a-cho.com/ac/res_2020.html
12:14:13<eggdrop>[remind] ok, i'll remind you at 2025-01-17T12:14:12Z
12:14:14lflare (lflare) joins
12:18:17<opl>thanks, OrIdow6. any idea what should be done with the youtube content? i know downthetube exists, but it seems the queue is rather long so i'd be worried about the videos never making it to the front before deletion
12:19:24<h2ibot>Bzc6p edited Blogger.hu (+301, add discovery numbers, update status): https://wiki.archiveteam.org/?diff=54154&oldid=54131
12:21:19<@OrIdow6>opl: So with ArchiveTeam we do have video archival systems, most notable #down-the-tube, however videos are HUGE especially when people want to archive them in bulk and the result of that is that there are criteria for what gets included that I'm not familiar with, so if someone who is wants to comment they can
12:21:25<h2ibot>Bzc6p edited Blogger.hu (+0, /* Archiving */ fix crash date): https://wiki.archiveteam.org/?diff=54155&oldid=54154
12:21:45<opl>i'm sure some people will attempt their own archive jobs, but i imagine the results of those would become nearly impossible to find afterwards. there's also the problem of there being days of recordings there, so i imagine most people making an attempt won't have enough storage space for it all
12:21:46<@OrIdow6>For your own purposes, a yt-dlp wrapper would work; or yt-dlp itself but it requires using a command-line
12:22:43<@OrIdow6>If the fighting game community is organized enough best thing to do would be to have it host them itself I think
12:23:09<@OrIdow6>But yeah $$$
12:23:11Webuser981789 joins
12:23:28Webuser981789 quits [Client Quit]
12:24:25<h2ibot>Bzc6p edited Kepfeltoltes.eu (+24, /* Archiving */ add 2024 numbers): https://wiki.archiveteam.org/?diff=54156&oldid=53345
12:24:26<opl>yeah, i'm familiar with yt-dlp. i think i'm mostly trying to figure out here if anyone has figured out how to solve the problem of link rot in situations like this. after all, it doesn't matter if you have a copy of everything if no one can discover it
12:25:22<opl>i'm not actually part of that scene, but i do hate to see history gone
12:25:38<@OrIdow6>Pretty common sentiment here :D
12:26:03<@OrIdow6>But sadly I don't know of any such universal thing, tho I'm not familiar with Youtube archival specifically
12:26:46<@OrIdow6>If something like IPFS (and funnily enough we had a brief conversation about that here a few days ago) wasn't hugely inefficient and impossible to use that might've been a good solution
12:27:29<@OrIdow6>It's too bad web.archive.org is that if you let anyone say "hey, I'm hosting this!" and have it redirect to their site or whatever, but there are a bunch of obvious issues with that
12:28:19<@OrIdow6>But uh, I do encourage you to stick in the channel and see if you get any other comments about Youtube specifically
12:28:55<opl>yeah, indexing dead content is ultimately a massive pain that i've spent many nights pondering
12:29:41<opl>ipfs... i tried using it once, and it seemed like a solid idea. never used it seriously enough to discover the issues. might need to read through that conversation out of curiosity
12:29:57<@OrIdow6>It uses a gigantic amount of RAM mostly
12:31:53<opl>and i'll definitely be sticking around. would asking in down-the-tube specifically have been better, or would it just be more likely to get lost there? not super familiar with the conventions here
12:32:50<opl>plus i'm noticing the project channels aren't saved in the current log archives, so i'm hesitant to even ask there in case someone else comes with the same problem later
12:36:27<h2ibot>Bzc6p edited EOldal (+36, add company navbox): https://wiki.archiveteam.org/?diff=54157&oldid=49818
12:37:04<@OrIdow6>opl: Worth a shot asking there, I'd say mention that they're being deleted; it's all the same people, but some channels fill up with bots talking and some fill up with irrelevant-to-them conversations and no one reads everything
12:37:56<@OrIdow6>You've asked here already for logging purposes
12:39:16<opl>ok, will do. i didn't want to ask in two different channels at the same time since i wasn't sure if it's frowned upon >.>
12:40:09SkilledAlpaca418962 quits [Quit: SkilledAlpaca418962]
12:41:30SkilledAlpaca418962 joins
13:03:10IDK (IDK) joins
13:22:40NF885 (NF885) joins
13:24:20<NF885>looks like the Vine Archive site at https://vine.co has died
13:26:37<NF885>e.g. https://vine.co/twitter shows a broken loading gif
13:29:15NF885 quits [Client Quit]
13:46:21Shjosan quits [Read error: Connection reset by peer]
13:47:10Shjosan (Shjosan) joins
13:47:43Wohlstand (Wohlstand) joins
13:50:15Shjosan quits [Client Quit]
13:52:00Shjosan (Shjosan) joins
14:59:07graham9 quits [Quit: The Lounge - https://thelounge.chat]
15:14:48Wohlstand quits [Ping timeout: 260 seconds]
15:55:12Webuser171702 joins
15:55:54Webuser171702 quits [Client Quit]
16:53:58qwertyasdfuiopghjkl2 quits [Ping timeout: 260 seconds]
16:55:32loug8318142 quits [Quit: The Lounge - https://thelounge.chat]
16:55:50loug8318142 joins
17:07:56qwertyasdfuiopghjkl2 joins
17:08:30qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
17:10:09qwertyasdfuiopghjkl2 joins
17:10:43qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
17:11:18qwertyasdfuiopghjkl2 joins
17:11:52qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
17:13:14qwertyasdfuiopghjkl2 joins
17:13:48qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
17:15:20qwertyasdfuiopghjkl2 joins
17:15:54qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
17:16:11qwertyasdfuiopghjkl2 joins
17:16:45qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
17:20:45i_have_n0_idea9 quits [Quit: The Lounge - https://thelounge.chat]
17:25:00i_have_n0_idea9 (i_have_n0_idea) joins
17:26:17i_have_n0_idea9 quits [Client Quit]
17:26:45i_have_n0_idea9 (i_have_n0_idea) joins
17:29:34graham9 joins
17:30:22qwertyasdfuiopghjkl2 joins
17:30:57qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
18:23:50Webuser330602 joins
18:23:56Webuser330602 quits [Client Quit]
19:08:55Wohlstand (Wohlstand) joins
19:41:59hackbug quits [Remote host closed the connection]
19:43:07graham9 quits [Client Quit]
20:01:54abirkill quits [Quit: Let us prepare to grapple with the ineffable itself, and see if we may not eff it after all.]
21:01:54hackbug (hackbug) joins
21:25:28loug8318142 quits [Quit: The Lounge - https://thelounge.chat]
22:18:49graham9 joins