00:01:02etnguyen03 (etnguyen03) joins
00:18:23nine quits [Quit: See ya!]
00:18:34nine joins
00:18:35nine quits [Changing host]
00:18:35nine (nine) joins
00:36:53CuppyMan joins
00:51:14etnguyen03 quits [Client Quit]
01:02:09etnguyen03 (etnguyen03) joins
01:56:35HP_Archivist quits [Quit: Leaving]
02:09:49Karlett quits [Quit: Leaving]
02:38:27<@JAA>It's now redirecting to the new forums.
02:45:59CuppyMan quits [Client Quit]
02:53:19<@JAA>DNS changed from CNAME forum-lb-2082650296.us-west-2.elb.amazonaws.com to CNAME www.cyberlink.com → CNAME d3it87pvl2tmgl.cloudfront.net, but the former is the one that's timing out, so they didn't only switch the DNS.
02:54:23<@JAA>Sad that we didn't hear about it sooner.
02:55:44<@JAA>The Corsair, DDO, and LOTRO forums are still online, by the way.
02:59:08etnguyen03 quits [Remote host closed the connection]
03:03:28hexagonwin quits [Read error: Connection reset by peer]
03:04:00hexagonwin joins
03:07:44Karlett2 quits [Quit: Leaving]
03:09:57<steering>re cyberlink: ok but who's actually buying DVD player software in 2025 :P
03:18:14LunarianBunny1147 quits [Ping timeout: 258 seconds]
03:18:14hexagonwin quits [Read error: Connection reset by peer]
03:19:40hexagonwin joins
03:20:05LunarianBunny1147 (LunarianBunny1147) joins
03:21:10hexagonwin quits [Read error: Connection reset by peer]
03:23:11hexagonwin joins
03:23:56evergreen2 joins
03:26:34evergreen quits [Ping timeout: 260 seconds]
03:26:34evergreen2 is now known as evergreen
03:26:45Island quits [Read error: Connection reset by peer]
03:31:47Karlett2 (Karlett2) joins
03:41:37Karlett2 quits [Ping timeout: 258 seconds]
04:01:58fangfufu quits [Quit: ZNC 1.9.1+deb2+b3 - https://znc.in]
04:05:46lemuria_ (lemuria) joins
04:06:21tmg1|michelson quits [Remote host closed the connection]
04:06:45fangfufu (fangfufu) joins
04:08:50lemuria quits [Ping timeout: 258 seconds]
04:17:09SootBector quits [Ping timeout: 255 seconds]
04:19:42SootBector (SootBector) joins
04:25:29cmlow quits [Ping timeout: 260 seconds]
04:26:54Karlett2 (Karlett2) joins
04:30:34Karlett2 quits [Read error: Connection reset by peer]
04:31:39SootBector quits [Remote host closed the connection]
04:32:05Karlett joins
04:32:46SootBector (SootBector) joins
04:33:08Karlett quits [Read error: Connection reset by peer]
04:35:24pabs quits [Ping timeout: 260 seconds]
04:51:06pabs (pabs) joins
05:37:09Karlett joins
05:37:11Karlett quits [Read error: Connection reset by peer]
05:37:58Karlett joins
05:37:59Karlett quits [Read error: Connection reset by peer]
05:39:49Karlett joins
05:39:49Karlett quits [Read error: Connection reset by peer]
05:40:52Karlett joins
05:40:58Karlett quits [Read error: Connection reset by peer]
05:41:50Karlett joins
05:43:07Karlett quits [Read error: Connection reset by peer]
05:43:43Karlett joins
05:43:43Karlett quits [Read error: Connection reset by peer]
05:44:40Karlett joins
05:45:21Karlett quits [Read error: Connection reset by peer]
05:46:09Karlett joins
05:46:11Karlett quits [Read error: Connection reset by peer]
05:47:12Karlett joins
05:47:22Karlett quits [Read error: Connection reset by peer]
05:48:06Karlett joins
05:48:08Karlett quits [Read error: Connection reset by peer]
05:50:38Karlett joins
05:50:38Karlett quits [Read error: Connection reset by peer]
05:51:37Karlett joins
05:59:04Karlett quits [Remote host closed the connection]
05:59:42Karlett joins
05:59:43Karlett quits [Read error: Connection reset by peer]
06:00:46Karlett joins
06:02:34Karlett quits [Remote host closed the connection]
06:02:51cyanbox joins
06:24:41Karlett2 (Karlett2) joins
06:48:31nine quits [Quit: See ya!]
06:48:42nine joins
06:48:43nine quits [Changing host]
06:48:43nine (nine) joins
07:17:31<pabs>kiska: mannie noticed that this pad got emptied, is there any way to restore it to an earlier revision? https://pad.notkiska.pw/p/archivebot-twitter
07:18:34<pabs>version 34182 in the timeline was the last good one
07:19:17<pabs>actually make that 34175
07:21:09mannie (nannie) joins
07:21:34nine quits [Client Quit]
07:21:47nine joins
07:21:48nine quits [Changing host]
07:21:48nine (nine) joins
07:22:16<mannie>kiska: the twitter etherpad is empty pabs looked at it and the latest good one is version 34175. Can you take a look at it?
07:22:42pabs already mentioned it :)
07:27:49mannie quits [Client Quit]
07:30:12SootBector quits [Remote host closed the connection]
07:31:30SootBector (SootBector) joins
07:39:01SootBector quits [Remote host closed the connection]
07:40:09SootBector (SootBector) joins
08:00:49lennier2 joins
08:03:39lennier2_ quits [Ping timeout: 260 seconds]
08:03:55Karlett joins
08:10:15SootBector quits [Remote host closed the connection]
08:11:22SootBector (SootBector) joins
08:24:31Karlett2 quits [Ping timeout: 258 seconds]
08:25:34Karlett quits [Read error: Connection reset by peer]
08:26:59JTL quits [Ping timeout: 260 seconds]
08:31:01JTL (JTL) joins
08:36:46Karlett2 (Karlett2) joins
08:40:46Karlett joins
08:41:24Karlett quits [Client Quit]
08:41:41Karlett joins
08:46:33Dada joins
09:07:33APOLLO03 quits [Quit: .]
09:08:37APOLLO03 joins
09:28:13Karlett2 quits [Remote host closed the connection]
09:28:37Karlett2 (Karlett2) joins
09:32:50<h2ibot>Hans5958 edited Main Page/Current Projects (+21, Move Peing to medium): https://wiki.archiveteam.org/?diff=57196&oldid=57184
09:37:04Webuser094669 joins
09:37:42Webuser094669 quits [Client Quit]
09:39:06hexagonwin quits [Read error: Connection reset by peer]
09:41:03hexagonwin joins
09:47:24PredatorIWD25 quits [Read error: Connection reset by peer]
09:56:35a-dude joins
09:57:50a-dude quits [Remote host closed the connection]
09:59:29monoxane (monoxane) joins
09:59:45hexagonwin quits [Read error: Connection reset by peer]
10:01:45hexagonwin joins
10:01:49nulldata-alt3 (nulldata) joins
10:03:49nulldata-alt quits [Ping timeout: 260 seconds]
10:03:49nulldata-alt3 is now known as nulldata-alt
10:09:33TheEnbyperor quits [Ping timeout: 258 seconds]
10:09:39TheEnbyperor_ quits [Ping timeout: 260 seconds]
10:37:51PredatorIWD25 joins
10:46:21ericgallager quits [Quit: This computer has gone to sleep]
10:46:31Webuser923772 joins
10:54:36TheEnbyperor joins
10:54:40TheEnbyperor_ (TheEnbyperor) joins
10:55:48<Webuser923772>Hi guys, how much were you able to archive of the cyberlink forum? I saw that it went offline this morning
11:00:02Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat]
11:02:43Bleo182600722719623455222 joins
11:08:02<@imer>Webuser923772: "JAA: I got about half of all threads with qwarc, without attachments.", the archivebot job also saved some pages, likely nowhere near complete though :(
11:13:02<Webuser923772>I have all the PowerDVD (older versions) threads, both english and german, with all attachments. Unfortunately i didn't know about this community till it was too late for full archival
11:17:19hexagonwin quits [Read error: Connection reset by peer]
11:17:52<Webuser923772>Forum-Index » PowerDVD (ältere Versionen) -> 1976 german threads with attachments
11:17:52<Webuser923772>Forum Index » PowerDVD (previous versions) -> 7490 english threads with attachments
11:18:18<Webuser923772>If needed i can send this batch that i archived, there are probably some other pages that the spider got too
11:19:39hexagonwin joins
11:20:32<@imer>Webuser923772: can you upload it to archive.org?
11:29:26Webuser923772 quits [Client Quit]
12:02:40SootBector quits [Remote host closed the connection]
12:03:52SootBector (SootBector) joins
12:04:28SootBector quits [Remote host closed the connection]
12:06:45SootBector (SootBector) joins
12:15:17SootBector quits [Remote host closed the connection]
12:16:46SootBector (SootBector) joins
12:22:50ericgallager joins
12:29:46Commander001 quits [Remote host closed the connection]
12:40:10Commander001 joins
12:45:21gosc joins
13:01:28redbees quits [Quit: ZNC 1.7.5+deb4 - https://znc.in]
13:38:13Shard (Shard) joins
14:25:14zhongfu quits [Ping timeout: 258 seconds]
14:46:11dabs joins
14:51:16<kiska>pabs: Rev has been restored
14:51:53<pabs>kiska++
14:51:53<eggdrop>[karma] 'kiska' now has 13 karma!
14:52:46<kiska>Go through it and actually make sure it restored to that version, cause the api for that is a little... unstable :D
15:11:44<pabs>looks like it
15:12:00SootBector quits [Remote host closed the connection]
15:13:17SootBector (SootBector) joins
15:22:50dabs quits [Client Quit]
15:28:24cyanbox quits [Read error: Connection reset by peer]
15:35:53<h2ibot>Manu edited Mailing Lists (+29, Mailman 3: Add lists.das-labor.org): https://wiki.archiveteam.org/?diff=57197&oldid=57008
15:36:30Island joins
15:39:54Wohlstand (Wohlstand) joins
15:44:28fionera quits [Remote host closed the connection]
16:24:27ducky quits [Ping timeout: 258 seconds]
16:29:09skyrock3t joins
16:30:34skyrocket quits [Ping timeout: 260 seconds]
16:36:20Commander001 quits [Ping timeout: 258 seconds]
16:37:10Commander001 joins
16:45:09tzt quits [Ping timeout: 260 seconds]
16:54:48<justauser|m>https://fastcode.io/2025/08/30/the-69-billion-domino-effect-how-vmwares-debt-fueled-acquisition-is-killing-open-source-one-repository-at-a-time/ unsure about the potential for data loss. Docker images will no longer be supported, but they promise not to kill them.
16:57:10tzt (tzt) joins
17:01:17Shard quits [Quit: Im doing something rq. Il brb]
17:32:13SootBector quits [Read error: Connection reset by peer]
17:33:23SootBector (SootBector) joins
17:40:07b3nzo joins
17:42:33kansei quits [Quit: ZNC 1.10.1 - https://znc.in]
17:47:37<@OrIdow6>b3nzo: It's fairly informal, work on what you want mostly (with the caveat that 'what you want' is often difficult to pull off for technical/other practical reasons)
17:47:51<@OrIdow6>Lots of people just run machines that do grabbing
17:50:24<b3nzo>i emailed jason regarding a project im working on, so after a bit of discussion he suggested to join archiveteam and to have a look at archiveteam.org for the info, but i couldnt find it
17:50:47<justauser|m>https://wiki.archiveteam.org/ should work.
17:51:12<justauser|m>What kind of info?
17:53:39<b3nzo>im working on a personal web archival project to archive webpages and crawl and then to upload on wayback
17:54:10<b3nzo>but IA denied to index my warc files on wayback
17:54:40<justauser|m>Are there any special requirements for crawling? Would ArchiveBot work?
17:56:06<b3nzo>so reached out to jason on how they can be indexed on the wayback, and he said that only authorized "chain of custody" sources are indexed on the wayback, and he said to join archiveteam so i could get warc files from my project indexed
17:58:23<b3nzo>no, just regular crawling and prioritizing the sites that blocked IA's crawler like reddit
17:58:51<b3nzo>i just built a pipeline using grab-site
17:59:42<b3nzo>so ig its quite similar to how ArchiveBot works(dont know much abt ArchiveBot)
18:12:06gosc quits [Quit: Leaving]
18:18:48<@JAA>If you support the cause, you're part of AT, more or less. Like others have mentioned, there's no formal membership. However, getting WARCs into the WBM isn't as open (and can't be).
18:25:01Shard (Shard) joins
18:28:16<@OrIdow6>b3nzo: Reddit and the like might be problematic, I don't know what our status on that is; but if you have a short list of URLs we might be able to run it on what exists now
18:32:04<@arkiver>yeah limited numbers of URLs could probably be archived
18:32:23<@arkiver>maybe via ArchiveBot - i believe there is not a very strong/strict scope at the moment for ArchiveBot
18:34:49<b3nzo>JAA: yea that was the reason i reached out to Jason, but not sure what he meant by joining AT, ig i have to wait for his reply
18:35:42ducky (ducky) joins
18:37:09<b3nzo>OrIdow6: yea, they can be problematic for displaying on the WBM, but its upto the IA team on what to display and what not to, they'll just store my WARCs of all domains
18:38:04<@JAA>Yeah, uploading WARCs is always fine and still useful even if they're not added to the WBM index.
18:40:40<@OrIdow6>b3nzo: I mean, I don't know how much capacity we have right now to capture Reddit even if you do give us a list of Reddit URLs you want captured
18:42:36<b3nzo>OrIdow6: i dont have a list of URLs yet, im still working on the pipeline(should be done with it by this week), and then the plan is it launch a chrome extension to capture user's URLs(without IP logging or any cookies), blocking domains like gmail, discord, outlook,etc and then implement specific scraping rules for specific sites for better url collection(especially for dynamically loaded pages)
18:44:57<@OrIdow6>b3nzo: How are you generating WARCs from a Chrome extension? We've looked at that before but it's seemed like its API doesn't give a good way to complely accurately capture what goes over the wire (by aggressively normalizing headers off the top of my head, among other things)
18:45:34<b3nzo>OrIdow6: what do you mean by "I don't know how much capacity we have", do you mean the warrior?
18:46:42<@OrIdow6>b3nzo: Warrior or any other system. Mostly availablity of clean IP addresses but also whatever resources may be needed to capture Reddit these days
18:46:47<b3nzo>OrIdow6: nah, im not generating WARCs through extension, the extension just collects URLs from the users activity
18:47:51xkey quits [Quit: WeeChat 4.7.0]
18:49:42lemuria_ is now known as lemuria
18:51:37xkey (xkey) joins
18:52:29<b3nzo>would there be any legal issues if i publish the warc files on the project site?
18:53:14<b3nzo>or are corporations more aggressive towards sites which display the archives?
18:54:28Webuser098077 joins
18:59:19ducky quits [Ping timeout: 258 seconds]
18:59:19<@arkiver>do you mean archive.org with "project site"?
18:59:33<@arkiver>you can upload them to there, if there's a problem with them they may be taken down though
18:59:51<@arkiver>and they will not land in the Wayback Machine but in a collection with WARCs uploaded by various accounts on IA
19:03:14<b3nzo>i meant uploading on IA and on my project's site, so basically 2 locations
19:06:56<@arkiver>regarding legal issue on your own site, you would have to talk with lawyer about that
19:08:18<b3nzo>yea, if i upload warcs they endup at archive.org/details/warczone
19:13:30<@arkiver>yep
19:13:43<@arkiver>so, feel free to do that!
19:14:02<@arkiver>maybe not tens of TBs, but sounds like the plan is not too broad
19:20:44Wohlstand quits [Quit: Wohlstand]
19:20:59Wohlstand (Wohlstand) joins
19:31:14<b3nzo>yea, will just upload archives for the meantime until i can get them indexed
19:32:48<b3nzo>why does IA lock all the indexed archives? even if some of them arent equipped with antibot/yt-dlp
19:34:40<@arkiver>i can't speak for the crawls by IA themselves, but we have had to restrict direct access to our (Archive Team) WARCs due to scraping for LLM training and due to containing data blocked for access through the Wayback Machine
19:35:02<@arkiver>many are somewhat positive towards web archiving, but not towards mass scale AI training data collection
19:35:47<@arkiver>making the WARCs fully available would effectively mean getting archived by us is the same as giving all their public data to AI companies for training
19:35:53<@arkiver>(yay for LLMs :/ )
19:36:37<@JAA>We should put this in an FAQ entry.
19:36:42<@arkiver>yeah
19:36:52<@arkiver>(see also recent news articles about IA and Reddit problems)
19:40:11@arkiver is afk for a while
19:42:48ducky (ducky) joins
19:50:49<b3nzo>the AI data problem is just getting bigger, and the IA is getting a lot of hate even though they dont support it
19:51:50<Guest>i thought ai companies already got all the human generated data 😂
19:52:40<b3nzo>they certainly scraped almost all the data
19:52:58<b3nzo>but they would want more and updated data
19:56:06<Guest>anthropic got a slap on the wrist for scanning millions of books they purchased to train ai and the courts sided with meta after they torrented over 80tb of media for ai training
19:56:56<Guest>insane times we live in. most people have probably never seen 80tb worth of content.
20:02:18Webuser098077 quits [Client Quit]
20:02:35<h2ibot>Anonymoususer852 edited Frequently Asked Questions (+821, /* We Are Not The Internet Archive */ Added…): https://wiki.archiveteam.org/?diff=57198&oldid=56265
20:07:01<masterx244|m>arkiver: sucks for future cases like the imgone warc-eating though. Not possible for most members here anymore to crunch the data to extract stuff
20:07:35<h2ibot>Anonymoususer852 edited Frequently Asked Questions (+180, "Why can't I download the WARCs for some…): https://wiki.archiveteam.org/?diff=57199&oldid=57198
20:10:41<b3nzo>at the end of the day, none of the ai companies will pay a penny towards the charges
20:11:44<masterx244|m>yeah, fuckturds at the finest level
21:03:09Wohlstand quits [Client Quit]
21:13:13SootBector quits [Remote host closed the connection]
21:14:28SootBector (SootBector) joins
21:32:05etnguyen03 (etnguyen03) joins
21:42:14SootBector quits [Remote host closed the connection]
21:43:22SootBector (SootBector) joins
21:45:26etnguyen03 quits [Client Quit]
21:46:50b3nzo quits [Ping timeout: 258 seconds]
22:05:10dabs joins
22:08:19beardicus9 quits [Ping timeout: 260 seconds]
22:08:21SootBect1 (SootBector) joins
22:08:36SootBector quits [Ping timeout: 255 seconds]
22:20:11Dada quits [Remote host closed the connection]
22:39:32etnguyen03 (etnguyen03) joins
22:40:24Doomaholic quits [Ping timeout: 260 seconds]
22:46:25Doomaholic (Doomaholic) joins
23:11:13Wohlstand (Wohlstand) joins
23:13:25etnguyen03 quits [Client Quit]
23:18:43CuppyMan joins
23:19:33nstrom joins
23:54:31etnguyen03 (etnguyen03) joins