00:03:48<h2ibot>Flashfire42 edited List of websites excluded from the Wayback Machine (+31): https://wiki.archiveteam.org/?diff=50285&oldid=50272
00:16:44katocala joins
00:28:53Mateon2 joins
00:30:44Mateon1 quits [Ping timeout: 252 seconds]
00:30:44Mateon2 is now known as Mateon1
00:33:53<h2ibot>PaulWise edited Mailman2 (+70, more lists, deduplicate lists): https://wiki.archiveteam.org/?diff=50286&oldid=50214
00:37:01qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
00:44:21BlueMaxima joins
00:48:53etnguyen03 quits [Ping timeout: 252 seconds]
00:58:58etnguyen03 (etnguyen03) joins
01:00:58<h2ibot>JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=50287&oldid=50285
01:15:06<fireonlive>"VirtualBox 7.0.10 download links have disappeared" https://news.ycombinator.com/item?id=36841272
01:15:27<fireonlive>still on the download mirror though
01:18:00<imer>https://download.virtualbox.org/virtualbox/ that'd be quite a chunk to grab
01:18:06<imer>probably
01:18:12<fireonlive>hmm i guess some stuff is broken; apge hasn't been modified according to the wiki history in 10 months
01:18:19<fireonlive>but a lot of stuff doesn't make sens eon the site lol
01:18:29<fireonlive>e.g. changelog is blank too https://www.virtualbox.org/wiki/Changelog-7.0
01:19:26<fireonlive>"[Include(wikitestbuildsfile:changelog-7.0.wiki, text/x-trac-wiki)]]"
01:19:45<fireonlive>i guess their trac just isn't happy
01:20:49<fireonlive>via https://www.virtualbox.org/wiki/Changelog-7.0?action=diff&version=2, not sure how to view a page source otherwise
01:20:52<fireonlive>false alarm i guess :)
01:23:18W7RFa6AbNFz quits [Read error: Connection reset by peer]
01:23:32W7RFa6AbNFz joins
01:38:44TheTechRobo (TheTechRobo) joins
01:50:34superkuh__ joins
01:51:09chrismeller36 (chrismeller) joins
01:51:10fireonlive6 (fireonlive) joins
01:51:12TastyWiener958 (TastyWiener95) joins
01:51:37fireonlive quits [Killed (NickServ (GHOST command used by fireonlive6))]
01:51:37fireonlive6 is now known as fireonlive
01:51:59Ryz6 (Ryz) joins
01:52:15emily (pseudorizer) joins
01:53:25summerisle (summerisle) joins
01:53:32fionera_ (Fionera) joins
01:53:33fionera_ quits [Max SendQ exceeded]
01:53:35yawkat` (yawkat) joins
01:53:38fionera_ (Fionera) joins
01:53:57JensRex quits [Client Quit]
01:54:40bleb joins
01:54:43omni_ joins
01:54:47automato1 joins
01:54:49rewby1 (rewby) joins
01:54:49@ChanServ sets mode: +o rewby1
01:54:51ats_ (ats) joins
01:54:55@rewby quits [Killed (NickServ (GHOST command used by rewby1))]
01:54:55dave1 (dave) joins
01:54:57@rewby1 is now known as @rewby
01:55:04SketchCo1 joins
01:55:05whoami_ (whoami) joins
01:55:06JTL1 (jtl) joins
01:55:14TastyWiener95 quits [Client Quit]
01:55:14Perk quits [Client Quit]
01:55:14yawkat quits [Quit: No Ping reply in 180 seconds.]
01:55:14chrismeller3 quits [Client Quit]
01:55:14IDK_ quits [Quit: Ping timeout (120 seconds)]
01:55:14Ryz quits [Quit: Ping timeout (120 seconds)]
01:55:14qwertyasdfuiopghjkl quits [Client Quit]
01:55:14pseudorizer quits [Client Quit]
01:55:14fuzzy8021 quits [Remote host closed the connection]
01:55:14Billy549 quits [Client Quit]
01:55:14superkuh_ quits [Remote host closed the connection]
01:55:14aismallard quits [Remote host closed the connection]
01:55:14summerisle_ quits [Remote host closed the connection]
01:55:14phuzion quits [Quit: No Ping reply in 180 seconds.]
01:55:14fionera quits [Quit: No Ping reply in 180 seconds.]
01:55:14cm quits [Remote host closed the connection]
01:55:15MrRadar quits [Remote host closed the connection]
01:55:15lumidify quits [Remote host closed the connection]
01:55:15@JAA quits [Remote host closed the connection]
01:55:15xkey quits [Remote host closed the connection]
01:55:15ats quits [Remote host closed the connection]
01:55:15dave quits [Remote host closed the connection]
01:55:15Elizabeth quits [Remote host closed the connection]
01:55:15JTL quits [Remote host closed the connection]
01:55:15automato83 quits [Remote host closed the connection]
01:55:15whoami quits [Remote host closed the connection]
01:55:15omni quits [Remote host closed the connection]
01:55:15SketchCow quits [Remote host closed the connection]
01:55:15chrismeller36 is now known as chrismeller3
01:55:15Ryz6 is now known as Ryz
01:55:15TastyWiener958 is now known as TastyWiener95
01:55:17whoami_ is now known as whoami
01:55:21Elizabeth (Elizabeth) joins
01:55:21JAA (JAA) joins
01:55:21@ChanServ sets mode: +o JAA
01:55:24Billy549 (Billy549) joins
01:55:50xkey (xkey) joins
01:56:06JensRex (JensRex) joins
01:56:11phuzion (phuzion) joins
01:56:18aismallard joins
01:56:58fuzzy8021 (fuzzy8021) joins
02:00:02lumidify (lumidify) joins
02:00:21MrRadar (MrRadar) joins
02:18:20qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
02:28:59etnguyen03 quits [Ping timeout: 252 seconds]
02:35:45Ruthalas5 quits [Read error: Connection reset by peer]
02:36:07Ruthalas5 (Ruthalas) joins
02:49:03etnguyen03 (etnguyen03) joins
02:59:04killsushi joins
03:39:53DogsRNice_ joins
03:41:34DogsRNice quits [Ping timeout: 258 seconds]
03:45:31railen63 joins
03:49:34railen63 quits [Remote host closed the connection]
03:49:48railen63 joins
04:03:30DogsRNice_ quits [Read error: Connection reset by peer]
04:07:50railen69 joins
04:08:49<vokunal|m>On the youtube 144p idea, for a while, yt-dlp has a worstvideo setting and bestaudio setting, which I used to use to make sure I at least had the video in some quality, but the audio was still perfectly useable. Might be interesting if this idea gets tossed around a bit more
04:09:55leo60228 quits [Client Quit]
04:10:39leo60228 (leo60228) joins
04:11:05railen63 quits [Ping timeout: 258 seconds]
04:23:46<flashfire42|m>Donate a bunch to IA and suggest good channels to the current YouTube archival stuff
04:32:21<fireonlive>anyone have a few million burning a hole in pockets
04:32:31BlueMaxima quits [Read error: Connection reset by peer]
04:50:17etnguyen03 quits [Client Quit]
04:57:17cobertos joins
05:13:37Island quits [Read error: Connection reset by peer]
05:57:57nepeat quits [Client Quit]
05:59:09nepeat (nepeat) joins
06:12:53<h2ibot>Flashfire42 edited List of websites excluded from the Wayback Machine (+24): https://wiki.archiveteam.org/?diff=50288&oldid=50287
06:42:54spirit joins
06:44:57trainingdata joins
06:45:57tertu quits [Ping timeout: 258 seconds]
06:47:27tertu (tertu) joins
06:51:33W7RFa6AbNFz quits [Client Quit]
07:00:07nfriedly quits [Remote host closed the connection]
07:00:08<h2ibot>JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=50289&oldid=50288
07:05:02Unholy236131 quits [Remote host closed the connection]
07:06:35Unholy236131 (Unholy2361) joins
07:20:01<that_lurker>twitter rebranding as x will break a metric fuckton of links and embeds everywhere unless they still keep twitter.com and just redirect.
07:20:02railen69 quits [Remote host closed the connection]
07:21:20railen63 joins
07:21:26<DigitalDragons>twitter project time?
07:21:37<DigitalDragons>or is it still locked down to accounts only
07:23:45<that_lurker>still locked down to accounts only
07:24:15<DigitalDragons>ugh
07:24:35Arcorann (Arcorann) joins
07:25:47<that_lurker>good thing is that they seems to only be doing a domain swap to x.com, so everything should maybe hopefully if the start are aligned somewhat good stay the same
07:31:14<DigitalDragons>I assume that they'll probably only use x.com for the frontend and keep twitter.com in the backend
07:31:35<DigitalDragons>(like how discord's cdn is still on discordapp.com and such)
07:31:44<DigitalDragons>i doubt they have the dev bandwidth to do a full domain swap
07:32:48<that_lurker>we shall hope
07:48:41Sennaton joins
07:50:55<Sennaton>New here, hello.
07:53:14<that_lurker>Sennaton: Hello. You should also join #archiveteam-ot for somewhat off topic conversations
07:53:34<Sennaton>K, thanks.
07:54:20Ruthalas5 quits [Client Quit]
07:54:20nepeat quits [Client Quit]
07:54:20railen63 quits [Remote host closed the connection]
07:54:20cobertos quits [Remote host closed the connection]
07:54:22nepeat_ (nepeat) joins
07:54:23railen63 joins
07:54:27cobertos joins
07:54:31Ruthalas5 (Ruthalas) joins
07:59:05Sennaton quits [Remote host closed the connection]
07:59:23Sennaton joins
08:00:22Sennaton quits [Remote host closed the connection]
08:00:39Sennaton joins
08:02:59trainingdata quits [Remote host closed the connection]
08:03:33Sennaton quits [Remote host closed the connection]
08:03:49Sennaton joins
08:04:10Sennaton quits [Remote host closed the connection]
08:04:31Sennaton joins
08:08:16railen63 quits [Remote host closed the connection]
08:08:16cobertos quits [Remote host closed the connection]
08:08:16Sennaton quits [Remote host closed the connection]
08:08:16qwertyasdfuiopghjkl quits [Remote host closed the connection]
08:08:18railen63 joins
08:08:25Sennaton joins
08:08:27cobertos joins
08:21:52Alozuer0900 joins
08:22:19trainingdata joins
08:23:00Alozuer0900 quits [Remote host closed the connection]
08:56:03<trainingdata>I run an academic open large language model project https://hplt-project.org/ and am looking for more training data. We have 10 petabytes of spinning disks attached to high-performance compute and a deal with the Internet Archive for 7 petabytes of WARC, mainly WIDE*. While I appreciate that archivebot_go has publicly downloadable WARC, is it
08:56:04<trainingdata>possible to get access to Archive Team: URLs WARCs? For example https://archive.org/download/archiveteam_urls_20230720203029_3f55fb2a is not downloadable.
08:58:24<thuban>arkiver: i think this is your area ^
09:20:26Ketchup901 quits [Ping timeout: 245 seconds]
09:20:56Ketchup901 (Ketchup901) joins
09:23:46sec^nd quits [Ping timeout: 245 seconds]
09:26:45sec^nd (second) joins
09:43:16nfriedly joins
10:00:01railen63 quits [Remote host closed the connection]
10:00:19railen63 joins
10:29:52qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
10:45:05Sennaton quits [Remote host closed the connection]
11:02:25railen63 quits [Remote host closed the connection]
11:02:33railen63 joins
11:58:42W7RFa6AbNFz joins
11:59:19W7RFa6AbNFz quits [Remote host closed the connection]
11:59:29W7RFa6AbNFz joins
12:33:21rageear joins
12:34:53rageear_ joins
12:38:01rageear quits [Ping timeout: 265 seconds]
12:40:11<pabs>I sometimes need to parse WARCs to check what was missed by AB jobs, have been resorting to hacky shell so far but want to do something better
12:40:42<pabs>what libraries are recommended for WARC parsing? preferably with Python bindings
12:42:02<Maakuth|m>pabs, https://wiki.archiveteam.org/index.php/The_WARC_Ecosystem are you aware of this page?
12:43:08<pabs>nope, thanks
12:46:02<pabs>hmm, lots of unmaintained stuff
12:48:28<Maakuth|m>warcat seems promising, even though only one author
12:49:31<@JAA>warcio is acceptable for WARC parsing/reading, just don't ever write WARCs with it.
12:49:46etnguyen03 (etnguyen03) joins
12:50:47<@JAA>I've been working (on and off) on a new Python package with a more solid core, but it's not usable yet.
12:56:15AmAnd0A quits [Ping timeout: 258 seconds]
12:56:52AmAnd0A joins
13:03:16thenes quits [Remote host closed the connection]
13:03:17BigBrain quits [Remote host closed the connection]
13:03:41BigBrain (bigbrain) joins
13:03:44<pabs>the reason I wanted this is to better automate what I did today: discovering open directory indexes/trees that were missed and or contents partially missed
13:03:52<pabs>anything like that exist yet?
13:03:59thenes (thenes) joins
13:19:27jacksonchen666 (jacksonchen666) joins
13:22:23omg joins
13:22:39omg leaves
13:23:41jacksonchen666 quits [Client Quit]
13:25:53jacksonchen666 (jacksonchen666) joins
13:53:45Arcorann quits [Ping timeout: 258 seconds]
14:05:57rageear_ quits [Remote host closed the connection]
14:05:57railen63 quits [Remote host closed the connection]
14:05:57W7RFa6AbNFz quits [Remote host closed the connection]
14:05:57trainingdata quits [Remote host closed the connection]
14:05:57qwertyasdfuiopghjkl quits [Remote host closed the connection]
14:05:57W7RFa6AbNFz joins
14:06:03railen63 joins
14:06:06rageear_ joins
14:11:21Perk joins
14:13:54W7RFa6AbNFz_ joins
14:13:55miabbott_ joins
14:14:07railen69 joins
14:14:16railen63 quits [Remote host closed the connection]
14:14:18rageear_ quits [Remote host closed the connection]
14:14:18W7RFa6AbNFz quits [Remote host closed the connection]
14:14:37qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
14:17:09Megame (Megame) joins
14:19:41Ruthalas54 (Ruthalas) joins
14:19:53Ruthalas5 quits [Client Quit]
14:19:53Ruthalas54 is now known as Ruthalas5
14:22:36bf_ joins
14:24:44trainingdata joins
14:31:37trainingdata quits [Remote host closed the connection]
14:31:38qwertyasdfuiopghjkl quits [Remote host closed the connection]
14:32:02qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
14:39:56datechnoman quits [Ping timeout: 252 seconds]
14:41:41jacksonchen666 quits [Ping timeout: 245 seconds]
14:42:19Island joins
14:42:49jacksonchen666 (jacksonchen666) joins
14:57:38datechnoman (datechnoman) joins
15:28:38killsushi quits [Ping timeout: 265 seconds]
15:34:11imer quits [Quit: Oh no]
15:34:58imer (imer) joins
15:37:13imer quits [Client Quit]
15:40:07imer (imer) joins
15:51:10random joins
15:51:16thenes quits [Ping timeout: 245 seconds]
15:52:50<random>Hello, anyone have the Forward DNS (FDNS) of Project Sonar saved?
15:53:05thenes (thenes) joins
15:55:23katocala quits [Remote host closed the connection]
16:00:06random quits [Remote host closed the connection]
16:29:11BigBrain quits [Ping timeout: 245 seconds]
16:29:11Ketchup901 quits [Ping timeout: 245 seconds]
16:29:43Ketchup901 (Ketchup901) joins
16:29:51BigBrain (bigbrain) joins
16:43:26katocala joins
17:21:16BigBrain quits [Ping timeout: 245 seconds]
17:53:37spirit quits [Client Quit]
17:59:17trainingdata joins
18:33:28IDK quits [Client Quit]
18:33:28yts98 leaves
18:33:36yts98 joins
18:41:16bf_ quits [Remote host closed the connection]
18:45:07bf_ joins
18:47:34spirit joins
18:58:33<h2ibot>Exorcism edited DokuWiki (+137): https://wiki.archiveteam.org/?diff=50290&oldid=49786
19:19:15mrclon joins
19:29:47driib quits [Quit: The Lounge - https://thelounge.chat]
19:30:50driib (driib) joins
19:56:19adamus1red quits [Quit: SigTerm]
19:58:34adamus1red (adamus1red) joins
20:01:15railen63 joins
20:04:03railen69 quits [Ping timeout: 258 seconds]
20:10:29Megame quits [Ping timeout: 252 seconds]
20:37:08qwertyasdfuiopghjkl quits [Remote host closed the connection]
20:37:38fuzzy8021 quits [Read error: Connection reset by peer]
20:39:58fuzzy8021 (fuzzy8021) joins
20:44:02superkuh__ quits [Ping timeout: 252 seconds]
20:58:40DogsRNice joins
21:00:23ThetaDev quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
21:00:31ThetaDev joins
21:00:58DogsRNice_ joins
21:01:14Megame (Megame) joins
21:02:23Megame1_ (Megame) joins
21:04:34DogsRNice quits [Ping timeout: 265 seconds]
21:06:09Megame quits [Ping timeout: 258 seconds]
21:17:06nicolas17 joins
21:38:07michaelblob quits [Read error: Connection reset by peer]
22:05:55AmAnd0A quits [Read error: Connection reset by peer]
22:06:11AmAnd0A joins
22:12:21JTL1 is now known as JTL
22:16:59etnguyen03 quits [Ping timeout: 252 seconds]
22:34:13Megame1_ quits [Client Quit]
22:38:39msrn_ quits [Quit: ZNC - http://znc.in]
22:45:08superkuh joins
22:47:25mikael joins
22:50:01etnguyen03 (etnguyen03) joins
23:18:24tzt quits [Remote host closed the connection]
23:18:45tzt (tzt) joins
23:19:30W7RFa6AbNFz_ quits [Read error: Connection reset by peer]
23:19:53W7RFa6AbNFz_ joins
23:36:29Hackerpcs quits [Quit: Hackerpcs]
23:38:22Hackerpcs (Hackerpcs) joins
23:41:38DogsRNice_ quits [Ping timeout: 265 seconds]
23:52:34fangfufu_ quits [Remote host closed the connection]
23:52:42fangfufu joins
23:53:49IDK (IDK) joins
23:58:01qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins