00:02:25monoxane8 (monoxane) joins
00:03:09monoxane quits [Ping timeout: 272 seconds]
00:03:09monoxane8 is now known as monoxane
00:03:47etnguyen03 quits [Ping timeout: 272 seconds]
00:08:30ealia joins
00:09:47ealia quits [Remote host closed the connection]
00:11:41etnguyen03 (etnguyen03) joins
00:15:55monoxane quits [Client Quit]
00:16:18Megame (Megame) joins
00:23:07kitonthenet joins
00:24:03tzt quits [Ping timeout: 272 seconds]
00:25:41HP_Archivist quits [Read error: Connection reset by peer]
00:27:44kitonthenet quits [Ping timeout: 265 seconds]
00:27:49tzt (tzt) joins
00:28:29systwi quits [Ping timeout: 272 seconds]
00:34:00<h2ibot>JustAnotherArchivist edited Blogger (+182, Fix source and tracker links, update status): https://wiki.archiveteam.org/?diff=51174&oldid=51148
00:37:09monoxane (monoxane) joins
00:45:42systwi (systwi) joins
01:00:37atphoenix_ quits [Remote host closed the connection]
01:01:20atphoenix_ (atphoenix) joins
01:18:31etnguyen03 quits [Ping timeout: 272 seconds]
01:19:18kitonthenet joins
01:36:33etnguyen03 (etnguyen03) joins
01:49:25kitonthenet quits [Ping timeout: 265 seconds]
01:56:04Megame quits [Client Quit]
01:58:44DogsRNice_ quits [Read error: Connection reset by peer]
01:58:58DogsRNice joins
02:09:06<phuz-test>Anyone wanna archive the Questionable Content forums? (https://forums.questionablecontent.net) It's a webforum with approximately 900k posts, about 20 years old. Mostly webcomic related content.
02:09:09phuz-test is now known as phuzion
02:09:59<nicolas17>why though? is it at risk of dying?
02:10:40<phuzion>Yeah, the comic's server is showing 503s a lot, and it's likely that they're going to move to a new host. The forum was locked some time ago, and no new posts or registrations are allowed
02:10:49<phuzion>Locked as of Jan 1 this year.
02:11:12<nicolas17>oh :| I didn't know of that
02:11:40<phuzion>The current speculation is that Jeph (comic author) and his tech team might opt not to migrate the forums because of the additional complexity in doing so
02:11:53nicolas17 hasn't even read the comics in a few years
02:14:15etnguyen03 quits [Ping timeout: 272 seconds]
02:14:48dumbgoy joins
02:15:24Pedrosso didn't know of its existence
02:15:41Pedrosso wants it saved anyway, of course
02:18:03dumbgoy__ quits [Ping timeout: 272 seconds]
02:21:36<pokechu22>phuzion: I've queued an archivebot job for them - 900k posts is large but should be doable
02:22:36<phuzion>pokechu22: the forums are behind cloudflare, so I'd check to make sure that it's working properly at some point.
02:23:52etnguyen03 (etnguyen03) joins
02:25:55CandidSparrow2 joins
02:26:53<pokechu22>Yeah - there's other stuff running in archivebot right now so it's not started yet, but if it finishes abnormally quickly then I'll know it's cloudflare at least (not sure what we could do about it for something that large though)
02:28:11CandidSparrow quits [Ping timeout: 272 seconds]
02:28:11CandidSparrow2 is now known as CandidSparrow
02:37:18<@JAA>ohai
02:37:39<@JAA>Looks like their Buttflare config isn't very aggressive, so should be possible even if it doesn't work with AB.
02:40:14BlueMaxima joins
02:42:11<Pedrosso>should DeviantArt's sitemap be grabbed proactively? I'm surprised it hasn't been hidden from public yet and it's very big
02:43:11<@JAA>Seems to be running fine, albeit with timeouts and general slugginess.
02:54:29dumbgoy_ joins
02:57:57dumbgoy quits [Ping timeout: 272 seconds]
03:03:57kitonthe1et joins
03:19:17<phuzion>JAA: Yeah that server is creaky right now. The main comic page doesn't load about 1/3 of the time, it seems.
03:24:24systwi_ joins
03:28:21Barto quits [Ping timeout: 272 seconds]
03:34:03kitonthe1et quits [Ping timeout: 272 seconds]
03:35:44<@JAA>Barto pls
03:35:47<@JAA>Worst SLA ever
03:35:52<fireonlive>xD
03:38:56BlueMaxima_ joins
03:39:25CraftByte quits [Client Quit]
03:39:25BlueMaxima quits [Remote host closed the connection]
03:39:25atphoenix_ quits [Remote host closed the connection]
03:39:25DigitalDragons quits [Client Quit]
03:39:25AK quits [Client Quit]
03:39:25CandidSparrow quits [Client Quit]
03:39:25DogsRNice quits [Remote host closed the connection]
03:39:26DogsRNice joins
03:39:30CraftByte (DragonSec|CraftByte) joins
03:39:36CandidSparrow joins
03:39:37DigitalDragons (DigitalDragons) joins
03:39:51atphoenix_ (atphoenix) joins
03:39:51AK (AK) joins
04:04:12kitonthenet joins
04:07:08dumbgoy__ joins
04:10:04dumbgoy_ quits [Ping timeout: 265 seconds]
04:12:58kitonthenet quits [Ping timeout: 265 seconds]
04:27:15dumbgoy__ quits [Ping timeout: 272 seconds]
04:55:17BlueMaxima_ quits [Read error: Connection reset by peer]
05:01:39imer quits [Remote host closed the connection]
05:02:49imer (imer) joins
05:03:21Ruthalas59 quits [Ping timeout: 272 seconds]
05:05:46imer quits [Remote host closed the connection]
05:07:07imer (imer) joins
05:10:10imer quits [Remote host closed the connection]
05:11:24imer (imer) joins
05:12:38imer quits [Remote host closed the connection]
05:41:05<project10>JAA: are the logs from an AB job saved/accessible anywhere?
05:45:31imer (imer) joins
05:46:04<@JAA>project10: Yes, they're in the *-meta.warc.gz file. For aborted or crashed jobs, there's a -wpull.log.gz file instead, though that isn't indexed by the viewer; it should normally be in the same item as the *.json file.
05:46:43<project10>cool, I kinda had an inkling it might be saved in the data uploaded to IA itself. Throw nothing away and all that
05:47:20<fireonlive>=]
05:47:34<@JAA>Yeah, we do currently throw away the DB file though, which has some data that's hard to extract otherwise and is much more suitable for many analysis things.
05:48:17<fireonlive>ah like a quick sweep for failed urls or certain outlinks i suppose
05:48:19<@JAA>There's a... uh... three years old issue about it: https://github.com/ArchiveTeam/ArchiveBot/issues/465
05:49:19<@JAA>Yeah. And some links get indexed by wpull but silently ignored. They only appear in the raw responses and in the DB.
05:50:19Barto (Barto) joins
05:50:20<fireonlive>ahh
05:55:08c3manu (c3manu) joins
06:08:08Wohlstand quits [Client Quit]
06:14:17etnguyen03 quits [Client Quit]
06:14:44Island quits [Read error: Connection reset by peer]
06:23:06c3manu quits [Remote host closed the connection]
06:28:02Ruthalas59 (Ruthalas) joins
06:29:02DogsRNice quits [Read error: Connection reset by peer]
06:33:00hitgrr8 joins
06:45:19Earendil7 quits [Ping timeout: 272 seconds]
06:45:52Earendil7 (Earendil7) joins
07:09:06fireonlive quits [Killed (NickServ (GHOST command used by fireonlive5))]
07:09:49fireonlive (fireonlive) joins
07:14:50sec^nd quits [*.net *.split]
07:31:39Arcorann (Arcorann) joins
07:35:20sonick (sonick) joins
07:42:12<sonick>Has there already mentioned dotup.org and its light version, light.dotup.org, the website that will be shut down on November 30?
07:43:00<sonick>These sites are relatively simple and could be done by AB.
07:45:02<sonick>The light version and the normal version seem to have different size limits for uploading and different uploaded content.
08:00:21nfriedly quits [Remote host closed the connection]
08:18:04Dango360 quits [Read error: Connection reset by peer]
08:35:54nicolas17 quits [Ping timeout: 265 seconds]
08:39:34nicolas17 joins
08:41:57<pabs>sonick: JAA did the non-lite version on 20231103
08:42:37<pabs>https://archive.fart.website/archivebot/viewer/?q=dotup.org
08:44:59<pabs>stuck light one in AB now
08:45:32<pabs>hmm, file uploads are still enabled
08:46:03<pabs>its in Deathwatch so I guess someone will do another save near the deadline
08:59:34<sonick>ok, thanks.
09:24:40Vokun quits [*.net *.split]
09:24:41that_lurker|m quits [*.net *.split]
09:24:41M--mlv|m quits [*.net *.split]
09:24:41hillow596|m quits [*.net *.split]
09:24:41sonst-was|m quits [*.net *.split]
09:24:41qq44|m quits [*.net *.split]
09:24:41Misty|m quits [*.net *.split]
09:24:41username675f|m quits [*.net *.split]
09:24:41AntoninDelFabbro|m quits [*.net *.split]
09:24:41Peetz0r|m quits [*.net *.split]
09:24:41marius851000 quits [*.net *.split]
09:24:41EmeraldSnorlax|m quits [*.net *.split]
09:24:41ram|m quits [*.net *.split]
09:24:41kaz__|m quits [*.net *.split]
09:24:41Passiing|m quits [*.net *.split]
09:24:41noxious quits [*.net *.split]
09:24:41EvanBoehs|m quits [*.net *.split]
09:24:41Maakuth|m quits [*.net *.split]
09:24:41trumad|m quits [*.net *.split]
09:24:41NickS|m quits [*.net *.split]
09:24:41haha-whered-it-go|m quits [*.net *.split]
09:24:41joepie91|m quits [*.net *.split]
09:24:41yetanotherarchiver|m quits [*.net *.split]
09:24:41gwetchen|m quits [*.net *.split]
09:24:42superusercode quits [*.net *.split]
09:24:42noobirc|m quits [*.net *.split]
09:24:42lasdkfj|m quits [*.net *.split]
09:24:42GRBaset quits [*.net *.split]
09:24:42nyuuzyou quits [*.net *.split]
09:24:42Cydog|m quits [*.net *.split]
09:24:42will|m quits [*.net *.split]
09:24:42JC|m quits [*.net *.split]
09:24:42pannekoek11|m quits [*.net *.split]
09:24:42jwoglom|m quits [*.net *.split]
09:24:42gungagungagunga|m quits [*.net *.split]
09:24:42jevinskie quits [*.net *.split]
09:24:42coro quits [*.net *.split]
09:24:43t3chler|m quits [*.net *.split]
09:24:43qyxojzh|m quits [*.net *.split]
09:24:43hlgs|m quits [*.net *.split]
09:24:43thermospheric quits [*.net *.split]
09:24:43akaibu|m quits [*.net *.split]
09:24:43cmostracker|m quits [*.net *.split]
09:24:43Max|m12 quits [*.net *.split]
09:24:43Video quits [*.net *.split]
09:24:43nosamu|m quits [*.net *.split]
09:24:43masterx244|m quits [*.net *.split]
09:24:43voltagex|m quits [*.net *.split]
09:24:43mikolaj|m quits [*.net *.split]
09:24:43iCesenberk|m quits [*.net *.split]
09:24:43octylFractal|m quits [*.net *.split]
09:24:43wrangle|m quits [*.net *.split]
09:24:43tech234a|m quits [*.net *.split]
09:24:44Roki_100|m quits [*.net *.split]
09:24:44Ruk8 quits [*.net *.split]
09:24:44finalti|m quits [*.net *.split]
09:24:44jackt1365|m quits [*.net *.split]
09:24:44saouroun|m quits [*.net *.split]
09:24:44moe-a-m|m quits [*.net *.split]
09:24:44schwarzkatz|m quits [*.net *.split]
09:24:44alexshpilkin quits [*.net *.split]
09:24:44madpro|m quits [*.net *.split]
09:24:45Minkafighter|m quits [*.net *.split]
09:24:45yzqzss quits [*.net *.split]
09:24:45x9fff00 quits [*.net *.split]
09:24:45Exorcism quits [*.net *.split]
09:24:45phaeton quits [*.net *.split]
09:24:45Tom|m1 quits [*.net *.split]
09:24:45vexr quits [*.net *.split]
09:24:45Froxcey|m quits [*.net *.split]
09:24:45manu|m quits [*.net *.split]
09:24:45CrispyAlice2 quits [*.net *.split]
09:24:45s-crypt|m quits [*.net *.split]
09:24:45flashfire42|m quits [*.net *.split]
09:24:45Fletcher quits [*.net *.split]
09:24:45ragu|m quits [*.net *.split]
09:24:45MinePlayersPEMyNey|m quits [*.net *.split]
09:24:45Thibaultmol quits [*.net *.split]
09:24:45nstrom|m quits [*.net *.split]
09:24:45Hans5958 quits [*.net *.split]
09:24:45theblazehen|m quits [*.net *.split]
09:24:45rewby|m quits [*.net *.split]
09:24:45mpeter|m quits [*.net *.split]
09:24:45tomodachi94 quits [*.net *.split]
09:24:45audrooku|m quits [*.net *.split]
09:24:45xxia|m quits [*.net *.split]
09:24:45britmob|m quits [*.net *.split]
09:24:45andrewvieyra|m quits [*.net *.split]
09:24:45DigitalDragon quits [*.net *.split]
09:24:45mind_combatant quits [*.net *.split]
09:24:45@Sanqui|m quits [*.net *.split]
09:24:45igneousx quits [*.net *.split]
09:24:45Ajay quits [*.net *.split]
09:28:19marius851000 joins
09:28:19sonst-was|m joins
09:28:19that_lurker|m joins
09:28:19alexshpilkin joins
09:28:19qq44|m joins
09:28:19hillow596|m joins
09:28:19Misty|m joins
09:28:19username675f|m joins
09:28:19AntoninDelFabbro|m joins
09:28:19EmeraldSnorlax|m joins
09:28:19Peetz0r|m joins
09:28:19ram|m joins
09:28:19Passiing|m joins
09:28:19kaz__|m joins
09:28:19yzqzss joins
09:28:19Vokun joins
09:28:19tomodachi94 joins
09:28:19moe-a-m|m joins
09:28:19Sanqui|m joins
09:28:19Thibaultmol joins
09:28:19Tom|m1 joins
09:28:19Exorcism joins
09:28:19nstrom|m joins
09:28:19CrispyAlice2 joins
09:28:19Max|m12 joins
09:28:19thermospheric joins
09:28:19DigitalDragon joins
09:28:19GRBaset joins
09:28:19Froxcey|m joins
09:28:19JC|m joins
09:28:19x9fff00 joins
09:28:19iCesenberk|m joins
09:28:19phaeton joins
09:28:19octylFractal|m joins
09:28:19Roki_100|m joins
09:28:19schwarzkatz|m joins
09:28:19Minkafighter|m joins
09:28:19britmob|m joins
09:28:19mind_combatant joins
09:28:19Hans5958 joins
09:28:19xxia|m joins
09:28:19flashfire42|m joins
09:28:19qyxojzh|m joins
09:28:19noxious joins
09:28:19EvanBoehs|m joins
09:28:19coro joins
09:28:19Video joins
09:28:19yetanotherarchiver|m joins
09:28:19noobirc|m joins
09:28:19gungagungagunga|m joins
09:28:19cmostracker|m joins
09:28:19trumad|m joins
09:28:19Cydog|m joins
09:28:19nosamu|m joins
09:28:19s-crypt|m joins
09:28:19jwoglom|m joins
09:28:19superusercode joins
09:28:19nyuuzyou joins
09:28:19vexr joins
09:28:19wrangle|m joins
09:28:19will|m joins
09:28:19jevinskie joins
09:28:19gwetchen|m joins
09:28:19voltagex|m joins
09:28:19Fletcher joins
09:28:19manu|m joins
09:28:19finalti|m joins
09:28:19lasdkfj|m joins
09:28:19masterx244|m joins
09:28:19NickS|m joins
09:28:20akaibu|m joins
09:28:20igneousx joins
09:28:20audrooku|m joins
09:28:20Ajay joins
09:28:20ragu|m joins
09:28:20Maakuth|m joins
09:28:20madpro|m joins
09:28:20jackt1365|m joins
09:28:20t3chler|m joins
09:28:20haha-whered-it-go|m joins
09:28:20saouroun|m joins
09:28:20andrewvieyra|m joins
09:28:20theblazehen|m joins
09:28:20hlgs|m joins
09:28:20mpeter|m joins
09:28:20mikolaj|m joins
09:28:20Ruk8 joins
09:28:20MinePlayersPEMyNey|m joins
09:28:20tech234a|m joins
09:28:20pannekoek11|m joins
09:28:20joepie91|m joins
09:28:20rewby|m joins
10:00:03Bleo182 quits [Client Quit]
10:01:23Bleo182 joins
10:47:21s-crypt2 is now known as s-crypt
10:52:35s-crypt|m is now known as s-crypt|m|m
10:56:45MetaNova quits [Ping timeout: 272 seconds]
11:01:29MetaNova (MetaNova) joins
11:25:55atphoenix_ quits [Remote host closed the connection]
11:26:18atphoenix_ (atphoenix) joins
11:34:24DigitalDragons quits [Client Quit]
11:34:38DigitalDragons (DigitalDragons) joins
11:48:25nfriedly joins
11:51:51qwertyasdfuiopghjkl quits [Remote host closed the connection]
11:51:52DigitalDragons quits [Client Quit]
11:52:06DigitalDragons (DigitalDragons) joins
12:23:24yzqzss quits [Client Quit]
12:23:40yzqzss (yzqzss) joins
12:28:02razul quits [Quit: Bye -]
12:29:14razul joins
12:34:55Arcorann quits [Ping timeout: 272 seconds]
12:53:15kitonthenet joins
12:54:03Megame (Megame) joins
12:58:21kitonthenet quits [Ping timeout: 272 seconds]
12:59:04kitonthenet joins
13:03:40kitonthenet quits [Ping timeout: 265 seconds]
13:37:20kitonthenet joins
13:42:03kitonthenet quits [Ping timeout: 272 seconds]
13:55:06Chris5010 (Chris5010) joins
14:26:22sec^nd (second) joins
14:30:12qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
14:38:29kitonthenet joins
14:38:46etnguyen03 (etnguyen03) joins
14:47:35kitonthenet quits [Ping timeout: 265 seconds]
15:01:02kitonthe1et joins
15:05:35Exorcism quits [Changing host]
15:05:35Exorcism (exorcism) joins
15:08:49kitonthe1et quits [Ping timeout: 272 seconds]
15:46:01<@JAA>sonick, pabs: That job only grabbed the most recent files because the pagination's limited, but I intend to do another run bruteforcing the older files (extensions have to be guessed).
15:46:04M--mlv|m joins
15:57:23<@JAA>If anyone is able to access https://javiermilei.com/ , a grab-site crawl would be great. I tried from machines in 9 countries and got blocked everywhere. It might need to be run in Argentina.
15:57:39HP_Archivist (HP_Archivist) joins
15:58:16<anarchat>i'm blocked by CF there as well
16:00:17ehmry joins
16:03:28qwertyasdfuiopghjkl quits [Ping timeout: 265 seconds]
16:06:04Island joins
16:19:14qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
16:24:13bf_ joins
16:32:33bf_ quits [Remote host closed the connection]
16:41:10<nicolas17>JAA: yeah works for me, I think he had some kind of raffle a while ago so they had to stop people using bots from foreign IPs?
16:41:40<nicolas17>give me a tl;dr for grab-site
16:51:53<@JAA>nicolas17: Nice. The way I run grab-site is without the web interface and stuff, one container per target site: https://gitea.arpa.li/JustAnotherArchivist/grab-site-docker
16:52:59bf_ joins
16:54:28<@JAA>Since I can't look at the site, no idea what options are required here. :-|
17:03:27dumbgoy__ joins
17:07:52<that_lurker>Do you use docker exec for ignores and such or just yolo it without
17:10:48<@JAA>nano as root
17:11:01<@JAA>No risk, no fun.
17:11:06bf_ quits [Remote host closed the connection]
17:11:34<@JAA>But since it's a mount, it should be fine.
17:12:11<fireonlive>no vim for JAA :o?
17:12:18sonick quits [Client Quit]
17:12:41<project10>no emacs for JAA?
17:13:50<@JAA>I'd normally use a magnetised needle, but that's kind of hard to do remotely.
17:14:01<fireonlive>true true
17:14:09<fireonlive>we need to get you one of those surgeon robots
17:14:11<@JAA>Butterflies would work though, I suppose.
17:14:58bf_ joins
17:15:02<kpcyrd>I wish we could archive the editor jokes eventually
17:16:30@JAA archives kpcyrd.
17:16:39<that_lurker>I like to to rearange the 1 and 0 with electric magnet. Might edit the text or might kill the disk, but at least im not running as root so im safe
17:16:44kpcyrd .zip
17:17:55<nicolas17>JAA: do I need to build the container or is it in some repo already?
17:18:17<that_lurker>the build command is in the readme
17:18:43<@JAA>^
17:18:55<@JAA>Haven't pushed it anywhere, no.
17:18:55<nicolas17>yes
17:19:02<nicolas17>that_lurker: I'm asking if I have to :P
17:19:49kitonthenet joins
17:25:16<nicolas17>ugh
17:25:30<nicolas17>JAA: does grab-site run as an unprivileged user inside the container?
17:27:05<nicolas17>[Errno 13] Permission denied: '/data/javiermilei.com-2023-11-20-e125cf6c'
17:27:31kitonthenet quits [Ping timeout: 272 seconds]
17:27:38<@JAA>nicolas17: Yes: https://gitea.arpa.li/JustAnotherArchivist/grab-site-docker/src/commit/398726f73e84233a584fe096d916799fa3c90006/Dockerfile#L48
17:28:05<nicolas17>so I guess I need to make the data dir world writable
17:28:31<nicolas17>RuntimeError: html5-parser and lxml are using different versions of libxml2. This happens commonly when using pip installed versions of lxml. Use pip install --no-binary lxml lxml instead. libxml2 versions: html5-parser: (2, 9, 14) != lxml: (2, 10, 3)
17:28:32<@JAA>Hmm, yeah, there might be room for improvement.
17:28:49<@JAA>Ugh
17:29:14<nicolas17>non-reproducible build tsk tsk
17:29:58<fireonlive>you developers and your chmod 777 😾
17:32:44<@JAA>I'll take a look at the libxml2 issue in a sec.
17:33:58<project10>.
17:34:02Island quits [Remote host closed the connection]
17:34:02<AK>my builds are reproducible, they will fail every time 🤷
17:34:12Island joins
17:36:40<fireonlive>xD
17:37:31<h2ibot>Megame edited Deathwatch (+219, /* 2023 */ Okada Books - Nov 30): https://wiki.archiveteam.org/?diff=51175&oldid=51170
17:38:42bf_ quits [Remote host closed the connection]
17:40:41c3manu (c3manu) joins
17:42:42icedice (icedice) joins
17:43:28bf_ joins
18:00:21scurvy__dog__ joins
18:00:25<@JAA>Is there an equivalent to https://snapshot.debian.org/ for Alpine, such that you can 'install packages as they were at a specific datetime'?
18:02:10<nicolas17>forcing it to alpine 3.13 isn't enough? there's breaking changes to packages within the same alpine version? bleh
18:05:31<kpcyrd>JAA: no, but please let me know if you find one
18:06:21<@JAA>nicolas17: I mean, it might be, but my point is rather about how to do reproducible builds with Alpine.
18:06:35<kpcyrd>sad news: you can't
18:06:38<@JAA>Welp
18:06:55<kpcyrd>the error is likely related to python dependencies and unrelated to alpine tho?
18:07:00<@JAA>Yeah
18:07:10<@JAA>I was thinking more broadly about reproducibility.
18:07:12<nicolas17>https://gitlab.alpinelinux.org/alpine/abuild/-/issues/9996
18:07:56<@JAA>I was not aware they outright delete old packages. Oof.
18:08:17<nicolas17>so does debian
18:08:23<nicolas17>hence having a separate snapshot service :P
18:08:44<kpcyrd>the other problem with alpine is the build environments are not really documented, even if you have all old packages its difficult to tell which ones you need to pick to re-create the original build environment
18:10:04<kpcyrd>other distros solve this with buildinfo files (the OG sbom basically), but Alpine is also stuck in this apk2-apk3 migration thing
18:10:14<kpcyrd>so they decided against adding buildinfo files to apk2
18:10:25<@JAA>Well, yeah, but snapshot is part of the Debian project. So bit different, I think. (Although I believe snapshot.d.o might sometimes miss things if there are rapid uploads? I've heard something like that at least.)
18:10:54<kpcyrd>the only "proper" archive I'm aware of is https://archive.archlinux.org/
18:11:05<nicolas17>snapshot.debian.org becoming an Official Part of the Project is relatively recent, it used to be snapshot.debian.net
18:11:07<@JAA>Yeah, Arch seems to do a good job at this.
18:11:16<@JAA>Ah, interesting.
18:17:14<@JAA>> The official recommendation is to keep your own mirror / repository with all the specific package and their versions that you may want to use.
18:17:27<@JAA>For Alpine. Ok then...
18:18:03<kpcyrd>🤷
18:19:59<nicolas17>but yeah I bet your problem is not pinning Python dep versions
18:20:49phaeton quits [Changing host]
18:20:49phaeton (phaeton) joins
18:21:18<@JAA>Yeah, but indirectly. grab-site doesn't directly depend on html5-parser or lxml.
18:22:16<nicolas17>or you could push your working image somewhere :p
18:22:40scurvy__dog__ quits [Ping timeout: 265 seconds]
18:25:47etnguyen03 quits [Ping timeout: 272 seconds]
18:26:29bf_ quits [Remote host closed the connection]
18:30:08etnguyen03 (etnguyen03) joins
18:37:24<kpcyrd>JAA: the python ecosystem is very silly compared to other languages. Ideally you would have something like package-lock.json, Cargo.lock or composer.lock that records your dependency graph.
18:38:36<fireonlive>hmm you can pin requirements in the .txt can’t you
18:38:44<fireonlive>but i guess that’s also annoying
18:39:24icedice quits [Client Quit]
18:40:07<kpcyrd>tl;dr "yeah idk lol" https://stackoverflow.com/questions/52665596/equivalent-of-package-json-and-package-lock-json-for-pip
18:42:26<kpcyrd>"python is supposed to be easy, can we have easy dependency management too?" - "we have easy dependency management at home"
18:42:34<kpcyrd>dependency management at home: https://stackoverflow.com/questions/58218592/feature-comparison-between-npm-pip-pipenv-and-poetry-package-managers
18:47:19etnguyen03 quits [Ping timeout: 272 seconds]
18:47:25<fireonlive>😅
18:49:22etnguyen03 (etnguyen03) joins
18:52:28Dango360 (Dango360) joins
19:00:18<nicolas17>kpcyrd: yet pipenv and poetry seem to do exactly what you say?
19:01:00<fireonlive>but who uses those
19:03:47kiska5 quits [Ping timeout: 272 seconds]
19:03:47Ryz quits [Ping timeout: 272 seconds]
19:08:35Megame quits [Client Quit]
19:08:47<@JAA>There's also pip-tools. But I don't disagree.
19:09:32<@JAA>On the other hand, those packages need to be *constantly* updated for bug or security fixes anywhere in the dependency tree, which is also very silly.
19:09:40<@JAA>those package lists*
19:14:11kitonthenet joins
19:14:33etnguyen03 quits [Ping timeout: 272 seconds]
19:20:11kitonthenet quits [Ping timeout: 265 seconds]
19:38:18me joins
19:38:41me quits [Remote host closed the connection]
19:40:32IDK_ joins
19:40:59Ryz (Ryz) joins
19:48:12Ryz quits [Excess Flood]
19:52:30kiska5 joins
19:53:04Ryz (Ryz) joins
19:55:02<fireonlive>hmm yeah
19:55:43nicolas17 quits [Ping timeout: 272 seconds]
19:58:07ScenarioPlanet (ScenarioPlanet) joins
19:58:11nicolas17 joins
20:02:29Gooshka joins
20:02:54<Gooshka>https://www.forbes.ru/biznes/494353-andeks-zadumalsa-o-prodaze-svoego-biznesa-v-izraile - Yandex thinks about selling its business located in Israel.
20:03:19<Gooshka>https://www.golosameriki.com/a/yandex-can-sell-its-entire-business-in-russia/7355003.html -Yandex can sell its entire business in Russia
20:03:20<fireonlive>oh interesting, thanks Gooshka. what sites/fronts does it have there?
20:03:47<Gooshka>I sent some links in AB channel.
20:03:51<fireonlive>oh wow; all of yandex
20:04:00<fireonlive>Gooshka: ah! i missed that. thanks as always :)
20:05:44<Gooshka>https://github.com/yandex/ , https://github.com/yandex-cloud/ , https://huggingface.co/yandex , https://yandex.ru/company/ , https://yandex.ru/legal/ , https://yandex.ru/support/
20:05:49<Gooshka>etc.
20:06:49lumidify_ quits [Quit: leaving]
20:09:37<Gooshka>https://toloka.ai/ , https://toloka.ai/tolokers/ru/ (formerly https://toloka.yandex.ru/ ), has page on WKP: https://en.wikipedia.org/wiki/Toloka .
20:11:27<Gooshka>https://yandex.ru/dev/ - technologies of Yandex.
20:15:15<Gooshka>https://yatalks2023.com/ , https://yatalks.yandex.ru/ , I can't find YaTalks before 2023 on sites like yatalks2023.com, only pages like this: https://yatalks2023.com/2022/ru .
20:15:37<nicolas17>JAA: got a working grab-site yet? :P
20:17:46lumidify (lumidify) joins
20:36:00<Gooshka>https://habr.com/ru/companies/yandex/ - blog of Yandex team.
20:36:50<Gooshka>https://shedevrum.ai/ - AI by Yandex creates beatiful pictures of animals and people.
20:41:35<Gooshka>https://yandex.ru/lab/countries - game in which you guess what country is on photo. It follows Russian laws, so Abkhazia is not part of Georgia according to this. Player 2 is Alisa, AI by Yandex. Some other goods under /lab/ directory.
20:50:50Megame (Megame) joins
21:02:09dumbgoy joins
21:02:16dumbgoy quits [Read error: Connection reset by peer]
21:03:18Gooshka quits [Remote host closed the connection]
21:05:23dumbgoy__ quits [Ping timeout: 272 seconds]
21:05:46DigitalDragons quits [Client Quit]
21:05:46Megame quits [Remote host closed the connection]
21:05:46ScenarioPlanet quits [Remote host closed the connection]
21:05:46Island quits [Remote host closed the connection]
21:05:48Island joins
21:05:50Megame (Megame) joins
21:05:52ScenarioPlanet (ScenarioPlanet) joins
21:05:59DigitalDragons (DigitalDragons) joins
21:15:40dumbgoy joins
21:21:01BlueMaxima joins
21:54:50<fireonlive>hii; so i'm very crudelyâ„¢ monitoring urls that archivebot hits until something more betterer is in place - so far i'm looking for blogger/blogspot and imgur.. any others I should look for?
21:55:16<fireonlive>imgur because most pipelines just get a 429 from imgur right away
21:55:19<fireonlive>(it seems)
21:57:41bf_ joins
22:00:30ThetaDev_ quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
22:00:38ThetaDev joins
22:01:45wickedplayer494 quits [Ping timeout: 272 seconds]
22:02:31wickedplayer494 joins
22:04:32kitonthenet joins
22:06:48<thuban>fireonlive: https://wiki.archiveteam.org/index.php/Category:Projects_requiring_URL_lists mediafire
22:07:07<fireonlive>ah yes :)
22:07:10<fireonlive>thanks
22:08:39bf_ quits [Remote host closed the connection]
22:09:05bf_ joins
22:09:21kitonthenet quits [Ping timeout: 272 seconds]
22:16:20abirkill- (abirkill) joins
22:18:51abirkill quits [Ping timeout: 272 seconds]
22:18:51abirkill- is now known as abirkill
22:25:56icedice (icedice) joins
22:28:39c3manu quits [Client Quit]
22:35:53jacksonchen666 (jacksonchen666) joins
22:37:04<@JAA>fireonlive: Telegram, perhaps?
22:37:34<@JAA>You'll want to filter out the share links though.
22:38:26<fireonlive>ah right!
22:43:56<thuban>ah, someone should add the 'do you have a list' template to the telegram wiki page
22:44:49<thuban>idk exactly what the regex would be
22:45:30benjins2_ quits [Read error: Connection reset by peer]
22:48:21<@JAA>The reason it isn't there is that we don't currently have a bot that takes arbitrary URLs and extracts items for the tracker from it, like we do for Imgur and MediaFire.
22:49:22Megame1_ (Megame) joins
22:53:49Megame quits [Ping timeout: 265 seconds]
22:55:56ScenarioPlanet quits [Client Quit]
22:57:55<@JAA>Added it, but we won't be able to make full use of the lists easily yet.
22:58:35<h2ibot>JustAnotherArchivist edited Telegram (+113, Add URL list CTA): https://wiki.archiveteam.org/?diff=51176&oldid=50298
23:03:37ScenarioPlanet (ScenarioPlanet) joins
23:04:34<fireonlive>JAA++
23:04:34<eggdrop>[karma] 'JAA' now has 4 karma!
23:05:11bf_ quits [Remote host closed the connection]
23:05:12bf_ joins
23:06:21HP_Archivist quits [Ping timeout: 272 seconds]
23:16:06ScenarioPlanet quits [Client Quit]
23:19:03benjins2 joins
23:21:06<thuban>ah, fair
23:31:33jacksonchen666 quits [Client Quit]
23:33:59kitonthenet joins
23:45:37ymgve quits [Ping timeout: 272 seconds]