00:05:06 | | BlueMaxima quits [Read error: Connection reset by peer] |
00:08:58 | | wickedplayer494 quits [Remote host closed the connection] |
00:12:29 | | wickedplayer494 joins |
00:13:34 | | wickedplayer494 is now authenticated as wickedplayer494 |
00:17:09 | | sralracer quits [Client Quit] |
00:20:10 | | tzt quits [Ping timeout: 260 seconds] |
00:21:37 | | tzt (tzt) joins |
00:28:49 | | HP_Archivist (HP_Archivist) joins |
00:30:15 | | M60_ quits [Quit: Going offline, see ya! (www.adiirc.com)] |
00:49:09 | <immibis> | matt is still being a moron? |
00:50:19 | <immibis> | that_lurker: you assume people want to do that enough to bother to do it. These platforms aren't point-to-multipoint communication platforms, but more like multipoint-to-point propaganda feeds - you log in, you see some stuff, and following is a way to tailor the kind of stuff you see, to some extent. |
00:50:33 | <immibis> | it's not like email where you send a message and the receivers receive the message |
00:51:03 | <immibis> | and a lot of the time it doesn't even matter who precisely you follow, as long as the algorithm gives you content pieces that are near certain embeddings |
00:51:09 | <pabs> | nicolas17: no idea, but the HN thread has some jfrog mentions https://news.ycombinator.com/item?id=42136375 |
00:51:45 | <pabs> | kpcyrd: Debian is switching more towards the git repos, ignoring PyPI these days. so useless there |
00:52:00 | <pabs> | kpcyrd: for the same reasons as Arch really |
00:52:21 | <pabs> | no idea about how it all works |
00:53:40 | <pabs> | The Debian Rust team hasn't switched away from Rust crates.io to git though, but there is one breakway Debian dev who packages Rust stuff from git instead |
01:06:38 | <pabs> | kpcyrd: personally I think they should have gone with reproducible builds instead. require the maintainer and multiple auto-build providers to get the same binary |
03:00:50 | <catbottom> | anyone know any best practices for scraping the wayback machine? They seem to allow it, I mean there's even the wayback_machine_downloader tool, but at the same time I get throttled pretty fast. I'm trying to get every binary, zip file, etc hosted on ascii.co.jp. It ends up being 9872 files. I just tried grabbing them with aria2c and got throttled pretty fast. Is there a best way to do this without getting throttled and being respectful to |
03:00:50 | <catbottom> | the wayback machine, but that also isn't gonna end up taking me 5 years to download all of them? |
03:37:51 | | Dango360 quits [Ping timeout: 252 seconds] |
04:04:30 | | lukash98 quits [Quit: Ping timeout (120 seconds)] |
04:09:18 | | lukash98 joins |
04:33:08 | <nicolas17> | catbottom: link me to one of those files |
04:36:00 | <nicolas17> | hm looks like this was crawled by Alexa |
04:36:31 | <nicolas17> | and the raw WARC files are not publicly accessible, the WBM is the only way to access it |
04:37:14 | <nicolas17> | catbottom: did aria2 use multiple parallel connections? that would be a first start, use -j1 to do one at a time and go slower |
04:56:32 | <catbottom> | nicolas17 this is everything I'm trying to download. Hopefully I actually filtered it all right to get all the binaries I'm looking for https://cz0.au/fxyhyu.txt |
04:58:37 | <catbottom> | and yeah, aria2 did use multiple parallel connections. I started out using wget one at a time before realizing, with that many files, it would take forever. I guess I'm trying to find the sweet spot that wont take a year but will not get me blocked. I guess though, assuming 2 seconds to download, 2 seconds to wait, that's only around 11 hours for that many files now that I do the math. |
04:59:34 | <catbottom> | 9rrrrr' |
04:59:47 | <catbottom> | r78rvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv |
05:00:01 | <catbottom> | ahh sorry, cat on the keyboard |
05:19:28 | | etnguyen03 quits [Remote host closed the connection] |
05:36:38 | | BlueMaxima joins |
06:04:31 | | ducky_ (ducky) joins |
06:04:48 | | ducky quits [Ping timeout: 260 seconds] |
06:05:06 | | ducky_ is now known as ducky |
06:07:59 | | BlueMaxima quits [Read error: Connection reset by peer] |
07:27:22 | <pabs> | screenlockers++ |
07:27:22 | <eggdrop> | [karma] 'screenlockers' now has 1 karma! |
07:56:33 | | Ryz2 (Ryz) joins |
08:12:50 | | pixel leaves [Error from remote client] |
08:26:58 | | AlsoHP_Archivist joins |
08:29:00 | | HP_Archivist quits [Ping timeout: 260 seconds] |
09:00:38 | | pixel (pixel) joins |
09:55:40 | | sec^nd quits [Remote host closed the connection] |
09:56:01 | | sec^nd (second) joins |
10:47:54 | | sralracer joins |
10:48:50 | | sralracer is now authenticated as sralracer |
11:10:20 | | Dango360 (Dango360) joins |
11:18:44 | | MrMcNuggets (MrMcNuggets) joins |
11:24:47 | | pseudorizer quits [Quit: ZNC 1.9.1 - https://znc.in] |
11:28:42 | | pseudorizer (pseudorizer) joins |
11:31:01 | <that_lurker> | immibis: In bluesky there is no global feed for you so you actually need to follow the people you want to see. Of course there are these feeds people have made you can follow (they are mostly themed so for example people that release art) |
11:58:25 | | nulldata quits [Ping timeout: 255 seconds] |
12:00:01 | | Bleo182600722719623 quits [Quit: The Lounge - https://thelounge.chat] |
12:02:46 | | Bleo182600722719623 joins |
12:16:05 | | nulldata (nulldata) joins |
13:09:22 | | @OrIdow6 quits [Quit: ZNC 1.8.2+deb3.1 - https://znc.in] |
13:12:44 | | OrIdow6 (OrIdow6) joins |
13:12:44 | | @ChanServ sets mode: +o OrIdow6 |
13:14:15 | | SootBect1 (SootBector) joins |
13:14:43 | | SootBector quits [Remote host closed the connection] |
13:47:50 | | M60_ joins |
14:07:59 | | MrMcNuggets quits [Read error: Connection reset by peer] |
14:16:35 | | etnguyen03 (etnguyen03) joins |
14:17:03 | | MrMcNuggets (MrMcNuggets) joins |
15:05:18 | | BennyOtt (BennyOtt) joins |
15:36:35 | | Dango360 quits [Ping timeout: 260 seconds] |
15:38:58 | | Dango360 (Dango360) joins |
15:40:15 | | Dango360_ (Dango360) joins |
15:44:24 | | Dango360 quits [Ping timeout: 252 seconds] |
16:22:55 | <immibis> | sounds like more work |
16:23:07 | <immibis> | people are lazy and do whatever is easy. |
16:23:39 | | Gadelhas562 quits [Quit: auf Wiedersehen] |
16:24:08 | | Gadelhas5628 joins |
16:36:17 | <immibis> | the model we've settled on as a society is that you click on "more of this" or "less of this" and don't really focus on specific people (even if you want to, the system ignores that and people are still happy) |
17:39:19 | | qwertyasdfuiopghjkl quits [Client Quit] |
17:45:26 | | HP_Archivist (HP_Archivist) joins |
17:47:36 | | AlsoHP_Archivist quits [Ping timeout: 252 seconds] |
17:54:43 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
18:17:00 | | nicolas17 quits [Ping timeout: 260 seconds] |
18:31:18 | | MrMcNuggets quits [Quit: WeeChat 4.3.2] |
18:32:42 | | Dango360_ quits [Ping timeout: 252 seconds] |
18:53:47 | | Dango360 (Dango360) joins |
18:54:20 | | Dango360_ (Dango360) joins |
18:58:33 | | Dango360 quits [Ping timeout: 252 seconds] |
19:04:43 | | ducky quits [Ping timeout: 260 seconds] |
19:05:44 | | ducky (ducky) joins |
19:32:26 | <alexlehm> | its kind of like twitter before they made a feed |
20:48:26 | | Gadelhas56280 joins |
20:52:24 | | Gadelhas5628 quits [Ping timeout: 252 seconds] |
20:52:24 | | Gadelhas56280 is now known as Gadelhas5628 |
20:54:49 | | Gadelhas56286 joins |
20:58:27 | | Gadelhas5628 quits [Ping timeout: 252 seconds] |
20:58:27 | | Gadelhas56286 is now known as Gadelhas5628 |
21:40:17 | | BlueMaxima joins |
21:47:00 | | Dango360_ quits [Ping timeout: 260 seconds] |
22:19:58 | | Dango360 (Dango360) joins |
22:30:37 | | useretail joins |
22:57:32 | | BennyOtt quits [Quit: Leaving] |
23:09:21 | | Dango360 quits [Ping timeout: 252 seconds] |
23:17:31 | | pixel leaves [Error from remote client] |
23:19:40 | | pixel (pixel) joins |
23:48:30 | | VerifiedJ9 quits [Remote host closed the connection] |
23:49:23 | <immibis> | so is fediverse |
23:52:33 | | VerifiedJ9 (VerifiedJ) joins |
23:59:38 | | etnguyen03 quits [Quit: Konversation terminated!] |