00:00:52nexussfan (nexussfan) joins
00:11:14mr_sarge quits [Read error: Connection reset by peer]
00:11:47StarletCharlotte joins
00:11:58<StarletCharlotte>What's the best way to upload large files to the Internet Archive?
00:12:22<StarletCharlotte>Because I'm trying to upload an archive of ftp://ftp.funcom.com and it's stuck at 4.9 GB. It's been several hours.
00:12:39<StarletCharlotte>My internet isn't the best but I don't think it's that bad.
00:14:01<Yakov>reading some of ABs source, I think it supports ftp..?
00:14:08<@imer>StarletCharlotte: there's some tips here (if you've not seen it yet) https://wiki.archiveteam.org/index.php/Internet_Archive#Upload_speed
00:17:20<pokechu22>ab doesn't interact nicely with ftp - there's some code for it but it crashes most of the time and as such is mostly disabled at this point
00:18:11<StarletCharlotte>imer I'll take a look
00:26:25<@OrIdow6>What's the current tracker architecture? I found old logs talking about it being an Nginx(Lua) proxy that talks to the original tracker, but doesn't directly talk to Redis - is that still the case?
00:28:36<nicolas17>StarletCharlotte: are you on Linux?
00:28:46<StarletCharlotte>yeah
00:29:47<nicolas17>in my experience "sudo sysctl net.ipv4.tcp_congestion_control=bbr" makes uploads to archive.org significantly faster
00:29:50<nicolas17>won't help with ongoing connections/uploads though, you'd have to start over
00:30:21<StarletCharlotte>Got it. Should I turn it off after though?
00:30:46<nicolas17>I didn't notice any negative effects on the rest of my internet use tbh
00:30:57<StarletCharlotte>got it
00:31:01<nicolas17>but you could run "sudo sysctl net.ipv4.tcp_congestion_control" to see what your current value is
00:31:05<nicolas17>and restore it afterwards
00:31:07<StarletCharlotte>whoops
00:31:12<StarletCharlotte>i uh... already did it
00:31:17<StarletCharlotte>oh well
00:32:59<BlankEclair>out of curiosity, does that only make IA uploads fast, or does it make all tcp connections go faster
00:34:26<nicolas17>there's *something* in archive.org's networking that doesn't interact well with the default congestion control algorithm, I don't understand the details
00:35:34<klea>https://www.kernel.org/doc/html/latest/networking/ip-sysctl.html#:~:text=tcp%5Fcongestion%5Fcontrol%20%2D%20STRING is not very clear what that does.
00:40:24<nicolas17>https://en.wikipedia.org/wiki/TCP_congestion_control
00:40:44<StarletCharlotte>> /usr/bin/python: Error while finding module specification for 'ia-upload-stream.py' (ModuleNotFoundError: __path__ attribute not found on 'ia-upload-stream' while trying to find 'ia-upload-stream.py'). Try using 'ia-upload-stream' instead of 'ia-upload-stream.py' as the module name.
00:40:49<StarletCharlotte>not sure what's going on here
00:41:20<StarletCharlotte>Removing .py also fails
00:41:42<StarletCharlotte>same with removing -m
00:47:47<klea>nicolas17: what version does setting the variable to bbr set it to, BBRv1, BBRv2 or BBRv3?
00:49:12<StarletCharlotte>yeah I can't figure out how to run this. The example on the wiki just doesn't work for some reason
00:50:00<@JAA>Why is that command trying to run it as a module? (I either never knew or forgot that my uploader is even listed there.)
00:51:16<StarletCharlotte>Good question, but not running it as a module also fails.
00:51:18<@JAA>And what's that bit about installing the ia package? ia-upload-stream only depends on requests.
00:51:53<StarletCharlotte>https://pastebin.com/s88c8eJr
00:53:03<@JAA>Hmm yeah, I suppose.
00:53:22<@JAA>That does run the script correctly though.
00:53:46<@JAA>You can specify the S3 credentials via IA_S3_ACCESS and IA_S3_SECRET environment variables as well.
00:54:14<@JAA>And ia-s3-auth can get you those values without `ia configure`.
00:55:52etnguyen03 quits [Client Quit]
00:56:42<StarletCharlotte>S3?
00:57:16<StarletCharlotte>Okay I guess
00:57:24<klea>They're available on the web at https://archive.org/account/s3.php too
00:57:32<klea>It's an S3-like API
00:57:34<StarletCharlotte>oh okay thanks
00:59:27<StarletCharlotte>Tried again, same error. It's asking about a config file or something?
00:59:43<@JAA>To explain that error referencing `ia configure`: `ia-upload-stream` reads ia's config file if it's available (and not overridden by the environment variable). There's no actual dependency on `ia`.
01:00:29<StarletCharlotte>I assume ia is from python-internetarchive?
01:00:55<StarletCharlotte>I set the environment variables for the S3 credentials so it's not that.
01:01:27<TheTechRobo>StarletCharlotte: the sysctl option should go back to what it was before after a reboot, FWIW, so don't worry about losing it
01:01:39<StarletCharlotte>got it
01:01:41<@JAA>Sounds like you didn't set them correctly then. It won't even reach that code when they're set.
01:02:03<TheTechRobo>(ia comes from https://pypi.org/project/internetarchive BTW)
01:02:40<StarletCharlotte>Huh, I guess set just sets the shell variables and not environment variables? I think?
01:02:48<@JAA>Yes
01:02:50<klea>try to export.
01:02:53<TheTechRobo>export IA_S3_ACCESS=...
01:03:07<@JAA>Either run it as `IA_S3_ACCESS=... IA_S3_SECRET=... ./ia-upload-stream ...` or `export` them.
01:03:49<StarletCharlotte>There it goes. thank you
01:03:50<@JAA>And `set` sets the arguments, not variables.
01:03:54<StarletCharlotte>that explains a lot
01:04:34StarletCharlotte quits [Client Quit]
01:11:50pabs (pabs) joins
01:13:49LddPotato quits [Read error: Connection reset by peer]
01:14:30LddPotato (LddPotato) joins
01:15:12roverinexile joins
01:17:41rover quits [Ping timeout: 272 seconds]
01:18:31etnguyen03 (etnguyen03) joins
01:24:27LddPotato quits [Read error: Connection reset by peer]
01:25:09LddPotato (LddPotato) joins
01:34:57LddPotato quits [Read error: Connection reset by peer]
01:35:51LddPotato (LddPotato) joins
01:36:03petrichor quits [Ping timeout: 272 seconds]
01:44:13fangfufu quits [Client Quit]
01:45:53LddPotato quits [Read error: Connection reset by peer]
01:46:34LddPotato (LddPotato) joins
01:50:08fangfufu joins
01:50:28kansei- (kansei) joins
01:51:52kansei quits [Ping timeout: 256 seconds]
02:03:57LddPotato quits [Read error: Connection reset by peer]
02:05:31LddPotato (LddPotato) joins
02:29:50pokechu22 quits [Ping timeout: 256 seconds]
02:40:35pokechu22 (pokechu22) joins
02:52:14ducky_ (ducky) joins
02:53:04ducky quits [Ping timeout: 256 seconds]
02:53:04ducky_ is now known as ducky
02:53:29thalia quits [Quit: Connection closed for inactivity]
03:06:40ducky quits [Ping timeout: 256 seconds]
03:08:16ducky (ducky) joins
03:30:58nexussfan quits [Quit: Konversation terminated!]
03:36:42Godzfire quits [Quit: Ooops, wrong browser tab.]
03:47:30nexussfan (nexussfan) joins
04:08:05etnguyen03 quits [Remote host closed the connection]
04:08:17fireatseaparks quits [Quit: Textual IRC Client: www.textualapp.com]
04:16:13fireatseaparks (fireatseaparks) joins
04:39:57Island quits [Read error: Connection reset by peer]
04:46:18cyanbox joins
04:55:14DogsRNice quits [Read error: Connection reset by peer]
05:04:32n9nes quits [Ping timeout: 256 seconds]
05:05:03khaoohs quits [Ping timeout: 272 seconds]
05:06:01n9nes joins
05:06:36khaoohs joins
05:08:58nexussfan quits [Client Quit]
05:15:33steering wonders how thoroughly wikipedia links have been archived
05:23:34<steering>i know there's bots that try and point links to archives when they're dead but is there stuff going through and SPN'ing links for example
05:24:26<BlankEclair>wikipedia-eventstream or something
05:24:55<BlankEclair>https://archive.org/details/wikipedia-eventstream?tab=about
05:27:59<pokechu22>Yeah, my understanding is that there's a project that does that (that isn't by archiveteam). Looking at https://archive.org/details/wikipedia-eventstream?tab=collection&sort=-publicdate it seems like stuff is ran weeklyish?
05:35:01<steering>ah good :)
06:08:23Snivy quits [Ping timeout: 272 seconds]
06:15:57petrichor (petrichor) joins
06:25:00fionera quits [Ping timeout: 256 seconds]
06:29:23BennyOtt (BennyOtt) joins
06:40:59Wohlstand1 (Wohlstand) joins
06:43:24Wohlstand1 is now known as Wohlstand
06:51:24Wohlstand quits [Client Quit]
07:12:09Snivy (Snivy) joins
08:30:53rohvani quits [Ping timeout: 272 seconds]
08:55:44ducky quits [Ping timeout: 256 seconds]
08:57:25<ericgallager>https://en.wikipedia.org/wiki/User:GreenC_bot does archiving of Wikipedia links
08:57:40<ericgallager>https://en.wikipedia.org/wiki/User:GreenC/WaybackMedic
08:59:51<ericgallager>oh and this one too: https://en.wikipedia.org/wiki/User:InternetArchiveBot