00:01:06Wohlstand (Wohlstand) joins
00:03:43Dango360_ quits [Ping timeout: 272 seconds]
00:06:27<fireonlive>lol wow
00:06:52<fireonlive>'how about you try opening the fuckin' thing?'
01:13:06bladem quits [Read error: Connection reset by peer]
01:51:27Wohlstand quits [Client Quit]
02:14:35<h2ibot>Nicolas17v2 edited Taringa! (+112, Add links to the DPoS project): https://wiki.archiveteam.org/?diff=51902&oldid=51871
02:17:28<nicolas17>imer: did you independently confirm the zip is broken?
02:40:15nic8 quits [Client Quit]
02:44:11nic8 (nic) joins
02:57:42<h2ibot>Nulldata uploaded File:Taringa-FrontPage.png: https://wiki.archiveteam.org/?title=File%3ATaringa-FrontPage.png
02:58:02<nulldata>Hmm svgs uploads aren't allowed on the wiki
02:58:13<nulldata>svg*
03:02:43<h2ibot>Nulldata uploaded File:Taringa-logo-500px.png: https://wiki.archiveteam.org/?title=File%3ATaringa-logo-500px.png
03:03:16lennier2_ quits [Read error: Connection reset by peer]
03:03:34lennier2_ joins
03:04:43<h2ibot>Nulldata edited Taringa! (+64, Added logo and screenshot): https://wiki.archiveteam.org/?diff=51905&oldid=51902
03:14:37AlexisJ quits [Ping timeout: 255 seconds]
03:21:44<fireonlive>yeah sadly not :(
03:21:47nic8 quits [Client Quit]
03:22:04<fireonlive>i think JAA checked and it's not something they can change w/o editing the .php config
03:22:38nic8 (nic) joins
03:22:51<fireonlive>for mediawiki that is
03:35:53benjins2 quits [Ping timeout: 272 seconds]
03:35:53benjins quits [Ping timeout: 272 seconds]
04:06:34Island quits [Read error: Connection reset by peer]
04:09:21<HP_Archivist>I see that https://vetusware.com/ was grabbed in 2022. Is it worth doing a recursive crawl?
04:09:45<HP_Archivist>fireonlive: Wiki - https://wiki.panotools.org/Main_Page
04:12:16<HP_Archivist>And another, not sure if you've seen these previously https://wiki.videolan.org/
04:31:52BlueMaxima quits [Read error: Connection reset by peer]
04:41:53Craigle quits [Quit: The Lounge - https://thelounge.chat]
04:42:23Craigle (Craigle) joins
05:00:49JayEmbee quits [Ping timeout: 255 seconds]
05:17:20Rotietip joins
05:26:28JayEmbee (JayEmbee) joins
06:06:16Rotietip quits [Client Quit]
06:54:06<BornOn420>https://tracker.archiveteam.org/ is sleeping
07:05:04pigeon joins
07:05:22Ketchup901 quits [Quit: No Ping reply in 180 seconds.]
07:05:36pigeon quits [Client Quit]
07:06:43Ketchup901 (Ketchup901) joins
07:07:06Cronfox quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
07:07:13Cronfox (Cronfox) joins
07:23:39Arcorann (Arcorann) joins
07:28:40Cronfox quits [Client Quit]
08:00:24Cronfox (Cronfox) joins
08:35:06Gereon0 (Gereon) joins
08:37:21Gereon quits [Ping timeout: 272 seconds]
08:37:21Gereon0 is now known as Gereon
08:46:13<imer>nicolas17: yes, sent them the error message this time around
08:49:26AK quits [Quit: AK]
09:00:05Bleo182600 quits [Client Quit]
09:01:20Bleo182600 joins
09:11:09Dango360_ joins
09:12:02AK (AK) joins
09:15:21_Dango360 quits [Ping timeout: 272 seconds]
09:21:44Chris5010 quits [Remote host closed the connection]
09:38:58Hallfiry joins
09:42:16<Hallfiry>I just remembered Tennessee Bill's Old Time Radio (a website I used to visit around 2010) and was wondering if you guys know it and maybe have gotten in touch with the author to back up his stuff. The site is long gone, bet maybe he's still around and has the data. It had hundreds of gigabytes of radio recordings, wartime posters etc.
09:42:16<Hallfiry>https://web.archive.org/web/20101129153101/http://tennesseebillsotr.com
09:58:17<pabs>looks like we didn't grab it https://archive.fart.website/archivebot/viewer/?q=tennesseebillsotr.com
10:01:56<pabs>2010 era is a long time ago...
10:03:16Hallfiry quits [Ping timeout: 265 seconds]
10:16:02benjins joins
10:16:03knecht4 quits [Quit: knecht420]
10:17:19knecht4 joins
10:23:07Hallfiry joins
10:25:39Hallfiry quits [Client Quit]
11:25:15DLoader_ (DLoader) joins
11:27:43DLoader quits [Ping timeout: 272 seconds]
11:27:45DLoader_ is now known as DLoader
11:34:25DLoader quits [Read error: Connection reset by peer]
11:36:00DLoader (DLoader) joins
11:38:50DLoader_ (DLoader) joins
11:41:01DLoader quits [Ping timeout: 272 seconds]
11:41:09DLoader_ is now known as DLoader
11:49:03benjins2 joins
11:55:54grid quits [Quit: Connection closed for inactivity]
12:05:41grid joins
12:51:57Arcorann quits [Ping timeout: 272 seconds]
13:02:46Guest54 joins
13:11:50Hackerpcs (Hackerpcs) joins
13:18:15monoxane quits [Read error: Connection reset by peer]
13:18:42monoxane (monoxane) joins
13:32:31Guest92 joins
13:34:49<h2ibot>Arkiver uploaded File:Taringa icon.webp: https://wiki.archiveteam.org/?title=File%3ATaringa%20icon.webp
13:37:49<h2ibot>Arkiver uploaded File:Taringa icon.png: https://wiki.archiveteam.org/?title=File%3ATaringa%20icon.png
13:46:25hackbug quits [Ping timeout: 255 seconds]
13:48:58pratheekrebala joins
14:42:02eroc1990 quits [Quit: The Lounge - https://thelounge.chat]
14:42:35eroc1990 (eroc1990) joins
14:46:01Hackerpcs quits [Client Quit]
14:55:42Hackerpcs (Hackerpcs) joins
15:44:58emberquill080 quits [Quit: The Lounge - https://thelounge.chat]
15:45:49emberquill080 (emberquill) joins
15:49:54ell5 (ell) joins
15:50:25ell quits [Read error: Connection reset by peer]
15:50:26ell5 is now known as ell
15:55:54grid quits [Client Quit]
15:59:00pratheekrebala quits [Client Quit]
16:32:21Guest92 quits [Ping timeout: 265 seconds]
16:48:44BearFortress quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
17:00:51Dango360_ quits [Ping timeout: 272 seconds]
17:02:46Dango360 (Dango360) joins
17:15:08BearFortress joins
17:26:56Chris5010 (Chris5010) joins
18:38:40VerifiedJ9 quits [Remote host closed the connection]
18:39:13VerifiedJ9 (VerifiedJ) joins
19:27:31Island joins
19:40:50qwertyasdfuiopghjkl quits [Client Quit]
20:15:28qwertyasdfuiopghjkl joins
20:16:21qwertyasdfuiopghjkl quits [Client Quit]
20:17:57<nicolas17>5 more files up in samsung-grab; I tried downloading one of them myself and it failed after 2 hours :/
20:19:06<nicolas17>https://data.nicolas17.xyz/samsung-grab/
20:19:21<nicolas17>(sheesh that's a lot of thelounge users)
20:19:30qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
20:20:55<k>lol
20:28:52<fireonlive>there's also like one request from hackint's matrix bridge too
20:28:53<fireonlive><_<
20:33:00<myself>I don't _think_ my thelounge instance does that; I'm not seeing previews.
20:33:28<myself>But, this time the search result returned _two_ files: one 12-byte .txt, and one 1.5GB .zip, I presume we're to ignore the .txt?
20:37:48<TheTechRobo>I thought TL pings the servers even if link previews are disabled
20:37:50<TheTechRobo>Could be wrong
20:56:38BlueMaxima joins
20:58:20<nicolas17>myself: yes
20:58:41<eightthree>is python (and maybe lua) fast enough for fetching? Would some projects download faster if a rust -grab or -items or -discovery tool be made? Or are most projects rate-limiting anyways and extra speed would be of much benefit?
20:58:55<nicolas17>eightthree: I have never seen CPU be the bottleneck
21:00:48bladem (bladem) joins
21:00:48<nicolas17>your bandwidth, website's bandwidth, website's limit of requests per IP, target capacity (when IA can't ingest the data fast enough), etc etc before you hit CPU
21:03:11<nicolas17>myself: I even had to put that item in the queue manually, because my enqueue.py script skips both "item has >1 files" and "item is not in Mobile Phone category" :P
21:42:22<TheTechRobo>nicolas17: CPU is often the bottleneck, but it's usually fixable
21:42:52<TheTechRobo>e.g. Telegram used to eat up a ton of CPU while requesting discussion data
21:42:58<nicolas17>yeah, CPU has been the bottleneck in some specific cases, but it was usually problems that can be solved without rewriting the entire stack in another language :P
21:43:55<nicolas17>such as parsing JSON in pure Lua, when more efficient Lua extensions already exist for that
21:44:19<@JAA>The Python part is just for orchestration and not very relevant for performance.
21:46:50<@JAA>Even if we wanted to replace Lua, it wouldn't be an easy thing to do since it'd likely mean replacing wget-at as well. But yeah, we have yet to run into a situation where Lua is the bottleneck, I believe.
21:47:32k quits [Remote host closed the connection]
21:48:28<nicolas17>another possible thing to try before rewriting the world would be luajit
21:48:35katia (katia) joins
21:52:17<@JAA>I believe we've been using luajit for some time.
21:53:22<@JAA>Yes: https://github.com/ArchiveTeam/wget-lua/commit/da43582bfda92c9f5848f7b1fc15edf78d9e1b41
21:55:03<nicolas17>oh cool
22:04:25<h2ibot>Nicolas17v2 edited Taringa! (-1, Update status fields in project infobox): https://wiki.archiveteam.org/?diff=51908&oldid=51905
22:19:54b joins
22:20:16b quits [Client Quit]
22:20:26<TheTechRobo>oh, URLs is another project that's sometimes CPU-bound but that can be fixed without rewriting in another language
22:20:44<TheTechRobo>Even if the actual code can't be optimised further, some of the code can probably be made as a Lua extension
22:20:58<TheTechRobo>Or, worst-case scenario, some of the parsing could be done by a subprocess
22:27:53Wohlstand (Wohlstand) joins
22:28:27<imer>yeah, i wouldnt mind urls being less cpu hungry :D
22:28:40<imer>> load average: 419.08, 402.21, 371.56
22:29:08<nicolas17>myself: is that download still going or did it fail?
22:30:21<fireonlive>imer: those PDFs lol
22:30:30<h2ibot>JustAnotherArchivist edited Current Projects (+0, Move Taringa! to running): https://wiki.archiveteam.org/?diff=51909&oldid=51869
22:30:31<h2ibot>Slukiceng edited Discourse (+68, /* Active Discourses */): https://wiki.archiveteam.org/?diff=51910&oldid=51900
22:30:46<imer>fireonlive: no pdfs currently, well, at least not the big list
22:30:57<fireonlive>ahh
22:31:38<fireonlive>just ambient load :c
22:31:41<imer>wouldn't even know where to begin profiling though, nvm having it running in docker
22:32:01<imer>we talked about pattern count the other day in #// not sure if that's a significant contributor though
22:37:17<fireonlive>oh hmm we did just offload a bunch of stuff from the tracker eh
22:37:38<imer>loads been high before that though
22:37:43<imer>so not a sudden change
22:38:05<fireonlive>ah ok
22:38:14<fireonlive>i haven’t run urls for a little bit sadly
22:39:45pixel leaves [Error from remote client]
22:39:46pixel (pixel) joins
22:47:18BearFortress quits [Client Quit]
22:58:59BearFortress joins
23:03:23<myself>nicolas17: oh it finished, I'm just bad without reminders :) comin' at ya!
23:09:38<@JAA>!remind myself 5s We have reminders at home!
23:09:39<eggdrop>[remind] ok, i'll remind myself at 2024-03-19T23:09:43Z
23:09:43<eggdrop>[remind] myself: We have reminders at home!
23:19:42BornOn420 quits [Client Quit]
23:31:30<fireonlive>!remindme 0s :3
23:31:31<eggdrop>[remind] ok, i'll remind you at 2024-03-19T23:31:30Z
23:31:32<eggdrop>[remind] fireonlive: :3
23:32:06<fireonlive>eggdrop: help
23:37:23BornOn420 (BornOn420) joins