00:01:06 | | Wohlstand (Wohlstand) joins |
00:03:43 | | Dango360_ quits [Ping timeout: 272 seconds] |
00:06:27 | <fireonlive> | lol wow |
00:06:52 | <fireonlive> | 'how about you try opening the fuckin' thing?' |
01:13:06 | | bladem quits [Read error: Connection reset by peer] |
01:51:27 | | Wohlstand quits [Client Quit] |
02:14:35 | <h2ibot> | Nicolas17v2 edited Taringa! (+112, Add links to the DPoS project): https://wiki.archiveteam.org/?diff=51902&oldid=51871 |
02:17:28 | <nicolas17> | imer: did you independently confirm the zip is broken? |
02:40:15 | | nic8 quits [Client Quit] |
02:44:11 | | nic8 (nic) joins |
02:57:42 | <h2ibot> | Nulldata uploaded File:Taringa-FrontPage.png: https://wiki.archiveteam.org/?title=File%3ATaringa-FrontPage.png |
02:58:02 | <nulldata> | Hmm svgs uploads aren't allowed on the wiki |
02:58:13 | <nulldata> | svg* |
03:02:43 | <h2ibot> | Nulldata uploaded File:Taringa-logo-500px.png: https://wiki.archiveteam.org/?title=File%3ATaringa-logo-500px.png |
03:03:16 | | lennier2_ quits [Read error: Connection reset by peer] |
03:03:34 | | lennier2_ joins |
03:04:43 | <h2ibot> | Nulldata edited Taringa! (+64, Added logo and screenshot): https://wiki.archiveteam.org/?diff=51905&oldid=51902 |
03:14:37 | | AlexisJ quits [Ping timeout: 255 seconds] |
03:21:44 | <fireonlive> | yeah sadly not :( |
03:21:47 | | nic8 quits [Client Quit] |
03:22:04 | <fireonlive> | i think JAA checked and it's not something they can change w/o editing the .php config |
03:22:38 | | nic8 (nic) joins |
03:22:51 | <fireonlive> | for mediawiki that is |
03:35:53 | | benjins2 quits [Ping timeout: 272 seconds] |
03:35:53 | | benjins quits [Ping timeout: 272 seconds] |
04:06:34 | | Island quits [Read error: Connection reset by peer] |
04:09:21 | <HP_Archivist> | I see that https://vetusware.com/ was grabbed in 2022. Is it worth doing a recursive crawl? |
04:09:45 | <HP_Archivist> | fireonlive: Wiki - https://wiki.panotools.org/Main_Page |
04:12:16 | <HP_Archivist> | And another, not sure if you've seen these previously https://wiki.videolan.org/ |
04:31:52 | | BlueMaxima quits [Read error: Connection reset by peer] |
04:41:53 | | Craigle quits [Quit: The Lounge - https://thelounge.chat] |
04:42:23 | | Craigle (Craigle) joins |
05:00:49 | | JayEmbee quits [Ping timeout: 255 seconds] |
05:17:20 | | Rotietip joins |
05:26:28 | | JayEmbee (JayEmbee) joins |
06:06:16 | | Rotietip quits [Client Quit] |
06:54:06 | <BornOn420> | https://tracker.archiveteam.org/ is sleeping |
07:05:04 | | pigeon joins |
07:05:22 | | Ketchup901 quits [Quit: No Ping reply in 180 seconds.] |
07:05:36 | | pigeon quits [Client Quit] |
07:06:43 | | Ketchup901 (Ketchup901) joins |
07:07:06 | | Cronfox quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
07:07:13 | | Cronfox (Cronfox) joins |
07:23:39 | | Arcorann (Arcorann) joins |
07:28:40 | | Cronfox quits [Client Quit] |
08:00:24 | | Cronfox (Cronfox) joins |
08:35:06 | | Gereon0 (Gereon) joins |
08:37:21 | | Gereon quits [Ping timeout: 272 seconds] |
08:37:21 | | Gereon0 is now known as Gereon |
08:46:13 | <imer> | nicolas17: yes, sent them the error message this time around |
08:49:26 | | AK quits [Quit: AK] |
09:00:05 | | Bleo182600 quits [Client Quit] |
09:01:20 | | Bleo182600 joins |
09:11:09 | | Dango360_ joins |
09:12:02 | | AK (AK) joins |
09:15:21 | | _Dango360 quits [Ping timeout: 272 seconds] |
09:21:44 | | Chris5010 quits [Remote host closed the connection] |
09:38:58 | | Hallfiry joins |
09:42:16 | <Hallfiry> | I just remembered Tennessee Bill's Old Time Radio (a website I used to visit around 2010) and was wondering if you guys know it and maybe have gotten in touch with the author to back up his stuff. The site is long gone, bet maybe he's still around and has the data. It had hundreds of gigabytes of radio recordings, wartime posters etc. |
09:42:16 | <Hallfiry> | https://web.archive.org/web/20101129153101/http://tennesseebillsotr.com |
09:58:17 | <pabs> | looks like we didn't grab it https://archive.fart.website/archivebot/viewer/?q=tennesseebillsotr.com |
10:01:56 | <pabs> | 2010 era is a long time ago... |
10:03:16 | | Hallfiry quits [Ping timeout: 265 seconds] |
10:16:02 | | benjins joins |
10:16:03 | | knecht4 quits [Quit: knecht420] |
10:17:19 | | knecht4 joins |
10:23:07 | | Hallfiry joins |
10:25:39 | | Hallfiry quits [Client Quit] |
11:25:15 | | DLoader_ (DLoader) joins |
11:27:43 | | DLoader quits [Ping timeout: 272 seconds] |
11:27:45 | | DLoader_ is now known as DLoader |
11:34:25 | | DLoader quits [Read error: Connection reset by peer] |
11:36:00 | | DLoader (DLoader) joins |
11:38:50 | | DLoader_ (DLoader) joins |
11:41:01 | | DLoader quits [Ping timeout: 272 seconds] |
11:41:09 | | DLoader_ is now known as DLoader |
11:49:03 | | benjins2 joins |
11:55:54 | | grid quits [Quit: Connection closed for inactivity] |
12:05:41 | | grid joins |
12:51:57 | | Arcorann quits [Ping timeout: 272 seconds] |
13:02:46 | | Guest54 joins |
13:11:50 | | Hackerpcs (Hackerpcs) joins |
13:18:15 | | monoxane quits [Read error: Connection reset by peer] |
13:18:42 | | monoxane (monoxane) joins |
13:32:31 | | Guest92 joins |
13:34:49 | <h2ibot> | Arkiver uploaded File:Taringa icon.webp: https://wiki.archiveteam.org/?title=File%3ATaringa%20icon.webp |
13:37:49 | <h2ibot> | Arkiver uploaded File:Taringa icon.png: https://wiki.archiveteam.org/?title=File%3ATaringa%20icon.png |
13:46:25 | | hackbug quits [Ping timeout: 255 seconds] |
13:48:58 | | pratheekrebala joins |
14:42:02 | | eroc1990 quits [Quit: The Lounge - https://thelounge.chat] |
14:42:35 | | eroc1990 (eroc1990) joins |
14:46:01 | | Hackerpcs quits [Client Quit] |
14:55:42 | | Hackerpcs (Hackerpcs) joins |
15:44:58 | | emberquill080 quits [Quit: The Lounge - https://thelounge.chat] |
15:45:49 | | emberquill080 (emberquill) joins |
15:49:54 | | ell5 (ell) joins |
15:50:25 | | ell quits [Read error: Connection reset by peer] |
15:50:26 | | ell5 is now known as ell |
15:55:54 | | grid quits [Client Quit] |
15:59:00 | | pratheekrebala quits [Client Quit] |
16:32:21 | | Guest92 quits [Ping timeout: 265 seconds] |
16:48:44 | | BearFortress quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
17:00:51 | | Dango360_ quits [Ping timeout: 272 seconds] |
17:02:46 | | Dango360 (Dango360) joins |
17:15:08 | | BearFortress joins |
17:26:56 | | Chris5010 (Chris5010) joins |
18:38:40 | | VerifiedJ9 quits [Remote host closed the connection] |
18:39:13 | | VerifiedJ9 (VerifiedJ) joins |
19:27:31 | | Island joins |
19:40:50 | | qwertyasdfuiopghjkl quits [Client Quit] |
20:15:28 | | qwertyasdfuiopghjkl joins |
20:16:21 | | qwertyasdfuiopghjkl quits [Client Quit] |
20:17:57 | <nicolas17> | 5 more files up in samsung-grab; I tried downloading one of them myself and it failed after 2 hours :/ |
20:19:06 | <nicolas17> | https://data.nicolas17.xyz/samsung-grab/ |
20:19:21 | <nicolas17> | (sheesh that's a lot of thelounge users) |
20:19:30 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
20:20:55 | <k> | lol |
20:28:52 | <fireonlive> | there's also like one request from hackint's matrix bridge too |
20:28:53 | <fireonlive> | <_< |
20:33:00 | <myself> | I don't _think_ my thelounge instance does that; I'm not seeing previews. |
20:33:28 | <myself> | But, this time the search result returned _two_ files: one 12-byte .txt, and one 1.5GB .zip, I presume we're to ignore the .txt? |
20:37:48 | <TheTechRobo> | I thought TL pings the servers even if link previews are disabled |
20:37:50 | <TheTechRobo> | Could be wrong |
20:56:38 | | BlueMaxima joins |
20:58:20 | <nicolas17> | myself: yes |
20:58:41 | <eightthree> | is python (and maybe lua) fast enough for fetching? Would some projects download faster if a rust -grab or -items or -discovery tool be made? Or are most projects rate-limiting anyways and extra speed would be of much benefit? |
20:58:55 | <nicolas17> | eightthree: I have never seen CPU be the bottleneck |
21:00:48 | | bladem (bladem) joins |
21:00:48 | <nicolas17> | your bandwidth, website's bandwidth, website's limit of requests per IP, target capacity (when IA can't ingest the data fast enough), etc etc before you hit CPU |
21:03:11 | <nicolas17> | myself: I even had to put that item in the queue manually, because my enqueue.py script skips both "item has >1 files" and "item is not in Mobile Phone category" :P |
21:42:22 | <TheTechRobo> | nicolas17: CPU is often the bottleneck, but it's usually fixable |
21:42:52 | <TheTechRobo> | e.g. Telegram used to eat up a ton of CPU while requesting discussion data |
21:42:58 | <nicolas17> | yeah, CPU has been the bottleneck in some specific cases, but it was usually problems that can be solved without rewriting the entire stack in another language :P |
21:43:55 | <nicolas17> | such as parsing JSON in pure Lua, when more efficient Lua extensions already exist for that |
21:44:19 | <@JAA> | The Python part is just for orchestration and not very relevant for performance. |
21:46:50 | <@JAA> | Even if we wanted to replace Lua, it wouldn't be an easy thing to do since it'd likely mean replacing wget-at as well. But yeah, we have yet to run into a situation where Lua is the bottleneck, I believe. |
21:47:32 | | k quits [Remote host closed the connection] |
21:48:28 | <nicolas17> | another possible thing to try before rewriting the world would be luajit |
21:48:35 | | katia (katia) joins |
21:52:17 | <@JAA> | I believe we've been using luajit for some time. |
21:53:22 | <@JAA> | Yes: https://github.com/ArchiveTeam/wget-lua/commit/da43582bfda92c9f5848f7b1fc15edf78d9e1b41 |
21:55:03 | <nicolas17> | oh cool |
22:04:25 | <h2ibot> | Nicolas17v2 edited Taringa! (-1, Update status fields in project infobox): https://wiki.archiveteam.org/?diff=51908&oldid=51905 |
22:19:54 | | b joins |
22:20:16 | | b quits [Client Quit] |
22:20:26 | <TheTechRobo> | oh, URLs is another project that's sometimes CPU-bound but that can be fixed without rewriting in another language |
22:20:44 | <TheTechRobo> | Even if the actual code can't be optimised further, some of the code can probably be made as a Lua extension |
22:20:58 | <TheTechRobo> | Or, worst-case scenario, some of the parsing could be done by a subprocess |
22:27:53 | | Wohlstand (Wohlstand) joins |
22:28:27 | <imer> | yeah, i wouldnt mind urls being less cpu hungry :D |
22:28:40 | <imer> | > load average: 419.08, 402.21, 371.56 |
22:29:08 | <nicolas17> | myself: is that download still going or did it fail? |
22:30:21 | <fireonlive> | imer: those PDFs lol |
22:30:30 | <h2ibot> | JustAnotherArchivist edited Current Projects (+0, Move Taringa! to running): https://wiki.archiveteam.org/?diff=51909&oldid=51869 |
22:30:31 | <h2ibot> | Slukiceng edited Discourse (+68, /* Active Discourses */): https://wiki.archiveteam.org/?diff=51910&oldid=51900 |
22:30:46 | <imer> | fireonlive: no pdfs currently, well, at least not the big list |
22:30:57 | <fireonlive> | ahh |
22:31:38 | <fireonlive> | just ambient load :c |
22:31:41 | <imer> | wouldn't even know where to begin profiling though, nvm having it running in docker |
22:32:01 | <imer> | we talked about pattern count the other day in #// not sure if that's a significant contributor though |
22:37:17 | <fireonlive> | oh hmm we did just offload a bunch of stuff from the tracker eh |
22:37:38 | <imer> | loads been high before that though |
22:37:43 | <imer> | so not a sudden change |
22:38:05 | <fireonlive> | ah ok |
22:38:14 | <fireonlive> | i haven’t run urls for a little bit sadly |
22:39:45 | | pixel leaves [Error from remote client] |
22:39:46 | | pixel (pixel) joins |
22:47:18 | | BearFortress quits [Client Quit] |
22:58:59 | | BearFortress joins |
23:03:23 | <myself> | nicolas17: oh it finished, I'm just bad without reminders :) comin' at ya! |
23:09:38 | <@JAA> | !remind myself 5s We have reminders at home! |
23:09:39 | <eggdrop> | [remind] ok, i'll remind myself at 2024-03-19T23:09:43Z |
23:09:43 | <eggdrop> | [remind] myself: We have reminders at home! |
23:19:42 | | BornOn420 quits [Client Quit] |
23:31:30 | <fireonlive> | !remindme 0s :3 |
23:31:31 | <eggdrop> | [remind] ok, i'll remind you at 2024-03-19T23:31:30Z |
23:31:32 | <eggdrop> | [remind] fireonlive: :3 |
23:32:06 | <fireonlive> | eggdrop: help |
23:37:23 | | BornOn420 (BornOn420) joins |