00:11:43 | | scurvy_duck quits [Ping timeout: 260 seconds] |
00:29:00 | | etnguyen03 (etnguyen03) joins |
00:33:36 | | elliewebz joins |
00:34:13 | | elliewebz quits [Client Quit] |
00:45:39 | | @imer quits [Quit: Oh no] |
00:46:02 | | sec^nd quits [Ping timeout: 276 seconds] |
00:46:13 | | imer (imer) joins |
00:46:14 | | @ChanServ sets mode: +o imer |
00:49:52 | | sec^nd (second) joins |
00:52:50 | | cascode quits [Ping timeout: 250 seconds] |
00:53:32 | | cascode joins |
00:54:23 | | etnguyen03 quits [Client Quit] |
01:00:31 | | cascode quits [Read error: Connection reset by peer] |
01:01:02 | | cascode joins |
01:37:00 | | scurvy_duck joins |
01:40:17 | <@arkiver> | we're going to slowly start other projects again, which may be at the expense of #UncleSamsArchive a bit, will try to prioritize better there |
01:45:26 | | Webuser760773 joins |
01:46:29 | | etnguyen03 (etnguyen03) joins |
01:54:13 | <pabs> | c3manu: re web search scraping, I have a few browser console scripts for Google/Bing/Yandex. the little-things stuff wasn't working for me, probably because of captchas IIRC |
01:54:37 | <pabs> | (also Yandex gives captchas often, even with the browser console scripts) |
02:08:45 | <pabs> | also https://wiki.archiveteam.org/index.php/Site_exploration#Search_Engines |
02:09:06 | | cascode quits [Ping timeout: 250 seconds] |
02:13:20 | | cascode joins |
02:17:20 | | pabs did a scrape of nfdc.faa.gov and sent it to #UncleSamsArchive |
02:19:17 | | cascode quits [Read error: Connection reset by peer] |
02:19:30 | | cascode joins |
02:19:47 | <@arkiver> | pabs: is that something you can also !ao < in AB? |
02:20:21 | <pabs> | not sure what c3manu had planned for it, but I guess so yes. some of it might be better to !a like the airport codes section |
02:21:32 | <@arkiver> | !a < can still be used as well i think |
02:24:57 | | Hackerpcs quits [Quit: Hackerpcs] |
02:26:08 | <h2ibot> | PaulWise edited Site exploration (+656, add browser console scripts): https://wiki.archiveteam.org/?diff=54386&oldid=50868 |
02:28:40 | | Hackerpcs (Hackerpcs) joins |
02:36:49 | <pabs> | I put it into !ao < for now |
02:47:49 | | Webuser760773 quits [Client Quit] |
02:55:37 | <Hans5958> | Someone should tell r/datahoarder that donating to AT is also an option. At this point there are just too many warriors that the better option would be helping to scale the targets instead |
02:56:58 | <Hans5958> | Or, yeah, put the "Donate" link more prominent |
02:57:03 | <nicolas17> | ++ |
02:57:05 | <Hans5958> | on the wiki |
02:58:45 | <that_lurker> | Donating to IA would be ideal as well. They are the ones that take the end bulk of data |
03:05:19 | | Webuser179214 joins |
03:05:43 | | Webuser179214 quits [Client Quit] |
03:07:07 | <Hans5958> | That one is true but I think speed is also important, since sites are going down, which means the targets should also be the priority |
03:07:27 | <Hans5958> | (by the way, I apologizes for being a busybody. Donated 1$ since my currency sucks) |
03:19:19 | | BlueMaxima quits [Read error: Connection reset by peer] |
03:33:03 | | etnguyen03 quits [Client Quit] |
03:35:18 | | scurvy_duck quits [Ping timeout: 260 seconds] |
03:45:43 | | etnguyen03 (etnguyen03) joins |
03:48:02 | | Webuser489155 joins |
03:48:49 | | Webuser489155 quits [Client Quit] |
04:26:54 | | etnguyen03 quits [Remote host closed the connection] |
04:27:48 | | nicolas17 quits [Ping timeout: 260 seconds] |
04:31:23 | | nicolas17 joins |
04:31:25 | | nicolas17 is now authenticated as nicolas17 |
04:35:28 | | nicolas17 quits [Client Quit] |
04:35:44 | | nicolas17 joins |
04:45:46 | | i_have_n0_idea quits [Quit: The Lounge - https://thelounge.chat] |
04:46:15 | | i_have_n0_idea (i_have_n0_idea) joins |
04:48:01 | | lennier2_ joins |
05:09:41 | | scurvy_duck joins |
05:18:57 | | Roboguy joins |
05:22:22 | <Roboguy> | Hi! I was wondering if anyone has an archive of nifty before it went down last year. I'm looking for a specific blogpost made by Naoyuki Katoh on the development of a model kit (http://homepage2.nifty.com/NaoKatoh/making/starship_troopers/episode_8/index.html). |
05:22:28 | | Webuser637326 quits [Quit: Ooops, wrong browser tab.] |
05:26:43 | | Roboguy quits [Client Quit] |
05:29:56 | | Webuser482381 joins |
05:35:22 | | Roboguy joins |
05:39:12 | | Webuser482381 quits [Client Quit] |
05:39:12 | | Roboguy quits [Client Quit] |
05:47:34 | | earl joins |
05:54:54 | | Webuser457588 joins |
05:56:26 | | SootBector quits [Remote host closed the connection] |
05:56:42 | | SootBector (SootBector) joins |
06:07:38 | <h2ibot> | PaulWise edited Flickr (+379, document current status with AB /cc arkiver): https://wiki.archiveteam.org/?diff=54387&oldid=53134 |
06:13:32 | | some_body quits [Quit: Leaving.] |
06:14:28 | | some_body joins |
06:15:31 | | SootBector quits [Remote host closed the connection] |
06:15:48 | | SootBector (SootBector) joins |
06:17:50 | | katocala quits [Ping timeout: 250 seconds] |
06:18:16 | | katocala joins |
06:22:04 | | qinplus_phone joins |
06:26:56 | | katocala quits [Ping timeout: 250 seconds] |
06:27:09 | | katocala joins |
06:33:22 | | DogsRNice quits [Read error: Connection reset by peer] |
06:44:42 | | some_body6 joins |
06:45:08 | | some_body quits [Ping timeout: 250 seconds] |
06:45:08 | | some_body6 is now known as some_body |
06:56:39 | <pabs> | Roboguy's URL has "This URL has been excluded from the Wayback Machine." in the WBM |
06:58:18 | | scurvy_duck quits [Ping timeout: 260 seconds] |
07:39:14 | | Island quits [Read error: Connection reset by peer] |
07:43:31 | | Naruyoko5 joins |
07:47:53 | | Naruyoko quits [Ping timeout: 260 seconds] |
08:39:39 | <h2ibot> | Bzc6p edited Kephost.com (+65, /* Archiving */ add total size of archives): https://wiki.archiveteam.org/?diff=54388&oldid=51294 |
08:46:40 | <h2ibot> | Bzc6p edited Network.hu (-26, fix archives links): https://wiki.archiveteam.org/?diff=54389&oldid=51085 |
08:52:38 | <pabs> | hmm, this URL is excluded from the WBM https://web.archive.org/web/20250209084927/https://www.dropbox.com/s/zqwuy8bfrw65156/Shoup%20CV%20March%202023.pdf?dl=1 |
08:52:48 | <pabs> | is Dropbox more generally excluded? |
08:53:26 | <tzt> | pabs: yes, see https://wiki.archiveteam.org/index.php/List_of_websites_excluded_from_the_Wayback_Machine/Partial_exclusions |
08:53:54 | | SkilledAlpaca418962 quits [Quit: SkilledAlpaca418962] |
08:54:01 | <pabs> | ugh... |
08:54:37 | | pabs wonders why the search doesn't find it https://wiki.archiveteam.org/index.php?search=dropbox&title=Special%3ASearch&fulltext=Search |
09:03:44 | <h2ibot> | Thezt edited Deathwatch (+105, Add WATMM): https://wiki.archiveteam.org/?diff=54390&oldid=54367 |
09:05:24 | | SkilledAlpaca418962 joins |
09:06:45 | <h2ibot> | Thezt edited Deathwatch (+91, Add reference): https://wiki.archiveteam.org/?diff=54391&oldid=54390 |
09:12:45 | | earl quits [] |
10:21:53 | | qinplus_phone quits [Quit: Connection closed for inactivity] |
11:41:41 | | Gadelhas5628737 quits [Quit: auf Wiedersehen] |
12:00:06 | | Bleo18260072271962345 quits [Quit: The Lounge - https://thelounge.chat] |
12:02:52 | | Bleo18260072271962345 joins |
12:04:31 | <@OrIdow6> | !remind 8h me look into foro3djuegos |
12:04:32 | <eggdrop> | [remind] unable to parse time: unable to convert date-time string "me": syntax error (characters 0-1) |
12:08:14 | | fangfufu quits [Quit: ZNC 1.8.2+deb3.1+deb12u1 - https://znc.in] |
12:12:59 | | fangfufu joins |
12:13:32 | | fangfufu is now authenticated as fangfufu |
12:18:40 | <@OrIdow6> | !remindme 8h look into foro3djuegos |
12:18:41 | <eggdrop> | [remind] ok, i'll remind you at 2025-02-09T20:18:40Z |
12:34:15 | | SkilledAlpaca418962 quits [Quit: SkilledAlpaca418962] |
12:34:46 | | SkilledAlpaca418962 joins |
12:40:03 | | sec^nd quits [Remote host closed the connection] |
12:46:55 | | Gadelhas5628737 joins |
13:01:36 | | carleski joins |
13:02:05 | | carleski leaves |
13:02:43 | | sec^nd (second) joins |
13:09:52 | | carleski joins |
13:10:46 | | carleski quits [Client Quit] |
13:19:57 | | CapyLord joins |
13:22:27 | | CapyLord quits [Client Quit] |
13:28:30 | | monoxane4 (monoxane) joins |
13:30:18 | | monoxane quits [Ping timeout: 250 seconds] |
13:30:19 | | monoxane4 is now known as monoxane |
13:35:37 | | etnguyen03 (etnguyen03) joins |
13:43:22 | | catbottom quits [Quit: ZNC 1.8.2+deb2+deb11u1 - https://znc.in] |
13:43:36 | | catbottom joins |
13:45:21 | | scurvy_duck joins |
14:07:05 | | catbottom quits [Client Quit] |
14:07:49 | | catbottom joins |
14:14:00 | | catbottom quits [Client Quit] |
14:15:14 | | catbottom joins |
14:35:38 | | scurvy_duck quits [Ping timeout: 260 seconds] |
14:52:24 | | scurvy_duck joins |
14:52:57 | | Gadelhas5628737 quits [Client Quit] |
14:54:56 | | Gadelhas5628737 joins |
15:21:52 | | caylin quits [Quit: Ping timeout (120 seconds)] |
15:22:11 | | caylin (caylin) joins |
15:29:59 | | Deewiant quits [Remote host closed the connection] |
15:31:10 | | Deewiant (Deewiant) joins |
15:55:01 | | etnguyen03 quits [Client Quit] |
15:55:22 | <Blueacid> | There was a brief conversation yesterday in the UncleSam Archive channel about the Targets & how much they punish their disks. I was wondering: how many targets (typically) are there, and how much disk space do they typically consume when busy? My reasoning for asking being: Is there any worth in using the Hetzner S3-alike service for uploads? (e.g. get presigned URL for each warrior 'put' and |
15:55:28 | <Blueacid> | use that s3 for staging). Free ingress, free API calls for puts (it seems?) and free bandwidth out to servers hosted in the same Hetzner DC. So for $less-than-a-big-ssd-server it might be worth a thought? Aware this may well have been considered & dismissed, but wanted to ask in case it hadn't been thought of? |
15:55:52 | <Blueacid> | (and if ~50TB needs to hole up somewhere for a day or two, it might not be that expensive) |
15:56:53 | <Blueacid> | I was pricing up buying a server to offer, but wondered about just donating cash, given many of the target operators (seem to be?) using Hetzner and DO |
16:01:18 | | etnguyen03 (etnguyen03) joins |
16:03:39 | | Gadelhas5628737 quits [Client Quit] |
16:03:56 | | Gadelhas5628737 joins |
16:07:01 | <@imer> | Blueacid: https://wiki.archiveteam.org/index.php/Dev/Targets |
16:07:02 | <@imer> | S3 isn't a great fit since we need to pack individual uploads into larger files for uploading to archive.org |
16:08:06 | <@imer> | you can throw us money for general operational costs here https://opencollective.com/archiveteam (or donate to archive.org - helps us indirectly too since they host the data) |
16:08:49 | <Blueacid> | Yeah, the unsteereed thought I'd had was: warriors --> s3-esque service --> targets pull from here to repack --> Upload to archive.org |
16:10:09 | <Blueacid> | Thinking: the connections from warriors to s3 are then the 'problem' of Hetzner to deal with (so the cpu / firewalling / whatever is on them), then when a target is ready it immediately grabs, say, 100GB of warcs quickly over the Hetzner LAN, repacks, and uploads |
16:10:11 | <@imer> | ingest from warriors & packing usually isn't the problem - so that wouldn't do much for us aside from add complexity |
16:10:18 | <Blueacid> | Aha very fair :) |
16:10:27 | <Blueacid> | I appreciate you taking the time to listen and to explain <3 |
16:12:17 | <Blueacid> | Oh, one further thought was: If targets are not ready to push to IA, or if they're already flat out, then S3 is what sits holding the data until it's drawn down. Don't know if that solves any problems or merely makes new & different ones :D |
16:12:34 | | etnguyen03 quits [Client Quit] |
16:13:23 | <@imer> | np, main bottleneck is usually uploading data to IA (for a variety of reasons, currently things are going pretty well and it's basically going as fast as IA can handle) |
16:14:20 | <@imer> | Blueacid: we do something like that for our temporary storage, yeah. do have to get that unloaded at some point though, can't keep piling things on top :D |
16:14:22 | <Blueacid> | Yeah I read they're limited to circa 12gbit ingest, and a finite number of connections |
16:15:49 | <Blueacid> | And yep, if the 12gbit limit at the IA stays as-is for the forseeable, that might start to become a challenge as projects get bigger (I mean, Livestream + US Gov both seem pretty large) |
16:17:15 | <@imer> | if only these video services would stop shutting down left and right. lol |
16:17:15 | | @imer looks at #dailydemotion |
16:19:29 | <Blueacid> | Haha yep, #dailydemotion is going to be tricky.. |
16:20:18 | <Blueacid> | I work adjacent to video streaming for $dayjob, it's scary how large some of the SANs we have at work are.. multiple petabytes is seen as a 'meh, oh yeah' sort of quantity |
16:24:08 | | scurvy_duck quits [Ping timeout: 260 seconds] |
16:31:33 | | scurvy_duck joins |
16:40:58 | | scurvy_duck quits [Ping timeout: 250 seconds] |
16:41:50 | | devkev joins |
16:43:19 | <Hans5958> | Wanted to drop this so these can be backed up. Some Indonesian ministries were split by the elected president (PS. no mention of it on the "Elections" tab). This is the compilation of the priority ones to backup: https://transfer.archivete.am/inline/LDRMP/ministryid.md |
16:44:43 | | etnguyen03 (etnguyen03) joins |
16:48:53 | | dreamstream joins |
16:51:29 | | dreamstream quits [Client Quit] |
17:07:45 | | Webuser949870 joins |
17:08:04 | | Webuser949870 quits [Client Quit] |
17:38:15 | | DogsRNice joins |
17:50:40 | | ^ quits [Remote host closed the connection] |
17:50:49 | | ^ (^) joins |
17:57:09 | | Sidpatchy quits [Quit: The Lounge - https://thelounge.chat] |
18:18:52 | | devkev quits [Remote host closed the connection] |
18:19:25 | | devkev joins |
18:23:06 | | devkev quits [Remote host closed the connection] |
18:24:01 | | devkev joins |
18:28:58 | | devkev quits [Ping timeout: 260 seconds] |
18:29:26 | | devkev joins |
18:34:04 | | devkev quits [Ping timeout: 250 seconds] |
18:42:20 | | devkev joins |
18:43:13 | | datechnoman quits [Quit: Ping timeout (120 seconds)] |
18:43:35 | | datechnoman (datechnoman) joins |
18:55:13 | | devkev quits [Ping timeout: 260 seconds] |
19:04:59 | | Hackerpcs quits [Quit: Hackerpcs] |
19:07:58 | | devkev joins |
19:09:39 | | Hackerpcs (Hackerpcs) joins |
19:11:16 | | etnguyen03 quits [Client Quit] |
19:12:38 | | devkev quits [Ping timeout: 250 seconds] |
19:13:06 | | devkev joins |
19:15:37 | | devkev quits [Remote host closed the connection] |
19:15:44 | | devkev joins |
19:17:23 | | s-crypt quits [Ping timeout: 260 seconds] |
19:17:23 | | Ryz2 quits [Ping timeout: 260 seconds] |
19:17:23 | | Flashfire42 quits [Ping timeout: 260 seconds] |
19:17:23 | | kiska quits [Ping timeout: 260 seconds] |
19:26:16 | | Ryz2 (Ryz) joins |
19:26:27 | | s-crypt (s-crypt) joins |
19:27:01 | | Flashfire42 joins |
19:27:12 | | kiska (kiska) joins |
19:33:13 | | moth_ quits [Quit: Quit] |
19:43:39 | | sec^nd quits [Remote host closed the connection] |
19:44:04 | | sec^nd (second) joins |
20:06:07 | <devkev> | hello, is there any benefit to running a project-specific docker container over warrior-dockerfile with a project selected? |
20:07:42 | <devkev> | I was following the docker instructions after running a specific project and just stumbled upon the warrior-dockerfile which only appears to be mentioned in the podman section not the docker one |
20:10:52 | | devkev quits [Remote host closed the connection] |
20:11:25 | | devkev joins |
20:14:04 | <TheTechRobo> | Project-specific ones are much lower overhead, can have their concurrency set up to 20, and output logs to stdout |
20:14:21 | <TheTechRobo> | They are much better for running projects in bulk |
20:14:28 | <TheTechRobo> | The Warrior is a nice 'set it and forget it' thing |
20:16:18 | | devkev quits [Ping timeout: 260 seconds] |
20:18:42 | <eggdrop> | [remind] OrIdow6: look into foro3djuegos |
20:20:09 | | devkev joins |
20:25:49 | | etnguyen03 (etnguyen03) joins |
20:32:05 | | devkev quits [Remote host closed the connection] |
20:32:33 | | devkev joins |
20:37:18 | | devkev quits [Ping timeout: 260 seconds] |
20:40:38 | | scurvy_duck joins |
20:46:05 | | BornOn420 quits [Remote host closed the connection] |
20:46:55 | | BornOn420 (BornOn420) joins |
20:47:02 | | westonal joins |
20:50:12 | | devkev joins |
20:53:41 | <westonal> | Hi all, I made myself a command line leaderboard watcher, I found the web-based one too noisy and hard to find myself or see my rank. https://github.com/westonal/archive-warrior-leaderboard-cli hope someone else enjoys it. If anyone wants to contribute, main thing that's not idea about it is it polls a legacy API rather than using the web socket. |
20:54:28 | | devkev quits [Ping timeout: 250 seconds] |
20:57:22 | | lunik11 quits [Quit: :x] |
21:01:06 | | devkev joins |
21:04:08 | | ats quits [Ping timeout: 260 seconds] |
21:06:01 | | ats (ats) joins |
21:07:00 | | Island joins |
21:31:12 | | etnguyen03 quits [Client Quit] |
21:31:32 | | lunik11 joins |
21:32:32 | | dvb (dvb) joins |
21:32:33 | | lunik11 quits [Client Quit] |
21:32:49 | | lunik11 joins |
21:39:32 | | devkev quits [Ping timeout: 250 seconds] |
21:39:51 | | lennier2 joins |
21:42:38 | | lennier2_ quits [Ping timeout: 260 seconds] |
21:43:46 | | devkev joins |
21:44:11 | | devkev quits [Remote host closed the connection] |
21:44:18 | | devkev joins |
21:58:54 | | BlueMaxima joins |
21:59:18 | | Flashfire42 is now authenticated as flashfire42 |
22:06:15 | | devkev_ joins |
22:07:43 | | etnguyen03 (etnguyen03) joins |
22:07:46 | | devkev is now authenticated as devkev |
22:08:13 | | devkev_ quits [Client Quit] |
22:08:23 | | devkev_ joins |
22:09:03 | | devkev_ quits [Client Quit] |
22:10:16 | | devkev_mobile (devkev) joins |
22:13:39 | | devkev quits [Remote host closed the connection] |
22:14:16 | | devkev_mobile quits [Remote host closed the connection] |
22:15:11 | | devkev (devkev) joins |
22:19:35 | | westonal quits [Client Quit] |
22:19:47 | | loug8318142 quits [Quit: The Lounge - https://thelounge.chat] |
22:19:50 | | devkev quits [Ping timeout: 250 seconds] |
22:21:21 | | devkev (devkev) joins |
22:22:24 | | devkev_mobile (devkev) joins |
22:22:57 | | scurvy_duck quits [Remote host closed the connection] |
22:23:16 | | scurvy_duck joins |
22:23:29 | | hexa- quits [Quit: WeeChat 4.4.3] |
22:24:20 | | devkev_mobile quits [Remote host closed the connection] |
22:24:57 | | hexa- (hexa-) joins |
22:26:45 | | devkev_mobile (devkev) joins |
22:26:58 | | devkev quits [Ping timeout: 260 seconds] |
22:28:26 | | westonal joins |
22:29:56 | | devkev (devkev) joins |
22:30:34 | | devkev_mobile quits [Remote host closed the connection] |
22:40:28 | | etnguyen03 quits [Client Quit] |
22:43:18 | | scurvy_duck quits [Ping timeout: 260 seconds] |
22:45:04 | | balrog quits [Quit: Bye] |
22:45:24 | | devkev quits [Ping timeout: 250 seconds] |
22:50:18 | | cascode quits [Ping timeout: 260 seconds] |
22:50:54 | | cascode joins |
22:53:34 | | balrog (balrog) joins |
22:58:09 | | devkev (devkev) joins |
23:02:44 | | devkev quits [Ping timeout: 250 seconds] |
23:03:12 | | devkev (devkev) joins |
23:07:48 | | devkev quits [Ping timeout: 260 seconds] |
23:10:46 | | devkev (devkev) joins |
23:10:57 | | devkev quits [Client Quit] |
23:12:58 | | devkev (devkev) joins |
23:16:00 | | cascode quits [Read error: Connection reset by peer] |
23:16:13 | | cascode joins |
23:16:53 | | dontwashyourhands (dontwashyourhands) joins |
23:20:39 | | scurvy_duck joins |
23:32:12 | | devkev quits [Ping timeout: 250 seconds] |
23:37:10 | | devkev (devkev) joins |
23:41:38 | | devkev quits [Ping timeout: 260 seconds] |
23:41:38 | | scurvy_duck quits [Ping timeout: 260 seconds] |
23:45:31 | | devkev (devkev) joins |
23:46:40 | | etnguyen03 (etnguyen03) joins |
23:52:14 | <pabs> | AB is currently archiving over a million pets. is this maybe excessive? for eg https://www.savethislife.com/900164000215773 |
23:52:49 | <that_lurker> | every pet is important :-) |
23:53:34 | <pabs> | (their sitemap contains every single microchip ID) |
23:53:40 | | lunik11 quits [Client Quit] |
23:53:52 | | devkev quits [Ping timeout: 250 seconds] |
23:53:56 | | lunik11 joins |
23:54:03 | | pabs always walks into giant AB jobs... |
23:54:37 | <nicolas17> | how long has brickshelf been running? D: |
23:55:24 | <pabs> | https://po.savethislife.com/image-pets/0A12610808.jpg |
23:55:28 | <nulldata> | Gotta climb the Shameboard! |
23:56:14 | <nicolas17> | ugh I forgot the shameboard was showing bytes |
23:56:15 | | etnguyen03 quits [Client Quit] |
23:56:24 | <Flashfire42> | its fine its fine. |
23:56:24 | <nicolas17> | instead of a more readable unit |