00:00:19<@JAA>If their servers stop sucking, yeah. #noanswers for that project.
00:01:37<user>Thanks. (Is it considered bad IRC etiquette to reply with just "Thanks" and similar phrases?)
00:01:56<@JAA>No, that's fine. :-)
00:02:12BlueMaxima joins
00:02:19<@JAA>I'd suggest picking a more discernable nickname though.
00:04:40<@OrIdow6>So it doesn't look like Docket has announced the new deletion date yet?
00:04:44<@OrIdow6>*Docker
00:05:53<user>JAA, I tried registering with Nickserv, but for some reason I'm still seen as \'user' here. I'll fix it later.
00:06:09<@JAA>user: /nick NewNickname
00:06:19user is now known as gazorpazorp
00:06:31<gazorpazorp>is that too long?
00:07:43<@JAA>OrIdow6: Nope, doesn't look like it.
00:07:58<@JAA>gazorpazorp: No, that's fine. :-)
00:32:56<@OrIdow6>Has anyone looked into JFrog or Aimax-Z?
00:41:45tzt_ joins
00:43:02tzt quits [Ping timeout: 250 seconds]
00:43:08webdownload joins
00:45:09webdownload quits [Remote host closed the connection]
00:45:26<@OrIdow6>Both look like they may need warrior projects (though I haven't yet determined how many forums Aimix-Z hosts)
00:46:04<@OrIdow6>JFrog is basically a big repository for software builds and is likely many TB at least
01:02:52dm4v quits [Read error: Connection reset by peer]
01:04:31dm4v joins
01:04:33dm4v quits [Changing host]
01:04:33dm4v (dm4v) joins
01:08:49gazorpazorp quits [Client Quit]
01:13:30gazorpazorp (gazorpazorp) joins
01:17:14<lennier1>A trend I have mixed feelings about, but relevant for archiving--news sites considering removal of old stories that may have unfair consequences for the subjects covered, such as someone arrested for a crime but never charged: https://www.kentucky.com/opinion/editorials/article250720789.html
01:22:15Iki quits [Remote host closed the connection]
01:22:17<gazorpazorp>What about "the right to be forgotten" and laws like that? I'm in favor of such stories being archived and publicly available, but am genuinely wondering if that's a real problem
01:22:32<gazorpazorp>legally speaking
01:23:10Mineroboter_ joins
01:25:04Mineroboter quits [Ping timeout: 250 seconds]
01:25:41Sylirana quits [Remote host closed the connection]
01:27:46<lennier1>In this case, it's a US site, so there's no legal requirement for them to do anything. But I see the point that someone can be arrested for a minor crime, and even if they're not convicted, it's still keeping them from getting jobs a decade later because it shows up in search engines.
01:29:21<gazorpazorp>IANAL, but even a US site has to conform to GDPR, so maybe it has to conform to other EU laws if it wants to operate in the EU? Like how some sites just block EU IPs.
01:29:30Sylirana (Sylirana) joins
01:31:57<gazorpazorp>Are any of the current archives accessible via a distributed system? Or are they only on centalized sites like the Wayback Machine? Because if it's centralized, it could be subject to requests for removal from the respective government
01:32:45<gazorpazorp>So perhaps if we want to archive such news articles, it should be done in a censorship-resistant way
01:33:01<gazorpazorp>(I'm new to the whole scene, so forgive anything ignorant I write)
01:33:27<G4te_Keep3r>an evolving news collection torrent?
01:36:34<gazorpazorp>Are torrents flexible enough to support different overlapping subsets of information? For example, you have news categorized by country and by topic (two dimensions). If someone wants to download news for country X on topic Y, will they be able to easily download only those news, or will they have to add many torrents or select specific files from a torrent with lots of files?
01:43:36<G4te_Keep3r>touche...great for distributed files not so much for common everyday person to grab specific thing/location...and would probably be in warc format so would need whatever is used to read those
01:46:19<gazorpazorp>I don't much about IPFS, but that seems like an alternative for file sharing/storage worth investigating
02:29:12G4te_Keep3r quits [Ping timeout: 258 seconds]
02:30:08G4te_Keep3r joins
02:50:29<ave>gazorpazorp, I believe most archives go to IA, usually in wayback machine. Requests for removals are understandable and are probably for the best, though I can't see US govt sending one. Definitely can see news conglomerates sending DMCA strikes tho.
02:50:53<ave>Most torrent clients let you see a file/folder list before you download and let you pick which ones you want.
02:54:11<ave>re IPFS, while it's distributed, the data is only guaranteed to stay up as long as one node is pinning it, or at least that's my understanding. Similar to torrents in that regard, I'd say. *sigh* As much as I hate cryptocurrencies, there's filecoin which tries to address that afaik.
02:54:23<ave>IA was granted a ton of filecoins by filecoin foundation this month. https://blog.archive.org/2021/04/01/filecoin-foundation-grants-50000-fil-to-the-internet-archive/
02:56:54<@OrIdow6>A US news site removing material under order of a foreign country enforcing a law that violates the First Amendment would not looked well upon
02:58:51<ave>I assume they could do what google does when you use the right to be forgotten thing in EU: Hide the content for certain regions. (https://www.google.com/webmasters/tools/legal-removal-request?complaint_type=rtbf&hl=en&rd=1)
03:06:01<gazorpazorp>I'm glad I was wrong about compliance with EU laws for foreign sites
03:06:53<gazorpazorp>But the problem with censorship still stands - as long as data is kept in a central location, it's vulnerable to any number of bad (wrt freedom of speech/archiving) actors
03:08:17<gazorpazorp>Filecoin seems interesting. I also suspected a cryptocurrency would have to get mixed up in a distributed archiving/file sharing network, to incentivize people to keep the data
03:26:00<@JAA>Poking a bit at the JFrog platforms...
03:28:08<@OrIdow6>Aimax-Z is my own first priority among the two, both because of expected capacity and because of potential value, so may see about writing a script for that soon
03:28:29<@OrIdow6>Though need to do must more exploration first
03:28:40<@OrIdow6>Thanks for looking at JFrog
03:29:01<@JAA>Bintray has no sitemap or obvious option of enumeration. The API has ridiculous rate limits and is useless. The search seems to return partial matches and requires at least two letters. Bruteforcing aa through zz should work for discovery. There are four tabs for each search term, and they often have many thousands of pages. So even that enumeration will be millions of requests.
03:29:34<@JAA>I haven't fully understood the relation between 'packages', 'files', and 'repositories' yet.
03:30:29<@JAA>GoCenter seems to have vanished already. Both gocenter.io and search.gocenter.io redirect to the shutdown announcement.
03:32:27<@JAA>As I understand it though, GoCenter was essentially a caching platform similar to the main Go proxy. I'm not sure there was anything original on it, but perhaps it had some things that are no longer on GitHub et al. Alas, it appears to be too late.
03:34:17<@JAA>ChartCenter apparently similar but for Helm.
03:35:05<@JAA>There is still something at https://repo.chartcenter.io/, will try to get a size estimate.
03:35:57<@JAA>Actually yeah, the shutdown notice says that the GoCenter and ChartCenter websites were taken down end of February, but 'client requests will still work' until 1 May.
03:37:36<@JAA>JCenter will stay online until February 2022 for package downloads. However, discovery might no longer be possible after 1 May.
03:44:38<@JAA>ChartCenter size according to a 1 % sample HEAD of the URLs returned in https://repo.chartcenter.io/: 1.6 GB...? I have no idea whether that's correct.
03:46:11<@JAA>I'll throw that into AB anyway.
03:51:23etnguyen03 quits [Client Quit]
03:53:19DogsRNice quits [Read error: Connection reset by peer]
03:55:47qw3rty__ joins
03:57:53<@OrIdow6>Looks like there are something like 16k "rooms" (i.e. forums, as far as I can tell) in Aimax-Z in the IA CDX
03:59:17qw3rty_ quits [Ping timeout: 258 seconds]
04:03:36<@OrIdow6>Roughly accords with the "sites" count on this page http://www.aimix-z.com/view.html, if that's what it's referring to
04:03:42<atphoenix>gazorpazorp: you can generally run 1 instance of URLteam in parallel with anything else. It has minimal system and network requirements, and is IP-limited.
04:20:41<@JAA>I've sent an email to JFrog about GoCenter as there is no public index of the modules.
04:25:11<@JAA>JCenter has an index at https://bintray.com/bintray/jcenter but the pagination seems to max out at an offset of 10k, so page 1250 of allegedly 54k. It stops at Schema* though (sorted alphabetically), so that doesn't seem quite right.
04:47:13Stiletto joins
05:12:22tzt_ is now known as tzt
05:19:54<@JAA>No idea what this is exactly and how it relates to the shutdown: http://oss.jfrog.org/artifactory/jcenter-cache/
05:35:49Arcorann (Arcorann) joins
05:42:38<tech234a>“Founder of Adobe and developer of PDFs dies at age 81” https://apnews.com/article/business-john-warnock-san-francisco-b77f216f52d736a6b5a383a429208f51
07:12:50rsn quits [Remote host closed the connection]
07:13:12rsn joins
07:21:16G4te_Keep3r quits [Ping timeout: 250 seconds]
07:21:55G4te_Keep3r joins
07:55:05LeighR (LeighR) joins
07:57:12wickedplayer494 quits [Remote host closed the connection]
08:04:49wickedplayer494 joins
08:33:50BlueMaxima quits [Client Quit]
08:48:17hooway joins
09:55:02mutantmonkey quits [Remote host closed the connection]
09:55:17mutantmonkey (mutantmonkey) joins
10:02:29Dj-Wawa quits [Quit: Dj-Wawa]
10:02:53Dj-Wawa joins
10:30:12Terbium quits [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
10:30:35Terbium joins
10:47:55Matthww quits [Ping timeout: 258 seconds]
10:57:08Matthww joins
10:58:42spirit joins
11:03:20chriscoffee quits [Quit: Connection closed for inactivity]
12:47:41Matthww quits [Client Quit]
12:51:40spirit2 joins
12:51:52Matthww joins
12:55:11spirit quits [Ping timeout: 258 seconds]
14:37:51emernic joins
14:43:59<emernic>Basic question from a newbie: would it be useful for me to spin up several workers on Google Cloud K8s and just leave the project set to "auto" all the time? or is that kind of "raw firepower" rarely the bottleneck?
14:44:31<emernic>(or maybe there's some other reason that wouldn't work, e.g. I don't know how GKE manages outgoing IPs and whether they could just be banned quickly)
14:47:09<emernic>(I mean the warrior images, sorry if that was ambiguous)
15:03:01<LeighR>Unless you have a lot of varied outbound IP addresses, not really.
15:03:24<LeighR>and you would need to figure out how to make your pods use them
15:04:42<LeighR>perhaps run warriors as a StatefulSet, with one per node, on a cluster that you already have up doing other stuff
15:05:12<LeighR>Diverse IPs are the main bottleneck for most projects, from what I've seen
15:05:32<LeighR>most of us have more "firepower" than the projects can use
15:05:59<emernic>That makes sense, thanks for the info!
15:11:52<LeighR>based on how Azure Kubernetes Service worked before I went on maternity leave, GKS might also route all cluster outbound traffic over a single load balancer public IP
15:12:03<LeighR>which would be very counterproductive for AT
15:12:58<LeighR>do some testing with plain Ubuntu/your favorite distro and "curl http://ip4.me/api/" as well as "curl http://ip6.me/api/"
15:30:17<emernic>Yeah, that makes sense. I know it's not just 1 stable, predictable IP for all outbound GKE traffic by default (since I wanted this a while ago and it was a huge pain to setup), but I think it's one per node by default (which, like you said, wouldn't make sense unless those nodes were also being used for something else). I'll keep digging for ways
15:30:18<emernic>to get a bunch of "good" outgoing IPs on Google Cloud.
15:31:55<LeighR>if there's a way you could get a random one each time, that would be amazing
15:58:14Arcorann quits [Ping timeout: 250 seconds]
16:22:33emernic quits [Remote host closed the connection]
16:51:24DogsRNice (Webuser299) joins
17:41:12etnguyen03 (etnguyen03) joins
19:47:30Jonboy3451 quits [Read error: Connection reset by peer]
19:53:28Jonboy345 joins
22:09:43LeighR quits [Client Quit]
22:11:11@EggplantN quits [Client Quit]
22:11:55EggplantN joins
22:41:07EggplantN quits [Changing host]
22:41:07EggplantN (EggplantN) joins
22:41:07@ChanServ sets mode: +o EggplantN
23:02:33BlueMaxima joins
23:06:31hooway quits [Client Quit]
23:07:52Viniter7 (Viniter) joins
23:08:46Sanqui quits [Remote host closed the connection]
23:08:59Sanqui joins
23:09:50Viniter quits [Ping timeout: 250 seconds]
23:09:50Viniter7 is now known as Viniter
23:34:10Arcorann (Arcorann) joins
23:44:16Stilett0 joins
23:48:00Stiletto quits [Ping timeout: 258 seconds]
23:53:23aphitex22 joins