00:34:39AlsoHP_Archivist quits [Client Quit]
01:28:23<@JAA>4.5 hours ago, I started a derive. It has so far copied roughly 30 GiB of the item's data to the iw server. :-|
01:37:53HP_Archivist quits [Read error: Connection reset by peer]
01:38:16HP_Archivist (HP_Archivist) joins
01:38:46<TheTechRobo>JAA: pop into IA and copy the bits manually from each hard drive using a magnet and a steady hand
01:39:36<@JAA>Yeah, lol
01:50:11<@arkiver>very steady hand :P
01:58:52<nicolas17>the bits are like all right next to each other
02:03:17<fireonlive>IA should move to this tbh: https://github.com/yarrick/pingfs
02:42:02threedeeitguy39 quits [Ping timeout: 252 seconds]
02:53:46<DigitalDragons>anyone know if it's possible (or how) to make the CDX api spit out all the pages when using the domain search?
02:54:17<DigitalDragons>eg, domain searching example.com will return example.com/foo but not example.com/foo/bar
02:56:54<@JAA>You need to paginate yourself. Add &showNumPages=true to get the number of pages, then do &page=0, &page=1, etc. (resumeKey pagination was pretty broken in some cases as of last time I tried it.)
02:57:43<@JAA>(Also, shameless plug for ia-cdx-search in my little-things repo.)
02:59:34<fireonlive>https://gitea.arpa.li/JustAnotherArchivist/little-things
03:00:40<nicolas17>I'm uploading at 5MiB/s pretty smoothly, hope I'm not killing their capacity /s
03:01:40<@JAA>The uploads go pretty fast, yeah. The processing after the uploads is what's slow.
03:01:45<nicolas17>ah good
03:02:42<nicolas17>I'm uploading this like 2 weeks later than I should have, and it's for preservation just in case apple deletes it (which they have done in the past), so idc if it takes long to process, heck I'd mark it low priority if I could
03:04:32<@JAA>Since everything is smashing into the global task limit wall regularly currently, I don't think marking tasks as low priority would help.
03:05:10<@JAA>There is a way to specify a priority, but I have no idea if/how that works.
03:05:58<nicolas17>low-priority is only useful if enough people know about it and use it
03:06:24<fireonlive>can i make everything of mine urgent? :3
03:06:42<@JAA>But even then, those tasks would just sit in the queue and effectively lower the global task limit.
03:07:01<nicolas17>ah yeah probably not useful when things are *this* full
03:07:11<nicolas17>also what happens when the global limit is reached? :|
03:07:52<fireonlive>JAA’s upload takes 3-4 business weeks to process
03:07:57<@JAA>You get a 503 on upload.
03:08:03<nicolas17>ah... yikes
03:08:14<fireonlive>oh, i read raised not reached
03:08:28<@JAA>And naturally, that happens when it tries to submit the task, i.e. upon completion of the upload rather than at the start.
03:08:30<nicolas17>is there a graph of global tasks over time?
03:08:58<@JAA>Not sure. There is one for derives, but I haven't found one for all tasks.
03:09:43<nicolas17>like something that means "you'll get 503 on upload" when the graph hits the roof at 100%
03:11:39<@JAA>You can do a check_limit=1 request against the S3 API, but I haven't tried that. https://archive.org/developers/ias3.html#use-limits
03:12:22<nicolas17>oh but that's just boolean
03:12:25<@JAA>And of course, it might just happen that this says it's fine but then you get a 503 anyway by the time the upload completes.
03:12:49<nicolas17>I wanted a graph because not only I know if we're at the limit but also how far we are and how fast we're approaching it :P
03:13:03<@JAA>Yeah
03:13:14<@JAA>If you find one, let me know. :-)
03:13:30<nicolas17>same reason I want a "free disk space on AT targets" graph, how far are we from hitting "max connections -1"
03:19:44<DigitalDragons>I haven't seen a graph but found that you can https://catalogd.archive.org/catalog.php?summary=1 for the current count
03:20:26threedeeitguy39 (threedeeitguy) joins
04:30:08Iki quits [Read error: Connection reset by peer]
04:48:02<nicolas17>huh
04:49:36<nicolas17>DigitalDragon: that page sometimes shows only the counts, and sometimes shows a task list with 4000 rows
05:09:59whoami quits [Ping timeout: 252 seconds]
05:17:23whoami (whoami) joins
05:20:40<fireonlive>i only see "My Tasks Not Yet Completed" :)
05:21:08<fireonlive>there is "where am I in line?" though https://catalogd.archive.org/catalog.php?whereami=1
05:22:21<fireonlive>(linked from the summary page)
05:55:03<nicolas17>JAA: https://archive.org/~tracey/stats/derivesg.html this may be it?
05:55:40<nicolas17>or at least a reasonable approximation
05:57:38<nicolas17>hmm no
05:57:51<nicolas17>that says 600 items waiting for derive
05:58:04<nicolas17>catalog says there are 94285 tasks waiting to run
06:12:31BigBrain_ quits [Ping timeout: 245 seconds]
06:14:22<nicolas17>JAA: so I can't simultaneously upload multiple files on the same non-existent item, but after I finish uploading the first file, can I already upload more files, even if the processing tasks didn't run yet and the item doesn't show up on the web yet?
06:14:41BigBrain_ (bigbrain) joins
06:18:24<nicolas17>oh huh
06:18:28<nicolas17>I just got "Please reduce your request rate. - total_tasks_queued exceeds global_limit"
06:18:32<nicolas17>*mid-upload*
06:21:20nicolas17 quits [Client Quit]
06:42:30jtagcat quits [Quit: Bye!]
06:45:54jtagcat (jtagcat) joins
07:23:43Arcorann (Arcorann) joins
07:24:44nulldata quits [Ping timeout: 252 seconds]
07:27:48nulldata (nulldata) joins
07:27:58<masterX244>Sucks when you get that in the middle of a csv upload. Had to manually redo it to resume the upload (luckily i got backup monitoring on my automation so i know what to check if the fail doesnt get noticed immediately
08:13:46BigBrain_ quits [Ping timeout: 245 seconds]
08:16:08BigBrain_ (bigbrain) joins
09:38:56pabs quits [Ping timeout: 252 seconds]
09:41:33pabs (pabs) joins
11:18:46PredatorIWD_ quits [Read error: Connection reset by peer]
11:25:08PredatorIWD joins
11:25:50Exorcism quits [Quit: issued !quit command]
11:36:07Exorcism (exorcism) joins
11:39:57PredatorIWD_ joins
11:42:08PredatorIWD quits [Ping timeout: 252 seconds]
13:30:37IDK quits [Quit: Connection closed for inactivity]
13:39:52BearFortress quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
13:41:54Arcorann quits [Ping timeout: 265 seconds]
14:05:55BearFortress joins
14:10:29<DigitalDragons>nicolas17: 600 waiting derives might be correct
14:11:29<DigitalDragons>that 95k is everything (archive.php, derive, modify_xml, virus scan...)
14:15:26BigBrain_ quits [Ping timeout: 245 seconds]
14:18:40BigBrain_ (bigbrain) joins
15:11:41BigBrain_ quits [Ping timeout: 245 seconds]
15:13:52BigBrain_ (bigbrain) joins
16:30:08<@JAA>nicolas17: Derives are only one type of task, yeah, and that's the only graph I know that shows task number not task rate.
16:30:59<@JAA>nicolas17: Correct, you can keep uploading even if the processing didn't run. It seems to check for existence on archive.php task submission or similar.
17:15:22Iki joins
17:27:30Exorcism quits [Client Quit]
17:32:31Exorcism (exorcism) joins
18:25:47<@JAA>> FATAL ERROR: remote dsync failed, NEWLY RSYNCed FILE(S) MAY HAVE BEEN CORRUPTED
18:26:34<@JAA>Nothing to see here, move along.
18:33:33Exorcism quits [Client Quit]
18:35:29<fireonlive>https://mkx9delh5a.execute-api.ca-central-1.amazonaws.com/uploads/3f3bb12cc1fa5ebd/image.jpeg
18:36:24<that_lurker>https://lounge.kuhaon.fun/folder/988351bde3a3d9cf/homer-simpson-i-see.gif
18:38:07<@JAA>https://i.imgur.com/MBlS7Wr.gif
18:38:47<that_lurker>https://lounge.kuhaon.fun/folder/97178a847a24d723/disappearing.gif
19:04:26Exorcism (exorcism) joins
19:24:41DLoader quits [Ping timeout: 252 seconds]
19:31:44DLoader joins
20:01:44Doranwen quits [Remote host closed the connection]
20:02:35Doranwen (Doranwen) joins
20:16:54Doranwen quits [Remote host closed the connection]
20:17:41Doranwen (Doranwen) joins
20:23:42Doranwen quits [Remote host closed the connection]
20:24:26Doranwen (Doranwen) joins
21:30:01nicolas17 joins
21:30:24<nicolas17>JAA: well it seems "tasks waiting to run" in catalogd.archive.org is now higher than last night
21:30:30<nicolas17>so it's getting worse \o/
21:51:09balrog quits [Quit: Bye]
21:52:25<TheTechRobo>Is this accurate? https://lounge.thetechrobo.ca/uploads/6fc00160af3b5629/image.png
21:52:37<TheTechRobo>Found on https://archive.org/~tracey/stats/ after poking around catalogd.archive.org a bit
21:56:22<fireonlive>line go up :D
21:57:53balrog (balrog) joins
21:58:17<@JAA>TIL it's also available under 'stats'.
21:58:19<TheTechRobo>also, btw, https://archive.org/web/petabox.php says 212 PB when https://catalogd.archive.org/report/space.php says ~144,000 TB, am I confused?
21:58:38<nicolas17>Waiting to run: 93896
21:58:39<@JAA>Yes, but that's tasks per second, not number of tasks pending execution.
21:58:39<nicolas17>Running: 4119
21:58:41<nicolas17>is what I see in catalog.php
21:58:52<@JAA>Probably tasks completed per second or similar.
21:59:13<@JAA>petabox.php is 'as of December 2021'.
21:59:33<TheTechRobo>JAA: did they really lose that much space usage in two years, though?
21:59:42<@JAA>It's not a loss.
22:00:11<@JAA>The 141 PiB is unique data, the 212 'PB' (PiB?) are both copies.
22:00:52<TheTechRobo>ahhhhh I should have realised
22:00:55<fireonlive>can we use P∅B for when it's not PiB
22:01:02<fireonlive>(we as a society)
22:01:13<fireonlive>so we're clear that the other party knows iB exists and didn't use them
22:01:14<fireonlive>:p
22:02:16<@JAA>P!B
22:02:40<fireonlive>could work too :3
22:03:28<fireonlive>https://en.wikipedia.org/wiki/Zero_(linguistics)
22:03:30<fireonlive>"In the English sentence nobody knows ∅ the zero pronoun plays the role of the object of the verb, and in ∅ makes no difference it plays the role of the subject. Likewise, the zero pronoun in the book ∅ I am reading plays the role of the relative pronoun that in the book that I am reading."
22:03:37<fireonlive>my brain broke for these senteneces
22:03:50<fireonlive>🧠🩹
22:11:43<fireonlive>i will say
22:11:58<fireonlive>the new version of the servers look 👌
22:12:19<fireonlive>i would have added a chef there but :chef brings me no chef; whatever library the lounge sucks for searching 'moji
22:16:16<imer>👨‍🍳
22:16:56<fireonlive>:D thanks
22:17:02<fireonlive>👨‍🍳👌
22:23:26nulldata quits [Ping timeout: 252 seconds]
22:27:51nulldata (nulldata) joins
22:48:21BigBrain_ quits [Ping timeout: 245 seconds]
22:50:40BigBrain_ (bigbrain) joins
22:54:35<flashfire42>hank: drain of 8T disks (remainder of ia600503) does this mean they are getting bigger disks? is this the hardware thing we are dealing with atm?