| 00:34:39 | | AlsoHP_Archivist quits [Client Quit] |
| 01:28:23 | <@JAA> | 4.5 hours ago, I started a derive. It has so far copied roughly 30 GiB of the item's data to the iw server. :-| |
| 01:37:53 | | HP_Archivist quits [Read error: Connection reset by peer] |
| 01:38:16 | | HP_Archivist (HP_Archivist) joins |
| 01:38:46 | <TheTechRobo> | JAA: pop into IA and copy the bits manually from each hard drive using a magnet and a steady hand |
| 01:39:36 | <@JAA> | Yeah, lol |
| 01:50:11 | <@arkiver> | very steady hand :P |
| 01:58:52 | <nicolas17> | the bits are like all right next to each other |
| 02:03:17 | <fireonlive> | IA should move to this tbh: https://github.com/yarrick/pingfs |
| 02:42:02 | | threedeeitguy39 quits [Ping timeout: 252 seconds] |
| 02:53:46 | <DigitalDragons> | anyone know if it's possible (or how) to make the CDX api spit out all the pages when using the domain search? |
| 02:54:17 | <DigitalDragons> | eg, domain searching example.com will return example.com/foo but not example.com/foo/bar |
| 02:56:54 | <@JAA> | You need to paginate yourself. Add &showNumPages=true to get the number of pages, then do &page=0, &page=1, etc. (resumeKey pagination was pretty broken in some cases as of last time I tried it.) |
| 02:57:43 | <@JAA> | (Also, shameless plug for ia-cdx-search in my little-things repo.) |
| 02:59:34 | <fireonlive> | https://gitea.arpa.li/JustAnotherArchivist/little-things |
| 03:00:40 | <nicolas17> | I'm uploading at 5MiB/s pretty smoothly, hope I'm not killing their capacity /s |
| 03:01:40 | <@JAA> | The uploads go pretty fast, yeah. The processing after the uploads is what's slow. |
| 03:01:45 | <nicolas17> | ah good |
| 03:02:42 | <nicolas17> | I'm uploading this like 2 weeks later than I should have, and it's for preservation just in case apple deletes it (which they have done in the past), so idc if it takes long to process, heck I'd mark it low priority if I could |
| 03:04:32 | <@JAA> | Since everything is smashing into the global task limit wall regularly currently, I don't think marking tasks as low priority would help. |
| 03:05:10 | <@JAA> | There is a way to specify a priority, but I have no idea if/how that works. |
| 03:05:58 | <nicolas17> | low-priority is only useful if enough people know about it and use it |
| 03:06:24 | <fireonlive> | can i make everything of mine urgent? :3 |
| 03:06:42 | <@JAA> | But even then, those tasks would just sit in the queue and effectively lower the global task limit. |
| 03:07:01 | <nicolas17> | ah yeah probably not useful when things are *this* full |
| 03:07:11 | <nicolas17> | also what happens when the global limit is reached? :| |
| 03:07:52 | <fireonlive> | JAA’s upload takes 3-4 business weeks to process |
| 03:07:57 | <@JAA> | You get a 503 on upload. |
| 03:08:03 | <nicolas17> | ah... yikes |
| 03:08:14 | <fireonlive> | oh, i read raised not reached |
| 03:08:28 | <@JAA> | And naturally, that happens when it tries to submit the task, i.e. upon completion of the upload rather than at the start. |
| 03:08:30 | <nicolas17> | is there a graph of global tasks over time? |
| 03:08:58 | <@JAA> | Not sure. There is one for derives, but I haven't found one for all tasks. |
| 03:09:43 | <nicolas17> | like something that means "you'll get 503 on upload" when the graph hits the roof at 100% |
| 03:11:39 | <@JAA> | You can do a check_limit=1 request against the S3 API, but I haven't tried that. https://archive.org/developers/ias3.html#use-limits |
| 03:12:22 | <nicolas17> | oh but that's just boolean |
| 03:12:25 | <@JAA> | And of course, it might just happen that this says it's fine but then you get a 503 anyway by the time the upload completes. |
| 03:12:49 | <nicolas17> | I wanted a graph because not only I know if we're at the limit but also how far we are and how fast we're approaching it :P |
| 03:13:03 | <@JAA> | Yeah |
| 03:13:14 | <@JAA> | If you find one, let me know. :-) |
| 03:13:30 | <nicolas17> | same reason I want a "free disk space on AT targets" graph, how far are we from hitting "max connections -1" |
| 03:19:44 | <DigitalDragons> | I haven't seen a graph but found that you can https://catalogd.archive.org/catalog.php?summary=1 for the current count |
| 03:20:26 | | threedeeitguy39 (threedeeitguy) joins |
| 04:30:08 | | Iki quits [Read error: Connection reset by peer] |
| 04:48:02 | <nicolas17> | huh |
| 04:49:36 | <nicolas17> | DigitalDragon: that page sometimes shows only the counts, and sometimes shows a task list with 4000 rows |
| 05:09:59 | | whoami quits [Ping timeout: 252 seconds] |
| 05:17:23 | | whoami (whoami) joins |
| 05:20:40 | <fireonlive> | i only see "My Tasks Not Yet Completed" :) |
| 05:21:08 | <fireonlive> | there is "where am I in line?" though https://catalogd.archive.org/catalog.php?whereami=1 |
| 05:22:21 | <fireonlive> | (linked from the summary page) |
| 05:55:03 | <nicolas17> | JAA: https://archive.org/~tracey/stats/derivesg.html this may be it? |
| 05:55:40 | <nicolas17> | or at least a reasonable approximation |
| 05:57:38 | <nicolas17> | hmm no |
| 05:57:51 | <nicolas17> | that says 600 items waiting for derive |
| 05:58:04 | <nicolas17> | catalog says there are 94285 tasks waiting to run |
| 06:12:31 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 06:14:22 | <nicolas17> | JAA: so I can't simultaneously upload multiple files on the same non-existent item, but after I finish uploading the first file, can I already upload more files, even if the processing tasks didn't run yet and the item doesn't show up on the web yet? |
| 06:14:41 | | BigBrain_ (bigbrain) joins |
| 06:18:24 | <nicolas17> | oh huh |
| 06:18:28 | <nicolas17> | I just got "Please reduce your request rate. - total_tasks_queued exceeds global_limit" |
| 06:18:32 | <nicolas17> | *mid-upload* |
| 06:21:20 | | nicolas17 quits [Client Quit] |
| 06:42:30 | | jtagcat quits [Quit: Bye!] |
| 06:45:54 | | jtagcat (jtagcat) joins |
| 07:23:43 | | Arcorann (Arcorann) joins |
| 07:24:44 | | nulldata quits [Ping timeout: 252 seconds] |
| 07:27:48 | | nulldata (nulldata) joins |
| 07:27:58 | <masterX244> | Sucks when you get that in the middle of a csv upload. Had to manually redo it to resume the upload (luckily i got backup monitoring on my automation so i know what to check if the fail doesnt get noticed immediately |
| 08:13:46 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 08:16:08 | | BigBrain_ (bigbrain) joins |
| 09:38:56 | | pabs quits [Ping timeout: 252 seconds] |
| 09:41:33 | | pabs (pabs) joins |
| 11:18:46 | | PredatorIWD_ quits [Read error: Connection reset by peer] |
| 11:25:08 | | PredatorIWD joins |
| 11:25:50 | | Exorcism quits [Quit: issued !quit command] |
| 11:36:07 | | Exorcism (exorcism) joins |
| 11:39:57 | | PredatorIWD_ joins |
| 11:42:08 | | PredatorIWD quits [Ping timeout: 252 seconds] |
| 13:30:37 | | IDK quits [Quit: Connection closed for inactivity] |
| 13:39:52 | | BearFortress quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 13:41:54 | | Arcorann quits [Ping timeout: 265 seconds] |
| 14:05:55 | | BearFortress joins |
| 14:10:29 | <DigitalDragons> | nicolas17: 600 waiting derives might be correct |
| 14:11:29 | <DigitalDragons> | that 95k is everything (archive.php, derive, modify_xml, virus scan...) |
| 14:15:26 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 14:18:40 | | BigBrain_ (bigbrain) joins |
| 15:11:41 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 15:13:52 | | BigBrain_ (bigbrain) joins |
| 16:30:08 | <@JAA> | nicolas17: Derives are only one type of task, yeah, and that's the only graph I know that shows task number not task rate. |
| 16:30:59 | <@JAA> | nicolas17: Correct, you can keep uploading even if the processing didn't run. It seems to check for existence on archive.php task submission or similar. |
| 17:15:22 | | Iki joins |
| 17:27:30 | | Exorcism quits [Client Quit] |
| 17:32:31 | | Exorcism (exorcism) joins |
| 18:25:47 | <@JAA> | > FATAL ERROR: remote dsync failed, NEWLY RSYNCed FILE(S) MAY HAVE BEEN CORRUPTED |
| 18:26:34 | <@JAA> | Nothing to see here, move along. |
| 18:33:33 | | Exorcism quits [Client Quit] |
| 18:35:29 | <fireonlive> | https://mkx9delh5a.execute-api.ca-central-1.amazonaws.com/uploads/3f3bb12cc1fa5ebd/image.jpeg |
| 18:36:24 | <that_lurker> | https://lounge.kuhaon.fun/folder/988351bde3a3d9cf/homer-simpson-i-see.gif |
| 18:38:07 | <@JAA> | https://i.imgur.com/MBlS7Wr.gif |
| 18:38:47 | <that_lurker> | https://lounge.kuhaon.fun/folder/97178a847a24d723/disappearing.gif |
| 19:04:26 | | Exorcism (exorcism) joins |
| 19:24:41 | | DLoader quits [Ping timeout: 252 seconds] |
| 19:31:44 | | DLoader joins |
| 20:01:44 | | Doranwen quits [Remote host closed the connection] |
| 20:02:35 | | Doranwen (Doranwen) joins |
| 20:16:54 | | Doranwen quits [Remote host closed the connection] |
| 20:17:41 | | Doranwen (Doranwen) joins |
| 20:23:42 | | Doranwen quits [Remote host closed the connection] |
| 20:24:26 | | Doranwen (Doranwen) joins |
| 21:30:01 | | nicolas17 joins |
| 21:30:24 | <nicolas17> | JAA: well it seems "tasks waiting to run" in catalogd.archive.org is now higher than last night |
| 21:30:30 | <nicolas17> | so it's getting worse \o/ |
| 21:51:09 | | balrog quits [Quit: Bye] |
| 21:52:25 | <TheTechRobo> | Is this accurate? https://lounge.thetechrobo.ca/uploads/6fc00160af3b5629/image.png |
| 21:52:37 | <TheTechRobo> | Found on https://archive.org/~tracey/stats/ after poking around catalogd.archive.org a bit |
| 21:56:22 | <fireonlive> | line go up :D |
| 21:57:53 | | balrog (balrog) joins |
| 21:58:17 | <@JAA> | TIL it's also available under 'stats'. |
| 21:58:19 | <TheTechRobo> | also, btw, https://archive.org/web/petabox.php says 212 PB when https://catalogd.archive.org/report/space.php says ~144,000 TB, am I confused? |
| 21:58:38 | <nicolas17> | Waiting to run: 93896 |
| 21:58:39 | <@JAA> | Yes, but that's tasks per second, not number of tasks pending execution. |
| 21:58:39 | <nicolas17> | Running: 4119 |
| 21:58:41 | <nicolas17> | is what I see in catalog.php |
| 21:58:52 | <@JAA> | Probably tasks completed per second or similar. |
| 21:59:13 | <@JAA> | petabox.php is 'as of December 2021'. |
| 21:59:33 | <TheTechRobo> | JAA: did they really lose that much space usage in two years, though? |
| 21:59:42 | <@JAA> | It's not a loss. |
| 22:00:11 | <@JAA> | The 141 PiB is unique data, the 212 'PB' (PiB?) are both copies. |
| 22:00:52 | <TheTechRobo> | ahhhhh I should have realised |
| 22:00:55 | <fireonlive> | can we use P∅B for when it's not PiB |
| 22:01:02 | <fireonlive> | (we as a society) |
| 22:01:13 | <fireonlive> | so we're clear that the other party knows iB exists and didn't use them |
| 22:01:14 | <fireonlive> | :p |
| 22:02:16 | <@JAA> | P!B |
| 22:02:40 | <fireonlive> | could work too :3 |
| 22:03:28 | <fireonlive> | https://en.wikipedia.org/wiki/Zero_(linguistics) |
| 22:03:30 | <fireonlive> | "In the English sentence nobody knows ∅ the zero pronoun plays the role of the object of the verb, and in ∅ makes no difference it plays the role of the subject. Likewise, the zero pronoun in the book ∅ I am reading plays the role of the relative pronoun that in the book that I am reading." |
| 22:03:37 | <fireonlive> | my brain broke for these senteneces |
| 22:03:50 | <fireonlive> | 🧠🩹 |
| 22:11:43 | <fireonlive> | i will say |
| 22:11:58 | <fireonlive> | the new version of the servers look 👌 |
| 22:12:19 | <fireonlive> | i would have added a chef there but :chef brings me no chef; whatever library the lounge sucks for searching 'moji |
| 22:16:16 | <imer> | 👨🍳 |
| 22:16:56 | <fireonlive> | :D thanks |
| 22:17:02 | <fireonlive> | 👨🍳👌 |
| 22:23:26 | | nulldata quits [Ping timeout: 252 seconds] |
| 22:27:51 | | nulldata (nulldata) joins |
| 22:48:21 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 22:50:40 | | BigBrain_ (bigbrain) joins |
| 22:54:35 | <flashfire42> | hank: drain of 8T disks (remainder of ia600503) does this mean they are getting bigger disks? is this the hardware thing we are dealing with atm? |