| 00:03:35 | | etnguyen03 (etnguyen03) joins |
| 00:24:28 | | wessel1512 joins |
| 00:35:38 | | BlueMaxima joins |
| 00:52:29 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 01:06:07 | | etnguyen03 (etnguyen03) joins |
| 01:17:35 | <project10> | so I went looking at my 135G zowa warc on IA. Found it at https://archive.org/download/archiveteam_zowa_20230923012400_df2de1d0 but also at https://archive.org/download/archiveteam_zowa_20230924040422_7fbffef8. Why would there be two copies, uploaded on different days with different filenames/timestamps? |
| 01:18:10 | <@JAA> | Probably the item was reclaimed and completed twice (or more times). |
| 01:18:40 | | cascode quits [Ping timeout: 265 seconds] |
| 01:18:59 | <project10> | oh, interesting. I assume IA won't dedupe/reap these and they will show on the WBM as captures on different days? |
| 01:19:10 | <@JAA> | Yes |
| 01:20:07 | | cascode joins |
| 01:20:14 | <project10> | ok, good to know the total size displayed on the tracker is not necessarily indicative of the amount shipped to IA |
| 01:27:08 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 01:33:48 | | haha joins |
| 01:34:38 | | haha quits [Remote host closed the connection] |
| 01:35:52 | | etnguyen03 (etnguyen03) joins |
| 02:21:17 | | cascode quits [Read error: Connection reset by peer] |
| 02:21:40 | | cascode joins |
| 02:36:14 | | Wohlstand quits [Client Quit] |
| 02:36:40 | | Wohlstand (Wohlstand) joins |
| 02:40:20 | | Wohlstand quits [Client Quit] |
| 02:43:35 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 02:49:12 | | etnguyen03 (etnguyen03) joins |
| 03:06:04 | <anarcat> | so this debian developer died https://abrahamraji.in/ |
| 03:06:39 | <anarcat> | i'm going to crawl that site and https://wiki.abrahamraji.in/ |
| 03:06:50 | <anarcat> | there's also https://www.youtube.com/@abrahamraji3699/ i'm not sure what to do with |
| 03:07:34 | <anarcat> | there's also https://gitlab.com/avron https://aana.site/@avronr - same |
| 03:08:45 | <anarcat> | oh looks like pabs already did it |
| 03:08:45 | | mindstrut1 quits [Read error: Connection reset by peer] |
| 03:09:04 | <pabs> | anarcat: yeah, well covered |
| 03:09:07 | | mindstrut1 joins |
| 03:09:21 | <pabs> | anarcat: did the youtube in #down-the-tube |
| 03:09:38 | <anarcat> | thanks |
| 03:09:41 | <anarcat> | so sad |
| 03:10:02 | <pabs> | the mastodon I don't think can be saved, too much JS and AT doesn't save fediverse I thought |
| 03:19:13 | <anarcat> | ack |
| 03:20:03 | <pabs> | if we wanted to, this could be repurposed for that https://github.com/jwilk/zygolophodon |
| 03:23:22 | | dumbgoy quits [Ping timeout: 265 seconds] |
| 04:09:55 | | dumbgoy joins |
| 04:10:06 | | Exorcism quits [Remote host closed the connection] |
| 04:10:44 | | Exorcism (exorcism) joins |
| 04:17:26 | | Exorcism quits [Remote host closed the connection] |
| 04:17:54 | | Exorcism (exorcism) joins |
| 04:24:45 | | DogsRNice quits [Read error: Connection reset by peer] |
| 04:35:28 | | Exorcism quits [Remote host closed the connection] |
| 04:36:09 | | Exorcism (exorcism) joins |
| 04:41:54 | | icedice quits [Read error: Connection reset by peer] |
| 04:42:17 | | icedice (icedice) joins |
| 04:46:20 | | Exorcism quits [Remote host closed the connection] |
| 04:47:10 | | Exorcism (exorcism) joins |
| 04:50:44 | | etnguyen03 quits [Client Quit] |
| 04:54:20 | | appledash quits [Remote host closed the connection] |
| 04:54:54 | | cascode quits [Read error: Connection reset by peer] |
| 04:55:05 | | cascode joins |
| 05:08:01 | | Island quits [Read error: Connection reset by peer] |
| 05:46:35 | | Earendil7 quits [Quit: Leaving] |
| 05:48:00 | | Earendil7 (Earendil7) joins |
| 05:48:15 | | magmaus3 quits [Client Quit] |
| 05:50:05 | | magmaus3 (magmaus3) joins |
| 05:51:11 | | decky_e_ joins |
| 05:54:26 | | decky quits [Ping timeout: 252 seconds] |
| 05:56:33 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 06:08:40 | | cascode quits [Ping timeout: 265 seconds] |
| 06:09:05 | | nepeat (nepeat) joins |
| 06:09:50 | | cascode joins |
| 06:10:29 | | tzui joins |
| 06:11:57 | | thunder_steak joins |
| 06:18:45 | | tzui quits [Remote host closed the connection] |
| 06:40:57 | | Dango360 quits [Read error: Connection reset by peer] |
| 06:42:35 | | cascode quits [Read error: Connection reset by peer] |
| 06:42:55 | | Arcorann (Arcorann) joins |
| 06:43:11 | | cascode joins |
| 06:46:08 | | Naruyoko quits [Ping timeout: 252 seconds] |
| 06:48:03 | | Naruyoko joins |
| 07:00:08 | | nfriedly quits [Remote host closed the connection] |
| 07:06:36 | | Unholy23613166180851599 (Unholy2361) joins |
| 07:10:49 | | ffff joins |
| 07:17:09 | | decky joins |
| 07:20:41 | | decky_e_ quits [Ping timeout: 265 seconds] |
| 07:40:13 | | VerifiedJ quits [Quit: Ping timeout (120 seconds)] |
| 07:40:22 | | lunik173 quits [Quit: Ping timeout (120 seconds)] |
| 07:40:26 | | VerifiedJ (VerifiedJ) joins |
| 07:40:34 | | lunik173 joins |
| 08:10:10 | <flashfire42> | I dunno what happened but I am seeing a lot more movement across the warrior projects |
| 08:15:19 | | lukash91 joins |
| 08:16:20 | | lukash9 quits [Ping timeout: 252 seconds] |
| 08:20:08 | | lukash91 quits [Ping timeout: 265 seconds] |
| 08:56:29 | | lun4 quits [Ping timeout: 252 seconds] |
| 08:56:29 | | ave quits [Ping timeout: 252 seconds] |
| 08:58:13 | | lukash9 joins |
| 09:06:44 | | ave (ave) joins |
| 09:06:50 | | lun4 (lun4) joins |
| 09:16:58 | | icedice quits [Client Quit] |
| 09:24:45 | | ymgve_ joins |
| 09:27:17 | | ymgve quits [Ping timeout: 252 seconds] |
| 09:32:44 | | Exdetransitioner (exdetransitioner) joins |
| 09:35:02 | <Exdetransitioner> | does there anybody has an access to genspect's chatroom? |
| 09:35:06 | <Exdetransitioner> | https://www.dailydot.com/debug/genspect/ |
| 09:35:28 | <Exdetransitioner> | they claim to run a semi-secret forum where they discuss anti-trans extermist talking points |
| 09:46:21 | | IRC2DC joins |
| 09:49:17 | | IRC2DC quits [Remote host closed the connection] |
| 09:52:27 | | parfait_ quits [Ping timeout: 265 seconds] |
| 09:54:47 | | tsyesika quits [Ping timeout: 252 seconds] |
| 10:00:01 | | railen63 quits [Remote host closed the connection] |
| 10:00:20 | | railen63 joins |
| 10:11:40 | | Exdetransitioner quits [Client Quit] |
| 10:15:39 | | imer quits [Ping timeout: 265 seconds] |
| 10:16:08 | | nfriedly joins |
| 10:29:40 | <thunder_steak> | how is decided how often a website will be crawled/snapshotted? e.g. http://zwisler.de/ |
| 10:42:46 | | icedice (icedice) joins |
| 10:50:46 | | beario_ joins |
| 10:53:38 | | beario quits [Ping timeout: 252 seconds] |
| 11:10:37 | | Peroniko joins |
| 11:12:20 | | RetiredTurtle quits [Ping timeout: 252 seconds] |
| 11:31:18 | | shreyasminocha quits [Remote host closed the connection] |
| 11:31:18 | | thehedgeh0g quits [Remote host closed the connection] |
| 11:31:18 | | evan quits [Remote host closed the connection] |
| 11:31:22 | | evan joins |
| 11:31:25 | | shreyasminocha (shreyasminocha) joins |
| 11:31:25 | | thehedgeh0g (mrHedgehog0) joins |
| 11:38:33 | | icedice quits [Remote host closed the connection] |
| 11:38:57 | | icedice (icedice) joins |
| 11:58:02 | | JohnnyJ quits [Quit: Ping timeout (120 seconds)] |
| 11:58:24 | | JohnnyJ joins |
| 12:00:23 | | JohnnyJ quits [Client Quit] |
| 12:17:54 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 12:23:51 | | imer (imer) joins |
| 13:04:20 | | AmAnd0A quits [Ping timeout: 265 seconds] |
| 13:05:07 | | AmAnd0A joins |
| 13:12:37 | | RetiredTurtle joins |
| 13:12:54 | | railen64 joins |
| 13:13:37 | | kiryu quits [Remote host closed the connection] |
| 13:14:49 | | kiryu (kiryu) joins |
| 13:15:27 | | Peroniko quits [Ping timeout: 265 seconds] |
| 13:16:25 | | railen63 quits [Ping timeout: 265 seconds] |
| 13:16:51 | | lflare quits [Read error: Connection reset by peer] |
| 13:19:38 | | lflare (lflare) joins |
| 13:21:14 | | etnguyen03 (etnguyen03) joins |
| 13:33:07 | | imer quits [Killed (NickServ (GHOST command used by imer7))] |
| 13:33:14 | | imer (imer) joins |
| 13:40:37 | | imer quits [Killed (NickServ (GHOST command used by imer7))] |
| 13:40:44 | | imer (imer) joins |
| 13:42:41 | <pabs> | thunder_steak: in what context? for ArchiveBot, usually when the site is closing or there is another reason for doing it |
| 13:43:28 | | imer quits [Killed (NickServ (GHOST command used by imer0))] |
| 13:43:35 | | imer (imer) joins |
| 13:47:40 | | railen64 quits [Remote host closed the connection] |
| 13:47:56 | | railen64 joins |
| 13:51:13 | | Arcorann quits [Ping timeout: 265 seconds] |
| 13:51:22 | | Wohlstand (Wohlstand) joins |
| 13:53:21 | | toss (toss) joins |
| 14:14:55 | | vukky quits [Quit: @ERROR: max connections (-1) reached -- try again later] |
| 14:15:17 | | vukky (vukky) joins |
| 14:17:24 | <thunder_steak> | pabs e.g. http://zwisler.de/ has been snapshotted multiple times but with no constant frequency |
| 14:19:08 | | vukky quits [Client Quit] |
| 14:19:25 | | vukky (vukky) joins |
| 14:24:25 | <pabs> | I guess you mean in web.archive.org. if you click the "About this capture" thing on the top right, you can get some idea |
| 14:25:00 | <pabs> | as you can see here, zero of those were ArchiveTeam ArchiveBot snapshots: https://archive.fart.website/archivebot/viewer/?q=zwisler.de |
| 14:39:45 | <@JAA> | My Canucks forums topic page qwarc grab finished earlier today without any obvious issues. |
| 14:40:47 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 14:41:41 | | Exorcism quits [Read error: Connection reset by peer] |
| 14:43:20 | <@JAA> | 196068 We could not find that topic. |
| 14:43:20 | <@JAA> | 21026 You do not have permission to view this topic. |
| 14:43:20 | <@JAA> | 122653 There are no posts to show |
| 14:43:28 | | Exorcism (exorcism) joins |
| 14:43:39 | <@JAA> | The rest of the 409104 topic IDs were retrieved. |
| 14:46:18 | | DogsRNice joins |
| 14:49:19 | | railen69 joins |
| 14:53:05 | | railen64 quits [Ping timeout: 265 seconds] |
| 14:59:13 | | thunder_steak quits [Remote host closed the connection] |
| 15:01:35 | | etnguyen03 (etnguyen03) joins |
| 15:07:50 | | Island joins |
| 15:24:30 | | RetiredTurtle quits [Ping timeout: 265 seconds] |
| 15:30:01 | | guest9234 joins |
| 15:31:33 | | icedice quits [Client Quit] |
| 15:31:52 | <@JAA> | I got approximately 6007327 posts, which matches the homepage. :-) |
| 15:33:13 | | Exorcism5 (exorcism) joins |
| 15:34:22 | | Exorcism quits [Read error: Connection reset by peer] |
| 15:34:22 | | Exorcism5 is now known as Exorcism |
| 15:34:27 | <@JAA> | I might try to grab new posts as they're being made until the shutdown if I have time to set that up. |
| 15:35:01 | <@JAA> | Although the post URLs require a topic ID, it doesn't have to be correct; you can do something like https://forum.canucks.com/topic/0-x/?do=findComment&comment=16942183 instead. |
| 15:36:05 | | Exorcism quits [Remote host closed the connection] |
| 15:36:46 | | Exorcism5 (exorcism) joins |
| 15:46:42 | | BigBrain (bigbrain) joins |
| 15:51:18 | | toss quits [Client Quit] |
| 16:04:56 | | HP_Archivist quits [Ping timeout: 252 seconds] |
| 16:05:46 | | albertlarsan68 quits [Quit: The Lounge - https://thelounge.chat] |
| 16:10:21 | | Exorcism5 is now known as Exorcism |
| 16:42:04 | | Dango360 (Dango360) joins |
| 16:58:27 | | HP_Archivist (HP_Archivist) joins |
| 17:01:34 | | IRC2DC joins |
| 17:03:06 | | guest9234 quits [Ping timeout: 265 seconds] |
| 17:03:14 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 17:12:19 | | Wohlstand quits [Client Quit] |
| 17:12:42 | | nahimgood joins |
| 17:14:03 | | nahimgood quits [Remote host closed the connection] |
| 17:14:25 | | nahimnotgood joins |
| 17:14:26 | | nahimnotgood quits [Remote host closed the connection] |
| 17:14:43 | | aaaa1 joins |
| 17:24:15 | | webuser9995 joins |
| 17:24:23 | | webuser9995 leaves |
| 17:32:27 | | etnguyen03 (etnguyen03) joins |
| 17:49:37 | | IRC2DC quits [Remote host closed the connection] |
| 17:49:45 | | IRC2DC joins |
| 17:53:36 | | IRC2DC quits [Remote host closed the connection] |
| 17:53:48 | | IRC2DC joins |
| 17:54:56 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 18:25:16 | | HP_Archivist quits [Ping timeout: 265 seconds] |
| 18:28:41 | | parfait_ joins |
| 18:30:34 | | mindstrut1 quits [Read error: Connection reset by peer] |
| 18:32:07 | <@JAA> | FOIAonline completion rate has slowed down due to larger items, now at about a third done and an estimated 3 TiB total. ETA is still on time but only just (a bit over 4 days). |
| 18:32:21 | <@JAA> | (That's based on the rate of the past 6 hours.) |
| 18:34:32 | <@JAA> | Actually, probably closer to 4 TiB. |
| 18:35:27 | <thuban> | hm, rough--chronological ordering suggests sizes will continue to increase |
| 18:37:45 | | qw3rty_ joins |
| 18:38:19 | | qw3rty quits [Ping timeout: 265 seconds] |
| 18:41:50 | | Exorcism|tor (exorcism) joins |
| 18:41:59 | <@JAA> | Yeah |
| 18:43:09 | | mindstrut joins |
| 18:43:16 | | Peroniko joins |
| 18:43:22 | | Peroniko is now authenticated as Peroniko |
| 18:44:03 | <@JAA> | I can try throwing more concurrency at it. My machine is nowhere near its limits. |
| 18:44:21 | <@JAA> | And I haven't seen any rate limiting or blocks whatsoever so far, just some random timeouts. |
| 18:46:54 | <thuban> | seems wise, especially if you can adjust on the fly. what tooling are you using? |
| 18:49:59 | <@JAA> | qwarc |
| 18:50:24 | <@JAA> | I can't adjust the concurrency of running processes, but I can add more processes. :-) |
| 18:50:48 | <thuban> | >:? |
| 18:50:52 | <@JAA> | (I'd have to stop them, ideally gracefully, for the former.) |
| 18:51:46 | <@JAA> | I originally had one process at 25 concurrency, but that was far from ideal because it got blocked sometimes by large downloads. |
| 18:51:52 | <@JAA> | So now it's 5 processes with 5 concurrency each. |
| 18:56:53 | <thuban> | ah, i forgot qwarc runs off a database and everything. it's sufficiently self-organizing that you can just tell new processes to jump in, then? |
| 18:57:57 | <@JAA> | Yep, each process just takes items from the DB, processes them, and writes the new status back (plus any new items it might've discovered, not relevant in this case). |
| 18:58:24 | <thuban> | neat |
| 19:00:22 | | Exorcism|tor quits [Client Quit] |
| 19:01:04 | <@JAA> | It really is pretty much like a local tracker in that respect. That's what I modelled it after conceptually, anyway. |
| 19:02:36 | <@JAA> | Also, some of the timeouts I'm seeing are actually due to large downloads taking time to process, similar to the problems in wpull. |
| 19:03:14 | <@JAA> | Eventuallyâ„¢, I'll refactor that so the actual HTTP stuff happens in a separate thread. |
| 19:15:39 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 19:15:54 | | AmAnd0A joins |
| 19:21:12 | | IRC2DC quits [Remote host closed the connection] |
| 19:21:29 | | aaaa1 quits [Remote host closed the connection] |
| 19:23:29 | | qw3rty_ quits [Ping timeout: 252 seconds] |
| 19:23:30 | | qw3rty joins |
| 19:28:14 | | imer quits [Client Quit] |
| 19:28:44 | | imer (imer) joins |
| 19:31:11 | | qw3rty quits [Ping timeout: 252 seconds] |
| 19:31:14 | | qw3rty_ joins |
| 19:32:27 | | AmAnd0A quits [Ping timeout: 265 seconds] |
| 19:32:43 | | AmAnd0A joins |
| 19:34:13 | | Rootliam joins |
| 19:36:50 | <Rootliam> | I got a response from Jason Scott about yahoo video with "all I can say is all the data is up there, one way or another. There'sno other stores out there." |
| 19:37:18 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 19:37:20 | <Rootliam> | I'm not really sure if that means it could have been mixed up with something else or if it wasn't uploaded then it's gone forever |
| 19:37:38 | | AmAnd0A joins |
| 19:37:54 | <thuban> | Rootliam: did you ever open that github issue? |
| 19:38:06 | <Rootliam> | No but I guess I should do that soon |
| 19:42:44 | | programmerq quits [Read error: Connection reset by peer] |
| 19:42:51 | | programmerq (programmerq) joins |
| 19:51:47 | | cascode quits [Ping timeout: 265 seconds] |
| 19:52:30 | | cascode joins |
| 20:05:48 | | Doomaholic quits [Ping timeout: 265 seconds] |
| 20:08:25 | | Doomaholic (Doomaholic) joins |
| 20:09:35 | | programmerq quits [Client Quit] |
| 20:09:57 | | programmerq (programmerq) joins |
| 20:15:21 | | icedice (icedice) joins |
| 20:17:55 | | cascode quits [Read error: Connection reset by peer] |
| 20:18:12 | | cascode joins |
| 20:25:19 | | programmerq quits [Client Quit] |
| 20:52:05 | <flashfire42> | Wait we are completely clogged? Like completely? |
| 21:00:22 | | ThetaDev quits [Client Quit] |
| 21:00:30 | | ThetaDev joins |
| 21:04:41 | | etnguyen03 (etnguyen03) joins |
| 21:18:47 | | BearFortress quits [Ping timeout: 265 seconds] |
| 21:26:08 | | girst quits [Ping timeout: 252 seconds] |
| 21:31:22 | | ffff quits [Remote host closed the connection] |
| 21:39:49 | | Peroniko quits [Read error: Connection reset by peer] |
| 21:41:09 | | Peroniko joins |
| 21:49:39 | | Exorcism quits [Remote host closed the connection] |
| 21:50:18 | | Exorcism (exorcism) joins |
| 22:00:21 | | Rootliam quits [Ping timeout: 265 seconds] |
| 22:00:21 | | Perk quits [Read error: Connection reset by peer] |
| 22:08:24 | | Perk joins |
| 22:08:25 | | Perk7 joins |
| 22:08:35 | | Perk7 quits [Remote host closed the connection] |
| 22:28:33 | | ffff joins |
| 22:33:14 | | AmAnd0A quits [Ping timeout: 252 seconds] |
| 22:33:17 | | AmAnd0A joins |
| 22:43:36 | | flashfire42 quits [Client Quit] |
| 22:43:36 | | mindstrut quits [Read error: Connection reset by peer] |
| 22:43:36 | | kiska quits [Client Quit] |
| 22:43:36 | | Ryz263 quits [Client Quit] |
| 22:43:36 | | s-crypt2 quits [Client Quit] |
| 22:43:50 | | mindstrut joins |
| 22:45:03 | | Ryz263 (Ryz) joins |
| 22:45:03 | | s-crypt2 (s-crypt) joins |
| 22:45:11 | | flashfire42 joins |
| 22:46:47 | | kiska (kiska) joins |
| 22:55:09 | | girst (girst) joins |
| 22:59:56 | | benjins quits [Read error: Connection reset by peer] |
| 23:07:55 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 23:09:05 | | AmAnd0A joins |
| 23:13:13 | | HP_Archivist (HP_Archivist) joins |
| 23:17:28 | | BlueMaxima joins |
| 23:22:45 | | benjins joins |
| 23:24:02 | | flashfire42 is now authenticated as flashfire42 |
| 23:29:37 | <flashfire42> | So is it Optane9 again rewby or is it the transferring stuck? |
| 23:30:57 | | benjinsm joins |
| 23:32:19 | <flashfire42> | Ok looks like its optane9 that needs a kick if you have access to it JAA I did a test and Mediafire uses a seperate target and one of them went through fine |
| 23:32:29 | <@JAA> | flashfire42: Please stop. |
| 23:33:04 | <@JAA> | Targets are doing target things as well as they can. The situation isn't great, and everyone's aware of it. |
| 23:33:44 | | benjins quits [Ping timeout: 252 seconds] |
| 23:41:24 | | BearFortress joins |
| 23:41:38 | | Exorcism quits [Remote host closed the connection] |
| 23:42:21 | | Exorcism (exorcism) joins |
| 23:44:11 | | RetiredTurtle joins |
| 23:46:56 | | Peroniko quits [Ping timeout: 252 seconds] |
| 23:47:07 | | benjinsmi joins |
| 23:49:39 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 23:50:14 | | benjinsm quits [Ping timeout: 252 seconds] |
| 23:52:13 | | AmAnd0A joins |
| 23:52:22 | | RetiredTurtle is now authenticated as Peroniko |
| 23:56:50 | | Chris5010 quits [Ping timeout: 252 seconds] |