00:03:35 | | etnguyen03 (etnguyen03) joins |
00:24:28 | | wessel1512 joins |
00:35:38 | | BlueMaxima joins |
00:52:29 | | etnguyen03 quits [Ping timeout: 252 seconds] |
01:06:07 | | etnguyen03 (etnguyen03) joins |
01:17:35 | <project10> | so I went looking at my 135G zowa warc on IA. Found it at https://archive.org/download/archiveteam_zowa_20230923012400_df2de1d0 but also at https://archive.org/download/archiveteam_zowa_20230924040422_7fbffef8. Why would there be two copies, uploaded on different days with different filenames/timestamps? |
01:18:10 | <@JAA> | Probably the item was reclaimed and completed twice (or more times). |
01:18:40 | | cascode quits [Ping timeout: 265 seconds] |
01:18:59 | <project10> | oh, interesting. I assume IA won't dedupe/reap these and they will show on the WBM as captures on different days? |
01:19:10 | <@JAA> | Yes |
01:20:07 | | cascode joins |
01:20:14 | <project10> | ok, good to know the total size displayed on the tracker is not necessarily indicative of the amount shipped to IA |
01:27:08 | | etnguyen03 quits [Ping timeout: 252 seconds] |
01:33:48 | | haha joins |
01:34:38 | | haha quits [Remote host closed the connection] |
01:35:52 | | etnguyen03 (etnguyen03) joins |
02:21:17 | | cascode quits [Read error: Connection reset by peer] |
02:21:40 | | cascode joins |
02:36:14 | | Wohlstand quits [Client Quit] |
02:36:40 | | Wohlstand (Wohlstand) joins |
02:40:20 | | Wohlstand quits [Client Quit] |
02:43:35 | | etnguyen03 quits [Ping timeout: 252 seconds] |
02:49:12 | | etnguyen03 (etnguyen03) joins |
03:06:04 | <anarcat> | so this debian developer died https://abrahamraji.in/ |
03:06:39 | <anarcat> | i'm going to crawl that site and https://wiki.abrahamraji.in/ |
03:06:50 | <anarcat> | there's also https://www.youtube.com/@abrahamraji3699/ i'm not sure what to do with |
03:07:34 | <anarcat> | there's also https://gitlab.com/avron https://aana.site/@avronr - same |
03:08:45 | <anarcat> | oh looks like pabs already did it |
03:08:45 | | mindstrut1 quits [Read error: Connection reset by peer] |
03:09:04 | <pabs> | anarcat: yeah, well covered |
03:09:07 | | mindstrut1 joins |
03:09:21 | <pabs> | anarcat: did the youtube in #down-the-tube |
03:09:38 | <anarcat> | thanks |
03:09:41 | <anarcat> | so sad |
03:10:02 | <pabs> | the mastodon I don't think can be saved, too much JS and AT doesn't save fediverse I thought |
03:19:13 | <anarcat> | ack |
03:20:03 | <pabs> | if we wanted to, this could be repurposed for that https://github.com/jwilk/zygolophodon |
03:23:22 | | dumbgoy quits [Ping timeout: 265 seconds] |
04:09:55 | | dumbgoy joins |
04:10:06 | | Exorcism quits [Remote host closed the connection] |
04:10:44 | | Exorcism (exorcism) joins |
04:17:26 | | Exorcism quits [Remote host closed the connection] |
04:17:54 | | Exorcism (exorcism) joins |
04:24:45 | | DogsRNice quits [Read error: Connection reset by peer] |
04:35:28 | | Exorcism quits [Remote host closed the connection] |
04:36:09 | | Exorcism (exorcism) joins |
04:41:54 | | icedice quits [Read error: Connection reset by peer] |
04:42:17 | | icedice (icedice) joins |
04:46:20 | | Exorcism quits [Remote host closed the connection] |
04:47:10 | | Exorcism (exorcism) joins |
04:50:44 | | etnguyen03 quits [Client Quit] |
04:54:20 | | appledash quits [Remote host closed the connection] |
04:54:54 | | cascode quits [Read error: Connection reset by peer] |
04:55:05 | | cascode joins |
05:08:01 | | Island quits [Read error: Connection reset by peer] |
05:46:35 | | Earendil7 quits [Quit: Leaving] |
05:48:00 | | Earendil7 (Earendil7) joins |
05:48:15 | | magmaus3 quits [Client Quit] |
05:50:05 | | magmaus3 (magmaus3) joins |
05:51:11 | | decky_e_ joins |
05:54:26 | | decky quits [Ping timeout: 252 seconds] |
05:56:33 | | BlueMaxima quits [Read error: Connection reset by peer] |
06:08:40 | | cascode quits [Ping timeout: 265 seconds] |
06:09:05 | | nepeat (nepeat) joins |
06:09:50 | | cascode joins |
06:10:29 | | tzui joins |
06:11:57 | | thunder_steak joins |
06:18:45 | | tzui quits [Remote host closed the connection] |
06:40:57 | | Dango360 quits [Read error: Connection reset by peer] |
06:42:35 | | cascode quits [Read error: Connection reset by peer] |
06:42:55 | | Arcorann (Arcorann) joins |
06:43:11 | | cascode joins |
06:46:08 | | Naruyoko quits [Ping timeout: 252 seconds] |
06:48:03 | | Naruyoko joins |
07:00:08 | | nfriedly quits [Remote host closed the connection] |
07:06:36 | | Unholy23613166180851599 (Unholy2361) joins |
07:10:49 | | ffff joins |
07:17:09 | | decky joins |
07:20:41 | | decky_e_ quits [Ping timeout: 265 seconds] |
07:40:13 | | VerifiedJ quits [Quit: Ping timeout (120 seconds)] |
07:40:22 | | lunik173 quits [Quit: Ping timeout (120 seconds)] |
07:40:26 | | VerifiedJ (VerifiedJ) joins |
07:40:34 | | lunik173 joins |
08:10:10 | <flashfire42> | I dunno what happened but I am seeing a lot more movement across the warrior projects |
08:15:19 | | lukash91 joins |
08:16:20 | | lukash9 quits [Ping timeout: 252 seconds] |
08:20:08 | | lukash91 quits [Ping timeout: 265 seconds] |
08:56:29 | | lun4 quits [Ping timeout: 252 seconds] |
08:56:29 | | ave quits [Ping timeout: 252 seconds] |
08:58:13 | | lukash9 joins |
09:06:44 | | ave (ave) joins |
09:06:50 | | lun4 (lun4) joins |
09:16:58 | | icedice quits [Client Quit] |
09:24:45 | | ymgve_ joins |
09:27:17 | | ymgve quits [Ping timeout: 252 seconds] |
09:32:44 | | Exdetransitioner (exdetransitioner) joins |
09:35:02 | <Exdetransitioner> | does there anybody has an access to genspect's chatroom? |
09:35:06 | <Exdetransitioner> | https://www.dailydot.com/debug/genspect/ |
09:35:28 | <Exdetransitioner> | they claim to run a semi-secret forum where they discuss anti-trans extermist talking points |
09:46:21 | | IRC2DC joins |
09:49:17 | | IRC2DC quits [Remote host closed the connection] |
09:52:27 | | parfait_ quits [Ping timeout: 265 seconds] |
09:54:47 | | tsyesika quits [Ping timeout: 252 seconds] |
10:00:01 | | railen63 quits [Remote host closed the connection] |
10:00:20 | | railen63 joins |
10:11:40 | | Exdetransitioner quits [Client Quit] |
10:15:39 | | imer quits [Ping timeout: 265 seconds] |
10:16:08 | | nfriedly joins |
10:29:40 | <thunder_steak> | how is decided how often a website will be crawled/snapshotted? e.g. http://zwisler.de/ |
10:42:46 | | icedice (icedice) joins |
10:50:46 | | beario_ joins |
10:53:38 | | beario quits [Ping timeout: 252 seconds] |
11:10:37 | | Peroniko joins |
11:12:20 | | RetiredTurtle quits [Ping timeout: 252 seconds] |
11:31:18 | | shreyasminocha quits [Remote host closed the connection] |
11:31:18 | | thehedgeh0g quits [Remote host closed the connection] |
11:31:18 | | evan quits [Remote host closed the connection] |
11:31:22 | | evan joins |
11:31:25 | | shreyasminocha (shreyasminocha) joins |
11:31:25 | | thehedgeh0g (mrHedgehog0) joins |
11:38:33 | | icedice quits [Remote host closed the connection] |
11:38:57 | | icedice (icedice) joins |
11:58:02 | | JohnnyJ quits [Quit: Ping timeout (120 seconds)] |
11:58:24 | | JohnnyJ joins |
12:00:23 | | JohnnyJ quits [Client Quit] |
12:17:54 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
12:23:51 | | imer (imer) joins |
13:04:20 | | AmAnd0A quits [Ping timeout: 265 seconds] |
13:05:07 | | AmAnd0A joins |
13:12:37 | | RetiredTurtle joins |
13:12:54 | | railen64 joins |
13:13:37 | | kiryu quits [Remote host closed the connection] |
13:14:49 | | kiryu (kiryu) joins |
13:15:27 | | Peroniko quits [Ping timeout: 265 seconds] |
13:16:25 | | railen63 quits [Ping timeout: 265 seconds] |
13:16:51 | | lflare quits [Read error: Connection reset by peer] |
13:19:38 | | lflare (lflare) joins |
13:21:14 | | etnguyen03 (etnguyen03) joins |
13:33:07 | | imer quits [Killed (NickServ (GHOST command used by imer7))] |
13:33:14 | | imer (imer) joins |
13:40:37 | | imer quits [Killed (NickServ (GHOST command used by imer7))] |
13:40:44 | | imer (imer) joins |
13:42:41 | <pabs> | thunder_steak: in what context? for ArchiveBot, usually when the site is closing or there is another reason for doing it |
13:43:28 | | imer quits [Killed (NickServ (GHOST command used by imer0))] |
13:43:35 | | imer (imer) joins |
13:47:40 | | railen64 quits [Remote host closed the connection] |
13:47:56 | | railen64 joins |
13:51:13 | | Arcorann quits [Ping timeout: 265 seconds] |
13:51:22 | | Wohlstand (Wohlstand) joins |
13:53:21 | | toss (toss) joins |
14:14:55 | | vukky quits [Quit: @ERROR: max connections (-1) reached -- try again later] |
14:15:17 | | vukky (vukky) joins |
14:17:24 | <thunder_steak> | pabs e.g. http://zwisler.de/ has been snapshotted multiple times but with no constant frequency |
14:19:08 | | vukky quits [Client Quit] |
14:19:25 | | vukky (vukky) joins |
14:24:25 | <pabs> | I guess you mean in web.archive.org. if you click the "About this capture" thing on the top right, you can get some idea |
14:25:00 | <pabs> | as you can see here, zero of those were ArchiveTeam ArchiveBot snapshots: https://archive.fart.website/archivebot/viewer/?q=zwisler.de |
14:39:45 | <@JAA> | My Canucks forums topic page qwarc grab finished earlier today without any obvious issues. |
14:40:47 | | etnguyen03 quits [Ping timeout: 252 seconds] |
14:41:41 | | Exorcism quits [Read error: Connection reset by peer] |
14:43:20 | <@JAA> | 196068 We could not find that topic. |
14:43:20 | <@JAA> | 21026 You do not have permission to view this topic. |
14:43:20 | <@JAA> | 122653 There are no posts to show |
14:43:28 | | Exorcism (exorcism) joins |
14:43:39 | <@JAA> | The rest of the 409104 topic IDs were retrieved. |
14:46:18 | | DogsRNice joins |
14:49:19 | | railen69 joins |
14:53:05 | | railen64 quits [Ping timeout: 265 seconds] |
14:59:13 | | thunder_steak quits [Remote host closed the connection] |
15:01:35 | | etnguyen03 (etnguyen03) joins |
15:07:50 | | Island joins |
15:24:30 | | RetiredTurtle quits [Ping timeout: 265 seconds] |
15:30:01 | | guest9234 joins |
15:31:33 | | icedice quits [Client Quit] |
15:31:52 | <@JAA> | I got approximately 6007327 posts, which matches the homepage. :-) |
15:33:13 | | Exorcism5 (exorcism) joins |
15:34:22 | | Exorcism quits [Read error: Connection reset by peer] |
15:34:22 | | Exorcism5 is now known as Exorcism |
15:34:27 | <@JAA> | I might try to grab new posts as they're being made until the shutdown if I have time to set that up. |
15:35:01 | <@JAA> | Although the post URLs require a topic ID, it doesn't have to be correct; you can do something like https://forum.canucks.com/topic/0-x/?do=findComment&comment=16942183 instead. |
15:36:05 | | Exorcism quits [Remote host closed the connection] |
15:36:46 | | Exorcism5 (exorcism) joins |
15:46:42 | | BigBrain (bigbrain) joins |
15:51:18 | | toss quits [Client Quit] |
16:04:56 | | HP_Archivist quits [Ping timeout: 252 seconds] |
16:05:46 | | albertlarsan68 quits [Quit: The Lounge - https://thelounge.chat] |
16:10:21 | | Exorcism5 is now known as Exorcism |
16:42:04 | | Dango360 (Dango360) joins |
16:58:27 | | HP_Archivist (HP_Archivist) joins |
17:01:34 | | IRC2DC joins |
17:03:06 | | guest9234 quits [Ping timeout: 265 seconds] |
17:03:14 | | etnguyen03 quits [Ping timeout: 252 seconds] |
17:12:19 | | Wohlstand quits [Client Quit] |
17:12:42 | | nahimgood joins |
17:14:03 | | nahimgood quits [Remote host closed the connection] |
17:14:25 | | nahimnotgood joins |
17:14:26 | | nahimnotgood quits [Remote host closed the connection] |
17:14:43 | | aaaa1 joins |
17:24:15 | | webuser9995 joins |
17:24:23 | | webuser9995 leaves |
17:32:27 | | etnguyen03 (etnguyen03) joins |
17:49:37 | | IRC2DC quits [Remote host closed the connection] |
17:49:45 | | IRC2DC joins |
17:53:36 | | IRC2DC quits [Remote host closed the connection] |
17:53:48 | | IRC2DC joins |
17:54:56 | | etnguyen03 quits [Ping timeout: 252 seconds] |
18:25:16 | | HP_Archivist quits [Ping timeout: 265 seconds] |
18:28:41 | | parfait_ joins |
18:30:34 | | mindstrut1 quits [Read error: Connection reset by peer] |
18:32:07 | <@JAA> | FOIAonline completion rate has slowed down due to larger items, now at about a third done and an estimated 3 TiB total. ETA is still on time but only just (a bit over 4 days). |
18:32:21 | <@JAA> | (That's based on the rate of the past 6 hours.) |
18:34:32 | <@JAA> | Actually, probably closer to 4 TiB. |
18:35:27 | <thuban> | hm, rough--chronological ordering suggests sizes will continue to increase |
18:37:45 | | qw3rty_ joins |
18:38:19 | | qw3rty quits [Ping timeout: 265 seconds] |
18:41:50 | | Exorcism|tor (exorcism) joins |
18:41:59 | <@JAA> | Yeah |
18:43:09 | | mindstrut joins |
18:43:16 | | Peroniko joins |
18:43:22 | | Peroniko is now authenticated as Peroniko |
18:44:03 | <@JAA> | I can try throwing more concurrency at it. My machine is nowhere near its limits. |
18:44:21 | <@JAA> | And I haven't seen any rate limiting or blocks whatsoever so far, just some random timeouts. |
18:46:54 | <thuban> | seems wise, especially if you can adjust on the fly. what tooling are you using? |
18:49:59 | <@JAA> | qwarc |
18:50:24 | <@JAA> | I can't adjust the concurrency of running processes, but I can add more processes. :-) |
18:50:48 | <thuban> | >:? |
18:50:52 | <@JAA> | (I'd have to stop them, ideally gracefully, for the former.) |
18:51:46 | <@JAA> | I originally had one process at 25 concurrency, but that was far from ideal because it got blocked sometimes by large downloads. |
18:51:52 | <@JAA> | So now it's 5 processes with 5 concurrency each. |
18:56:53 | <thuban> | ah, i forgot qwarc runs off a database and everything. it's sufficiently self-organizing that you can just tell new processes to jump in, then? |
18:57:57 | <@JAA> | Yep, each process just takes items from the DB, processes them, and writes the new status back (plus any new items it might've discovered, not relevant in this case). |
18:58:24 | <thuban> | neat |
19:00:22 | | Exorcism|tor quits [Client Quit] |
19:01:04 | <@JAA> | It really is pretty much like a local tracker in that respect. That's what I modelled it after conceptually, anyway. |
19:02:36 | <@JAA> | Also, some of the timeouts I'm seeing are actually due to large downloads taking time to process, similar to the problems in wpull. |
19:03:14 | <@JAA> | Eventuallyâ„¢, I'll refactor that so the actual HTTP stuff happens in a separate thread. |
19:15:39 | | AmAnd0A quits [Read error: Connection reset by peer] |
19:15:54 | | AmAnd0A joins |
19:21:12 | | IRC2DC quits [Remote host closed the connection] |
19:21:29 | | aaaa1 quits [Remote host closed the connection] |
19:23:29 | | qw3rty_ quits [Ping timeout: 252 seconds] |
19:23:30 | | qw3rty joins |
19:28:14 | | imer quits [Client Quit] |
19:28:44 | | imer (imer) joins |
19:31:11 | | qw3rty quits [Ping timeout: 252 seconds] |
19:31:14 | | qw3rty_ joins |
19:32:27 | | AmAnd0A quits [Ping timeout: 265 seconds] |
19:32:43 | | AmAnd0A joins |
19:34:13 | | Rootliam joins |
19:36:50 | <Rootliam> | I got a response from Jason Scott about yahoo video with "all I can say is all the data is up there, one way or another. There'sno other stores out there." |
19:37:18 | | AmAnd0A quits [Read error: Connection reset by peer] |
19:37:20 | <Rootliam> | I'm not really sure if that means it could have been mixed up with something else or if it wasn't uploaded then it's gone forever |
19:37:38 | | AmAnd0A joins |
19:37:54 | <thuban> | Rootliam: did you ever open that github issue? |
19:38:06 | <Rootliam> | No but I guess I should do that soon |
19:42:44 | | programmerq quits [Read error: Connection reset by peer] |
19:42:51 | | programmerq (programmerq) joins |
19:51:47 | | cascode quits [Ping timeout: 265 seconds] |
19:52:30 | | cascode joins |
20:05:48 | | Doomaholic quits [Ping timeout: 265 seconds] |
20:08:25 | | Doomaholic (Doomaholic) joins |
20:09:35 | | programmerq quits [Client Quit] |
20:09:57 | | programmerq (programmerq) joins |
20:15:21 | | icedice (icedice) joins |
20:17:55 | | cascode quits [Read error: Connection reset by peer] |
20:18:12 | | cascode joins |
20:25:19 | | programmerq quits [Client Quit] |
20:52:05 | <flashfire42> | Wait we are completely clogged? Like completely? |
21:00:22 | | ThetaDev quits [Client Quit] |
21:00:30 | | ThetaDev joins |
21:04:41 | | etnguyen03 (etnguyen03) joins |
21:18:47 | | BearFortress quits [Ping timeout: 265 seconds] |
21:26:08 | | girst quits [Ping timeout: 252 seconds] |
21:31:22 | | ffff quits [Remote host closed the connection] |
21:39:49 | | Peroniko quits [Read error: Connection reset by peer] |
21:41:09 | | Peroniko joins |
21:49:39 | | Exorcism quits [Remote host closed the connection] |
21:50:18 | | Exorcism (exorcism) joins |
22:00:21 | | Rootliam quits [Ping timeout: 265 seconds] |
22:00:21 | | Perk quits [Read error: Connection reset by peer] |
22:08:24 | | Perk joins |
22:08:25 | | Perk7 joins |
22:08:35 | | Perk7 quits [Remote host closed the connection] |
22:28:33 | | ffff joins |
22:33:14 | | AmAnd0A quits [Ping timeout: 252 seconds] |
22:33:17 | | AmAnd0A joins |
22:43:36 | | flashfire42 quits [Client Quit] |
22:43:36 | | mindstrut quits [Read error: Connection reset by peer] |
22:43:36 | | kiska quits [Client Quit] |
22:43:36 | | Ryz263 quits [Client Quit] |
22:43:36 | | s-crypt2 quits [Client Quit] |
22:43:50 | | mindstrut joins |
22:45:03 | | Ryz263 (Ryz) joins |
22:45:03 | | s-crypt2 (s-crypt) joins |
22:45:11 | | flashfire42 joins |
22:46:47 | | kiska (kiska) joins |
22:55:09 | | girst (girst) joins |
22:59:56 | | benjins quits [Read error: Connection reset by peer] |
23:07:55 | | AmAnd0A quits [Read error: Connection reset by peer] |
23:09:05 | | AmAnd0A joins |
23:13:13 | | HP_Archivist (HP_Archivist) joins |
23:17:28 | | BlueMaxima joins |
23:22:45 | | benjins joins |
23:24:02 | | flashfire42 is now authenticated as flashfire42 |
23:29:37 | <flashfire42> | So is it Optane9 again rewby or is it the transferring stuck? |
23:30:57 | | benjinsm joins |
23:32:19 | <flashfire42> | Ok looks like its optane9 that needs a kick if you have access to it JAA I did a test and Mediafire uses a seperate target and one of them went through fine |
23:32:29 | <@JAA> | flashfire42: Please stop. |
23:33:04 | <@JAA> | Targets are doing target things as well as they can. The situation isn't great, and everyone's aware of it. |
23:33:44 | | benjins quits [Ping timeout: 252 seconds] |
23:41:24 | | BearFortress joins |
23:41:38 | | Exorcism quits [Remote host closed the connection] |
23:42:21 | | Exorcism (exorcism) joins |
23:44:11 | | RetiredTurtle joins |
23:46:56 | | Peroniko quits [Ping timeout: 252 seconds] |
23:47:07 | | benjinsmi joins |
23:49:39 | | AmAnd0A quits [Read error: Connection reset by peer] |
23:50:14 | | benjinsm quits [Ping timeout: 252 seconds] |
23:52:13 | | AmAnd0A joins |
23:52:22 | | RetiredTurtle is now authenticated as Peroniko |
23:56:50 | | Chris5010 quits [Ping timeout: 252 seconds] |