00:02:42 | | lennier2 joins |
00:05:24 | | lennier2__ quits [Ping timeout: 250 seconds] |
00:21:00 | | etnguyen03 quits [Client Quit] |
00:21:26 | | cascode quits [Ping timeout: 250 seconds] |
00:22:27 | | cascode joins |
00:38:06 | | Ifan joins |
00:43:58 | <Ifan> | Hey. Not sure if this is the right place for this. A whole bunch of US government websites are going down (eg. USAID.gov just now.) Various data sources are at risk of being lost, some of which are essential to an education project I work on. Eg. https://civilrightsdata.ed.gov/data. I heard this is a good place to mention something like this. I |
00:43:58 | <Ifan> | coudn't find any rules page so please let me know if I'm going about this wrong |
00:44:28 | <pokechu22> | Ifan: We're currently working on it; #UncleSamsArchive is the channel for that specifically |
00:44:57 | <Ifan> | Awesome. Thanks. I barely have any idea how this site works. |
00:56:03 | | cascode quits [Ping timeout: 260 seconds] |
00:57:37 | | cascode joins |
00:59:23 | | Naruyoko5 quits [Quit: Leaving] |
01:14:29 | | BlueMaxima_ joins |
01:16:42 | | Naruyoko joins |
01:16:54 | | BlueMaxima quits [Ping timeout: 250 seconds] |
01:17:03 | | cascode quits [Read error: Connection reset by peer] |
01:17:16 | | cascode joins |
01:29:35 | | etnguyen03 (etnguyen03) joins |
01:33:56 | | Webuser178098 joins |
01:48:33 | | Ifan quits [Client Quit] |
02:00:39 | | Webuser178098 quits [Client Quit] |
02:00:48 | | cascode quits [Ping timeout: 260 seconds] |
02:01:11 | | cascode joins |
02:04:58 | <pabs> | pokechu22: are there known ignores for fandom? |
02:05:53 | <pokechu22> | Not really beyond -i mediawiki |
02:21:54 | | scurvy_duck quits [Ping timeout: 250 seconds] |
02:24:08 | | cascode quits [Ping timeout: 260 seconds] |
02:24:40 | | cascode joins |
02:30:54 | | sec^nd quits [Remote host closed the connection] |
02:31:12 | | sec^nd (second) joins |
03:07:21 | | hackbug quits [Remote host closed the connection] |
03:10:07 | | hackbug (hackbug) joins |
03:13:04 | | scurvy_duck joins |
03:16:38 | | pabs quits [Ping timeout: 260 seconds] |
03:16:54 | <nicolas17> | /phonenixdown pabs |
03:17:21 | | pabs (pabs) joins |
03:17:50 | <nulldata> | nicolas17 - phoenix has risen again |
03:35:40 | | nicolas17 is now authenticated as nicolas17 |
03:35:53 | | cascode quits [Ping timeout: 260 seconds] |
03:38:21 | | cascode joins |
03:48:31 | <utulien_> | Ifan - there's a torrent of most of the CDC data up online already. |
03:48:53 | <utulien_> | (not from archiveteam, but some guy on reddit. it's on the internet archive) |
03:50:53 | | etnguyen03 quits [Client Quit] |
03:54:05 | | etnguyen03 (etnguyen03) joins |
03:54:13 | | Webuser303121 joins |
03:54:41 | | Webuser303121 quits [Client Quit] |
04:01:00 | | scurvy_duck quits [Remote host closed the connection] |
04:19:49 | | etnguyen03 quits [Read error: Connection reset by peer] |
05:00:00 | | benjins3 quits [Read error: Connection reset by peer] |
05:08:41 | <tech234a> | DSLReports updated again with weird restriction: "NEWS: The full site corpus is only available (in readonly form) for 5 minutes past each hour, for members and guests." |
05:09:14 | <tech234a> | do we have ArchiveBot tooling tooling to only crawl a website for the first 5 minutes past each hour? |
05:11:23 | <nicolas17> | what the fuck |
05:11:49 | <nicolas17> | tech234a: we could manually pause and resume the AB job at the right moments ig |
05:12:27 | <@JAA> | Is it hosted on DSL with an hourly traffic limit? lol |
05:13:18 | <tech234a> | it gives a 503 error on the error pages, is it possible to have it retry those pages at the right time if it encounters one? |
05:14:11 | <tech234a> | also the error page says the 5 minutes per hour thing "may change at any time" |
05:15:46 | <nicolas17> | so we need someone to stare at the logs and pause/resume, wonderful |
05:15:53 | <nicolas17> | is there an AB job running already? |
05:22:12 | <@JAA> | No point in starting outside of those 5 minutes. |
05:24:41 | <@JAA> | But also, we tried to run it through Archivebot a couple years ago. It had to run extremely slowly. Like 5 requests per minute slowly. |
05:27:44 | | BlueMaxima_ quits [Read error: Connection reset by peer] |
05:47:22 | | Island quits [Read error: Connection reset by peer] |
06:01:26 | <tech234a> | doesn't seem to be working as advertised |
06:04:22 | <@JAA> | I am not surprised. |
06:09:19 | <@JAA> | It's clearly still being worked on, but if it doesn't get better soon, we could try to contact them. |
06:49:33 | | qwertyasdfuiopghjkl2 quits [Ping timeout: 260 seconds] |
06:50:08 | | utulien_ quits [Ping timeout: 260 seconds] |
06:56:41 | | niemasd (niemasd) joins |
06:58:16 | <niemasd> | I have a manual dump of the cdc.gov website from 1/25, and it has some pages that Wayback Machine doesn't. The files seem to have their original timestamps. Is there any way to bulk-add them to Wayback Machine? I can share them with the Archive team if so |
06:59:02 | <niemasd> | I imagine no since there's no way to confirm their validity, but I figured I'd check just in case |
07:01:29 | <@JAA> | → #UncleSamsArchive |
07:02:53 | <niemasd> | Ah, thank you! |
07:03:14 | <niemasd> | I imagine no since there's no way to confirm their validity, but I figured I'd check just in case |
07:03:25 | | niemasd leaves |
07:09:04 | | qwertyasdfuiopghjkl2 (qwertyasdfuiopghjkl2) joins |
07:09:32 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
07:30:30 | | earl joins |
07:45:36 | | cascode quits [Ping timeout: 250 seconds] |
07:45:43 | | cascode joins |
07:50:13 | | cascode quits [Ping timeout: 260 seconds] |
07:53:16 | | cascode joins |
08:05:20 | <@JAA> | So DSLReports works now, kind of. |
08:05:31 | <@JAA> | But the rate limit situation isn't looking great. |
08:06:22 | <@JAA> | AB got 503s almost immediately. They're actually the same 'mostly closed' message, but it's obviously rate limiting. |
08:06:32 | <@JAA> | ... and it's down again. |
08:08:28 | <@JAA> | This five minute window thing could only be made to work if we can go *hard* in those five minutes. But as I suspected, doesn't look like we can. |
08:12:08 | | cascode quits [Read error: Connection reset by peer] |
08:12:21 | | cascode joins |
09:05:49 | <steering> | >NEWS: The full site corpus is only available (in readonly form) for 5 minutes past each hour, for members and guests |
09:05:51 | <steering> | wat |
09:08:05 | <steering> | out of control AI scraping? xP |
09:23:17 | <steering> | it's still up right now though? |
09:23:41 | <steering> | oh only the homepage isup |
09:28:20 | | meisnick quits [Quit: Ooops, wrong browser tab.] |
10:02:45 | <Chewie9999> | TheTechRobo: Thanks! I'll try that. I love ntfy.sh :) |
10:11:00 | | earl quits [Client Quit] |
11:11:31 | | neggles quits [Quit: bye friends - ZNC - https://znc.in] |
11:13:56 | | T31M quits [Quit: ZNC - https://znc.in] |
11:15:16 | | T31M joins |
11:15:18 | | T31M is now authenticated as T31M |
11:20:39 | | Stagnant_ quits [Remote host closed the connection] |
11:39:23 | | earl joins |
12:00:04 | | Bleo18260072271962345 quits [Quit: The Lounge - https://thelounge.chat] |
12:02:47 | | Bleo18260072271962345 joins |
12:08:38 | | Webuser654763 joins |
12:34:19 | | SkilledAlpaca418962 quits [Quit: SkilledAlpaca418962] |
12:34:50 | | SkilledAlpaca418962 joins |
12:46:01 | | pixel leaves [Error from remote client] |
12:48:13 | <Webuser654763> | ```Failed to submit discovered URLs.wantreadnil |
12:48:13 | <Webuser654763> | nil``` getting this runner docker for the government grab |
12:49:19 | | pixel (pixel) joins |
12:52:18 | | Stagnant_ (Stagnant) joins |
13:49:40 | | etnguyen03 (etnguyen03) joins |
13:50:31 | | _Dango360 (Dango360) joins |
13:53:56 | | Dango360_ quits [Ping timeout: 250 seconds] |
14:02:35 | | SootBector quits [Remote host closed the connection] |
14:02:53 | | SootBector (SootBector) joins |
14:05:47 | | etnguyen03 quits [Client Quit] |
14:21:07 | | etnguyen03 (etnguyen03) joins |
14:40:17 | | etnguyen03 quits [Client Quit] |
14:43:28 | | etnguyen03 (etnguyen03) joins |
14:50:20 | | Matthww joins |
15:11:18 | <TheTechRobo> | Webuser654763: Should be fixed. |
15:30:51 | | Dango360_ (Dango360) joins |
15:34:02 | | _Dango360 quits [Ping timeout: 250 seconds] |
15:44:24 | | etnguyen03 quits [Client Quit] |
15:45:34 | | Webuser781938 joins |
15:45:43 | | Webuser781938 quits [Client Quit] |
15:46:07 | | etnguyen03 (etnguyen03) joins |
15:53:02 | | Webuser365683 joins |
16:02:59 | | etnguyen03 quits [Client Quit] |
16:03:15 | | Webuser365683 quits [Client Quit] |
17:00:54 | | icedice (icedice) joins |
17:05:25 | | i_have_n0_idea quits [Quit: The Lounge - https://thelounge.chat] |
17:05:43 | | i_have_n0_idea (i_have_n0_idea) joins |
17:09:28 | | PredatorIWD25 quits [Read error: Connection reset by peer] |
17:12:57 | | PredatorIWD25 joins |
17:22:43 | | utulien joins |
17:22:50 | | etnguyen03 (etnguyen03) joins |
17:35:52 | | pseudorizer quits [Quit: ZNC 1.9.1 - https://znc.in] |
17:37:46 | | pseudorizer (pseudorizer) joins |
17:46:19 | | etnguyen03 quits [Client Quit] |
17:50:59 | | etnguyen03 (etnguyen03) joins |
17:53:25 | <h2ibot> | TheTechRobo edited YouTube (+53, Rewrite Wayback Machine section): https://wiki.archiveteam.org/?diff=54319&oldid=53952 |
18:32:21 | | etnguyen03 quits [Client Quit] |
18:37:42 | | etnguyen03 (etnguyen03) joins |
18:46:57 | | etnguyen03 quits [Client Quit] |
19:00:54 | | scurvy_duck joins |
19:10:23 | <eggdrop> | [remind] pokechu22: https://sleepnomoreauction.com/ auctions close shortly cc TheTechRobo |
19:15:03 | | katocala quits [Ping timeout: 260 seconds] |
19:15:16 | | katocala joins |
19:28:28 | | katocala quits [Ping timeout: 260 seconds] |
19:28:43 | <pokechu22> | TheTechRobo: https://transfer.archivete.am/X5AeX/sleepnomoreauction.com_urls_redo.txt - it seems like there are more URLs now (though I'm not sure why?) |
19:28:43 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/X5AeX/sleepnomoreauction.com_urls_redo.txt |
19:29:16 | | katocala joins |
19:39:32 | | Webuser654763 quits [Quit: Ooops, wrong browser tab.] |
19:45:10 | | HP_Archivist (HP_Archivist) joins |
19:52:23 | | HP_Archivist quits [Ping timeout: 260 seconds] |
19:56:06 | <tech234a> | DSLReports seems to generally be made available a few minutes before each hour starts as well, though the exact time might be inconsistent |
19:57:10 | | Miki_57 quits [Quit: Leaving] |
20:05:51 | | BornOn420 quits [Remote host closed the connection] |
20:06:24 | | BornOn420 (BornOn420) joins |
20:24:08 | | HP_Archivist (HP_Archivist) joins |
20:24:18 | | Shyy quits [Quit: The Lounge - https://thelounge.chat] |
20:28:34 | | BlueMaxima joins |
20:32:28 | | etnguyen03 (etnguyen03) joins |
20:38:37 | | lennier2_ joins |
20:41:23 | | lennier2 quits [Ping timeout: 260 seconds] |
20:42:08 | | cascode quits [Ping timeout: 250 seconds] |
20:42:43 | | cascode joins |
20:43:28 | | Shyy joins |
21:02:34 | | Shyy quits [Client Quit] |
21:03:19 | | HP_Archivist quits [Read error: Connection reset by peer] |
21:03:59 | | Shyy joins |
21:05:44 | | HP_Archivist (HP_Archivist) joins |
21:08:49 | | HP_Archivist quits [Read error: Connection reset by peer] |
21:13:41 | | HP_Archivist (HP_Archivist) joins |
21:16:49 | | HP_Archivist quits [Read error: Connection reset by peer] |
21:17:14 | | cascode quits [Ping timeout: 250 seconds] |
21:18:20 | | cascode joins |
21:20:05 | | HP_Archivist (HP_Archivist) joins |
21:31:32 | | cascode quits [Ping timeout: 250 seconds] |
21:31:41 | | cascode joins |
21:54:30 | | BlueMaxima quits [Ping timeout: 250 seconds] |
22:14:48 | | etnguyen03 quits [Client Quit] |
22:22:24 | <szczot3k> | How do we grab yt videos once again? |
22:22:29 | <szczot3k> | Preferably to get them into WBM |
22:24:33 | | lunik11 quits [Quit: :x] |
22:24:44 | | Island joins |
22:25:10 | <nstrom|m> | #down-the-tube:hackint.org |
22:25:53 | <nstrom|m> | there's a bot in there if the videos meet the criteria for the project |
22:29:19 | | lunik11 joins |
22:31:50 | | loug8318142 quits [Quit: The Lounge - https://thelounge.chat] |
22:32:20 | | etnguyen03 (etnguyen03) joins |
22:46:31 | | loug8318142 joins |
22:49:11 | | loug8318142 quits [Client Quit] |
22:55:30 | | katocala is now authenticated as katocala |
23:01:13 | | etnguyen03 quits [Client Quit] |
23:05:00 | <TheTechRobo> | pokechu22: Running now |
23:19:52 | | cascode quits [Ping timeout: 250 seconds] |
23:20:38 | | scurvy_duck quits [Ping timeout: 260 seconds] |
23:23:53 | | cascode joins |
23:32:34 | | ` |