| 00:00:30 | | jtagcat quits [Quit: Bye!] |
| 00:00:57 | | jtagcat (jtagcat) joins |
| 00:11:35 | | Arcorann (Arcorann) joins |
| 00:26:59 | | cobertos_ quits [Remote host closed the connection] |
| 00:35:38 | | hitgrr8 quits [Client Quit] |
| 00:55:09 | | nicolas17 quits [Ping timeout: 265 seconds] |
| 00:59:09 | | nicolas17 joins |
| 01:10:10 | | sonick (sonick) joins |
| 01:16:54 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 01:29:22 | | etnguyen03 (etnguyen03) joins |
| 01:43:58 | | chessnoob280 quits [Ping timeout: 265 seconds] |
| 01:58:13 | | jacksonchen666 quits [Client Quit] |
| 02:43:12 | | HIJIUIGURUY joins |
| 02:43:28 | | HIJIUIGURUY quits [Remote host closed the connection] |
| 02:45:50 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 02:47:54 | | lunik173 joins |
| 02:55:00 | | Naruyoko5 joins |
| 02:57:00 | | Naruyoko quits [Ping timeout: 258 seconds] |
| 03:00:12 | | emily quits [Client Quit] |
| 03:00:50 | | pseudorizer (pseudorizer) joins |
| 03:00:55 | | etnguyen03 (etnguyen03) joins |
| 03:03:09 | | nicolas17 quits [Client Quit] |
| 03:45:31 | | decky joins |
| 04:21:32 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 04:26:05 | | etnguyen03 (etnguyen03) joins |
| 04:28:08 | | razul quits [Ping timeout: 252 seconds] |
| 04:57:42 | <@hook54321> | really not a fan DRM, but i can see why they sent the takedown request with the surrounding legal stuff going on right now |
| 04:58:09 | | etnguyen03 quits [Client Quit] |
| 05:01:03 | <flashfire42> | I mean there are easier ways to get pirated books than crack DRM from IA |
| 05:08:38 | <fireonlive> | 👀 |
| 05:08:53 | <fireonlive> | hook54321: yeah for sure. hope it helps them |
| 05:33:11 | <pcr> | It does probably mean they are going to resist any attempt to perform a (maybe illegal) archival if they announce they are going to need to delete stuff on a day in the future. |
| 05:44:46 | | jasons1 (jasonswohl) joins |
| 05:44:47 | | jasons quits [Client Quit] |
| 05:44:47 | | jasons1 is now known as jasons |
| 05:48:11 | | GhostUser2863 joins |
| 05:56:19 | | BlueMaxima joins |
| 06:03:41 | <thuban> | flashfire42: depends on the book tbqh |
| 06:18:20 | | JohnnyJ joins |
| 07:05:24 | | skyrocket quits [Ping timeout: 258 seconds] |
| 07:06:37 | | Unholy236131 (Unholy2361) joins |
| 07:09:14 | | eroc1990 quits [Ping timeout: 258 seconds] |
| 07:09:17 | | Unholy23613 quits [Ping timeout: 252 seconds] |
| 07:09:17 | | Unholy236131 is now known as Unholy23613 |
| 07:10:55 | | skyrocket joins |
| 07:13:58 | | eroc1990 (eroc1990) joins |
| 07:18:20 | | pabs hmmm at https://godotforums.org/d/35412-sadly-i-think-godot-is-a-scam-im-not-sure-i-can-do-this |
| 07:21:41 | | hitgrr8 joins |
| 07:22:05 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 07:26:36 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 07:27:53 | <h2ibot> | Yts98 edited 半次元 (+2524, Explain alternate image CDN endpoints): https://wiki.archiveteam.org/?diff=50208&oldid=50196 |
| 07:30:32 | | chessnoob280 joins |
| 07:32:54 | <h2ibot> | Yts98 edited 半次元 (+0): https://wiki.archiveteam.org/?diff=50209&oldid=50208 |
| 07:44:29 | | dumbgoy quits [Ping timeout: 252 seconds] |
| 08:14:55 | | Island quits [Read error: Connection reset by peer] |
| 08:22:50 | | railen63 quits [Remote host closed the connection] |
| 08:25:48 | | railen63 joins |
| 08:34:54 | | razul joins |
| 09:17:03 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 09:23:37 | | GhostUser2863 quits [Ping timeout: 265 seconds] |
| 09:40:03 | | chessnoob280 quits [Ping timeout: 265 seconds] |
| 09:41:22 | | Dango360 quits [Read error: Connection reset by peer] |
| 09:43:29 | | Dango360 (Dango360) joins |
| 09:53:00 | | yts98 leaves |
| 09:53:02 | | yts98 joins |
| 10:00:01 | | railen63 quits [Remote host closed the connection] |
| 10:00:17 | | railen63 joins |
| 10:18:24 | <h2ibot> | PaulWise edited Bugzilla (+30, ghostscript bugzilla): https://wiki.archiveteam.org/?diff=50210&oldid=50178 |
| 10:18:25 | <h2ibot> | PaulWise edited Bugzilla (+53, IRC channel topics idea): https://wiki.archiveteam.org/?diff=50211&oldid=50210 |
| 10:25:26 | <h2ibot> | PaulWise edited Bugzilla (+160, security issue lists idea): https://wiki.archiveteam.org/?diff=50212&oldid=50211 |
| 10:26:26 | <h2ibot> | PaulWise edited Bugzilla (-2, syntax fix): https://wiki.archiveteam.org/?diff=50213&oldid=50212 |
| 10:33:27 | <h2ibot> | PaulWise edited Mailman2 (+213, add IRC and sectrackers as sources of mailman2…): https://wiki.archiveteam.org/?diff=50214&oldid=50180 |
| 10:51:53 | | Mateon2 joins |
| 10:52:22 | | JohnnyJ quits [Client Quit] |
| 10:52:22 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
| 10:52:22 | | razul quits [Client Quit] |
| 10:52:22 | | Mateon1 quits [Remote host closed the connection] |
| 10:52:22 | | Mateon2 is now known as Mateon1 |
| 10:52:26 | | JohnnyJ joins |
| 10:52:37 | | razul joins |
| 11:02:25 | | JohnnyJ quits [Read error: Connection reset by peer] |
| 11:35:16 | | eroc1990 quits [Ping timeout: 258 seconds] |
| 11:54:06 | | jacksonchen666 (jacksonchen666) joins |
| 12:02:22 | | eroc1990 (eroc1990) joins |
| 12:02:44 | <h2ibot> | PaulWise edited Bugzilla (+792, add URLs from Debian sectracker): https://wiki.archiveteam.org/?diff=50215&oldid=50213 |
| 12:08:59 | | chessnoob280 joins |
| 12:25:11 | | jc666 (jacksonchen666) joins |
| 12:28:21 | | jacksonchen666 quits [Ping timeout: 245 seconds] |
| 13:19:00 | | chessnoob280 quits [Ping timeout: 265 seconds] |
| 13:24:40 | | lunik173 quits [Client Quit] |
| 13:25:03 | | lunik173 joins |
| 13:39:36 | | jc666 quits [Ping timeout: 245 seconds] |
| 13:52:00 | | jc666 (jacksonchen666) joins |
| 14:08:50 | | W7RFa6AbNFz_ quits [Read error: Connection reset by peer] |
| 14:09:06 | | W7RFa6AbNFz_ joins |
| 14:12:57 | | etnguyen03 (etnguyen03) joins |
| 14:24:46 | | chessnoob280 joins |
| 14:39:44 | | nulldata quits [Ping timeout: 252 seconds] |
| 14:44:41 | | Arcorann quits [Ping timeout: 252 seconds] |
| 14:48:33 | | nulldata (nulldata) joins |
| 14:52:21 | | chessnoob280 is now authenticated as chessnoob280 |
| 14:55:09 | | jc666 is now known as jacksonchen666 |
| 14:58:08 | | chessnoob280 quits [Remote host closed the connection] |
| 15:20:00 | | dumbgoy joins |
| 15:33:38 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 16:00:27 | <h2ibot> | JAABot edited CurrentWarriorProject (-2): https://wiki.archiveteam.org/?diff=50216&oldid=50153 |
| 16:22:18 | | etnguyen03 (etnguyen03) joins |
| 16:52:06 | | sonick quits [Client Quit] |
| 16:57:57 | | fishingforsoup_ joins |
| 16:58:02 | | gfhh quits [Ping timeout: 258 seconds] |
| 17:01:52 | | fishingforsoup__ quits [Ping timeout: 258 seconds] |
| 17:03:50 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 17:05:32 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 17:05:49 | | AmAnd0A joins |
| 17:19:44 | | jasons quits [Client Quit] |
| 17:20:15 | | jasons (jasonswohl) joins |
| 17:21:12 | | etnguyen03 (etnguyen03) joins |
| 17:31:16 | | razul quits [Client Quit] |
| 17:32:40 | | razul joins |
| 17:43:26 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 17:50:55 | | etnguyen03 (etnguyen03) joins |
| 17:54:57 | | chessnoob280 joins |
| 18:13:50 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 18:49:13 | <pokechu22> | JAA: I have a large set of URLs related to germandocsinrussia.org and historyrussia.org book scans, probably on the order of 10 million across all sites and all unsaved zoom levels. They're incremental IDs with gaps (e.g. https://wwii.germandocsinrussia.org/pages/24/zooms/8, https://wwii.germandocsinrussia.org/pages/1505900/zooms/8 - I haven't figured out the exact maximum |
| 18:49:16 | <pokechu22> | yet) where zoom ranges from 3 to 7 or 8 (0-2 are not used directly, but instead e.g. https://wwii.germandocsinrussia.org/system/pages/000/734/55/images/small/fd4fabbe9f63bf507db8ac35af4e318616146ad4.jpg?1538539960 or x_small or xx_small, and archivebot will have already captured them so we don't need to worry about the random-looking component). I assume qwarc is the best |
| 18:49:18 | <pokechu22> | tool for that, as giving archivebot an !ao < list job with 10 million entries will result in sadness? |
| 18:49:26 | <pokechu22> | If so, what kind of information do you need to do a qwarc job? |
| 19:01:22 | | nicolas17 joins |
| 19:01:30 | | nicolas17 is now authenticated as nicolas17 |
| 19:21:40 | <pokechu22> | For the AB jobs that have finished, I've determined that the highest valid IDs are https://tsamo.germandocsinrussia.org/pages/48045/zooms/8 and https://rgaspi-458-9.germandocsinrussia.org/pages/77762/zooms/8 (and that there are 40593 and 70703 actual valid images in that region respectively, with invalid ones in that range giving 500s and ones outside that range giving 404s). |
| 19:21:43 | <pokechu22> | It seems like zoom 8 gives errors on some URLs (e.g. https://rgaspi-458-9.germandocsinrussia.org/pages/8/zooms/8) for which zoom 7 does work. |
| 19:22:26 | | cm quits [Ping timeout: 252 seconds] |
| 19:23:24 | <pokechu22> | ah, scratch that about 500s, seems to depend on the site as https://wwii.germandocsinrussia.org/pages/163000/zooms/8 and https://wwii.germandocsinrussia.org/pages/165000/zooms/8 are 200 but https://wwii.germandocsinrussia.org/pages/164000/zooms/8 is 404 instead of 500. I'll just wait for AB to finish to get a maximum valid ID instead of trying to do a binary search |
| 19:26:08 | | cm joins |
| 19:35:05 | | eroc1990 quits [Ping timeout: 252 seconds] |
| 19:35:51 | | sec^nd quits [Ping timeout: 245 seconds] |
| 19:40:58 | | sec^nd (second) joins |
| 19:44:17 | | razul quits [Client Quit] |
| 19:45:34 | | razul joins |
| 19:45:54 | <@JAA> | pokechu22: Yeah, loading 10M into AB would be slow. The list input importing in wpull is a bit awkward. It'd probably take a few hours. That's the only sad part though. It's certainly better otherwise because it allows for easy monitoring, request rate adjustment, etc., which isn't really the case with qwarc. |
| 19:46:38 | <@JAA> | And could do it in smaller chunks of course rather than one huge list. |
| 19:47:18 | <@JAA> | It's possible of course with qwarc, just doesn't sound like a great fit unless the site is going down soon and can handle several dozen requests per second. |
| 19:47:28 | <pokechu22> | Alright, I might try it for the smaller ones at least |
| 19:48:10 | <pokechu22> | I'm not aware of any rate-limiting - I'll try tsamo.germandocsinrussia.org at an aggressive rate with AB to see what happens maybe |
| 19:56:32 | <@JAA> | Well, qwarc is about 1 or 2 orders of magnitude faster than AB... |
| 19:57:02 | <@JAA> | (Without trying hard, that is.) |
| 19:57:33 | <@JAA> | Although AB is able to reach something like 20 req/s quite comfortably for images. |
| 19:58:17 | <pokechu22> | Probably we'd be limited by the ping time to russia if anything |
| 19:58:38 | <@JAA> | Right |
| 20:03:29 | <nicolas17> | JAA: https://archive.org/details/csdnsdplist this has a bunch of "screensavers" used on Apple Store demo devices, but it also has the original URLs they were downloaded from, would it be worth putting them in archivebot or something so they're on WBM? |
| 20:09:15 | <@JAA> | (TIL 'H.264 IA' for derived videos.) |
| 20:09:20 | <@JAA> | nicolas17: Maybe, yeah. I wouldn't be opposed to it. |
| 20:14:21 | <fireonlive> | i wonder why the first one is 'sideways' |
| 20:14:34 | <fireonlive> | hm they all seem sideways |
| 20:16:01 | <fireonlive> | the few i checked yday were good though :D |
| 20:16:39 | <nicolas17> | Side data: |
| 20:16:40 | <nicolas17> | displaymatrix: rotation of -90.00 degrees |
| 20:16:51 | <fireonlive> | ahh |
| 20:16:59 | <nicolas17> | which the web player doesn't understand ig |
| 20:17:07 | <fireonlive> | makes sense :) |
| 20:17:26 | <nicolas17> | also it seems many of these are h265 and HDR |
| 20:21:15 | <pokechu22> | ugh, looks like there's also a map view for some pages that's higher resolution, e.g. https://tsamo.germandocsinrussia.org/pages/44716/map - indicated on view-source:https://tsamo.germandocsinrussia.org/ru/nodes/246-delo-234-karta-polozheniya-frantsuzskih-angliyskih-i-belgiyskih-voysk-na-zapadnom-fronte-na-04-05-1918g-m-1-750-000 by map_ids = [44716]; in the JS. Pretty sure |
| 20:21:18 | <pokechu22> | the only way to find those is to download the full warcs :| |
| 20:22:39 | <pokechu22> | (you can plug in any page ID, but most will try to load missing images, and I think there's only a few maps to trying to save them for everything would be a waste of resources) |
| 20:33:23 | | etnguyen03 (etnguyen03) joins |
| 20:44:31 | | Jstar269 joins |
| 20:53:49 | | Jstar269 quits [Remote host closed the connection] |
| 20:59:42 | | Minkafighter52 quits [Client Quit] |
| 20:59:54 | | Minkafighter52 joins |
| 21:01:41 | | ymgve_ quits [Quit: Leaving] |
| 21:05:25 | | IDK_ quits [Client Quit] |
| 21:05:25 | | Minkafighter52 quits [Client Quit] |
| 21:05:31 | | IDK_ joins |
| 21:05:33 | | Minkafighter52 joins |
| 21:07:38 | | Island joins |
| 21:11:08 | | IDK_ quits [Client Quit] |
| 21:11:08 | | Minkafighter52 quits [Client Quit] |
| 21:11:13 | | IDK_ joins |
| 21:11:20 | | Minkafighter52 joins |
| 21:21:08 | | hitgrr8 quits [Client Quit] |
| 21:26:41 | | BigBrain quits [Ping timeout: 245 seconds] |
| 21:27:35 | | BigBrain (bigbrain) joins |
| 22:05:54 | | Jake quits [Quit: Leaving for a bit!] |
| 22:06:10 | | Jake (Jake) joins |
| 22:25:17 | <nicolas17> | JAA: transfer.archivete.am is down |
| 22:25:29 | | BigBrain quits [Remote host closed the connection] |
| 22:25:56 | | BigBrain (bigbrain) joins |
| 22:26:21 | <nicolas17> | Caddy returns Bad Gateway |
| 22:26:23 | <@JAA> | nicolas17: Yes, we have monitoring for that in #nodeping. |
| 22:26:30 | <nicolas17> | ok |
| 22:28:51 | | luckcolors quits [Ping timeout: 258 seconds] |
| 22:33:23 | | luckcolors (luckcolors) joins |
| 23:01:26 | | tzt quits [Ping timeout: 258 seconds] |
| 23:02:59 | | tzt (tzt) joins |
| 23:10:38 | | BlueMaxima joins |
| 23:21:39 | | ymgve joins |
| 23:22:43 | | fook joins |
| 23:46:43 | | Arcorann (Arcorann) joins |