00:00:30jtagcat quits [Quit: Bye!]
00:00:57jtagcat (jtagcat) joins
00:11:35Arcorann (Arcorann) joins
00:26:59cobertos_ quits [Remote host closed the connection]
00:35:38hitgrr8 quits [Client Quit]
00:55:09nicolas17 quits [Ping timeout: 265 seconds]
00:59:09nicolas17 joins
01:10:10sonick (sonick) joins
01:16:54etnguyen03 quits [Ping timeout: 265 seconds]
01:29:22etnguyen03 (etnguyen03) joins
01:43:58chessnoob280 quits [Ping timeout: 265 seconds]
01:58:13jacksonchen666 quits [Client Quit]
02:43:12HIJIUIGURUY joins
02:43:28HIJIUIGURUY quits [Remote host closed the connection]
02:45:50etnguyen03 quits [Ping timeout: 252 seconds]
02:47:54lunik173 joins
02:55:00Naruyoko5 joins
02:57:00Naruyoko quits [Ping timeout: 258 seconds]
03:00:12emily quits [Client Quit]
03:00:50pseudorizer (pseudorizer) joins
03:00:55etnguyen03 (etnguyen03) joins
03:03:09nicolas17 quits [Client Quit]
03:45:31decky joins
04:21:32etnguyen03 quits [Ping timeout: 265 seconds]
04:26:05etnguyen03 (etnguyen03) joins
04:28:08razul quits [Ping timeout: 252 seconds]
04:57:42<@hook54321>really not a fan DRM, but i can see why they sent the takedown request with the surrounding legal stuff going on right now
04:58:09etnguyen03 quits [Client Quit]
05:01:03<flashfire42>I mean there are easier ways to get pirated books than crack DRM from IA
05:08:38<fireonlive>👀
05:08:53<fireonlive>hook54321: yeah for sure. hope it helps them
05:33:11<pcr>It does probably mean they are going to resist any attempt to perform a (maybe illegal) archival if they announce they are going to need to delete stuff on a day in the future.
05:44:46jasons1 (jasonswohl) joins
05:44:47jasons quits [Client Quit]
05:44:47jasons1 is now known as jasons
05:48:11GhostUser2863 joins
05:56:19BlueMaxima joins
06:03:41<thuban>flashfire42: depends on the book tbqh
06:18:20JohnnyJ joins
07:05:24skyrocket quits [Ping timeout: 258 seconds]
07:06:37Unholy236131 (Unholy2361) joins
07:09:14eroc1990 quits [Ping timeout: 258 seconds]
07:09:17Unholy23613 quits [Ping timeout: 252 seconds]
07:09:17Unholy236131 is now known as Unholy23613
07:10:55skyrocket joins
07:13:58eroc1990 (eroc1990) joins
07:18:20pabs hmmm at https://godotforums.org/d/35412-sadly-i-think-godot-is-a-scam-im-not-sure-i-can-do-this
07:21:41hitgrr8 joins
07:22:05qwertyasdfuiopghjkl quits [Client Quit]
07:26:36qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
07:27:53<h2ibot>Yts98 edited 半次元 (+2524, Explain alternate image CDN endpoints): https://wiki.archiveteam.org/?diff=50208&oldid=50196
07:30:32chessnoob280 joins
07:32:54<h2ibot>Yts98 edited 半次元 (+0): https://wiki.archiveteam.org/?diff=50209&oldid=50208
07:44:29dumbgoy quits [Ping timeout: 252 seconds]
08:14:55Island quits [Read error: Connection reset by peer]
08:22:50railen63 quits [Remote host closed the connection]
08:25:48railen63 joins
08:34:54razul joins
09:17:03BlueMaxima quits [Read error: Connection reset by peer]
09:23:37GhostUser2863 quits [Ping timeout: 265 seconds]
09:40:03chessnoob280 quits [Ping timeout: 265 seconds]
09:41:22Dango360 quits [Read error: Connection reset by peer]
09:43:29Dango360 (Dango360) joins
09:53:00yts98 leaves
09:53:02yts98 joins
10:00:01railen63 quits [Remote host closed the connection]
10:00:17railen63 joins
10:18:24<h2ibot>PaulWise edited Bugzilla (+30, ghostscript bugzilla): https://wiki.archiveteam.org/?diff=50210&oldid=50178
10:18:25<h2ibot>PaulWise edited Bugzilla (+53, IRC channel topics idea): https://wiki.archiveteam.org/?diff=50211&oldid=50210
10:25:26<h2ibot>PaulWise edited Bugzilla (+160, security issue lists idea): https://wiki.archiveteam.org/?diff=50212&oldid=50211
10:26:26<h2ibot>PaulWise edited Bugzilla (-2, syntax fix): https://wiki.archiveteam.org/?diff=50213&oldid=50212
10:33:27<h2ibot>PaulWise edited Mailman2 (+213, add IRC and sectrackers as sources of mailman2…): https://wiki.archiveteam.org/?diff=50214&oldid=50180
10:51:53Mateon2 joins
10:52:22JohnnyJ quits [Client Quit]
10:52:22qwertyasdfuiopghjkl quits [Remote host closed the connection]
10:52:22razul quits [Client Quit]
10:52:22Mateon1 quits [Remote host closed the connection]
10:52:22Mateon2 is now known as Mateon1
10:52:26JohnnyJ joins
10:52:37razul joins
11:02:25JohnnyJ quits [Read error: Connection reset by peer]
11:35:16eroc1990 quits [Ping timeout: 258 seconds]
11:54:06jacksonchen666 (jacksonchen666) joins
12:02:22eroc1990 (eroc1990) joins
12:02:44<h2ibot>PaulWise edited Bugzilla (+792, add URLs from Debian sectracker): https://wiki.archiveteam.org/?diff=50215&oldid=50213
12:08:59chessnoob280 joins
12:25:11jc666 (jacksonchen666) joins
12:28:21jacksonchen666 quits [Ping timeout: 245 seconds]
13:19:00chessnoob280 quits [Ping timeout: 265 seconds]
13:24:40lunik173 quits [Client Quit]
13:25:03lunik173 joins
13:39:36jc666 quits [Ping timeout: 245 seconds]
13:52:00jc666 (jacksonchen666) joins
14:08:50W7RFa6AbNFz_ quits [Read error: Connection reset by peer]
14:09:06W7RFa6AbNFz_ joins
14:12:57etnguyen03 (etnguyen03) joins
14:24:46chessnoob280 joins
14:39:44nulldata quits [Ping timeout: 252 seconds]
14:44:41Arcorann quits [Ping timeout: 252 seconds]
14:48:33nulldata (nulldata) joins
14:55:09jc666 is now known as jacksonchen666
14:58:08chessnoob280 quits [Remote host closed the connection]
15:20:00dumbgoy joins
15:33:38etnguyen03 quits [Ping timeout: 252 seconds]
16:00:27<h2ibot>JAABot edited CurrentWarriorProject (-2): https://wiki.archiveteam.org/?diff=50216&oldid=50153
16:22:18etnguyen03 (etnguyen03) joins
16:52:06sonick quits [Client Quit]
16:57:57fishingforsoup_ joins
16:58:02gfhh quits [Ping timeout: 258 seconds]
17:01:52fishingforsoup__ quits [Ping timeout: 258 seconds]
17:03:50etnguyen03 quits [Ping timeout: 252 seconds]
17:05:32AmAnd0A quits [Read error: Connection reset by peer]
17:05:49AmAnd0A joins
17:19:44jasons quits [Client Quit]
17:20:15jasons (jasonswohl) joins
17:21:12etnguyen03 (etnguyen03) joins
17:31:16razul quits [Client Quit]
17:32:40razul joins
17:43:26etnguyen03 quits [Ping timeout: 252 seconds]
17:50:55etnguyen03 (etnguyen03) joins
17:54:57chessnoob280 joins
18:13:50etnguyen03 quits [Ping timeout: 265 seconds]
18:49:13<pokechu22>JAA: I have a large set of URLs related to germandocsinrussia.org and historyrussia.org book scans, probably on the order of 10 million across all sites and all unsaved zoom levels. They're incremental IDs with gaps (e.g. https://wwii.germandocsinrussia.org/pages/24/zooms/8, https://wwii.germandocsinrussia.org/pages/1505900/zooms/8 - I haven't figured out the exact maximum
18:49:16<pokechu22>yet) where zoom ranges from 3 to 7 or 8 (0-2 are not used directly, but instead e.g. https://wwii.germandocsinrussia.org/system/pages/000/734/55/images/small/fd4fabbe9f63bf507db8ac35af4e318616146ad4.jpg?1538539960 or x_small or xx_small, and archivebot will have already captured them so we don't need to worry about the random-looking component). I assume qwarc is the best
18:49:18<pokechu22>tool for that, as giving archivebot an !ao < list job with 10 million entries will result in sadness?
18:49:26<pokechu22>If so, what kind of information do you need to do a qwarc job?
19:01:22nicolas17 joins
19:21:40<pokechu22>For the AB jobs that have finished, I've determined that the highest valid IDs are https://tsamo.germandocsinrussia.org/pages/48045/zooms/8 and https://rgaspi-458-9.germandocsinrussia.org/pages/77762/zooms/8 (and that there are 40593 and 70703 actual valid images in that region respectively, with invalid ones in that range giving 500s and ones outside that range giving 404s).
19:21:43<pokechu22>It seems like zoom 8 gives errors on some URLs (e.g. https://rgaspi-458-9.germandocsinrussia.org/pages/8/zooms/8) for which zoom 7 does work.
19:22:26cm quits [Ping timeout: 252 seconds]
19:23:24<pokechu22>ah, scratch that about 500s, seems to depend on the site as https://wwii.germandocsinrussia.org/pages/163000/zooms/8 and https://wwii.germandocsinrussia.org/pages/165000/zooms/8 are 200 but https://wwii.germandocsinrussia.org/pages/164000/zooms/8 is 404 instead of 500. I'll just wait for AB to finish to get a maximum valid ID instead of trying to do a binary search
19:26:08cm joins
19:35:05eroc1990 quits [Ping timeout: 252 seconds]
19:35:51sec^nd quits [Ping timeout: 245 seconds]
19:40:58sec^nd (second) joins
19:44:17razul quits [Client Quit]
19:45:34razul joins
19:45:54<@JAA>pokechu22: Yeah, loading 10M into AB would be slow. The list input importing in wpull is a bit awkward. It'd probably take a few hours. That's the only sad part though. It's certainly better otherwise because it allows for easy monitoring, request rate adjustment, etc., which isn't really the case with qwarc.
19:46:38<@JAA>And could do it in smaller chunks of course rather than one huge list.
19:47:18<@JAA>It's possible of course with qwarc, just doesn't sound like a great fit unless the site is going down soon and can handle several dozen requests per second.
19:47:28<pokechu22>Alright, I might try it for the smaller ones at least
19:48:10<pokechu22>I'm not aware of any rate-limiting - I'll try tsamo.germandocsinrussia.org at an aggressive rate with AB to see what happens maybe
19:56:32<@JAA>Well, qwarc is about 1 or 2 orders of magnitude faster than AB...
19:57:02<@JAA>(Without trying hard, that is.)
19:57:33<@JAA>Although AB is able to reach something like 20 req/s quite comfortably for images.
19:58:17<pokechu22>Probably we'd be limited by the ping time to russia if anything
19:58:38<@JAA>Right
20:03:29<nicolas17>JAA: https://archive.org/details/csdnsdplist this has a bunch of "screensavers" used on Apple Store demo devices, but it also has the original URLs they were downloaded from, would it be worth putting them in archivebot or something so they're on WBM?
20:09:15<@JAA>(TIL 'H.264 IA' for derived videos.)
20:09:20<@JAA>nicolas17: Maybe, yeah. I wouldn't be opposed to it.
20:14:21<fireonlive>i wonder why the first one is 'sideways'
20:14:34<fireonlive>hm they all seem sideways
20:16:01<fireonlive>the few i checked yday were good though :D
20:16:39<nicolas17> Side data:
20:16:40<nicolas17> displaymatrix: rotation of -90.00 degrees
20:16:51<fireonlive>ahh
20:16:59<nicolas17>which the web player doesn't understand ig
20:17:07<fireonlive>makes sense :)
20:17:26<nicolas17>also it seems many of these are h265 and HDR
20:21:15<pokechu22>ugh, looks like there's also a map view for some pages that's higher resolution, e.g. https://tsamo.germandocsinrussia.org/pages/44716/map - indicated on view-source:https://tsamo.germandocsinrussia.org/ru/nodes/246-delo-234-karta-polozheniya-frantsuzskih-angliyskih-i-belgiyskih-voysk-na-zapadnom-fronte-na-04-05-1918g-m-1-750-000 by map_ids = [44716]; in the JS. Pretty sure
20:21:18<pokechu22>the only way to find those is to download the full warcs :|
20:22:39<pokechu22>(you can plug in any page ID, but most will try to load missing images, and I think there's only a few maps to trying to save them for everything would be a waste of resources)
20:33:23etnguyen03 (etnguyen03) joins
20:44:31Jstar269 joins
20:53:49Jstar269 quits [Remote host closed the connection]
20:59:42Minkafighter52 quits [Client Quit]
20:59:54Minkafighter52 joins
21:01:41ymgve_ quits [Quit: Leaving]
21:05:25IDK_ quits [Client Quit]
21:05:25Minkafighter52 quits [Client Quit]
21:05:31IDK_ joins
21:05:33Minkafighter52 joins
21:07:38Island joins
21:11:08IDK_ quits [Client Quit]
21:11:08Minkafighter52 quits [Client Quit]
21:11:13IDK_ joins
21:11:20Minkafighter52 joins
21:21:08hitgrr8 quits [Client Quit]
21:26:41BigBrain quits [Ping timeout: 245 seconds]
21:27:35BigBrain (bigbrain) joins
22:05:54Jake quits [Quit: Leaving for a bit!]
22:06:10Jake (Jake) joins
22:25:17<nicolas17>JAA: transfer.archivete.am is down
22:25:29BigBrain quits [Remote host closed the connection]
22:25:56BigBrain (bigbrain) joins
22:26:21<nicolas17>Caddy returns Bad Gateway
22:26:23<@JAA>nicolas17: Yes, we have monitoring for that in #nodeping.
22:26:30<nicolas17>ok
22:28:51luckcolors quits [Ping timeout: 258 seconds]
22:33:23luckcolors (luckcolors) joins
23:01:26tzt quits [Ping timeout: 258 seconds]
23:02:59tzt (tzt) joins
23:10:38BlueMaxima joins
23:21:39ymgve joins
23:22:43fook joins
23:46:43Arcorann (Arcorann) joins