| 00:00:03 | | HackMii quits [Remote host closed the connection] |
| 00:00:20 | | HackMii (hacktheplanet) joins |
| 00:23:15 | | etnguyen03 quits [Quit: Konversation terminated!] |
| 00:34:32 | | ducky quits [Ping timeout: 260 seconds] |
| 00:36:20 | | ducky (ducky) joins |
| 00:48:29 | | ericgallager quits [Quit: This computer has gone to sleep] |
| 00:58:27 | | ericgallager joins |
| 01:00:51 | | Hackerpcs quits [Quit: Hackerpcs] |
| 01:01:32 | <ericgallager> | has anyone archived Zillow's climate risk scores? https://bsky.app/profile/volts.wtf/post/3m6upiqs67k2c |
| 01:04:54 | <pokechu22> | IIRC zillow is pretty anti-scraping |
| 01:06:56 | | Hackerpcs (Hackerpcs) joins |
| 01:08:42 | | Webuser421296 joins |
| 01:09:01 | | Webuser421296 quits [Client Quit] |
| 01:17:29 | | etnguyen03 (etnguyen03) joins |
| 01:18:01 | <nicolas17> | looking at parti-livestream |
| 01:18:35 | | jason joins |
| 01:19:09 | <nicolas17> | listing taking me 1 second per page ugh |
| 01:19:45 | | Czechball quits [Quit: Quit: Leaving] |
| 01:21:47 | <nicolas17> | 200k files and I don't know how far I am |
| 01:23:33 | <nicolas17> | there are duplicated filessss |
| 01:24:03 | <@JAA> | I once listed a bucket that took almost two months and produced a couple hundred GB of compressed JSONL. Yep, that's how it goes. :-) |
| 01:24:39 | <nicolas17> | example (2MB): |
| 01:24:40 | <nicolas17> | c20d4d80a75defb0364ca43f14486f58cd210597 423581_557ce44a-1597-4555-8221-557bec85c2f0.png |
| 01:24:41 | <nicolas17> | c20d4d80a75defb0364ca43f14486f58cd210597 423581_713af7b0-2fbd-43ef-a963-d9821cf2a95a.png |
| 01:25:02 | | @JAA pretends to be surprised. |
| 01:25:19 | <@JAA> | .png sounds like it might be a thumbnail or similar? |
| 01:26:39 | <nicolas17> | https://media.parti.com/423581_713af7b0-2fbd-43ef-a963-d9821cf2a95a.png (bad content-type) |
| 01:28:03 | <nicolas17> | ...really curious what that decimal number before the uuid means |
| 01:28:22 | | etnguyen03 quits [Client Quit] |
| 01:29:57 | <nicolas17> | 1 million files, 630GiB, still going |
| 01:30:19 | <@JAA> | Yeah, I wouldn't be surprised if this were pretty big, especially if it's HLS with segments. |
| 01:30:43 | <nicolas17> | so far it's all images |
| 01:30:44 | <nicolas17> | ??? |
| 01:34:41 | <@JAA> | I just found that netcup employs Anubis, by the way. Custom message 'This site is protected by <a href="https://www.anexia.com/">ANEXIA</a>.' and the anime girl image is a slow 403. |
| 01:38:51 | <nicolas17> | 1TiB |
| 01:40:24 | <nexussfan> | JAA: custom versions of anubis exists, but AFAIK they all use .within-website |
| 01:42:19 | <nicolas17> | oh god I got to the video streams, this is huge |
| 01:45:19 | <nicolas17> | /channels contains 298 dev-channel-<id> subdirectories which seems... strangely low? |
| 01:47:00 | <nicolas17> | "the web3 creator economy & live stream platform" okay having <300 active users makes more sense now |
| 01:47:42 | <nicolas17> | how come they're using google cloud storage instead of filecoin? :P |
| 01:52:18 | <mystique_altrosky> | cause they need something that actually works |
| 01:57:30 | | etnguyen03 (etnguyen03) joins |
| 02:09:10 | | tzt quits [Remote host closed the connection] |
| 02:09:29 | | tzt (tzt) joins |
| 02:11:09 | | nathang2184 quits [Quit: Ping timeout (120 seconds)] |
| 02:11:28 | | nathang2184 joins |
| 02:19:00 | | nathang2184 quits [Ping timeout: 256 seconds] |
| 02:26:16 | <nicolas17> | list has been running for an hour, 7M files, 5.2TiB, still going |
| 02:26:41 | | beardicus quits [Ping timeout: 272 seconds] |
| 02:26:43 | <nicolas17> | I started from scratch on my VPS which has better latency to google, I'm optimistic it will catch up with my local PC long before it finishes |
| 02:27:50 | | beardicus (beardicus) joins |
| 02:30:17 | | nathang2184 joins |
| 02:30:48 | | Czechball joins |
| 02:31:13 | | Doomaholic (Doomaholic) joins |
| 02:34:21 | | jason quits [Read error: Connection reset by peer] |
| 02:34:46 | | jason joins |
| 02:38:27 | | Wohlstand quits [Quit: Wohlstand] |
| 02:38:27 | | nathang2184 quits [Read error: Connection reset by peer] |
| 02:38:38 | | nathang2184 joins |
| 02:45:40 | | ducky quits [Ping timeout: 260 seconds] |
| 02:46:19 | | nathang2184 quits [Ping timeout: 272 seconds] |
| 02:47:28 | | ducky (ducky) joins |
| 02:48:02 | | sg72 quits [Remote host closed the connection] |
| 02:49:11 | | sg72 joins |
| 02:59:40 | | ducky quits [Ping timeout: 260 seconds] |
| 03:01:35 | | ducky (ducky) joins |
| 03:04:10 | <nicolas17> | 21M files, 17.6TiB, still going |
| 03:06:40 | | ducky quits [Ping timeout: 260 seconds] |
| 03:07:50 | | Island quits [Read error: Connection reset by peer] |
| 03:09:11 | | nathang2184 joins |
| 03:11:53 | | ducky (ducky) joins |
| 03:13:33 | | cultpony quits [Ping timeout: 272 seconds] |
| 03:14:16 | | cultpony (cultpony) joins |
| 03:14:49 | | nathang2184 quits [Ping timeout: 272 seconds] |
| 03:15:56 | <nicolas17> | I estimate 60TB but I could be way off |
| 03:16:56 | | ducky quits [Ping timeout: 260 seconds] |
| 03:24:23 | <h2ibot> | BlankEclair edited List of websites excluded from the Wayback Machine/Partial exclusions (+55, …): https://wiki.archiveteam.org/?diff=58207&oldid=58146 |
| 03:27:50 | | etnguyen03 quits [Client Quit] |
| 03:30:30 | | nathang2184 joins |
| 03:32:11 | | Lord_Nightmare quits [Quit: ZNC - http://znc.in] |
| 03:33:24 | | ducky (ducky) joins |
| 03:35:49 | | Lord_Nightmare (Lord_Nightmare) joins |
| 03:37:37 | | nathang2184 quits [Ping timeout: 272 seconds] |
| 03:40:27 | | etnguyen03 (etnguyen03) joins |
| 03:44:51 | | nathang2184 joins |
| 03:59:21 | <nicolas17> | ok finished channels/ and it seems there's more directories, so all bets are off now |
| 03:59:36 | | DogDisco joins |
| 04:01:19 | <@JAA> | Welcome to S3 bucket listing. |
| 04:04:19 | | etnguyen03 quits [Remote host closed the connection] |
| 04:13:24 | | ducky quits [Ping timeout: 260 seconds] |
| 04:17:59 | <nicolas17> | https://storage.googleapis.com/parti-livestream/?prefix=ivs 9% on this directory now |
| 04:23:05 | <nicolas17> | anyway I'm not sure if archiving this is feasible or useful |
| 04:23:54 | | Webuser669590 joins |
| 04:24:04 | <nicolas17> | by the time I'm done running the list, some videos from >30 days ago may have gotten removed, and there may be some new ones |
| 04:24:14 | | Webuser669590 quits [Client Quit] |
| 04:24:15 | <nicolas17> | I think *live* streams are in this same bucket even (so new HLS segment files are being added every 2 seconds) |
| 04:30:38 | <Guest> | nicolas17: ive seen that livestreams have ids assigned to them (maybe sequentially?), so i think it might have something to do with that |
| 04:31:10 | <nicolas17> | <nicolas17> ...really curious what that decimal number before the uuid means |
| 04:31:15 | <nicolas17> | I suspect it's the *user* ID |
| 04:31:27 | <Guest> | also possible |
| 04:31:48 | <nicolas17> | because under channels/ there's 300 directories with similar numbers, each of them having multiple videos with a timestamp |
| 04:32:02 | <nicolas17> | so that number is the user/channel ID, not the video ID |
| 04:35:01 | | ducky (ducky) joins |
| 04:36:50 | | SootBector quits [Remote host closed the connection] |
| 04:37:57 | | SootBector (SootBector) joins |
| 04:39:59 | <Guest> | only 300? |
| 04:40:34 | <nicolas17> | <nicolas17> "the web3 creator economy & live stream platform" okay having <300 active users makes more sense now |
| 04:40:42 | <Guest> | thats like what kick.com tried to do with twitch |
| 04:41:29 | <nicolas17> | note that if a user hasn't streamed in the last month (so saved streams already expired) and/or doesn't save stream recordings, I won't see the directory as existing at all |
| 04:41:41 | <Guest> | imo theres not much of a point in archiving (especially if theres only 300 channrls) |
| 04:41:49 | <Guest> | i thought the bucket was a lot smaller |
| 04:42:24 | <nicolas17> | I don't even know what's in ivs/, maybe it's similar to "clips"? in which case there's a *ton* |
| 04:42:57 | <Guest> | what are the file formats? |
| 04:44:40 | | ducky quits [Ping timeout: 260 seconds] |
| 04:45:20 | | ducky (ducky) joins |
| 04:48:43 | <nicolas17> | HLS |
| 04:48:56 | <nicolas17> | might be they used a different system for past streams |
| 04:49:01 | <nicolas17> | ivs might mean https://aws.amazon.com/ivs/ |
| 04:54:37 | <nicolas17> | maybe the last month of streams is in channels/<id>/archive/<timestamp> but older stuff is in ivs? maybe they migrated systems around that time? 2025/10/27 is the most recent timestamp I see in ivs |
| 04:57:40 | <Guest> | parti was founded in 2017 and ivs was created in 2020, unless they changed the architecture since then (the site is really empty for an 8 year old company) |
| 04:59:46 | <Guest> | its kind of pointless whether they used ivs or not though |
| 05:02:07 | <nicolas17> | it seems ivs/ has stream recordings older than ~2025-10-27... whether they changed systems at that date, or they move them there after a month, doesn't really matter |
| 05:03:44 | <nicolas17> | either way my extrapolation so far says ivs/ is 100TB :p |
| 05:05:26 | <nicolas17> | and considering https://tracker.archiveteam.org/twitch/ was 550TB... |
| 05:06:22 | <Guest> | based on https://ivs.rocks/calculator i dont think they would use ivs |
| 05:06:40 | <nicolas17> | ah well that's for streaming |
| 05:06:53 | <nicolas17> | when the stream is over you can shove it into a regular S3 bucket |
| 05:12:11 | <nicolas17> | paying for 100TB of GCP storage is still no laughing matter tbh |
| 05:15:55 | | sec^nd quits [Remote host closed the connection] |
| 05:16:15 | | sec^nd (second) joins |
| 05:23:52 | | ducky quits [Ping timeout: 260 seconds] |
| 05:26:04 | | ducky (ducky) joins |
| 05:31:20 | | ducky quits [Ping timeout: 260 seconds] |
| 05:31:23 | <nicolas17> | Guest: does 1508 users sound more reasonable? |
| 05:34:09 | | jason quits [Ping timeout: 272 seconds] |
| 05:34:35 | | jason joins |
| 05:36:37 | | ducky (ducky) joins |
| 06:09:36 | | nexussfan quits [Quit: Konversation terminated!] |
| 06:31:10 | | jason quits [Read error: Connection reset by peer] |
| 06:31:35 | | jason joins |
| 06:34:03 | | jason quits [Read error: Connection reset by peer] |
| 06:34:18 | | jason joins |
| 06:58:31 | | Wohlstand (Wohlstand) joins |
| 07:18:53 | | hexagonwin quits [Read error: Connection reset by peer] |
| 07:19:47 | | jason quits [Remote host closed the connection] |
| 07:19:59 | | hexagonwin joins |
| 07:23:00 | | gosc joins |
| 07:26:12 | <gosc> | I can just make a page for the wiki right? I want to add Microsoft Store on there |
| 07:26:29 | <gosc> | it could also be a project in the future, like chrome extensions |
| 07:51:57 | | Dada joins |