01:24:55HP_Archivist (HP_Archivist) joins
01:25:20itachi1706 quits [Ping timeout: 240 seconds]
01:25:50neggles quits [Ping timeout: 240 seconds]
01:25:51itachi1706 (itachi1706) joins
01:26:03neggles (neggles) joins
01:41:38<h2ibot>Pedrosso edited Steam (+1237, Added the steam workshop as its own project as…): https://wiki.archiveteam.org/?diff=51378&oldid=51377
01:41:39<h2ibot>Pedrosso uploaded File:Steam workshop 2023-12-18.png (The main page of…): https://wiki.archiveteam.org/?title=File%3ASteam%20workshop%202023-12-18.png
01:41:40<h2ibot>Pedrosso uploaded File:Steam Workshop v3 2023.png (Steam Workshop Banner Image): https://wiki.archiveteam.org/?title=File%3ASteam%20Workshop%20v3%202023.png
01:47:40<h2ibot>JustAnotherArchivist edited Steam (+131, Move new section to the end of the page;…): https://wiki.archiveteam.org/?diff=51381&oldid=51378
01:49:45<@JAA>Pedrosso: Streams crossed there, your new edit has resulted in a conflict and I had to reject it.
01:50:04<Pedrosso>ok
01:51:41<h2ibot>JustAnotherArchivist edited Steam (+3, Fix Workshop image): https://wiki.archiveteam.org/?diff=51382&oldid=51381
01:51:50<Pedrosso>whops
01:52:00<@JAA>Welp :-)
01:52:30<@JAA>> Your edit was ignored because no change was made to the text.
01:52:52<Pedrosso>Haha
01:53:00<Pedrosso>Wonderful
01:55:28<@JAA>I'd like to get rid of the 'How can I help?' sections scattered all over the wiki at some point. They're only useful while a DPoS project is active.
01:55:42<@JAA>Which is to say, they're all noise at this point.
01:56:30<Pedrosso>I mentioned that too. I was considering making them collapsable but that didn't work due to the section headers
01:59:37itachi1706 quits [Client Quit]
02:01:06<@JAA>Maybe we should include a message with similar contents on all {{in progress}} DPoS projects. That should even be possible automatically from the infobox template.
02:01:37<Pedrosso>That sounds efficient
02:02:13itachi1706 (itachi1706) joins
02:02:26<Pedrosso>On the note of the steam workshop, I have coded a program ready to download all portal 2 workshop creations as long as I know a good way to upload it (not messing up the metadata, using a "nice" format, a "steam workshop" collection if needed)
02:03:43<@JAA>Well, since the Workshop is a web interface, it should be WARC and go into the WBM.
02:24:29qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
02:34:38<Pedrosso>How exactly? As I mentioned in the wiki the download isn't directly off of the pages. On a steam workshop page you can only "subscribe" to an item which does stuff with the app. The script I made uses the API to grab it directly. (if you know all that already): How could that be put into a WARC and into the WBM?
02:47:49<pabs>is the API based on HTTP GET or POST?
02:48:13<pabs>if GET, then AB can be fed a list of API URLs to download
02:53:29<nicolas17>what if GET but auth cookies? :P
02:53:35<nicolas17>we need an alignment chart
02:53:50<nicolas17>where plain GET with nice URLs would be lawful good
03:15:32<@JAA>Huh, is that new? I seem to remember downloading files directly from there. This would've been a few years ago though.
03:19:44<Pedrosso>GET, no cookies
03:19:57<Pedrosso>yes, you're right. A list of URLs would work
03:21:00<Pedrosso>actually, disregard that I got confused. Pull
03:21:05<Pedrosso>Post*
03:21:22<Pedrosso>GET was the one for the comments
03:26:01<Pedrosso>ok so, the API for getting items is POST and it's https://api.steampowered.com/ISteamRemoteStorage/GetPublishedFileDetails/v1/ with publishedfileids[0]=ID_HERE and itemcount=1 It can be used for more items
03:26:55<@JAA>Ok, and then the actual download is a simple GET.
03:27:15<@JAA>That can be archived into WARC in a way that would kind of work in the WBM, I think.
03:27:58<Pedrosso>> Ok, and then the actual download is a simple GET.
03:27:58<Pedrosso>It is? Can you give an example if I say the ID is 3058373765
03:28:17<@JAA>https://steamusercontent-a.akamaihd.net/ugc/2117314083157632215/B7FF5C4548936111546D0F348FECE251F8F4A1E7/
03:28:26<Pedrosso>awesome
03:29:20<@JAA>The server ignores the query string, so that can be (ab)used to retain the file ID context into the WBM.
03:31:43<@JAA>Can't find an example of a workshop page with a download link from years ago, so I guess I misremembered. Huh.
03:32:38<Pedrosso>here's a (hopefully extensive) list of portal steam workshop ids https://transfer.archivete.am/VK5SG/steamids.txt.zst
03:33:06atphoenix quits [Remote host closed the connection]
03:33:31<@JAA>It's certainly extensive, more interesting would be whether it's exhaustive. :-)
03:33:43<@JAA>I'm guessing there's an API for that as well?
03:33:49atphoenix (atphoenix) joins
03:34:27<Pedrosso>I meant exhaustive, thanks. I'm not aware of any API for that so I had the code go through the normal search pages
03:34:47<@JAA>Hmm, maybe IPublishedFileService/QueryFiles.
03:34:51<fireonlive>i seem to remember https://steamdb.info/ being a thing but looks like it's third party
03:34:59fireonlive back to lurk mode
03:36:11<Pedrosso>I think you're right. When I did that pull I didn't have a steam API key
03:36:22<@JAA>Yeah, that requires an access key. :-/
03:37:27<@JAA>3 billion-ish IDs is perfectly feasible, especially since you can request multiple IDs per request.
03:39:01<@JAA>Assuming Valve lets it happen, that is.
03:39:55<@JAA>We'd want requests like `curl --data 'itemcount=1&publishedfileids%5B0%5D=3058373765' 'https://api.steampowered.com/ISteamRemoteStorage/GetPublishedFileDetails/v1/?itemcount=1&publishedfileids%5B0%5D=3058373765'` into WARC.
03:40:18<@JAA>This still allows looking up file IDs in the WBM by also including it in the URL.
03:41:44<nicolas17>oh clever
03:42:07<@JAA>We did the same thing for one of the YouTube projects.
03:42:52<Pedrosso>Very nice
03:53:55<Pedrosso>I thought AB could only do URLs
03:55:59<@JAA>Who said anything about AB?
03:59:33<Pedrosso>Oh?
03:59:59<fireonlive>qwarc qwarc 🦆
04:00:03<Pedrosso>hehe
04:03:50DLoader quits [Ping timeout: 240 seconds]
04:04:00<@JAA>:-)
04:04:34<Pedrosso>my only regret is I lack a nice progress bar to stare at
04:06:00DLoader joins
04:17:20<fireonlive>Pedrosso: https://dl.fireon.live/irc/1035455b3b1f59a3/please-wait.gif
04:22:06<nicolas17>Pedrosso: https://twitter.com/neilsardesai/status/1399037054957326339
04:22:06<eggdrop>nitter: https://nitter.net/neilsardesai/status/1399037054957326339
04:47:48<fireonlive>https://9to5mac.com/2023/12/18/apple-halting-apple-watch-series-9-and-apple-watch-ultra-2-sales/
04:49:30<@JAA>If I'm seeing this correctly, slider.kz is a VK and Last.FM index. VK for the audio, and Last.FM for similar artist recommendations.
04:52:26<@JAA>The search endpoint is plainly called vk_auth.php, and the audio URLs are on VK's CDN. The similar artists are less obvious, but the endpoint returns Last.fm's image URLs (which aren't displayed anywhere).
04:54:44<@JAA>So probably virtually no unique data.
04:56:07<@JAA>It's already well past its deadline, by the way; it was supposed to go down at the end of November.
04:56:38<@JAA>Cf. https://nitter.net/x_slider/status/1720341321062228252
05:12:34<fireonlive>interesting service
05:24:23nicolas17 quits [Client Quit]
05:32:57etnguyen03 (etnguyen03) joins
05:45:21<Pedrosso>fireonlive yes
05:49:45DogsRNice quits [Read error: Connection reset by peer]
06:04:59Island quits [Read error: Connection reset by peer]
06:15:20icedice quits [Ping timeout: 240 seconds]
06:22:12etnguyen03 quits [Remote host closed the connection]
06:38:40BlueMaxima quits [Read error: Connection reset by peer]
07:36:15Arcorann (Arcorann) joins
08:16:13qwertyasdfuiopghjkl quits [Remote host closed the connection]
08:17:00qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
09:21:47<Pedrosso>So, how's the qwarc-ing coming along?
10:00:01Bleo1826 quits [Client Quit]
10:01:17Bleo1826 joins
10:44:54Megame quits [Client Quit]
11:13:50CandidSparrow quits [Ping timeout: 240 seconds]
11:40:28project10 quits [Remote host closed the connection]
11:40:43project10 (project10) joins
11:42:34c3manu (c3manu) joins
11:42:39c3manu quits [Max SendQ exceeded]
11:42:55c3manu (c3manu) joins
11:42:55c3manu quits [Max SendQ exceeded]
11:43:12c3manu (c3manu) joins
11:51:51icedice (icedice) joins
11:59:55icedice2 (icedice) joins
12:00:17icedice2 quits [Remote host closed the connection]
12:00:22icedice quits [Client Quit]
12:02:02kiryu_ joins
12:02:03qwertyasdfuiopghjkl quits [Remote host closed the connection]
12:02:05kiryu quits [Ping timeout: 272 seconds]
12:21:17kiryu_ quits [Remote host closed the connection]
12:32:54kiryu joins
12:32:54kiryu quits [Changing host]
12:32:54kiryu (kiryu) joins
12:39:39tertu quits [Client Quit]
12:41:05tertu (tertu) joins
12:46:50Arcorann quits [Ping timeout: 240 seconds]
13:31:20Gereon9 quits [Ping timeout: 240 seconds]
13:32:18ehmry joins
14:05:21CandidSparrow joins
15:09:07jacksonchen666 (jacksonchen666) joins
15:25:27HP_Archivist quits [Client Quit]
15:30:24<@JAA>Pedrosso: It isn't because both I and my machines are busy with too many other things currently.
15:33:37mr_sarge quits [Ping timeout: 272 seconds]
15:37:03mr_sarge (sarge) joins
15:49:50mcint quits [Ping timeout: 240 seconds]
16:05:18datechnoman quits [Quit: Ping timeout (120 seconds)]
16:05:38datechnoman (datechnoman) joins
16:42:52DogsRNice joins
17:15:28c3manu quits [Remote host closed the connection]
17:34:20c3manu (c3manu) joins
17:41:26<c3manu>so, i will be attending an event between christmas and new years where i’ll have more bandwith than i could possibly use for a span of 4 days. the uplink should be mostly clean (except for some incident response) and temporary (so torrenting will be fine). i don’t have big hardware lying around, but i could take a few raspberry-pi-like devices with me. what's the most useful thing i could let them do archiving-wise?
17:46:11Island joins
17:50:14riku_ (riku) joins
17:51:28<murb>obs use more bandwith ;-)
17:51:50riku quits [Ping timeout: 240 seconds]
17:52:14<c3manu>obs?
17:52:22<@JAA>I think I know which event that is. :-)
17:52:37<c3manu>JAA don't tell me you're going as well :D
17:52:38<murb>obviously
17:52:43<c3manu>murb, ah
17:52:51riku_ is now known as riku
17:53:05<murb>i'll be there.
17:54:20<@JAA>c3manu: Sadly no. :-(
17:54:22<c3manu>murb: ah, no wonder you know the NOC’s slogan then ;)
17:54:46<c3manu>JAA: bummer :/
17:55:51<@JAA>Maybe next year. :-)
17:56:09wyatt8750 joins
17:56:45wyatt8740 quits [Ping timeout: 272 seconds]
17:57:40<c3manu>nice, i’d like to say hi in person one day (if you'd be up for that)
18:01:05<c3manu>but still, is there a project that would make sense setting up on such small hardware? or does it defeat the purpose of many "smaller" participants to not get blocked entirely?
18:06:15wyatt8750 quits [Ping timeout: 272 seconds]
18:06:47wyatt8740 joins
18:14:13qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
18:37:20burak321 joins
18:37:50<burak321>pokechu22 well that sucks :( It will remain read only, but I doubt it will stay that way for too long
18:38:49<pokechu22>Yeah. Depending on how fast the site is/if it blocks people going too fast, qwarc might be usable, but I don't know too much about how that works
18:40:24<h2ibot>Pokechu22 edited Deathwatch (+140, /* 2023 */ https://wizaz.pl/forum/ read-only…): https://wiki.archiveteam.org/?diff=51383&oldid=51367
18:41:24<h2ibot>Pokechu22 edited Deathwatch (+74, /* 2023 */): https://wiki.archiveteam.org/?diff=51384&oldid=51383
18:42:30<burak321>I didn't even realized it's that big. I guess it's a lost case then. Thanks for help
18:45:58<pokechu22>It's a little bit simpler since each individual post doesn't need to be saved, only every page in a thread (e.g. only https://wizaz.pl/forum/showthread.php?t=1283338 https://wizaz.pl/forum/showthread.php?t=1283338&page=2 https://wizaz.pl/forum/showthread.php?t=1283338&page=3 and not https://wizaz.pl/forum/showpost.php?p=88917750&postcount=1
18:46:00<pokechu22>https://wizaz.pl/forum/showpost.php?p=88917912&postcount=2 https://wizaz.pl/forum/showpost.php?p=88919041&postcount=3 ... https://wizaz.pl/forum/showpost.php?p=89694400&postcount=71) - those post links have the same info as the thread pages
18:47:30<pokechu22>716134 threads, but there are probably lots of threads with at least 2 pages... I'd estimate between 1 and 3 million total pages that need to be saved, which is a lot but isn't impossible (it just wouldn't work well for archivebot)
19:35:00<@JAA>Yeah, definitely feasible with qwarc if they have a decently generous rate limit.
19:48:13burak321 quits [Ping timeout: 265 seconds]
20:13:54burak321 joins
20:26:00bf_ joins
20:27:44bf_ quits [Remote host closed the connection]
20:29:39bf_ joins
21:14:07bf_ quits [Remote host closed the connection]
21:23:09qwertyasdfuiopghjkl quits [Client Quit]
21:23:09burak321 quits [Client Quit]
21:35:17<DogsRNice>mittensquads twitter if anyone wants to archive it https://twitter.com/mittensquad
21:35:18<eggdrop>nitter: https://nitter.net/mittensquad
21:39:44<@JAA>AB job started.
21:51:12BlueMaxima joins
22:03:14c3manu quits [Remote host closed the connection]
22:13:02crunkster joins
22:16:37crunkster quits [Remote host closed the connection]
22:20:13inedia quits [Ping timeout: 272 seconds]
22:27:21qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
22:27:35lunik173 quits [Remote host closed the connection]
22:27:57lunik173 joins
22:56:49icedice (icedice) joins