01:24:55 | | HP_Archivist (HP_Archivist) joins |
01:25:20 | | itachi1706 quits [Ping timeout: 240 seconds] |
01:25:50 | | neggles quits [Ping timeout: 240 seconds] |
01:25:51 | | itachi1706 (itachi1706) joins |
01:26:03 | | neggles (neggles) joins |
01:41:38 | <h2ibot> | Pedrosso edited Steam (+1237, Added the steam workshop as its own project as…): https://wiki.archiveteam.org/?diff=51378&oldid=51377 |
01:41:39 | <h2ibot> | Pedrosso uploaded File:Steam workshop 2023-12-18.png (The main page of…): https://wiki.archiveteam.org/?title=File%3ASteam%20workshop%202023-12-18.png |
01:41:40 | <h2ibot> | Pedrosso uploaded File:Steam Workshop v3 2023.png (Steam Workshop Banner Image): https://wiki.archiveteam.org/?title=File%3ASteam%20Workshop%20v3%202023.png |
01:47:40 | <h2ibot> | JustAnotherArchivist edited Steam (+131, Move new section to the end of the page;…): https://wiki.archiveteam.org/?diff=51381&oldid=51378 |
01:49:45 | <@JAA> | Pedrosso: Streams crossed there, your new edit has resulted in a conflict and I had to reject it. |
01:50:04 | <Pedrosso> | ok |
01:51:41 | <h2ibot> | JustAnotherArchivist edited Steam (+3, Fix Workshop image): https://wiki.archiveteam.org/?diff=51382&oldid=51381 |
01:51:50 | <Pedrosso> | whops |
01:52:00 | <@JAA> | Welp :-) |
01:52:30 | <@JAA> | > Your edit was ignored because no change was made to the text. |
01:52:52 | <Pedrosso> | Haha |
01:53:00 | <Pedrosso> | Wonderful |
01:55:28 | <@JAA> | I'd like to get rid of the 'How can I help?' sections scattered all over the wiki at some point. They're only useful while a DPoS project is active. |
01:55:42 | <@JAA> | Which is to say, they're all noise at this point. |
01:56:30 | <Pedrosso> | I mentioned that too. I was considering making them collapsable but that didn't work due to the section headers |
01:59:37 | | itachi1706 quits [Client Quit] |
02:01:06 | <@JAA> | Maybe we should include a message with similar contents on all {{in progress}} DPoS projects. That should even be possible automatically from the infobox template. |
02:01:37 | <Pedrosso> | That sounds efficient |
02:02:13 | | itachi1706 (itachi1706) joins |
02:02:26 | <Pedrosso> | On the note of the steam workshop, I have coded a program ready to download all portal 2 workshop creations as long as I know a good way to upload it (not messing up the metadata, using a "nice" format, a "steam workshop" collection if needed) |
02:03:43 | <@JAA> | Well, since the Workshop is a web interface, it should be WARC and go into the WBM. |
02:24:29 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
02:34:38 | <Pedrosso> | How exactly? As I mentioned in the wiki the download isn't directly off of the pages. On a steam workshop page you can only "subscribe" to an item which does stuff with the app. The script I made uses the API to grab it directly. (if you know all that already): How could that be put into a WARC and into the WBM? |
02:47:49 | <pabs> | is the API based on HTTP GET or POST? |
02:48:13 | <pabs> | if GET, then AB can be fed a list of API URLs to download |
02:53:29 | <nicolas17> | what if GET but auth cookies? :P |
02:53:35 | <nicolas17> | we need an alignment chart |
02:53:50 | <nicolas17> | where plain GET with nice URLs would be lawful good |
03:15:32 | <@JAA> | Huh, is that new? I seem to remember downloading files directly from there. This would've been a few years ago though. |
03:19:44 | <Pedrosso> | GET, no cookies |
03:19:57 | <Pedrosso> | yes, you're right. A list of URLs would work |
03:21:00 | <Pedrosso> | actually, disregard that I got confused. Pull |
03:21:05 | <Pedrosso> | Post* |
03:21:22 | <Pedrosso> | GET was the one for the comments |
03:26:01 | <Pedrosso> | ok so, the API for getting items is POST and it's https://api.steampowered.com/ISteamRemoteStorage/GetPublishedFileDetails/v1/ with publishedfileids[0]=ID_HERE and itemcount=1 It can be used for more items |
03:26:55 | <@JAA> | Ok, and then the actual download is a simple GET. |
03:27:15 | <@JAA> | That can be archived into WARC in a way that would kind of work in the WBM, I think. |
03:27:58 | <Pedrosso> | > Ok, and then the actual download is a simple GET. |
03:27:58 | <Pedrosso> | It is? Can you give an example if I say the ID is 3058373765 |
03:28:17 | <@JAA> | https://steamusercontent-a.akamaihd.net/ugc/2117314083157632215/B7FF5C4548936111546D0F348FECE251F8F4A1E7/ |
03:28:26 | <Pedrosso> | awesome |
03:29:20 | <@JAA> | The server ignores the query string, so that can be (ab)used to retain the file ID context into the WBM. |
03:31:43 | <@JAA> | Can't find an example of a workshop page with a download link from years ago, so I guess I misremembered. Huh. |
03:32:38 | <Pedrosso> | here's a (hopefully extensive) list of portal steam workshop ids https://transfer.archivete.am/VK5SG/steamids.txt.zst |
03:33:06 | | atphoenix quits [Remote host closed the connection] |
03:33:31 | <@JAA> | It's certainly extensive, more interesting would be whether it's exhaustive. :-) |
03:33:43 | <@JAA> | I'm guessing there's an API for that as well? |
03:33:49 | | atphoenix (atphoenix) joins |
03:34:27 | <Pedrosso> | I meant exhaustive, thanks. I'm not aware of any API for that so I had the code go through the normal search pages |
03:34:47 | <@JAA> | Hmm, maybe IPublishedFileService/QueryFiles. |
03:34:51 | <fireonlive> | i seem to remember https://steamdb.info/ being a thing but looks like it's third party |
03:34:59 | | fireonlive back to lurk mode |
03:36:11 | <Pedrosso> | I think you're right. When I did that pull I didn't have a steam API key |
03:36:22 | <@JAA> | Yeah, that requires an access key. :-/ |
03:37:27 | <@JAA> | 3 billion-ish IDs is perfectly feasible, especially since you can request multiple IDs per request. |
03:39:01 | <@JAA> | Assuming Valve lets it happen, that is. |
03:39:55 | <@JAA> | We'd want requests like `curl --data 'itemcount=1&publishedfileids%5B0%5D=3058373765' 'https://api.steampowered.com/ISteamRemoteStorage/GetPublishedFileDetails/v1/?itemcount=1&publishedfileids%5B0%5D=3058373765'` into WARC. |
03:40:18 | <@JAA> | This still allows looking up file IDs in the WBM by also including it in the URL. |
03:41:44 | <nicolas17> | oh clever |
03:42:07 | <@JAA> | We did the same thing for one of the YouTube projects. |
03:42:52 | <Pedrosso> | Very nice |
03:53:55 | <Pedrosso> | I thought AB could only do URLs |
03:55:59 | <@JAA> | Who said anything about AB? |
03:59:33 | <Pedrosso> | Oh? |
03:59:59 | <fireonlive> | qwarc qwarc 🦆 |
04:00:03 | <Pedrosso> | hehe |
04:03:50 | | DLoader quits [Ping timeout: 240 seconds] |
04:04:00 | <@JAA> | :-) |
04:04:34 | <Pedrosso> | my only regret is I lack a nice progress bar to stare at |
04:06:00 | | DLoader joins |
04:17:20 | <fireonlive> | Pedrosso: https://dl.fireon.live/irc/1035455b3b1f59a3/please-wait.gif |
04:22:06 | <nicolas17> | Pedrosso: https://twitter.com/neilsardesai/status/1399037054957326339 |
04:22:06 | <eggdrop> | nitter: https://nitter.net/neilsardesai/status/1399037054957326339 |
04:47:48 | <fireonlive> | https://9to5mac.com/2023/12/18/apple-halting-apple-watch-series-9-and-apple-watch-ultra-2-sales/ |
04:49:30 | <@JAA> | If I'm seeing this correctly, slider.kz is a VK and Last.FM index. VK for the audio, and Last.FM for similar artist recommendations. |
04:52:26 | <@JAA> | The search endpoint is plainly called vk_auth.php, and the audio URLs are on VK's CDN. The similar artists are less obvious, but the endpoint returns Last.fm's image URLs (which aren't displayed anywhere). |
04:54:44 | <@JAA> | So probably virtually no unique data. |
04:56:07 | <@JAA> | It's already well past its deadline, by the way; it was supposed to go down at the end of November. |
04:56:38 | <@JAA> | Cf. https://nitter.net/x_slider/status/1720341321062228252 |
05:12:34 | <fireonlive> | interesting service |
05:24:23 | | nicolas17 quits [Client Quit] |
05:32:57 | | etnguyen03 (etnguyen03) joins |
05:45:21 | <Pedrosso> | fireonlive yes |
05:49:45 | | DogsRNice quits [Read error: Connection reset by peer] |
06:04:59 | | Island quits [Read error: Connection reset by peer] |
06:15:20 | | icedice quits [Ping timeout: 240 seconds] |
06:22:12 | | etnguyen03 quits [Remote host closed the connection] |
06:38:40 | | BlueMaxima quits [Read error: Connection reset by peer] |
07:36:15 | | Arcorann (Arcorann) joins |
08:16:13 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
08:17:00 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
09:21:47 | <Pedrosso> | So, how's the qwarc-ing coming along? |
10:00:01 | | Bleo1826 quits [Client Quit] |
10:01:17 | | Bleo1826 joins |
10:44:54 | | Megame quits [Client Quit] |
11:13:50 | | CandidSparrow quits [Ping timeout: 240 seconds] |
11:40:28 | | project10 quits [Remote host closed the connection] |
11:40:43 | | project10 (project10) joins |
11:42:34 | | c3manu (c3manu) joins |
11:42:39 | | c3manu quits [Max SendQ exceeded] |
11:42:55 | | c3manu (c3manu) joins |
11:42:55 | | c3manu quits [Max SendQ exceeded] |
11:43:12 | | c3manu (c3manu) joins |
11:51:51 | | icedice (icedice) joins |
11:59:55 | | icedice2 (icedice) joins |
12:00:17 | | icedice2 quits [Remote host closed the connection] |
12:00:22 | | icedice quits [Client Quit] |
12:02:02 | | kiryu_ joins |
12:02:03 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
12:02:05 | | kiryu quits [Ping timeout: 272 seconds] |
12:21:17 | | kiryu_ quits [Remote host closed the connection] |
12:32:54 | | kiryu joins |
12:32:54 | | kiryu is now authenticated as kiryu |
12:32:54 | | kiryu quits [Changing host] |
12:32:54 | | kiryu (kiryu) joins |
12:39:39 | | tertu quits [Client Quit] |
12:41:05 | | tertu (tertu) joins |
12:46:50 | | Arcorann quits [Ping timeout: 240 seconds] |
13:31:20 | | Gereon9 quits [Ping timeout: 240 seconds] |
13:32:18 | | ehmry joins |
14:05:21 | | CandidSparrow joins |
15:09:07 | | jacksonchen666 (jacksonchen666) joins |
15:25:27 | | HP_Archivist quits [Client Quit] |
15:30:24 | <@JAA> | Pedrosso: It isn't because both I and my machines are busy with too many other things currently. |
15:33:37 | | mr_sarge quits [Ping timeout: 272 seconds] |
15:37:03 | | mr_sarge (sarge) joins |
15:49:50 | | mcint quits [Ping timeout: 240 seconds] |
16:05:18 | | datechnoman quits [Quit: Ping timeout (120 seconds)] |
16:05:38 | | datechnoman (datechnoman) joins |
16:42:52 | | DogsRNice joins |
17:15:28 | | c3manu quits [Remote host closed the connection] |
17:34:20 | | c3manu (c3manu) joins |
17:41:26 | <c3manu> | so, i will be attending an event between christmas and new years where i’ll have more bandwith than i could possibly use for a span of 4 days. the uplink should be mostly clean (except for some incident response) and temporary (so torrenting will be fine). i don’t have big hardware lying around, but i could take a few raspberry-pi-like devices with me. what's the most useful thing i could let them do archiving-wise? |
17:46:11 | | Island joins |
17:50:14 | | riku_ (riku) joins |
17:51:28 | <murb> | obs use more bandwith ;-) |
17:51:50 | | riku quits [Ping timeout: 240 seconds] |
17:52:14 | <c3manu> | obs? |
17:52:22 | <@JAA> | I think I know which event that is. :-) |
17:52:37 | <c3manu> | JAA don't tell me you're going as well :D |
17:52:38 | <murb> | obviously |
17:52:43 | <c3manu> | murb, ah |
17:52:51 | | riku_ is now known as riku |
17:53:05 | <murb> | i'll be there. |
17:54:20 | <@JAA> | c3manu: Sadly no. :-( |
17:54:22 | <c3manu> | murb: ah, no wonder you know the NOC’s slogan then ;) |
17:54:46 | <c3manu> | JAA: bummer :/ |
17:55:51 | <@JAA> | Maybe next year. :-) |
17:56:09 | | wyatt8750 joins |
17:56:45 | | wyatt8740 quits [Ping timeout: 272 seconds] |
17:57:40 | <c3manu> | nice, i’d like to say hi in person one day (if you'd be up for that) |
18:01:05 | <c3manu> | but still, is there a project that would make sense setting up on such small hardware? or does it defeat the purpose of many "smaller" participants to not get blocked entirely? |
18:06:15 | | wyatt8750 quits [Ping timeout: 272 seconds] |
18:06:47 | | wyatt8740 joins |
18:14:13 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
18:37:20 | | burak321 joins |
18:37:50 | <burak321> | pokechu22 well that sucks :( It will remain read only, but I doubt it will stay that way for too long |
18:38:49 | <pokechu22> | Yeah. Depending on how fast the site is/if it blocks people going too fast, qwarc might be usable, but I don't know too much about how that works |
18:40:24 | <h2ibot> | Pokechu22 edited Deathwatch (+140, /* 2023 */ https://wizaz.pl/forum/ read-only…): https://wiki.archiveteam.org/?diff=51383&oldid=51367 |
18:41:24 | <h2ibot> | Pokechu22 edited Deathwatch (+74, /* 2023 */): https://wiki.archiveteam.org/?diff=51384&oldid=51383 |
18:42:30 | <burak321> | I didn't even realized it's that big. I guess it's a lost case then. Thanks for help |
18:45:58 | <pokechu22> | It's a little bit simpler since each individual post doesn't need to be saved, only every page in a thread (e.g. only https://wizaz.pl/forum/showthread.php?t=1283338 https://wizaz.pl/forum/showthread.php?t=1283338&page=2 https://wizaz.pl/forum/showthread.php?t=1283338&page=3 and not https://wizaz.pl/forum/showpost.php?p=88917750&postcount=1 |
18:46:00 | <pokechu22> | https://wizaz.pl/forum/showpost.php?p=88917912&postcount=2 https://wizaz.pl/forum/showpost.php?p=88919041&postcount=3 ... https://wizaz.pl/forum/showpost.php?p=89694400&postcount=71) - those post links have the same info as the thread pages |
18:47:30 | <pokechu22> | 716134 threads, but there are probably lots of threads with at least 2 pages... I'd estimate between 1 and 3 million total pages that need to be saved, which is a lot but isn't impossible (it just wouldn't work well for archivebot) |
19:35:00 | <@JAA> | Yeah, definitely feasible with qwarc if they have a decently generous rate limit. |
19:48:13 | | burak321 quits [Ping timeout: 265 seconds] |
20:13:54 | | burak321 joins |
20:26:00 | | bf_ joins |
20:27:44 | | bf_ quits [Remote host closed the connection] |
20:29:39 | | bf_ joins |
21:14:07 | | bf_ quits [Remote host closed the connection] |
21:23:09 | | qwertyasdfuiopghjkl quits [Client Quit] |
21:23:09 | | burak321 quits [Client Quit] |
21:35:17 | <DogsRNice> | mittensquads twitter if anyone wants to archive it https://twitter.com/mittensquad |
21:35:18 | <eggdrop> | nitter: https://nitter.net/mittensquad |
21:39:44 | <@JAA> | AB job started. |
21:51:12 | | BlueMaxima joins |
22:03:14 | | c3manu quits [Remote host closed the connection] |
22:13:02 | | crunkster joins |
22:16:37 | | crunkster quits [Remote host closed the connection] |
22:20:13 | | inedia quits [Ping timeout: 272 seconds] |
22:27:21 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
22:27:35 | | lunik173 quits [Remote host closed the connection] |
22:27:57 | | lunik173 joins |
22:56:49 | | icedice (icedice) joins |