02:25:27<klea>lovely https://w.wiki/ WMF seems to have a url shortener: https://w.wiki/94fS example url
02:35:40<@JAA>Yes, dumps are available, too: https://dumps.wikimedia.org/other/shorturls/
02:36:19<klea>oh good
03:22:08DogsRNice_ quits [Read error: Connection reset by peer]
03:30:03tzt quits [Quit: tzt]
03:30:30tzt (tzt) joins
06:10:10ArchivalEfforts quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
06:10:19ArchivalEfforts joins
11:10:08archiveDrill quits [Ping timeout: 256 seconds]
14:29:19pabs quits [Ping timeout: 272 seconds]
14:32:18pabs (pabs) joins
15:02:53that_lurker quits [Ping timeout: 272 seconds]
15:07:34that_lurker (that_lurker) joins
15:13:45<justauser>Huh. Do we have https://goo.gl/fb/mthfjx -style covered?
15:20:41<nstrom|m>yeah we did those w/ goo-gl-grab
15:21:38<klea>https://archive.org/details/UrlteamWebCrawls?tab=collection&query=goo-gl
15:21:56<klea>curl -Ls https://archive.org/download/urlteam_2025-08-25-00-17-01/goo-gl.2025-08-25-00-17-01.zip/goo-gl%2F______.txt.xz | unxz -dc
15:22:04<klea>ofc for all items, not just that one
15:25:53<klea>tho maybe this query is better: https://archive.org/details/UrlteamWebCrawls?tab=collection&and%5B%5D=subject%3A%22goo-gl%22
15:27:28<justauser>tl;dr looks painful.
15:27:57<justauser>I wouldn't be surprised if someone/something has the data in one piece.
15:28:15<justauser>Maybe datechnoman already includes those in stash?
15:28:48<justauser>Or that much of them were fed down the #//?
15:29:27<klea>im not sure.
15:29:44<klea>i can try to download those if you want.
15:29:48<justauser>That's exactly why I asked.
15:30:33<klea>iirc i wrote a python script to get the torrent links, that's easy to repurpose
15:31:04<klea>oh no that used the scrape endpoint thingy :p
15:31:21<klea>i'll just use jq
15:33:55<klea>aaa zip has datee
15:39:14<klea>huh, the number of underscores changes :(
15:39:48<klea>and more than one file in some zips
16:03:13<klea>justauser: if you want to download them, here: <https://transfer.archivete.am/WbXO3/urls.txt>, i'm too lazy to extract them and all that sorry
16:03:14<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/WbXO3/urls.txt>,
17:04:18TastyWiener95 quits [Ping timeout: 256 seconds]
17:08:17TastyWiener95 (TastyWiener95) joins
18:33:40DogsRNice joins
21:41:52jesterjunk_ joins
21:45:03jesterjunk quits [Ping timeout: 272 seconds]
22:35:35<@JAA>That's the URLTeam data, which is easy to process. goo-gl-grab and #urlteamwasright is separate.
23:16:55atphoenix__ (atphoenix) joins
23:20:03atphoenix_ quits [Ping timeout: 272 seconds]