00:06:22nicolas17 joins
00:08:48nickofni1 quits [Remote host closed the connection]
00:09:04nickofnicks (nickofnicks) joins
00:09:48nickofnicks quits [Remote host closed the connection]
00:10:06nickofnicks (nickofnicks) joins
01:20:38Megame (Megame) joins
02:51:16rfrf joins
02:51:59rfrf quits [Remote host closed the connection]
03:06:36celestial quits [Client Quit]
03:21:30icedice quits [Client Quit]
03:21:54celestial joins
03:24:12<h2ibot>JustAnotherArchivist edited Deathwatch (-4, /* 2023 */ Definitive deadline for OneHallyu): https://wiki.archiveteam.org/?diff=51385&oldid=51384
03:56:45DogsRNice quits [Read error: Connection reset by peer]
03:56:55qwertyasdfuiopghjkl quits [Remote host closed the connection]
04:00:02qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
04:13:13<fireonlive>from google alerts: https://mastodon.social/@stroughtonsmith/111602502197304440 :p
04:24:20DopefishJustin quits [Ping timeout: 240 seconds]
04:57:25<@JAA>I'm trying to qwarc OneHallyu. It's very slow.
05:12:50wickedplayer494 quits [Ping timeout: 240 seconds]
05:13:17wickedplayer494 joins
05:22:25Island quits [Read error: Connection reset by peer]
05:23:14<@JAA>AQNB reopened a few hours ago and will shut down tomorrow (21st). Running through AB now.
05:24:09<@arkiver>wooh :)
05:25:11<@JAA>The Analysis & Policy Observatory (APO) managed to secure a partnership and isn't shutting down after all. Would be nice to archive anyway, but not as urgent. (Heavy rate limits and bans thwarted the AB attempts.)
05:25:39<@JAA>> After taking a well-deserved break, APO will be re-established in the new year. The website will remain open for you to search and browse our policy and research repository and collections.
05:25:44<@JAA>Whatever 're-established' means exactly.
05:25:53<@JAA>https://apo.org.au/FAQ
05:26:48<nicolas17>hm I think something about this whole "teraleak" stuff should be mentioned in our TestFlight wiki page
05:27:18<nicolas17>in particular since I and others did some indexing of what is in the archived data
05:28:09<@JAA>Yeah, that would probably be a good idea. It was just public data, as I understand it (this was before my time here).
05:28:48<@arkiver>it was just a regular project
05:28:53<@arkiver>like all other projects
05:29:01<fireonlive>i did see arkiver2 as the committer :3
05:29:37<@JAA>It wasn't even a listable S3 bucket probably, right? Just S3 URLs referenced on the website?
05:29:47<nicolas17>I don't know how discovery worked
05:29:55<@arkiver>the tons of attention and "teraleak" branding is mostly people who just found out the "web archiving is pretty cool, because it saves stuff, like games, that are later maybe not available"
05:30:04<@arkiver>found out that*
05:30:13<@arkiver>remember for us it's pretty normal and obvious
05:30:32<@arkiver>for many non-tech people out there it's a first time they really see something like this
05:30:43<@JAA>Yeah, but it might be worth clarifying that we're archiving public data, not hacking into private systems or whatever.
05:30:45<nicolas17>I think good old scraping of previous projects found URLs like http://testflightapp.com/install/000059cc47865bcec060d67ddb11d30b-MTM2NjI4Ng/
05:30:53<@arkiver>JAA: yeah!
05:31:10<fireonlive>someone called it 'leak' initially as well, which caught on and wasn't helpful...
05:31:24<@arkiver>the code is completely public though, people can see there's no hack-y stuff in there.
05:31:30<fireonlive>yep :)
05:31:34<nicolas17>and from there you ended up at the ipa download link like https://testflightapp.com/dashboard/ipa/251aaeaaf0001ec906c157f2f31ddcbd-MTMxODQ0MzI/6280ca3ee10631fd6817100ffd1ee849-MTMzMzc3/
05:31:39<@arkiver>fireonlive: yeah honestly sounds like Discord shouting
05:31:40<nicolas17>which redirected to cloudfront or s3
05:31:47<fireonlive>arkiver: indeed
05:32:22<@arkiver>i'm very happy though nicolas17 is making good use of this, and we're getting attention :)
05:32:24<nicolas17>arkiver: Jason Scott went into the Discord to clarify things
05:32:31<@arkiver>although in a different way would have been better
05:32:33<@arkiver>nicolas17: yeah
05:32:34<@JAA>arkiver: Assuming people (a) take the time to look for the code, (b) read the code, and (c) understand the code. :-)
05:33:06<nicolas17>I know someone who plans to do some analysis on the executables
05:33:08<@arkiver>JAA: yeah but if any "official bodies" get involved or look further into this, they'll find the code and how it's all working
05:33:13<@JAA>Right
05:33:39<@arkiver>just an Archive Team project like all others :)
05:33:42<@arkiver>9 years ago
05:33:54<@arkiver>also this month i'm 10 years with Archive Team!
05:34:03<@JAA>\o/
05:34:04<fireonlive>:D
05:34:08<nicolas17>she works at a company making a decompiler and I think other reverse engineering tools, so 70000 real-world iOS binaries is a goldmine of test cases
05:35:50<@arkiver>nicolas17: yeah :)
05:36:44<nicolas17>arkiver: also for people who want to do stuff on the whole dataset, they're like "wait what is this warc thing"
05:37:09<nicolas17>"you mean I don't have to do 70k requests to the slow web.archive.org hostname to download each individual file?"
05:37:34<@JAA>'I can download over a terabyte at 5 kB/s instead? Amazing!' :-P
05:37:43<@JAA>But yeah
05:38:00<nicolas17>some people who have the storage quickly found the torrents
05:38:05<@JAA>Right
05:38:12<@JAA>Are these torrents complete?
05:38:32<@arkiver>there are torrents?
05:38:36<@arkiver>of testflight
05:38:38<nicolas17>I don't have the storage, so I'm piping wget into an "extract what I need and throw it away" script :P
05:38:56<@JAA>arkiver: We only set noarchivetorrent since a couple years ago, so I'd expect there to be torrents.
05:38:58<nicolas17>archive.org's autogenerated torrents work fine, the warcs are max 50GB
05:39:02<fireonlive>everyone should get a free 1PiB minimum :(
05:39:03<@arkiver>JAA: ah
05:39:12<fireonlive>as a human right!
05:39:21<nicolas17>although they have the usual problems of IA torrents
05:39:37<@arkiver>nicolas17: what are those problems?
05:39:48<nicolas17>textfiles edited the description on the items to clarify where they came from, and added a preview image
05:40:07<nicolas17>which re-generated the torrents
05:40:30<nicolas17>so now the old ones don't work anymore, or at least don't exchange data with people who got the new ones :P
05:40:45<@arkiver>i don't know if we needed that image
05:40:47<@arkiver>on all items
05:41:16<nicolas17>yeah that was questionable, but I think the xml file with the metadata (including description) is in the torrent too
05:41:17<@JAA>I don't think we needed it on any item.
05:41:24<@arkiver>JAA: yeah
05:41:28<@arkiver>just the logo maybe on the collection
05:41:33<nicolas17>so image or not, editing the description would invalidate the torrent anyway
05:41:49<fireonlive>:(
05:42:12<@JAA>Collections can have images, I think? That could've been the appropriate place.
05:42:35nicolas17 beds
05:42:39<@JAA>And a link to the wiki page there would've been good, too.
05:42:50<fireonlive>collection description maybe? hm.
05:43:02<fireonlive>cu nicky
05:46:05wickedplayer494 quits [Ping timeout: 272 seconds]
05:47:13wickedplayer494 joins
06:08:43BlueMaxima quits [Read error: Connection reset by peer]
06:18:25DopefishJustin joins
06:51:14atphoenix quits [Remote host closed the connection]
06:51:54atphoenix (atphoenix) joins
06:54:08atphoenix quits [Remote host closed the connection]
06:54:48atphoenix (atphoenix) joins
06:57:32atphoenix quits [Remote host closed the connection]
06:58:15atphoenix (atphoenix) joins
07:00:49jasons quits [Ping timeout: 272 seconds]
07:02:02jasons (jasons) joins
07:40:01jasons quits [Client Quit]
07:40:31jasons (jasons) joins
07:44:16jasons quits [Client Quit]
07:44:46jasons (jasons) joins
07:47:03Arcorann (Arcorann) joins
07:48:16<fireonlive>https://www.reddit.com/r/DataHoarder/comments/18mjqjd/the_master_tapes_for_all_of_reboot_have_been/
07:48:33<fireonlive>master tapes for โ€œrebootโ€ have been found
08:12:13c3manu (c3manu) joins
08:14:50<flashfire42>Archiveteam wiki down?
08:15:12<fireonlive>indee
08:15:13<fireonlive>d
08:20:00<angenieux>Is it because of the TestFlight "leak" driving traffic to the website?
08:25:13<fireonlive>unsure; another AT wiki went down too
08:32:43<angenieux>is that the https://internetarchive.archiveteam.org/ ?
08:35:28<fireonlive>indee
08:35:33<fireonlive>..d
08:35:51<angenieux>i see
09:17:42qwertyasdfuiopghjkl quits [Remote host closed the connection]
09:17:59qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
09:37:27c3manu quits [Remote host closed the connection]
09:45:33Webuser775 joins
09:45:45c3manu (c3manu) joins
09:45:46c3manu quits [Max SendQ exceeded]
09:46:14c3manu (c3manu) joins
09:46:14c3manu quits [Remote host closed the connection]
09:46:30c3manu (c3manu) joins
09:46:30c3manu quits [Max SendQ exceeded]
09:46:47c3manu (c3manu) joins
10:00:01Bleo1826 quits [Client Quit]
10:01:17Bleo1826 joins
10:50:18qwertyasdfuiopghjkl quits [Remote host closed the connection]
11:12:02Webuser775 quits [Client Quit]
11:16:57angenieux quits [Quit: The Lounge - https://thelounge.chat]
11:17:27angenieux (angenieux) joins
11:19:31qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
11:24:51<immibis>Times of Israel should probably get archived, yeah? it has strict cloudflare in front of it so i bet generic efforts didn't get it
11:31:07<datechnoman>Quick question for the mind hive. Playing around with the ia command line tool for the first time. How do I set the period when the files were uploaded/published to download? I am trying to pull all the cdx.gz files for a given period and then process them with the cdxsummary tool. The command I am using is "ia search
11:31:08<datechnoman>'collection:archiveteam_telegram' --itemlist | xargs -r -n 5 ia download --glob '*.cdx.gz'". I assumed there would be a switch such as --date but that does not seem to be the case. If there is a better way to do this please do share. Thanks in advance!
11:40:24<datechnoman>Maybe I could use identifiers or something?
11:41:42<datechnoman>Would also like a way to export all of the cdx.gz download links for that period as I can create a script to run them through the cdxsummary tool
11:42:59<datechnoman>Any example being: https://archive.org/download/archiveteam_telegram_20231220082241_fa0afc34/archiveteam_telegram_20231220082241_fa0afc34.cdx.gz
13:00:50Arcorann quits [Ping timeout: 240 seconds]
13:04:51ScenarioPlanet (ScenarioPlanet) joins
13:15:55atphoenix quits [Remote host closed the connection]
13:16:38atphoenix (atphoenix) joins
13:50:11Webuser775 joins
14:05:47katocala quits [Ping timeout: 272 seconds]
14:06:05katocala joins
14:15:17nicolas17 quits [Ping timeout: 272 seconds]
14:32:28Mateon1 quits [Quit: Mateon1]
14:33:17Mateon1 joins
15:12:21hackbug quits [Remote host closed the connection]
15:15:41hackbug (hackbug) joins
15:26:21Megame quits [Client Quit]
15:52:20kiryu quits [Ping timeout: 240 seconds]
15:54:24kiryu joins
15:54:24kiryu quits [Changing host]
15:54:24kiryu (kiryu) joins
15:54:46nicolas17 joins
15:57:15<magmaus3>out of curiosity, does #down-the-tube require any special permissions to use the bot?
15:59:09DogsRNice joins
15:59:50<Pedrosso>checking #down-the-tube james without mod or voice could do so
16:04:39<nicolas17>magmaus3: technically, no permissions needed
16:05:26<nicolas17>but if you're unsure if some video/channel fits in the archival scope as documented in the wiki, ask an op to approve it before submitting
16:05:53<@JAA>datechnoman: `ia search 'collection:archiveteam_telegram addeddate:[2023-12-01 TO 2023-12-20]' ...` for items created on those days. There's also `publicdate` (exact details of how that's set are unclear to me), and `oai_updatedate` allows to find items that had their most recent changes in some time window. You can use `null` instead of a date to make it an open range search.
16:07:40SketchCow joins
16:08:08SketchCow quits [Client Quit]
16:08:29<@JAA>A wild sketchy cow appeared.
16:08:58<nicolas17>but you didn't catch it fast enough
16:09:38SketchCow joins
16:09:46<nulldata>Quick, catch him now!
16:09:50<@JAA>:-)
16:39:01Dalek quits []
16:40:16<@JAA>My OneHallyu grab has been running for a while now. It looks like it might be tight. Their server is very slow; I'm getting 4.5 to 6 seconds average response time. Hitting it even harder is unlikely to help.
16:40:53<@JAA>Current ETA is just under 5 days. They're shutting down on the 25th...
16:41:53Dalek (Dalek) joins
17:16:27aninternettroll quits [Read error: Connection reset by peer]
17:16:38aninternettroll (aninternettroll) joins
17:20:13DLoader quits [Ping timeout: 272 seconds]
17:22:09DLoader joins
17:31:00BornOn420_ (BornOn420) joins
17:34:35Island joins
17:34:47BornOn420 quits [Ping timeout: 272 seconds]
17:42:00<ScenarioPlanet>Is that outline of AT Wiki planned?
17:43:59<@JAA>Outage, you mean?
17:44:08<ScenarioPlanet>503 error
17:44:48<@JAA>Not planned and being worked on as mentioned in the #archiveteam topic.
17:52:50DLoader quits [Ping timeout: 240 seconds]
17:53:59DLoader joins
18:14:58DLoader_ joins
18:14:59DLoader_ quits [Excess Flood]
18:15:24DLoader_ joins
18:16:50DLoader quits [Ping timeout: 240 seconds]
18:16:56DLoader_ is now known as DLoader
18:21:29<SketchCow>Totally planned.
18:21:32<SketchCow>We never miss
18:22:49<nulldata>It's a planned unplanned outage.
18:23:25<SketchCow>We're ArchiveTeam, we always work with the assumption everything dies and goes down
18:23:28<SketchCow>Nothing surprises us.
18:25:33<nulldata>The wiki is moving to Fandom so you can enjoy McDonald's ads and random unrelated gameplay videos along side your archival information!
18:31:29<TheTechRobo>All the information from the wiki is now available on our Discord server
18:41:24c3manu quits [Read error: Connection reset by peer]
19:01:15DLoader_ joins
19:03:20DLoader quits [Ping timeout: 240 seconds]
19:03:27DLoader_ is now known as DLoader
19:04:05<nulldata>Wiki is back. Quick, someone make an 'ansaleak' Twitter account for Yahoo Answers!
19:23:23Naruyoko quits [Remote host closed the connection]
19:23:41Naruyoko joins
19:28:38Naruyoko quits [Client Quit]
19:31:19DLoader quits [Ping timeout: 272 seconds]
19:33:43DLoader joins
19:39:15BlueMaxima joins
20:06:48<SketchCow>So, while I'm here, any other issues I need to be aware of?
20:07:03<SketchCow>I haven't abandoned you kids, I just went out for a pack of cigarettes
20:07:21<SketchCow>That should be my title: Archive Team Co-Founder, Went Out For Pack of Cigarettes
20:08:55<murb>missing, presumed smoked?
20:23:36BornOn420_ quits [Client Quit]
20:29:56BornOn420 (BornOn420) joins
20:35:49DLoader_ joins
20:37:20DLoader quits [Ping timeout: 240 seconds]
20:37:21DLoader_ is now known as DLoader
20:40:29Naruyoko joins
20:53:13RealPerson joins
20:53:28<fireonlive>i like this new title
20:54:51<that_lurker>theres a discord server
20:54:58<that_lurker>๐Ÿ‘€
20:55:06<fireonlive>absolutely not
20:55:07<@JAA>No
20:56:28<that_lurker>https://lounge.kuhaon.fun/folder/62d856ed4f653aee/3dvas5.gif
20:59:57<nulldata>\ msg fire *phew* that was a close one - lurker almost found out about the secret AT Discord server. See you in vc!
21:00:04<nulldata>oh shit
21:01:13<flashfire42>Discord Server?
21:01:32<flashfire42>*Makes backups of emojis from steam giveaway server and leaves*
21:01:34<flashfire42>Now I am ready
21:02:09Jawastore joins
21:02:46Jawastore quits [Remote host closed the connection]
21:17:04<fireonlive>๐Ÿ‘€
21:17:13<fireonlive>damn it null
21:22:18Webuser77531 joins
21:22:19Webuser775 quits [Ping timeout: 265 seconds]
21:22:47nfriedly quits [Ping timeout: 272 seconds]
21:28:52nfriedly joins
21:30:13icedice (icedice) joins
21:31:14Ketchup901 quits [Remote host closed the connection]
21:31:33Ketchup901 (Ketchup901) joins
21:58:32<Barto>https://support.google.com/groups/answer/11036538?hl=en uff
22:06:59RealPerson leaves
22:08:49RealPerson joins
22:13:41<h2ibot>Flashfire42 edited List of websites excluded from the Wayback Machine/Partial exclusions (+32): https://wiki.archiveteam.org/?diff=51386&oldid=51376
22:15:42<h2ibot>Flashfire42 edited List of websites excluded from the Wayback Machine/Partial exclusions (+31): https://wiki.archiveteam.org/?diff=51387&oldid=51386
22:29:45<h2ibot>Flashfire42 edited List of websites excluded from the Wayback Machine (+23): https://wiki.archiveteam.org/?diff=51388&oldid=51361
22:33:15RealPerson leaves
22:34:00<nicolas17>checking Safari Tech Preview links against WBM now...
22:36:13<nicolas17>95 versions to go, and I started to get rate limited by WBM
22:39:41<@JAA>nicolas17: Are you checking for truncation, too, or are these things below 2 GiB anyway?
22:40:24<nicolas17>for Safari I found one truncated yes
22:40:29<nicolas17>and included it in my list
22:40:45<@JAA>Ok, good :-)
22:41:13<nicolas17>the headers were weird too
22:41:16<nicolas17>HTTP/2 200 content-length: 1048576 x-archive-orig-x-crawler-content-length: 19521864 x-archive-orig-content-length: 1048576
22:41:52<@JAA>Beautiful
22:42:32<datechnoman>JAA thank you very much for that information. Really appreciate it :)
22:42:45<nicolas17>https://web.archive.org/web/20211208042307/http://appldnld.apple.com/Safari3/061-4602.20080416.t5rGb/SafariSetup.exe (truncated)
22:42:54<nicolas17>https://web.archive.org/web/20231220013857/http://appldnld.apple.com/Safari3/061-4602.20080416.t5rGb/SafariSetup.exe (complete, via archivebot yesterday)
22:55:50<h2ibot>JustAnotherArchivist edited Deathwatch (+287, /* 2023 */ Add Inside Imaging): https://wiki.archiveteam.org/?diff=51389&oldid=51385
23:00:51<h2ibot>JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=51390&oldid=51388
23:01:58RealPerson joins
23:06:04hitgrr8 quits [Quit: away]
23:11:34<@JAA>OneHallyu seems to be getting slower. I'm seeing an average response time of over 6 seconds now. ETA: not in time
23:11:50RealPerson leaves
23:13:43<nicolas17>other people archiving maybe? D:
23:13:54RealPerson joins
23:16:37<@JAA>Perhaps
23:21:29RealPerson leaves
23:27:30icedice quits [Client Quit]
23:32:12icedice (icedice) joins
23:57:04icedice quits [Client Quit]