00:06:22 | | nicolas17 joins |
00:08:48 | | nickofni1 quits [Remote host closed the connection] |
00:09:04 | | nickofnicks (nickofnicks) joins |
00:09:48 | | nickofnicks quits [Remote host closed the connection] |
00:10:06 | | nickofnicks (nickofnicks) joins |
01:20:38 | | Megame (Megame) joins |
01:34:30 | | nicolas17 is now authenticated as nicolas17 |
02:51:16 | | rfrf joins |
02:51:59 | | rfrf quits [Remote host closed the connection] |
03:06:36 | | celestial quits [Client Quit] |
03:21:30 | | icedice quits [Client Quit] |
03:21:54 | | celestial joins |
03:24:12 | <h2ibot> | JustAnotherArchivist edited Deathwatch (-4, /* 2023 */ Definitive deadline for OneHallyu): https://wiki.archiveteam.org/?diff=51385&oldid=51384 |
03:56:45 | | DogsRNice quits [Read error: Connection reset by peer] |
03:56:55 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
04:00:02 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
04:13:13 | <fireonlive> | from google alerts: https://mastodon.social/@stroughtonsmith/111602502197304440 :p |
04:24:20 | | DopefishJustin quits [Ping timeout: 240 seconds] |
04:57:25 | <@JAA> | I'm trying to qwarc OneHallyu. It's very slow. |
05:12:50 | | wickedplayer494 quits [Ping timeout: 240 seconds] |
05:13:17 | | wickedplayer494 joins |
05:13:27 | | wickedplayer494 is now authenticated as wickedplayer494 |
05:22:25 | | Island quits [Read error: Connection reset by peer] |
05:23:14 | <@JAA> | AQNB reopened a few hours ago and will shut down tomorrow (21st). Running through AB now. |
05:24:09 | <@arkiver> | wooh :) |
05:25:11 | <@JAA> | The Analysis & Policy Observatory (APO) managed to secure a partnership and isn't shutting down after all. Would be nice to archive anyway, but not as urgent. (Heavy rate limits and bans thwarted the AB attempts.) |
05:25:39 | <@JAA> | > After taking a well-deserved break, APO will be re-established in the new year. The website will remain open for you to search and browse our policy and research repository and collections. |
05:25:44 | <@JAA> | Whatever 're-established' means exactly. |
05:25:53 | <@JAA> | https://apo.org.au/FAQ |
05:26:48 | <nicolas17> | hm I think something about this whole "teraleak" stuff should be mentioned in our TestFlight wiki page |
05:27:18 | <nicolas17> | in particular since I and others did some indexing of what is in the archived data |
05:28:09 | <@JAA> | Yeah, that would probably be a good idea. It was just public data, as I understand it (this was before my time here). |
05:28:48 | <@arkiver> | it was just a regular project |
05:28:53 | <@arkiver> | like all other projects |
05:29:01 | <fireonlive> | i did see arkiver2 as the committer :3 |
05:29:37 | <@JAA> | It wasn't even a listable S3 bucket probably, right? Just S3 URLs referenced on the website? |
05:29:47 | <nicolas17> | I don't know how discovery worked |
05:29:55 | <@arkiver> | the tons of attention and "teraleak" branding is mostly people who just found out the "web archiving is pretty cool, because it saves stuff, like games, that are later maybe not available" |
05:30:04 | <@arkiver> | found out that* |
05:30:13 | <@arkiver> | remember for us it's pretty normal and obvious |
05:30:32 | <@arkiver> | for many non-tech people out there it's a first time they really see something like this |
05:30:43 | <@JAA> | Yeah, but it might be worth clarifying that we're archiving public data, not hacking into private systems or whatever. |
05:30:45 | <nicolas17> | I think good old scraping of previous projects found URLs like http://testflightapp.com/install/000059cc47865bcec060d67ddb11d30b-MTM2NjI4Ng/ |
05:30:53 | <@arkiver> | JAA: yeah! |
05:31:10 | <fireonlive> | someone called it 'leak' initially as well, which caught on and wasn't helpful... |
05:31:24 | <@arkiver> | the code is completely public though, people can see there's no hack-y stuff in there. |
05:31:30 | <fireonlive> | yep :) |
05:31:34 | <nicolas17> | and from there you ended up at the ipa download link like https://testflightapp.com/dashboard/ipa/251aaeaaf0001ec906c157f2f31ddcbd-MTMxODQ0MzI/6280ca3ee10631fd6817100ffd1ee849-MTMzMzc3/ |
05:31:39 | <@arkiver> | fireonlive: yeah honestly sounds like Discord shouting |
05:31:40 | <nicolas17> | which redirected to cloudfront or s3 |
05:31:47 | <fireonlive> | arkiver: indeed |
05:32:22 | <@arkiver> | i'm very happy though nicolas17 is making good use of this, and we're getting attention :) |
05:32:24 | <nicolas17> | arkiver: Jason Scott went into the Discord to clarify things |
05:32:31 | <@arkiver> | although in a different way would have been better |
05:32:33 | <@arkiver> | nicolas17: yeah |
05:32:34 | <@JAA> | arkiver: Assuming people (a) take the time to look for the code, (b) read the code, and (c) understand the code. :-) |
05:33:06 | <nicolas17> | I know someone who plans to do some analysis on the executables |
05:33:08 | <@arkiver> | JAA: yeah but if any "official bodies" get involved or look further into this, they'll find the code and how it's all working |
05:33:13 | <@JAA> | Right |
05:33:39 | <@arkiver> | just an Archive Team project like all others :) |
05:33:42 | <@arkiver> | 9 years ago |
05:33:54 | <@arkiver> | also this month i'm 10 years with Archive Team! |
05:34:03 | <@JAA> | \o/ |
05:34:04 | <fireonlive> | :D |
05:34:08 | <nicolas17> | she works at a company making a decompiler and I think other reverse engineering tools, so 70000 real-world iOS binaries is a goldmine of test cases |
05:35:50 | <@arkiver> | nicolas17: yeah :) |
05:36:44 | <nicolas17> | arkiver: also for people who want to do stuff on the whole dataset, they're like "wait what is this warc thing" |
05:37:09 | <nicolas17> | "you mean I don't have to do 70k requests to the slow web.archive.org hostname to download each individual file?" |
05:37:34 | <@JAA> | 'I can download over a terabyte at 5 kB/s instead? Amazing!' :-P |
05:37:43 | <@JAA> | But yeah |
05:38:00 | <nicolas17> | some people who have the storage quickly found the torrents |
05:38:05 | <@JAA> | Right |
05:38:12 | <@JAA> | Are these torrents complete? |
05:38:32 | <@arkiver> | there are torrents? |
05:38:36 | <@arkiver> | of testflight |
05:38:38 | <nicolas17> | I don't have the storage, so I'm piping wget into an "extract what I need and throw it away" script :P |
05:38:56 | <@JAA> | arkiver: We only set noarchivetorrent since a couple years ago, so I'd expect there to be torrents. |
05:38:58 | <nicolas17> | archive.org's autogenerated torrents work fine, the warcs are max 50GB |
05:39:02 | <fireonlive> | everyone should get a free 1PiB minimum :( |
05:39:03 | <@arkiver> | JAA: ah |
05:39:12 | <fireonlive> | as a human right! |
05:39:21 | <nicolas17> | although they have the usual problems of IA torrents |
05:39:37 | <@arkiver> | nicolas17: what are those problems? |
05:39:48 | <nicolas17> | textfiles edited the description on the items to clarify where they came from, and added a preview image |
05:40:07 | <nicolas17> | which re-generated the torrents |
05:40:30 | <nicolas17> | so now the old ones don't work anymore, or at least don't exchange data with people who got the new ones :P |
05:40:45 | <@arkiver> | i don't know if we needed that image |
05:40:47 | <@arkiver> | on all items |
05:41:16 | <nicolas17> | yeah that was questionable, but I think the xml file with the metadata (including description) is in the torrent too |
05:41:17 | <@JAA> | I don't think we needed it on any item. |
05:41:24 | <@arkiver> | JAA: yeah |
05:41:28 | <@arkiver> | just the logo maybe on the collection |
05:41:33 | <nicolas17> | so image or not, editing the description would invalidate the torrent anyway |
05:41:49 | <fireonlive> | :( |
05:42:12 | <@JAA> | Collections can have images, I think? That could've been the appropriate place. |
05:42:35 | | nicolas17 beds |
05:42:39 | <@JAA> | And a link to the wiki page there would've been good, too. |
05:42:50 | <fireonlive> | collection description maybe? hm. |
05:43:02 | <fireonlive> | cu nicky |
05:46:05 | | wickedplayer494 quits [Ping timeout: 272 seconds] |
05:47:13 | | wickedplayer494 joins |
05:47:23 | | wickedplayer494 is now authenticated as wickedplayer494 |
06:08:43 | | BlueMaxima quits [Read error: Connection reset by peer] |
06:18:25 | | DopefishJustin joins |
06:18:25 | | DopefishJustin is now authenticated as DopefishJustin |
06:51:14 | | atphoenix quits [Remote host closed the connection] |
06:51:54 | | atphoenix (atphoenix) joins |
06:54:08 | | atphoenix quits [Remote host closed the connection] |
06:54:48 | | atphoenix (atphoenix) joins |
06:57:32 | | atphoenix quits [Remote host closed the connection] |
06:58:15 | | atphoenix (atphoenix) joins |
07:00:49 | | jasons quits [Ping timeout: 272 seconds] |
07:02:02 | | jasons (jasons) joins |
07:40:01 | | jasons quits [Client Quit] |
07:40:31 | | jasons (jasons) joins |
07:44:16 | | jasons quits [Client Quit] |
07:44:46 | | jasons (jasons) joins |
07:47:03 | | Arcorann (Arcorann) joins |
07:48:16 | <fireonlive> | https://www.reddit.com/r/DataHoarder/comments/18mjqjd/the_master_tapes_for_all_of_reboot_have_been/ |
07:48:33 | <fireonlive> | master tapes for โrebootโ have been found |
08:12:13 | | c3manu (c3manu) joins |
08:14:50 | <flashfire42> | Archiveteam wiki down? |
08:15:12 | <fireonlive> | indee |
08:15:13 | <fireonlive> | d |
08:20:00 | <angenieux> | Is it because of the TestFlight "leak" driving traffic to the website? |
08:25:13 | <fireonlive> | unsure; another AT wiki went down too |
08:32:43 | <angenieux> | is that the https://internetarchive.archiveteam.org/ ? |
08:35:28 | <fireonlive> | indee |
08:35:33 | <fireonlive> | ..d |
08:35:51 | <angenieux> | i see |
09:17:42 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
09:17:59 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
09:37:27 | | c3manu quits [Remote host closed the connection] |
09:45:33 | | Webuser775 joins |
09:45:45 | | c3manu (c3manu) joins |
09:45:46 | | c3manu quits [Max SendQ exceeded] |
09:46:14 | | c3manu (c3manu) joins |
09:46:14 | | c3manu quits [Remote host closed the connection] |
09:46:30 | | c3manu (c3manu) joins |
09:46:30 | | c3manu quits [Max SendQ exceeded] |
09:46:47 | | c3manu (c3manu) joins |
10:00:01 | | Bleo1826 quits [Client Quit] |
10:01:17 | | Bleo1826 joins |
10:50:18 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
11:12:02 | | Webuser775 quits [Client Quit] |
11:16:57 | | angenieux quits [Quit: The Lounge - https://thelounge.chat] |
11:17:27 | | angenieux (angenieux) joins |
11:19:31 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
11:24:51 | <immibis> | Times of Israel should probably get archived, yeah? it has strict cloudflare in front of it so i bet generic efforts didn't get it |
11:31:07 | <datechnoman> | Quick question for the mind hive. Playing around with the ia command line tool for the first time. How do I set the period when the files were uploaded/published to download? I am trying to pull all the cdx.gz files for a given period and then process them with the cdxsummary tool. The command I am using is "ia search |
11:31:08 | <datechnoman> | 'collection:archiveteam_telegram' --itemlist | xargs -r -n 5 ia download --glob '*.cdx.gz'". I assumed there would be a switch such as --date but that does not seem to be the case. If there is a better way to do this please do share. Thanks in advance! |
11:40:24 | <datechnoman> | Maybe I could use identifiers or something? |
11:41:42 | <datechnoman> | Would also like a way to export all of the cdx.gz download links for that period as I can create a script to run them through the cdxsummary tool |
11:42:59 | <datechnoman> | Any example being: https://archive.org/download/archiveteam_telegram_20231220082241_fa0afc34/archiveteam_telegram_20231220082241_fa0afc34.cdx.gz |
13:00:50 | | Arcorann quits [Ping timeout: 240 seconds] |
13:04:51 | | ScenarioPlanet (ScenarioPlanet) joins |
13:15:55 | | atphoenix quits [Remote host closed the connection] |
13:16:38 | | atphoenix (atphoenix) joins |
13:50:11 | | Webuser775 joins |
14:05:47 | | katocala quits [Ping timeout: 272 seconds] |
14:06:05 | | katocala joins |
14:06:05 | | katocala is now authenticated as katocala |
14:15:17 | | nicolas17 quits [Ping timeout: 272 seconds] |
14:32:28 | | Mateon1 quits [Quit: Mateon1] |
14:33:17 | | Mateon1 joins |
15:12:21 | | hackbug quits [Remote host closed the connection] |
15:15:41 | | hackbug (hackbug) joins |
15:26:21 | | Megame quits [Client Quit] |
15:52:20 | | kiryu quits [Ping timeout: 240 seconds] |
15:54:24 | | kiryu joins |
15:54:24 | | kiryu is now authenticated as kiryu |
15:54:24 | | kiryu quits [Changing host] |
15:54:24 | | kiryu (kiryu) joins |
15:54:46 | | nicolas17 joins |
15:57:15 | <magmaus3> | out of curiosity, does #down-the-tube require any special permissions to use the bot? |
15:59:09 | | DogsRNice joins |
15:59:50 | <Pedrosso> | checking #down-the-tube james without mod or voice could do so |
16:04:39 | <nicolas17> | magmaus3: technically, no permissions needed |
16:05:26 | <nicolas17> | but if you're unsure if some video/channel fits in the archival scope as documented in the wiki, ask an op to approve it before submitting |
16:05:53 | <@JAA> | datechnoman: `ia search 'collection:archiveteam_telegram addeddate:[2023-12-01 TO 2023-12-20]' ...` for items created on those days. There's also `publicdate` (exact details of how that's set are unclear to me), and `oai_updatedate` allows to find items that had their most recent changes in some time window. You can use `null` instead of a date to make it an open range search. |
16:07:40 | | SketchCow joins |
16:08:08 | | SketchCow quits [Client Quit] |
16:08:29 | <@JAA> | A wild sketchy cow appeared. |
16:08:58 | <nicolas17> | but you didn't catch it fast enough |
16:09:38 | | SketchCow joins |
16:09:46 | <nulldata> | Quick, catch him now! |
16:09:50 | <@JAA> | :-) |
16:39:01 | | Dalek quits [] |
16:40:16 | <@JAA> | My OneHallyu grab has been running for a while now. It looks like it might be tight. Their server is very slow; I'm getting 4.5 to 6 seconds average response time. Hitting it even harder is unlikely to help. |
16:40:53 | <@JAA> | Current ETA is just under 5 days. They're shutting down on the 25th... |
16:41:53 | | Dalek (Dalek) joins |
17:16:27 | | aninternettroll quits [Read error: Connection reset by peer] |
17:16:38 | | aninternettroll (aninternettroll) joins |
17:20:13 | | DLoader quits [Ping timeout: 272 seconds] |
17:22:09 | | DLoader joins |
17:31:00 | | BornOn420_ (BornOn420) joins |
17:34:35 | | Island joins |
17:34:47 | | BornOn420 quits [Ping timeout: 272 seconds] |
17:42:00 | <ScenarioPlanet> | Is that outline of AT Wiki planned? |
17:43:59 | <@JAA> | Outage, you mean? |
17:44:08 | <ScenarioPlanet> | 503 error |
17:44:48 | <@JAA> | Not planned and being worked on as mentioned in the #archiveteam topic. |
17:52:50 | | DLoader quits [Ping timeout: 240 seconds] |
17:53:59 | | DLoader joins |
18:14:58 | | DLoader_ joins |
18:14:59 | | DLoader_ quits [Excess Flood] |
18:15:24 | | DLoader_ joins |
18:16:50 | | DLoader quits [Ping timeout: 240 seconds] |
18:16:56 | | DLoader_ is now known as DLoader |
18:21:29 | <SketchCow> | Totally planned. |
18:21:32 | <SketchCow> | We never miss |
18:22:49 | <nulldata> | It's a planned unplanned outage. |
18:23:25 | <SketchCow> | We're ArchiveTeam, we always work with the assumption everything dies and goes down |
18:23:28 | <SketchCow> | Nothing surprises us. |
18:25:33 | <nulldata> | The wiki is moving to Fandom so you can enjoy McDonald's ads and random unrelated gameplay videos along side your archival information! |
18:31:29 | <TheTechRobo> | All the information from the wiki is now available on our Discord server |
18:41:24 | | c3manu quits [Read error: Connection reset by peer] |
19:01:15 | | DLoader_ joins |
19:03:20 | | DLoader quits [Ping timeout: 240 seconds] |
19:03:27 | | DLoader_ is now known as DLoader |
19:04:05 | <nulldata> | Wiki is back. Quick, someone make an 'ansaleak' Twitter account for Yahoo Answers! |
19:23:23 | | Naruyoko quits [Remote host closed the connection] |
19:23:41 | | Naruyoko joins |
19:28:38 | | Naruyoko quits [Client Quit] |
19:31:19 | | DLoader quits [Ping timeout: 272 seconds] |
19:33:43 | | DLoader joins |
19:39:15 | | BlueMaxima joins |
20:06:48 | <SketchCow> | So, while I'm here, any other issues I need to be aware of? |
20:07:03 | <SketchCow> | I haven't abandoned you kids, I just went out for a pack of cigarettes |
20:07:21 | <SketchCow> | That should be my title: Archive Team Co-Founder, Went Out For Pack of Cigarettes |
20:08:55 | <murb> | missing, presumed smoked? |
20:23:36 | | BornOn420_ quits [Client Quit] |
20:29:56 | | BornOn420 (BornOn420) joins |
20:35:49 | | DLoader_ joins |
20:37:20 | | DLoader quits [Ping timeout: 240 seconds] |
20:37:21 | | DLoader_ is now known as DLoader |
20:40:29 | | Naruyoko joins |
20:53:13 | | RealPerson joins |
20:53:28 | <fireonlive> | i like this new title |
20:54:51 | <that_lurker> | theres a discord server |
20:54:58 | <that_lurker> | ๐ |
20:55:06 | <fireonlive> | absolutely not |
20:55:07 | <@JAA> | No |
20:56:28 | <that_lurker> | https://lounge.kuhaon.fun/folder/62d856ed4f653aee/3dvas5.gif |
20:59:57 | <nulldata> | \ msg fire *phew* that was a close one - lurker almost found out about the secret AT Discord server. See you in vc! |
21:00:04 | <nulldata> | oh shit |
21:01:13 | <flashfire42> | Discord Server? |
21:01:32 | <flashfire42> | *Makes backups of emojis from steam giveaway server and leaves* |
21:01:34 | <flashfire42> | Now I am ready |
21:02:09 | | Jawastore joins |
21:02:46 | | Jawastore quits [Remote host closed the connection] |
21:17:04 | <fireonlive> | ๐ |
21:17:13 | <fireonlive> | damn it null |
21:22:18 | | Webuser77531 joins |
21:22:19 | | Webuser775 quits [Ping timeout: 265 seconds] |
21:22:47 | | nfriedly quits [Ping timeout: 272 seconds] |
21:28:52 | | nfriedly joins |
21:30:13 | | icedice (icedice) joins |
21:31:14 | | Ketchup901 quits [Remote host closed the connection] |
21:31:33 | | Ketchup901 (Ketchup901) joins |
21:58:32 | <Barto> | https://support.google.com/groups/answer/11036538?hl=en uff |
22:06:59 | | RealPerson leaves |
22:08:49 | | RealPerson joins |
22:13:41 | <h2ibot> | Flashfire42 edited List of websites excluded from the Wayback Machine/Partial exclusions (+32): https://wiki.archiveteam.org/?diff=51386&oldid=51376 |
22:15:42 | <h2ibot> | Flashfire42 edited List of websites excluded from the Wayback Machine/Partial exclusions (+31): https://wiki.archiveteam.org/?diff=51387&oldid=51386 |
22:29:45 | <h2ibot> | Flashfire42 edited List of websites excluded from the Wayback Machine (+23): https://wiki.archiveteam.org/?diff=51388&oldid=51361 |
22:33:15 | | RealPerson leaves |
22:34:00 | <nicolas17> | checking Safari Tech Preview links against WBM now... |
22:36:13 | <nicolas17> | 95 versions to go, and I started to get rate limited by WBM |
22:39:41 | <@JAA> | nicolas17: Are you checking for truncation, too, or are these things below 2 GiB anyway? |
22:40:24 | <nicolas17> | for Safari I found one truncated yes |
22:40:29 | <nicolas17> | and included it in my list |
22:40:45 | <@JAA> | Ok, good :-) |
22:41:13 | <nicolas17> | the headers were weird too |
22:41:16 | <nicolas17> | HTTP/2 200 content-length: 1048576 x-archive-orig-x-crawler-content-length: 19521864 x-archive-orig-content-length: 1048576 |
22:41:52 | <@JAA> | Beautiful |
22:42:32 | <datechnoman> | JAA thank you very much for that information. Really appreciate it :) |
22:42:45 | <nicolas17> | https://web.archive.org/web/20211208042307/http://appldnld.apple.com/Safari3/061-4602.20080416.t5rGb/SafariSetup.exe (truncated) |
22:42:54 | <nicolas17> | https://web.archive.org/web/20231220013857/http://appldnld.apple.com/Safari3/061-4602.20080416.t5rGb/SafariSetup.exe (complete, via archivebot yesterday) |
22:55:50 | <h2ibot> | JustAnotherArchivist edited Deathwatch (+287, /* 2023 */ Add Inside Imaging): https://wiki.archiveteam.org/?diff=51389&oldid=51385 |
23:00:51 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=51390&oldid=51388 |
23:01:58 | | RealPerson joins |
23:06:04 | | hitgrr8 quits [Quit: away] |
23:11:34 | <@JAA> | OneHallyu seems to be getting slower. I'm seeing an average response time of over 6 seconds now. ETA: not in time |
23:11:50 | | RealPerson leaves |
23:13:43 | <nicolas17> | other people archiving maybe? D: |
23:13:54 | | RealPerson joins |
23:16:37 | <@JAA> | Perhaps |
23:21:29 | | RealPerson leaves |
23:27:30 | | icedice quits [Client Quit] |
23:32:12 | | icedice (icedice) joins |
23:57:04 | | icedice quits [Client Quit] |