| 00:06:19 | | etnguyen03 quits [Client Quit] |
| 00:09:00 | | Sk1d joins |
| 00:18:06 | <kline> | is there anyone who still has an interest in chaingang? I now have a text serialised version of the bitcoin (blergh) blockchain, and wondering what the best architecture for it would be to upload |
| 00:19:05 | <nicolas17> | what |
| 00:19:42 | <nicolas17> | what is chaingang? |
| 00:19:57 | <kline> | https://wiki.archiveteam.org/index.php/ArchiveTeam_Chain_Gang |
| 00:20:16 | <nicolas17> | oh |
| 00:20:57 | <nicolas17> | isn't that like impossible to get lost? |
| 00:21:24 | <kline> | an otherwise abandoned project to back up blockchains used for cryptocurrencies. i dont personally like them, but i do think bitcoin and probably ethereum could be considered internet-culturally-important and especially right now has taken a bit of a dive with the allocation of computing resources towards LLMs |
| 00:22:42 | <kline> | i dont think its impossible to lose - there are some subset of users of the network who maintain fully copies of the chain, often service providers who want to sell analytics etc, but most users do not have a complete copy and only maintain the last-N blocks to allow a bit of history |
| 00:23:10 | <kline> | i think monero has already got some holes in its blockchain but i would need to find a source on that |
| 00:23:16 | <nicolas17> | ah |
| 00:23:30 | <nicolas17> | last time I used this there were two types of clients |
| 00:24:01 | <nicolas17> | full-history and thin-client-ish |
| 00:24:16 | <nicolas17> | guess they finally implemented history pruning |
| 00:25:04 | | wickedplayer494 quits [Ping timeout: 256 seconds] |
| 00:25:17 | <kline> | full history nodes (with indexes) are closing in on 1TB of storage space (about 650GB is binary data, the rest are fast indexes so you dont have to iterate back through the chain every time you want to find a given block) |
| 00:25:50 | <kline> | naturally most people aren't spending a week+ downloading this to get started |
| 00:26:02 | | wickedplayer494 (wickedplayer494) joins |
| 00:31:23 | <kline> | failing anyone specifically interested, a more generic question: i have 206 files of monthly data coming to a grand total of ~850GB. Would it be better to structure this as 206 single-file items under a collection, or 1 item with 206 files? |
| 00:31:45 | <kline> | the largest individual file is 11GB |
| 00:33:21 | <pokechu22> | I don't have an opinion, but it's worth considering how you'd handle adding new data next month |
| 00:34:00 | <@JAA> | Something inbetween like yearly or quarterly items might be worth considering, too. |
| 00:34:32 | <kline> | pokechu22, you can add new files to an item afterwards, no? |
| 00:34:43 | <@JAA> | How is this data distributed in the actual network? As in, when you set up a full history node, how does it obtain all the data? |
| 00:35:03 | <pokechu22> | Pretty sure you can, though I feel like that can be a bit weird in some situations |
| 00:35:20 | <@JAA> | You can add more files later, yes, but there is a hard item size limit of 1 TiB, so you'd probably run into that pretty soon. |
| 00:35:23 | <nicolas17> | JAA: from other nodes p2p via the custom bitcoin protocol |
| 00:35:38 | <kline> | JAA, it bootstraps itself into the p2p network finding other full nodes, then just iterates through every block in order until its up to date |
| 00:35:49 | <@JAA> | Ah |
| 00:36:03 | <@JAA> | And how does the bootstrapping work? |
| 00:36:36 | <kline> | thats a good question, let me check how it finds its first peer |
| 00:36:38 | <nicolas17> | peers can tell you about other peers, to find the first ones I think there's some hardcoded IPs or a DNS record? |
| 00:37:53 | | SootBector quits [Remote host closed the connection] |
| 00:38:02 | <@JAA> | Yeah, that's what I'd expect, I think. BitTorrent's DHT works like that. |
| 00:39:03 | | SootBector (SootBector) joins |
| 00:39:17 | <kline> | DNS and hardcoded IPs of promised long-term nodes as a backup, apparently |
| 00:43:40 | | nexussfan (nexussfan) joins |
| 00:47:08 | | useretail quits [Quit: Leaving] |
| 00:48:02 | | Sk1d quits [Read error: Connection reset by peer] |
| 01:00:47 | | wickedplayer494 quits [Ping timeout: 272 seconds] |
| 01:01:50 | | wickedplayer494 (wickedplayer494) joins |
| 01:06:12 | | Wohlstand quits [Quit: Wohlstand] |
| 01:14:12 | | etnguyen03 (etnguyen03) joins |
| 01:25:12 | <h2ibot> | PaulWise edited Obstacles (+44, Spigot poisoner): https://wiki.archiveteam.org/?diff=60473&oldid=60388 |
| 01:33:13 | <h2ibot> | PaulWise edited Obstacles (+55, sethrawall): https://wiki.archiveteam.org/?diff=60474&oldid=60473 |
| 01:34:59 | | wickedplayer494 quits [Ping timeout: 272 seconds] |
| 01:39:29 | | wickedplayer494 (wickedplayer494) joins |
| 02:32:33 | | LddPotato quits [Read error: Connection reset by peer] |
| 02:34:49 | | LddPotato (LddPotato) joins |
| 02:44:39 | | Bleo1826007227196234552220 quits [Ping timeout: 272 seconds] |
| 02:44:39 | | LddPotato quits [Read error: Connection reset by peer] |
| 02:46:02 | | LddPotato (LddPotato) joins |
| 02:51:45 | <Yakov> | Speaking of archiving archive.today captures, can't we make a system where users solve captchas towards a pool of browser sessions running at AT? Might not generate WBM-valid warcs but will at least be able to be preserved to some extent |
| 02:52:49 | <Yakov> | Thought of this for a while, where the captcha is streamed to the browser via novnc so people can contribute solving a captcha that's being operated on a remote server |
| 02:53:47 | <Yakov> | Then we can scrape popular sites (like wikis) which reference archive.today so we can build a nice priority list of urls |
| 02:56:56 | <Yakov> | Can technicality even be a valid warc if it's acceptable to omit the captcha request from the warc - not really sure about the standards of that |
| 03:03:14 | | LddPotato quits [Read error: Connection reset by peer] |
| 03:04:47 | | LddPotato (LddPotato) joins |
| 03:24:03 | | LddPotato quits [Read error: Connection reset by peer] |
| 03:26:00 | | LddPotato (LddPotato) joins |
| 03:39:07 | | ducky quits [Ping timeout: 272 seconds] |
| 04:04:16 | | APOLLO03a quits [Read error: Connection reset by peer] |
| 04:04:30 | | APOLLO03 joins |
| 04:11:33 | | etnguyen03 quits [Client Quit] |
| 04:12:02 | | etnguyen03 (etnguyen03) joins |
| 04:13:38 | | etnguyen03 quits [Remote host closed the connection] |
| 04:30:37 | | Island quits [Read error: Connection reset by peer] |
| 04:57:56 | | Shjosan quits [Quit: Am sleepy (-, – )…zzzZZZ] |
| 04:58:27 | | Shjosan (Shjosan) joins |
| 05:04:26 | | n9nes quits [Ping timeout: 256 seconds] |
| 05:05:18 | | n9nes joins |
| 05:13:36 | | janos777 quits [Read error: Connection reset by peer] |
| 05:13:36 | | janos778 quits [Read error: Connection reset by peer] |
| 05:24:57 | | BlankEclair leaves |
| 05:25:09 | | BlankEclair (BlankEclair) joins |
| 06:00:07 | | Webuser146109 joins |
| 06:02:12 | | Webuser146109 quits [Client Quit] |
| 06:05:28 | | nexussfan quits [Quit: Konversation terminated!] |
| 07:59:39 | | benjins3_ joins |
| 08:00:06 | | benjins3 quits [Ping timeout: 256 seconds] |
| 08:28:33 | | ichdasich quits [Ping timeout: 272 seconds] |
| 08:29:08 | | ichdasich joins |
| 09:22:04 | | sec^nd quits [Remote host closed the connection] |
| 09:22:27 | | sec^nd (second) joins |
| 09:27:22 | | sec^nd quits [Remote host closed the connection] |
| 09:27:45 | | sec^nd (second) joins |
| 09:28:05 | | ichdasich quits [Ping timeout: 272 seconds] |
| 09:33:15 | | ichdasich joins |
| 09:46:17 | | Wohlstand (Wohlstand) joins |
| 09:57:47 | <h2ibot> | Manu edited Discourse/archived (+103, Queued community.roboticsys.com): https://wiki.archiveteam.org/?diff=60475&oldid=60469 |
| 10:22:54 | | ducky (ducky) joins |
| 10:29:22 | | n9nes quits [Quit: ZNC 1.10.1 - https://znc.in] |
| 10:33:02 | | n9nes joins |
| 11:25:02 | | Webuser364520 joins |
| 11:26:17 | | ducky_ (ducky) joins |
| 11:26:31 | | ducky quits [Ping timeout: 272 seconds] |
| 11:26:54 | | ducky_ is now known as ducky |
| 11:42:07 | | Dada joins |
| 11:51:55 | | Webuser364520 quits [Client Quit] |
| 12:02:48 | | Bleo1826007227196234552220 joins |
| 12:08:57 | | Snivy quits [Ping timeout: 272 seconds] |
| 12:13:05 | <katia> | yeah i thought of doing this for my own archival of Difficult thugs Yakov |
| 12:13:14 | <katia> | browser in novnc + warcprox |
| 12:13:56 | <katia> | don't see why it wouldn't work |
| 13:39:58 | | pedantic-darwin quits [Quit: The Lounge - https://thelounge.chat] |
| 13:40:09 | | Arcorann_ quits [Ping timeout: 272 seconds] |
| 13:57:28 | | cyanbox quits [Read error: Connection reset by peer] |
| 15:14:22 | <justauser> | thedude: For a single page, "wget -p" might be perfectly fine if it likes your IP. |
| 15:16:07 | <justauser> | Everybody interested in archive.today: CAPTCHA difficulty level is allegedly adaptive to server load, so we may want to run during some "quiet time". |
| 15:17:51 | | Island joins |
| 15:48:07 | | anarcat quits [Quit: rebooting] |
| 15:49:00 | <justauser> | https://comp.lain.la/notice/B3Fp2LQ98vYq06V9EG |
| 15:49:53 | <justauser> | Did we try talking yet? |
| 15:50:47 | | anarcat (anarcat) joins |
| 15:51:22 | <klea> | /cc pabs because I think they said about asking them. |
| 16:16:06 | <@arkiver> | i think we don't have historyhub.history.gov archived yet |
| 16:18:56 | | Dada quits [Remote host closed the connection] |
| 16:21:47 | | kansei (kansei) joins |
| 16:22:10 | | kansei- quits [Ping timeout: 256 seconds] |
| 16:22:47 | <@arkiver> | found a way to enumerate all discussions |
| 16:26:33 | <@arkiver> | imer: i may need a very urgent target for historyhub, shutting down on the 13th |
| 16:26:34 | <@arkiver> | today |
| 16:26:43 | <@imer> | ack |
| 16:26:47 | <@arkiver> | less than 100k threads, and they can be enumerated easily |
| 16:27:50 | <@imer> | give me a shout once the tracker is up and i'll set one up |
| 16:27:59 | <@arkiver> | imer: it's up at historyhub |
| 16:28:13 | | @imer checked 5s ago and it wasnt |
| 16:28:24 | <@arkiver> | can we get a target with historyhub_ archiveteam_historyhub_ and "Archive Team History Hub:"? |
| 16:28:31 | <@arkiver> | starting as soon as possible, hurrying |
| 16:28:36 | <@arkiver> | imer: yep i created it seconds ago |
| 16:29:43 | <@imer> | target's up |
| 16:30:04 | <@arkiver> | thanks |
| 16:30:10 | <@arkiver> | if needed we'll pause other projects, but i expect it's not needed |
| 16:30:42 | <@imer> | drone poked too |
| 16:36:20 | <@arkiver> | thank you as well for that |
| 16:37:21 | <@arkiver> | going to keep this simple i think |
| 16:40:09 | <@arkiver> | less than 60k threads |
| 16:40:18 | <@arkiver> | if this is not hosted on a potato we should be good |
| 16:56:00 | | Connection closed. |
| 16:56:12 | | atirclog (atirclog) joins |
| 16:56:12 | | Topic: Lengthy ArchiveTeam-related discussions, questions here | Offtopic: #archiveteam-ot | https://twitter.com/textfiles/status/1069715869994020867 |
| 16:56:12 | | Topic set by AlsoJAA at 2025-09-09 18:44:47Z |
| 16:56:12 | | h2ibot (h2ibot) joins |
| 16:56:12 | | jinn6 (jinn6) joins |
| 16:56:13 | | ax (ax) joins |
| 16:56:13 | | inedia (inedia) joins |
| 16:56:13 | | igloo22225 (igloo22225) joins |
| 16:56:14 | | yasomi joins |
| 16:56:14 | | yasomi is now authenticated as yasomi |
| 16:56:14 | | yasomi quits [Changing host] |
| 16:56:14 | | yasomi (yasomi) joins |
| 16:56:14 | | FalconK (FalconK) joins |
| 16:56:14 | | maxfan8_ (maxfan8) joins |
| 16:56:14 | | d10n joins |
| 16:56:14 | | TunaLobster joins |
| 16:56:14 | | lukash984 joins |
| 16:56:14 | | s-crypt (s-crypt) joins |
| 16:56:14 | | Lord_Nightmare (Lord_Nightmare) joins |
| 16:56:15 | | wyatt8740 joins |
| 16:56:15 | | notSokar joins |
| 16:56:15 | | PredatorIWD25 joins |
| 16:56:15 | | Barto (Barto) joins |
| 16:56:16 | | unknownsrc (unknownsrc) joins |
| 16:56:16 | | Pedrosso joins |
| 16:56:17 | | Fusl (Fusl) joins |
| 16:56:17 | | skankhunt42 joins |
| 16:56:17 | | monika (boom) joins |
| 16:56:17 | | sepro (sepro) joins |
| 16:56:17 | | michaelblob joins |
| 16:56:17 | | Soulflare joins |
| 16:56:18 | | benjins3_ joins |
| 16:56:18 | | kiska52 joins |
| 16:56:18 | | @ChanServ sets mode: +o Fusl |
| 16:56:18 | | fetcher joins |
| 16:56:19 | | phillipsjk joins |
| 16:56:19 | | archiveDrill joins |
| 16:56:19 | | xkey (xkey) joins |
| 16:56:19 | | Stagnant (Stagnant) joins |
| 16:56:19 | | khaoohs joins |
| 16:56:20 | | wessel1512 joins |
| 16:56:20 | | ThreeHM (ThreeHeadedMonkey) joins |
| 16:56:20 | | lexikiq joins |
| 16:56:20 | | h|ca2 (h) joins |
| 16:56:20 | | G4te_Keep3r34924156 joins |
| 16:56:20 | | nickofnicks (nickofnicks) joins |
| 16:56:21 | | skyrocket joins |
| 16:56:21 | | hexagonwin (hexagonwin) joins |
| 16:56:21 | | bilboed0 joins |
| 16:56:21 | | pokechu22 (pokechu22) joins |
| 16:56:22 | | Island joins |
| 16:56:23 | | Current users: Island, pokechu22 (pokechu22), bilboed0, hexagonwin (hexagonwin), skyrocket, nickofnicks (nickofnicks), G4te_Keep3r34924156, h|ca2 (h), lexikiq, ThreeHM (ThreeHeadedMonkey), wessel1512, khaoohs, Stagnant (Stagnant), xkey (xkey), archiveDrill, phillipsjk, fetcher, kiska52, benjins3_, Soulflare, michaelblob, sepro (sepro), monika (boom), skankhunt42, @Fusl (Fusl), Pedrosso, unknownsrc (unknownsrc), Barto (Barto), PredatorIWD25, notSokar, wyatt8740, Lord_Nightmare (Lord_Nightmare), s-crypt (s-crypt), lukash984, TunaLobster, d10n, maxfan8_ (maxfan8), FalconK (FalconK), yasomi (yasomi), igloo22225 (igloo22225), inedia (inedia), ax (ax), jinn6 (jinn6), h2ibot (h2ibot), atirclog (atirclog), trix (trix), BluRaf (BluRaf), fluke, luckcolors (luckcolors), ^ (^), Dalek (Dalek), PC (PC), ice, Radzig, ducky (ducky), peaches, chaoticbee (chaoticbee), krush, Shjosan (Shjosan), Kenshin (Kenshin), APOLLO03, szczot3k (szczot3k), Ryz2 (Ryz), andrewnyr, chunkynutz60, matoro, tmg1|michelson, ConstantK, ichdasich, summerisle (summerisle), phuzion (phuzion), anarcat (anarcat), cptcobalt, russss (russss), magmaus3 (magmaus3), pie_ (pie_), dan-, oxtyped, flotwig, pixeldesu (pixeldesu), atweedie, kf (kf), w0rm (w0rm), Bleo1826007227196234552220, n9nes, sec^nd (second), BlankEclair (BlankEclair), LddPotato (LddPotato), SootBector (SootBector), rohvani, nicolas17 (nicolas17), atphoenix__ (atphoenix), UwU, Dj-Wawa (Dj-Wawa), Hackerpcs (Hackerpcs), fionera (Fionera), BornOn420 (BornOn420), mgrytbak, TheEnbyperor (TheEnbyperor), BennyOtt (BennyOtt), fireatseaparks (fireatseaparks), fangfufu (fangfufu), roverinexile, pabs (pabs), lunik1, Goofybally (Goofybally), Deewiant (Deewiant), Karlett (Karlett), beastbg8 (beastbg8), tubgoat, DopefishJustin (DopefishJustin), Max_G, jspiros (jspiros), allani, nyakase (nyakase), arch (arch), beardicus (beardicus), Sidpatchy (Sidpatchy), qxtal (qxtal), Webuser995648, aninternettroll (aninternettroll), M--mlv|m, legoktm, miksters|m, IceCodeNew|m, mat|m1, Joy|m, ampdot|m, will|m, username675f|m, saouroun|m, Passiing|m, gareth48|m, yarnover|m, Claire|m, Tyrasuki|m, Valkum|m, yetanotherarchiver|m, Misty|m, hillow596|m, rain|m, Cronfox|m, noxious, PhoHale|m, mister_x, kaz__|m, ram|m, Fijxu|m, Video, starg2|m, mind_combatant (mind_combatant), Exorcism (exorcism), osiride|m, justauser|m (justauser|m), its_notjack (its_notjack), Alienmaster|m, e2mau|m, katia|m, tomodachi94 (tomodachi94), anon00001|m, mikolaj|m, Adamvoltagex|m, supermariofan67|m, vics, Tom|m1, iCesenberk|m, Roki_100|m, jevinskie, CrispyAlice2, Fletcher (Fletcher), lasdkfj|m, masterx244|m (masterx244|m), audrooku|m, britmob|m, nightpool (nightpool), ragu|m, madpro|m, x9fff00 (x9fff00), jackt1365|m, schwarzkatz|m, haha-whered-it-go|m, nstrom|m, theblazehen|m, andrewvieyra|m, hlgs|m, Ruk8 (Ruk8), pannekoek11|m, joepie91|m, ax|m, victor_vaughn|m, th3z0l4|m, Hans5958 (Hans5958), hexagonwin|m, tech234a (tech234a), triplecamera|m, spearcat|m, cruller, nyuuzyou, upperbody321|m, l0rd_enki|m, nano412510 (nano412510), GhostIsBeHere|m, aaq|m, v1cs, bogsen (bogsen), that_lurker|m, Nulo|m, s-crypt|m|m, gamer191-1|m, qyxojzh|m, octylFractal|m, EvanBoehs|m, flashfire42|m (flashfire42), alexshpilkin, coro, phaeton (phaeton), Vokun (Vokun), noobirc|m, cmostracker|m, trumad|m, nosamu|m, Cydog|m, jwoglom|m, MaxG, superusercode, vexr, wrangle|m, moe-a-m|m, yzqzss (yzqzss), Thibaultmol, finalti|m, Minkafighter|m, GRBaset (GRBaset), akaibu|m, NickS|m, igneousx (igneousx), Ajay, DigitalDragon, xxia|m, mpeter|m, thermospheric, MinePlayersPEMyNey|m, tech234a|m-backup (tech234a), @Sanqui|m (Sanqui), @rewby|m (rewby), twiswist (twiswist), nepeat (nepeat), ats (ats), TheoH7 (TheoH7), CYBERDEV, lumidify (lumidify), VerifiedJ (VerifiedJ), Medowar (Medowar), simon816 (simon816), fuzzy80211 (fuzzy80211), evergreen5, iPwnedYourIOTSmartdog, multisn8 (multisn8), HP_Archivist (HP_Archivist), hackbug (hackbug), driib97 (driib), datechnoman (datechnoman), steering (steering), pseudorizer (pseudorizer), tertu2 (tertu), ArchivalEfforts, croissant`, cm, lflare (lflare), Matthww, xarph, @arkiver (arkiver), midou, Ointment8862 (Ointment8862), barry, Guest, Cornelius (Cornelius), Doomaholic (Doomaholic), camrod6362 (camrod), MPThLee (MPThLee), ThetaDev, nukke (nukke), neggles (neggles), ScenarioPlanet (ScenarioPlanet), Coderjo_, Irenes (ireneista), f_ (funderscore), T31M, nimaje, tzt (tzt), DigitalDragons (DigitalDragons), Exorcism|irc (exorcism), Zachava (Zachava), endrift, itachi1706 (itachi1706), stepney141 (stepney141), Mateon1, jonte (jonte4), Dango360 (Dango360), monoxane (monoxane), opl (opl), @rewby (rewby), daxxy, Chris5010 (Chris5010), @imer (imer), ATinySpaceMarine, Jake (Jake), andrew (andrew), za3k, leo60228 (leo60228), knecht (knecht), IDK (IDK), cancername (cancername), superkuh, Church (Church), balrog (balrog), klea (jmjl), asie (asie), colla (colla), alexlehm (alexlehm), eggdrop (eggdrop), programmerq (programmerq), Yakov (Yakov), sknebel (sknebel), Arachnophine (Arachnophine), tuna (tuna), eythian, ivan (ivan), that_lurker (that_lurker), CraftByte (DragonSec|CraftByte), hexa- (hexa-), unlobito (unlobito), nothere, nulldata (nulldata), abirkill (abirkill), Czechball, bb010g (bb010g), raccoon (raccoon), maxfan8 (maxfan8), evan, OctopusET, thehedgeh0g (mrHedgehog0), shreyasminocha (shreyasminocha), alittleglitchy, entrox, DLoader (DLoader), night (night), cultpony (cultpony), graham9, Terbium, void09, runxiyu (runxiyu), justauser (justauser), Xesxen (Xesxen), kdy (kdy), bisector (bisector), @kaz (Kaz), qw3rty, catbottom, mattx433 (mattx433), @ChanServ, katia (katia), jodizzle (jodizzle), @JAA (JAA), erenrich, betamax (betamax), nyany (nyany), Jon (Jon), kallsyms, angenieux2 (angenieux), lea (lea_), @chfoo (chfoo), masterX244 (masterX244), apache2, kuroger (kuroger), @OrIdow6 (OrIdow6), lindowsME, revi (revi), billy549 (Billy549), GradientCat (GradientCat), Stargazers, mete, mikael, jonty (jonty), monohedron (monohedron), JSharp (JSharp), todb, Ctrl-S, th3ph3d, zifnab06, riking, murmur, thejsa, citty, mystique_altrosky (mystique_altrosky), siinus (siinus), @AlsoJAA (JAA), colona (colona), noodle-vrax, plcp, b3nzo, @HCross (HCross), @hook54321 (hook54321), [42] (N4Y), loopy, Muad-Dib, @rewby|backup (rewby), girst (girst), justcool393 (justcool393), kokos, mgrandi (mgrandi), Meroje (Meroje), ShadowJonathan (ShadowJonathan), PwnHoaX (PwnHoaX), efi (efi), Jonimus, kpcyrd (kpcyrd) |
| 16:56:23 | | TastyWiener95 (TastyWiener95) joins |
| 16:56:24 | | wickedplayer494 (wickedplayer494) joins |
| 16:56:24 | | dxrt joins |
| 16:56:24 | | nine joins |
| 16:56:25 | | Boppen (Boppen) joins |
| 16:56:25 | | BitByBit (BitByBit) joins |
| 16:56:26 | | chrismeller3 (chrismeller) joins |
| 16:56:26 | | kiska (kiska) joins |
| 16:56:26 | | dxrt is now authenticated as dxrt |
| 16:56:26 | | dxrt quits [Changing host] |
| 16:56:26 | | dxrt (dxrt) joins |
| 16:56:26 | | @ChanServ sets mode: +o dxrt |
| 16:56:27 | | ramsey (ramsey) joins |
| 16:56:30 | | Flashfire42 (flashfire42) joins |
| 16:56:31 | | devkev0 joins |
| 16:56:32 | | JTL (JTL) joins |
| 16:56:34 | | yano (yano) joins |
| 16:56:40 | | Doranwen (Doranwen) joins |
| 16:56:40 | | adamus1red (adamus1red) joins |
| 16:56:42 | | kline (kline) joins |
| 16:56:43 | | valdikss joins |
| 16:56:43 | | kansei (kansei) joins |
| 16:56:45 | | nine is now authenticated as nine |
| 16:56:45 | | nine quits [Changing host] |
| 16:56:45 | | nine (nine) joins |
| 16:56:50 | | mls (mls) joins |
| 16:56:54 | | TheTechRobo (TheTechRobo) joins |
| 16:57:00 | | lennier2_ joins |
| 16:57:04 | | Craigle (Craigle) joins |
| 16:57:09 | | sg72 joins |
| 16:57:09 | | Riku_V (riku) joins |
| 16:57:24 | | Sanqui joins |
| 16:57:25 | | Sluggs (Sluggs) joins |
| 16:57:28 | | fmeppo (fmeppo) joins |
| 16:57:28 | | Sanqui is now authenticated as Sanqui |
| 16:57:28 | | Sanqui quits [Changing host] |
| 16:57:28 | | Sanqui (Sanqui) joins |
| 16:57:28 | | @ChanServ sets mode: +o Sanqui |
| 16:57:28 | | z0ar5 (z0ar) joins |
| 16:57:29 | | _null (_null) joins |
| 16:57:41 | | BearFortress joins |
| 16:57:45 | | HugsNotDrugs joins |
| 16:57:48 | | jacksonchen666 (jacksonchen666) joins |
| 16:58:02 | | TheEnbyperor_ joins |
| 16:58:06 | | Cronfox (Cronfox) joins |
| 16:58:10 | | Suika joins |
| 16:58:13 | | petrichor (petrichor) joins |
| 16:59:40 | | pie_ quits [Client Quit] |
| 17:00:24 | | Juest (Juest) joins |
| 17:01:11 | | bladem (bladem) joins |
| 17:01:13 | | murb (murb) joins |
| 17:01:16 | | chrismrtn (chrismrtn) joins |
| 17:01:20 | | danwellby joins |
| 17:01:28 | | celestial joins |
| 17:01:34 | | Ryz (Ryz) joins |
| 17:04:17 | | @Fusl quits [Client Quit] |
| 17:04:21 | | crullerIRC joins |
| 17:04:25 | | Fusl (Fusl) joins |
| 17:04:25 | | @ChanServ sets mode: +o Fusl |
| 17:05:04 | | linuxgemini (linuxgemini) joins |
| 17:05:12 | | sensitiveParrot (sensitiveParrot) joins |
| 17:16:33 | | pie_ (pie_) joins |
| 17:43:40 | | Suika_ joins |
| 17:43:59 | | Suika quits [Ping timeout: 272 seconds] |
| 17:44:01 | | IDK quits [Quit: Connection closed for inactivity] |
| 17:44:17 | | Dada joins |
| 17:56:13 | <@arkiver> | got the thread pagination in finally |
| 18:19:55 | | UwU quits [Quit: bye] |
| 18:20:37 | | UwU joins |
| 18:22:50 | | anarcat quits [Client Quit] |
| 18:26:02 | <h2ibot> | KleaBot edited List of websites excluded from the Wayback Machine/Partial exclusions (+0, Reordered websites): https://wiki.archiveteam.org/?diff=60478&oldid=60462 |
| 18:29:05 | | IDK (IDK) joins |
| 18:37:39 | | UwU quits [Client Quit] |
| 18:38:16 | | UwU joins |
| 18:43:04 | <@arkiver> | historyhub-grab is up |
| 18:49:30 | | Shard111 (Shard) joins |
| 18:51:57 | | anarcat (anarcat) joins |
| 18:55:33 | | FiTheArchiver joins |
| 18:55:53 | | FiTheArchiver quits [Client Quit] |
| 18:56:02 | <@arkiver> | the historyhub project is up! |
| 18:56:09 | <@arkiver> | very high priority, can shut down any minute |
| 18:56:34 | <@imer> | arkiver: not getting any more items? |
| 18:58:55 | <@arkiver> | imer: not any? |
| 18:58:56 | <@imer> | 1=200 https://historyhub.history.gov/f/discussions/46462/dummy Lua runtime error: historyhub.lua:452: bad argument #1 to 'match' (string expected, got nil) |
| 18:59:06 | <@imer> | arkiver: just got one^ |
| 18:59:17 | <@imer> | discussion:10975 1=200 https://historyhub.history.gov/f/discussions/10975/dummy Lua runtime error: historyhub.lua:452: bad argument #1 to 'match' (string expected, got nil) |
| 18:59:37 | <@arkiver> | huh |
| 19:01:35 | <@arkiver> | imer: very odd, maybe banned? are you able to check? it should give a 301 |
| 19:02:27 | <@imer> | yep "Request unsuccessful. Incapsula incident ID: ..." |
| 19:02:56 | <@arkiver> | urgh :/ |
| 19:02:58 | <@arkiver> | in the HTML? |
| 19:03:00 | <@arkiver> | i'll add a check for it |
| 19:03:15 | <pokechu22> | incapsula-- |
| 19:03:15 | <eggdrop> | [karma] 'incapsula' now has -45 karma! |
| 19:03:28 | <@imer> | there's a few variants apparently |
| 19:04:58 | <@imer> | (dm'd output) |
| 19:06:10 | <@imer> | it does work in browser, so maybe fingerprinting? |
| 19:08:09 | | UwU quits [Remote host closed the connection] |
| 19:09:15 | <@arkiver> | it could be |
| 19:10:03 | | UwU joins |
| 19:10:47 | <@arkiver> | added --secure-protocol=PFS |
| 19:10:55 | <@arkiver> | but it runs for me at least |
| 19:11:22 | <@imer> | some do work so it's probably not just tls fingerprinting? could just throw all the warriors at it and hope we get through |
| 19:12:20 | <@arkiver> | yeah i made it the default project |
| 19:12:29 | <@arkiver> | i see others finishing items now |
| 19:13:41 | | UwU quits [Client Quit] |
| 19:13:43 | <pokechu22> | I immediately got "https://historyhub.history.gov/f/discussions/6662/dummy" -> "You are banned. Sleeping 1800 seconds." (that one seems to redirect to a login page though, not sure if that's related or not) |
| 19:15:30 | | UwU joins |
| 19:15:34 | <IDK> | 1=200 https://historyhub.history.gov/f/discussions/20930/dummy |
| 19:15:34 | <IDK> | You are banned. Sleeping 1800 seconds. |
| 19:15:41 | <IDK> | that was quick |
| 19:15:53 | <unknownsrc> | seems to be country based |
| 19:16:01 | <unknownsrc> | my US vpses work fine, elsewhere doesent |
| 19:16:16 | <unknownsrc> | and i got banned |
| 19:16:17 | <pokechu22> | I'm in the US on a residential connection |
| 19:17:30 | <@arkiver> | pokechu22: does it work in the browser? |
| 19:18:17 | <IDK> | arkiver: for me, it does work in browser first try, even if you disable js |
| 19:18:34 | <@arkiver> | IDK: alright i'll try to find a banned IP and test with that |
| 19:19:08 | <pokechu22> | For me, view-source:https://historyhub.history.gov/f/discussions/6662/dummy in browser immediately gets a challenge unless cookies are set |
| 19:19:59 | <pokechu22> | ... but https://historyhub.history.gov/f/discussions/6662/dummy in browser doesn't get a challenge? |
| 19:20:13 | <unknownsrc> | hetzner US seems banned |
| 19:20:19 | <tmg1|michelson> | 1=200 https://historyhub.history.gov/f/discussions/22119/dummy |
| 19:20:19 | <tmg1|michelson> | You are banned. Sleeping 1800 seconds. |
| 19:20:22 | <tmg1|michelson> | boo |
| 19:20:50 | <tmg1|michelson> | (vps iceland) |
| 19:20:51 | <pokechu22> | hmm, seems like it works every other request? |
| 19:21:35 | <tmg1|michelson> | (same thing canada residential isp) |
| 19:21:58 | <@arkiver> | pokechu22: got more info on that? |
| 19:22:36 | <@arkiver> | found a banned IP, will do some tests |
| 19:24:04 | <IDK> | yall have a not banned IP? :-) |
| 19:24:04 | | cyanbox joins |
| 19:24:08 | <pokechu22> | Looks like firefox devtools don't log view-source requests (I think that worked before though?), but it seems like the first load without cookies gives `<html>\n<head>\n<META NAME="robots" CONTENT="noindex,nofollow">\n<script src="/_Incapsula_Resource?SWJIYLWA=[REMOVED]">\n</script>\n<body>\n</body></html>\n` and also has set-cookie headers for visid_incap_3185430 and |
| 19:24:10 | <pokechu22> | incap_ses_362_3185430, and those two cookies are sufficient for later requests to work (even without JS) |
| 19:24:16 | <pokechu22> | might be better to discuss details in #UncleSamsArchive though? |
| 19:25:27 | <tmg1|michelson> | trid to set concurrency=1 and that seemed to get one or two successful responses but then ...banned |
| 19:25:37 | <tmg1|michelson> | (on another machine, residential canada isp) |
| 19:26:21 | <@arkiver> | pokechu22: when viewing the website, a little something is POSTed back for the CAPTCHA, that allows one to access it on a simple next try |
| 19:27:35 | <IDK> | zenlayer hk straight up throws 403 :-) |
| 19:27:36 | <IDK> | 1=403 https://historyhub.history.gov/f/discussions/32057/dummy |
| 19:27:44 | <IDK> | I guess thats the hard ban |
| 19:28:00 | <pokechu22> | I'm not seeing any POSTs, just 2 https://historyhub.history.gov/_Incapsula_Resource gets (plus a 3rd one as an image that gets rejected) |
| 19:30:48 | <unknownsrc> | seems like one item completes, and then ban |
| 19:30:49 | | pabs quits [Read error: Connection reset by peer] |
| 19:31:03 | <pokechu22> | ... OK, and if I keep deleting cookies or maybe if I block https://historyhub.history.gov/_Incapsula_Resource then I get the page with the actual captcha (which also has the "Request unsuccessful. Incapsula incident ID" message and does a POST). But I think the initial one which just /_Incapsula_Resource happens even if not banned |
| 19:31:40 | | pabs (pabs) joins |
| 19:35:53 | <phillipsjk> | On History hub I get an immediate: "1=200 https://historyhub.history.gov/f/discussions/6409/dummy |
| 19:35:54 | <phillipsjk> | You are banned. Sleeping 1800 seconds. |
| 19:35:54 | <phillipsjk> | " message |
| 19:36:23 | <phillipsjk> | Looks like maybe a trap URL |
| 19:37:47 | <@arkiver> | not a trap URL |
| 19:37:56 | <@arkiver> | i'm working on a solution (maybe a crappy one, but it is what it is) |
| 19:38:50 | <phillipsjk> | Last day is apparently today (which I am sure you know) |
| 19:40:31 | | petrichor quits [Ping timeout: 272 seconds] |
| 19:44:48 | <@arkiver> | alright crappy solution incoming |
| 19:46:53 | <@arkiver> | good news is it seems to run well if cookies from brower are provided |
| 19:46:59 | <@arkiver> | also at concurrency 20 |
| 19:47:34 | <@arkiver> | new version is out... this will require some manual work |
| 19:47:49 | <@arkiver> | if you're banned it will ask you to provide cookies through an environment variable |
| 19:47:59 | <pokechu22> | How do you do that with the VM? |
| 19:48:41 | <tmg1|michelson> | or docker? |
| 19:48:52 | <@arkiver> | actually maybe not 20 |
| 19:48:59 | <@arkiver> | pokechu22: currently only docker :/ |
| 19:49:26 | | petrichor (petrichor) joins |
| 19:49:53 | <@arkiver> | i need to see about making some interactive part on the warrior that will ask for information |
| 19:50:09 | | anarcat quits [Client Quit] |
| 19:54:40 | <@arkiver> | yeah looks like a single set of cookies can be used for nearly any concurrency |
| 19:54:41 | <@imer> | god the one time a site has v6 it's making it *more* difficult to bypass |
| 19:55:18 | <@imer> | (can't use browser cookie since the ipv6 of the container is different..) |
| 20:00:06 | | UwU quits [Remote host closed the connection] |
| 20:00:09 | <IDK> | ah thats why its not working |
| 20:00:15 | <IDK> | lemme force v4 real quick |
| 20:00:43 | | UwU joins |
| 20:00:44 | <@arkiver> | sg72: working? |
| 20:04:01 | | cipherrot (petrichor) joins |
| 20:04:12 | | anarcat (anarcat) joins |
| 20:05:51 | | petrichor quits [Ping timeout: 272 seconds] |
| 20:05:56 | <@arkiver> | requeuing for historyhub is enabled, i shall be off now |
| 20:11:06 | <@arkiver> | should be done in a few hours, let's hope they don't kill it before then |
| 20:11:16 | <@arkiver> | there's also some blog entries, we're not getting those at the moment |
| 20:12:50 | <@imer> | i'm not having any luck with the cookies unfortunately |
| 20:13:00 | <IDK> | what command did yall use? im using docker run -d --name ContainerName --label=com.centurylinklabs.watchtower.enable=true --restart=unless-stopped -e HISTORYHUB_COOKIES='[Value]' atdr.meo.ws/archiveteam/historyhub-grab --concurrent 20 IDK |
| 20:13:06 | <IDK> | not working sadly |
| 20:13:56 | <@imer> | arkiver: that's "Cookie: $VALUE" from the browser request and $VALUE goes in the env, right? |
| 20:14:32 | <@arkiver> | imer: yes |
| 20:14:39 | <@arkiver> | but I'll go allow either |
| 20:15:17 | <IDK> | imer: try resolving historyhub.history.gov to 45.60.33.181 instead |
| 20:15:27 | <IDK> | or what your local quad9 resolves for you |
| 20:15:32 | <tmg1|michelson> | run-pipeline3: error: unrecognized arguments: -e HISTORYHUB_COOKIES |
| 20:16:25 | <IDK> | I had recieved my cookies from 45.60.37.181 before and it did not work |
| 20:18:09 | <@arkiver> | tmg1|michelson: it need to go to the docker args, not the pipeline args |
| 20:18:26 | <@arkiver> | imer: update is out to allow both with "Cookie: " prepended and not |
| 20:18:32 | <@arkiver> | i'll really be off now though |
| 20:18:34 | <@imer> | no luck, also tried setting the same user agent. I need to head off for a bit |
| 20:19:20 | | UwU quits [Client Quit] |
| 20:19:28 | <@arkiver> | imer: thanks for trying though |
| 20:19:34 | <@arkiver> | looks like IDK figured it out too |
| 20:19:48 | <IDK> | yep, tho I also figured out running at 40 will get my cookie banned :-) |
| 20:19:58 | | UwU joins |
| 20:20:00 | <IDK> | 20 as well |
| 20:20:05 | <@arkiver> | the 403? |
| 20:20:24 | | @arkiver runs at 100 |
| 20:20:41 | <IDK> | yep |
| 20:20:56 | <@arkiver> | the 403 is unrelated to the cookie i think |
| 20:21:00 | <@arkiver> | a retry might work |
| 20:21:37 | <tmg1|michelson> | arkiver: that was with IDK 's docker command |
| 20:21:44 | <tmg1|michelson> | minus the label (what's the label do?) |
| 20:22:12 | <tmg1|michelson> | ie the -e literally went to docker ?? |
| 20:22:28 | <IDK> | did you include the ' ? |
| 20:22:33 | <tmg1|michelson> | yes |
| 20:22:50 | <IDK> | hm its working fine for me |
| 20:22:52 | <tmg1|michelson> | -e HISTORYHUB_COOKIES='AWSALB=nzp... |
| 20:22:58 | <@arkiver> | tmg1|michelson: i'd recommend `-e "HISTORYHUB_COOKIES=blabla"` (note the ") |
| 20:23:37 | <IDK> | https://blog.codinghorror.com/content/images/2025/05/works-on-my-machine-v2-2025-jon-galloway-1.png |
| 20:23:41 | <IDK> | as usual |
| 20:23:47 | <tmg1|michelson> | run-pipeline3: error: the following arguments are required: DOWNLOADER |
| 20:23:55 | <@arkiver> | fixed the message to be more clear |
| 20:28:28 | <tmg1|michelson> | -e "HISTORYHUB_COOKIES=AWSALB=nzpv... results in the same above message |
| 20:30:58 | <tmg1|michelson> | also tried -e "HISTORYHUB_COOKIES='AWSALB=nzpv... |
| 20:33:36 | <tmg1|michelson> | [looks like the centurylink watchtower thing is some kind of autoupdate functionality] |
| 20:34:01 | <tmg1|michelson> | IDK: i see hits on the tracker for you what did you change? |
| 20:35:35 | <tmg1|michelson> | nulldata: same for you? what command are you running? |
| 20:35:45 | | UwU quits [Client Quit] |
| 20:36:23 | | UwU joins |
| 20:39:04 | <nulldata> | I'm using a docker-compose file |
| 20:41:34 | <nulldata> | tmg1|michelson - Make sure you're using the entire "Cookie" value under Request Headers. It looks like you're using the "set-cookie" values from the response header. |
| 20:41:58 | <nulldata> | Should start with/contain "visid_incap_" |
| 20:45:45 | <tmg1|michelson> | under request headers i see accept: accept-encoding: accept-language: authorization code: connection: cookie: |
| 20:45:51 | <tmg1|michelson> | i am using the value from 'cookie' |
| 20:46:02 | <tmg1|michelson> | it doesn't start with visid_incap |
| 20:49:01 | <tmg1|michelson> | showing raw request headers shows a little more detail but fundamentally the same data (ie not visid_incap) |
| 20:49:57 | <nulldata> | https://tl.nulldata.foo/uploads/fdd7f36580ba5b5e/image.png |
| 20:50:33 | <tmg1|michelson> | wait, visid_incap is in there |
| 20:51:24 | <tmg1|michelson> | yeah my cookie doesn't start with visid_incap |
| 20:54:30 | <nulldata> | https://tl.nulldata.foo/uploads/3ceeab078bbac1f1/hhdockercompose.zip |
| 20:55:24 | <nulldata> | ^ my docker-compose file. Just open .env, fill your cookie in "HISTORYHUB_COOKIES=", and then run docker-compose up -d on that folder |
| 20:55:25 | <tmg1|michelson> | https://shitposter.world/notice/B3I3dkCJum3ZfjtPYe |
| 21:04:32 | <nulldata> | Or actually with newer docker the command should be "docker compose up -d" |
| 21:05:50 | | Webuser452607 joins |
| 21:06:13 | | Webuser452607 quits [Client Quit] |
| 21:06:40 | <tmg1|michelson> | i was able to get that running but it still says |
| 21:06:42 | <tmg1|michelson> | 1=200 https://historyhub.history.gov/f/discussions/19525/dummy |
| 21:06:42 | <tmg1|michelson> | You are banned. THE SOLUTION: |
| 21:07:24 | <tmg1|michelson> | it's showing that my environment variable is making its way in there: logs from inside the docker container show |
| 21:07:27 | <tmg1|michelson> | Using header Cookie: AWSALB=nzpv/VCFw |
| 21:10:34 | <nulldata> | I dunno. Maybe try getting to the site in a different browser/private tab and see if you get a different cookie to try. |
| 21:12:52 | <@JAA> | Shouldn't all of this be in #UncleSamsArchive? |
| 21:16:17 | <Guest> | is anyone else getting a default nginx page on archive.today? |
| 21:17:56 | <nulldata> | Guest - works here |
| 21:18:07 | <nulldata> | Try archive.is |
| 21:18:09 | <Yakov> | works for me as well |
| 21:18:17 | <Yakov> | .today redirects me to |
| 21:18:21 | <Yakov> | .ph |
| 21:22:40 | <@JAA> | Guest: IIRC, that's some sort of ban. |
| 21:23:39 | <Guest> | thats weird, this is on a residential ip (in the browser too) |
| 21:23:46 | <@JAA> | Also consider clearing your cookies for the domain. |
| 21:24:43 | <Guest> | and archive.is doesnt work either, it just hangs. requests for all of the subdomains hang for a while (until http timeout) after getting the nginx page. |
| 21:24:55 | <Guest> | and clearing cookies didnt work either :p |
| 21:26:00 | <Guest> | this is the url if anyone is interested: <http://archive.today/medium.com/@thequeryabhishk/the-json-performance-hack-that-every-go-developer-should-know-but-90-dont-b7de213c6d66> . archive.today switched to http when it shows the nginx page but i believe the site uses https. |
| 21:29:10 | <@JAA> | That's what I observed the last time as well: nginx page, then timeouts for a while. If you clear your cookies, it should work again after that ban expires. |
| 21:35:43 | <Guest> | thanks that worked |
| 21:35:48 | <phillipsjk> | I posted here because the wiki did not have a page telling me where to go. |
| 21:36:42 | <Guest> | for anyone else that might read from irclogs, you have to wait a little (after the timeouts), clear cookies for the site, and THEN visit the site. otherwise if you clear while on the site it could refresh and you are banned again. |
| 21:42:21 | <Yakov> | What does it take to get banned? because I've never got a default nginx page on archive.today before. |
| 21:45:08 | | ericgallager joins |
| 22:03:57 | | sec^nd quits [*.net *.split] |
| 22:03:57 | | SootBector quits [*.net *.split] |
| 22:04:18 | | sec^nd (second) joins |
| 22:05:06 | | SootBector (SootBector) joins |
| 22:06:18 | | nexussfan (nexussfan) joins |
| 22:06:41 | <Guest> | i had the same issue yesterday but im not sure. maybe someone else can answer. i didnt use archive.today for a whole week before this issue. after that (yesterday) i started getting the errors. |
| 22:25:59 | | etnguyen03 (etnguyen03) joins |
| 22:35:57 | | aninternettroll quits [Ping timeout: 272 seconds] |
| 22:46:09 | | aninternettroll (aninternettroll) joins |
| 22:56:40 | <nicolas17> | is historyhub the default project? if it needs manual intervention for cookies I think it shouldn't be... |
| 23:17:06 | <nicolas17> | https://bsky.app/profile/kendraserra.bsky.social/post/3merjyvt5322g |
| 23:17:30 | <nicolas17> | Ars Technica ran an article with a bunch of fake quotes and then took it down |
| 23:18:08 | <nicolas17> | it was archived via Save Page Now |
| 23:18:23 | <nicolas17> | but this could have been lost very easily |
| 23:19:09 | <nicolas17> | do we need something more systematic to grab news articles immediately on publication? |
| 23:20:53 | <@JAA> | #// might well have fetched it as well. We should be grabbing Ars Technica every 15 minutes there. |
| 23:21:07 | <nicolas17> | oh right that could be delayed |
| 23:21:39 | <nicolas17> | there are currently 3 captures, they are all from SPN, but I didn't take into account that // might have fetched it and it didn't get uploaded/indexed yet |
| 23:31:25 | | Dada quits [Remote host closed the connection] |
| 23:33:54 | | iPwnedYourIOTSmartdog6 joins |
| 23:36:07 | | iPwnedYourIOTSmartdog quits [Ping timeout: 272 seconds] |
| 23:36:08 | | iPwnedYourIOTSmartdog6 is now known as iPwnedYourIOTSmartdog |
| 23:38:03 | | Snivy (Snivy) joins |
| 23:52:51 | | Arcorann_ (Arcorann) joins |