00:01:49<@JAA>→ #telegrab
00:02:20<nicolas17>JAA: I believe telegram was just an example
00:03:21<kiska>eightthree: For my grafana dashboard please limit yourself to "last 24 hours"
00:03:37<@JAA>Fair
00:03:51<nicolas17>yes, I used 30 days only to take that screenshot, definitely don't use long periods *and* auto-update
00:04:34<kiska>nicolas17: If you did that, I will end you
00:04:51<nicolas17>kiska: https://transfer.archivete.am/inline/5vyqw/screenshot.png was me
00:05:01<kiska>I saw it
00:05:26<kiska>Do you need a "todo" graph?
00:05:34<nicolas17>idk
00:05:51<nicolas17>I took the existing graph and set min=0
00:06:09<nicolas17>and then I realized I had to zoom out a lot to make the line not look entirely flat :D
00:06:22<fireonlive>last 365 days update every 30s?
00:06:49<kiska>I don't have data for that period + I will end you
00:06:58<fireonlive>is that a promise
00:07:01<fireonlive>:3
00:07:17<nicolas17>kiska: https://shipadick.com/collections/ship-a-brick/products/ship-a-brick-custom-message
00:07:26<nicolas17>get this delivered through his window
00:18:08Wohlstand (Wohlstand) joins
00:36:07qwertyasdfuiopghjkl quits [Remote host closed the connection]
00:45:15<h2ibot>Mewtwodestroyer edited List of websites excluded from the Wayback Machine (+19, added http://667u.com/. I can't believe dj…): https://wiki.archiveteam.org/?diff=51698&oldid=51697
00:57:59icedice (icedice) joins
00:58:05jasons (jasons) joins
01:00:18<h2ibot>JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=51699&oldid=51698
01:15:20Wohlstand quits [Client Quit]
01:25:16qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
01:45:38icedice quits [Client Quit]
01:57:50jasons quits [Ping timeout: 240 seconds]
02:18:15lya joins
02:18:46lya quits [Remote host closed the connection]
02:18:51lya joins
02:21:14lya quits [Remote host closed the connection]
02:40:58pabs quits [Remote host closed the connection]
02:50:50pabs (pabs) joins
03:01:50jasons (jasons) joins
03:17:29<fireonlive>dj rainbow ejaculation please :(
03:33:03<eightthree>fireonlive: not too mention that Rank (fowl) Sinatra smell
03:33:43<eightthree>+XD
03:34:07<fireonlive>remember, clean behind your foreskin!
03:35:02<eightthree>is AI used, and if not what is, used to remove "undesirable" material from the queue/archive? i.e. how is moderation achieved?
03:36:37<eightthree>https://wiki.archiveteam.org/index.php/Porn ok...
03:38:21<eightthree>https://wiki.archiveteam.org/index.php/Special:WhatLinksHere/Porn
03:38:21<eightthree>> No pages link to Porn.
03:38:21<eightthree>so that's why I never stumbled on this in the wiki...
03:40:42Ruthalas593 (Ruthalas) joins
03:41:03<fireonlive>i dunno if there's a list of them but there's probably a few pages that aren't linked to by anything
03:41:20<fireonlive>ah, https://wiki.archiveteam.org/index.php?title=Special:LonelyPages&limit=500&offset=0
03:41:26Ruthalas59 quits [Read error: Connection reset by peer]
03:41:27Ruthalas593 is now known as Ruthalas59
03:42:25<fireonlive>TIL we once did a press release: https://wiki.archiveteam.org/index.php/Archive_Team_press_releases
04:00:20jasons quits [Ping timeout: 240 seconds]
04:29:34<fireonlive>eightthree: unfortunately there is no way to only save gay porn
04:29:38<fireonlive>so we must just save it all
04:31:04<eightthree>fireonlive: my human advisors tell me it is recommended to laugh at this joke, hahaha
04:31:17<fireonlive>;)
04:44:08DogsRNice quits [Read error: Connection reset by peer]
05:03:54jasons (jasons) joins
05:50:20<h2ibot>Pokechu22 edited Jira (+180, issues.apache.org failed, will need to update…): https://wiki.archiveteam.org/?diff=51700&oldid=51687
06:01:19jasons quits [Ping timeout: 272 seconds]
06:03:51IRC2DC quits [Ping timeout: 272 seconds]
07:00:41BlueMaxima quits [Read error: Connection reset by peer]
07:03:38<h2ibot>Hina.K edited URLTeam (+43, /* Alive */): https://wiki.archiveteam.org/?diff=51701&oldid=51653
07:04:37jasons (jasons) joins
07:31:09Arcorann (Arcorann) joins
08:01:01jasons quits [Ping timeout: 272 seconds]
08:33:08Overlordz joins
09:04:17jasons (jasons) joins
09:15:56jacksonchen666 (jacksonchen666) joins
09:27:11Lixusboooo joins
09:27:22Lixusboooo quits [Remote host closed the connection]
09:36:21<jacksonchen666>re queer.af: IP address is 65.108.48.233, only DNS doesn't resolve anymore (https://tech.lgbt/@ShadowJonathan/111917612478836930). not updating deathwatch yet.
09:57:09SootBector quits [Remote host closed the connection]
09:57:32SootBector (SootBector) joins
10:00:06Bleo18260 quits [Client Quit]
10:01:30Bleo18260 joins
10:01:59jasons quits [Ping timeout: 272 seconds]
10:10:11Island quits [Read error: Connection reset by peer]
10:21:54jacksonchen666 quits [Client Quit]
10:26:32<@OrIdow6^2>Re logging, I believe I read that the old logger let you prevent a message from being logged by prefixing it with something, do we have that now?
10:26:37@OrIdow6^2 is now known as @OrIdow6
11:01:34<h2ibot>OrIdow6 edited Fediverse (+591, The history of why we don't archive it): https://wiki.archiveteam.org/?diff=51704&oldid=36059
11:01:35<h2ibot>OrIdow6 edited Fediverse (+8, Fix template usage): https://wiki.archiveteam.org/?diff=51705&oldid=51704
11:01:36<h2ibot>OrIdow6 edited Fediverse (+15, Fix template usage, part 2): https://wiki.archiveteam.org/?diff=51706&oldid=51705
11:02:00igloo22225 quits [Quit: The Lounge - https://thelounge.chat]
11:02:25igloo22225 (igloo22225) joins
11:03:35<h2ibot>OrIdow6 edited Mastodon (+139, We don't archive it): https://wiki.archiveteam.org/?diff=51707&oldid=50243
11:04:09<thuban>OrIdow6: cf https://wiki.archiveteam.org/index.php?title=Mastodon&diff=prev&oldid=50243
11:04:45<@OrIdow6>thuban: Huh, thanks
11:05:18<@OrIdow6>Have we actually done this in the last year?
11:05:28jasons (jasons) joins
11:05:30<@OrIdow6>Or just talked about changing this policy
11:05:54<@OrIdow6>Mostly I feel some explanation is owed to the curious for why it's in place
11:06:01<thuban>i don't think so, it's kind of moot due to the technical issues
11:06:47<@OrIdow6>..... any compact description of the technical issues that could to up on the wiki? Or you can edit it yourself
11:07:00<@OrIdow6>("....." not meant to be mocking)
11:08:22<thuban>current mastodon doesn't work without js, so archivebot can't handle it, and we don't have a mastodon project (in part due to the historical policy, in part because mastodon devs have made vague noises about fixing it, and in part for the usual reasons)
11:08:33<@OrIdow6>Ack, will update
11:08:39<@OrIdow6>Thanks
11:10:23<thuban>(iirc you can get individual toots as embeds, but that's quite limited obviously)
11:20:48<aninternettroll>can't you use the API?
11:21:19<aninternettroll>i guess it would require more work though
11:23:07<thuban>right, that would be 'a mastodon project', which we haven't done
11:52:45<h2ibot>OrIdow6 edited Mastodon (-10, No longer a policy, apparently): https://wiki.archiveteam.org/?diff=51711&oldid=51707
12:01:50jasons quits [Ping timeout: 240 seconds]
12:26:52<h2ibot>OrIdow6 edited Fediverse (+396, /* ArchiveTeam and the Fediverse */ Not banned,…): https://wiki.archiveteam.org/?diff=51712&oldid=51706
12:37:09Arcorann quits [Ping timeout: 272 seconds]
12:43:01Wohlstand (Wohlstand) joins
13:05:35jasons (jasons) joins
13:06:08ScenarioPlanet (ScenarioPlanet) joins
13:53:24ell1 (ell) joins
13:54:45ell1 quits [Client Quit]
13:55:08ell1 (ell) joins
14:03:50jasons quits [Ping timeout: 240 seconds]
14:16:50^ quits [Ping timeout: 240 seconds]
14:17:25^ (^) joins
14:35:06SootBector quits [Remote host closed the connection]
14:37:00SootBector (SootBector) joins
14:42:33SootBector quits [Remote host closed the connection]
14:43:00SootBector (SootBector) joins
15:06:05Wohlstand quits [Remote host closed the connection]
15:07:42jasons (jasons) joins
15:30:03sdomi quits [Ping timeout: 272 seconds]
16:05:31jasons quits [Ping timeout: 272 seconds]
16:23:19qwertyasdfuiopghjkl quits [Remote host closed the connection]
16:33:40Darken (Darken) joins
16:38:12<aninternettroll>How would one start a project? Let's say that I wanted to archive mastodon, what are the requirements?
16:44:59Ketchup901 quits [Remote host closed the connection]
16:48:17Ketchup901 (Ketchup901) joins
16:52:48that_lurker quits [Quit: I am most likely running a system update]
16:54:01<thuban>aninternettroll: read and understand the code of prior dpos projects (https://wiki.archiveteam.org/index.php?title=Category:DPoS_project), then write your own with an appropriate pipeline.py (receive items from tracker, invoke wget) and <project>.lua (process response data and make additional requests as appropriate
16:54:03<thuban>https://github.com/ArchiveTeam/wget-lua/wiki#wget-with-lua-hooks)
16:54:13<thuban>this isn't very legible at present, sorry
16:55:18that_lurker (that_lurker) joins
17:03:14zhongfu_ (zhongfu) joins
17:03:32zhongfu quits [Read error: Connection reset by peer]
17:08:22jasons (jasons) joins
17:12:01<aninternettroll>Can I get a bit of a TL;DR? What more is there than to get a giant list of URLs and ask interrested people to archive them?
17:14:13icedice (icedice) joins
17:23:57icedice quits [Client Quit]
17:32:21icedice (icedice) joins
17:36:24<thuban>aninternettroll: most sites that can be handled that way go through archivebot. a dpos is really only necessary when (1) there are excessively strict rate limits (rare) or (2) it's non-trivial to generate the list of urls (common).
17:36:34<thuban>a conceptually simple 'item' like 'a user' or 'a post' involves potentially many web pages (user info, pages of comments), page assets (images, javascript), and api interactions (more javascript), all of which have to be retrieved for the item to render properly in the wayback machine.
17:36:39<thuban>it's often not possible to deduce those assets and api calls in advance; they have to be generated dynamically based on what we find in the page data.
17:36:47<thuban>in addition, many sites don't have a public index of all items. we can't simply get a giant list of all users or all posts; we have to start with what we can find and dynamically 'discover' new items through features like authorship (post->user), timelines (user->post), replies (post->post), and mentions (user->user).
17:37:46<thuban>the site-specific project code is what does this dynamic generation. does that answer your question?
17:41:16<aninternettroll>What does DPoS stand for?
17:44:47<thuban>"Distributed Preservation of Service". https://wiki.archiveteam.org/index.php/DPoS
17:47:33<aninternettroll>So what would such a DPoS script do for let's say mastodon? Is it's goal to find all the relevant links to be archived to make the page load? Could the script just go to the relevant API endpoints and call it a day?
17:49:15<aninternettroll>Anyway, thanks! I got the answer I needed at least (if you have a big list of URLs go to #archivebot)
17:49:18<imer>if you want wayback machine playback on archive.org it needs to visit everything a normal page load needs, and then you usually grab any extra info or legacy pages
17:49:33<imer>... that might be linked to*
17:51:11<aninternettroll>How much does archivebot handle? I would assume it can get the pages a page links to (<a href="" />), but I guess it doesn't run javascript to fetch API calls?
17:51:45<imer>yes
17:53:11Darken quits [Ping timeout: 272 seconds]
17:58:52qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
18:03:57jasons quits [Ping timeout: 272 seconds]
18:04:07<fireonlive>mastodon is broken in AB (and i guess WBM?) now due to the way it is atm
18:04:27<fireonlive>/embed after a single post url works though
18:04:41<aninternettroll>"due to the way it is"? Anything in particular?
18:04:49<fireonlive>but the whole you need js/its a single page app(?) thing ruins it i believe
18:04:59<fireonlive>started with version 4
18:05:31<fireonlive>they also refused to add any kind of nojs fallback sadly
18:05:59<aninternettroll>I thought the wayback machine did run javascript as well
18:06:27<fireonlive>hmm might just be an AB issue
18:08:29<thuban>wbm does run javascript, but the javascript and anything it requests have to be in the wbm
18:08:51<thuban>archivebot does not run javascript and therefore can't make all the relevant api requests
18:10:17<fireonlive>sheesh the wayback machine is in its dialup era today
18:10:19pseudorizer quits [Quit: ZNC 1.8.2 - https://znc.in]
18:10:20<fireonlive>https://web.archive.org/web/20240212180822/https://infosec.exchange/@radareorg/111919836786322370
18:10:23<fireonlive>just made this
18:10:44<fireonlive>it doesn’t appear to work correctly
18:10:59pseudorizer (pseudorizer) joins
18:11:34<fireonlive>if you toggle save a screenshot some otherwise broken sites work in screenshot only
18:12:47<katia>is there something like archivebot but with a browser?
18:13:59<pokechu22>There used to be chromebot but that's been disabled for a while due to it producing broken files (I don't entirely know the details)
18:14:40rktk quits [Read error: Connection reset by peer]
18:16:30<thuban>katia: different interface obviously, but that's how spn works (though spn is fucking slammed lately)
18:17:01<fireonlive>https://github.com/internetarchive/warcprox
18:17:17<fireonlive>https://github.com/internetarchive/brozzler
18:18:12rktk (rktk) joins
18:21:19icedice quits [Client Quit]
18:24:41<DigitalDragons>I wonder if other fediverse software (Sharkey, Pleroma) have the same problem as mastodon
18:26:26<fireonlive>hmm. if they’re js hellholes perhaps
18:26:58<@JAA>Pleroma has been a JS hell from the start. It was a design goal, I think.
18:28:47<katia>thuban, thinking more wrt. selfhosting
18:30:05Darken (Darken) joins
18:30:08<@JAA>katia: Yeah, look at brozzler.
18:30:28<katia>oh, nice, is this save page now? :D
18:30:32icedice (icedice) joins
18:30:51<@JAA>I believe it's the backend for SPN(2), but not entirely sure.
18:32:00<thuban>it is https://blog.archive.org/2019/10/23/the-wayback-machines-save-page-now-is-new-and-improved/
18:32:15<DigitalDragons>https://web.archive.org/web/20240212182740/https://social.digitaldragon.dev/notes/9p1my38vep7b0009 Sharkey (and by extension I assume it's upstream Misskey) doesn't seem to work either
18:32:42<@JAA>Ah, hadn't heard of Sharkey before, but Misskey is also broken, yeah.
18:33:02<@JAA>Note that in some cases SPN will actually archive all relevant information, but it just won't play back correctly.
18:34:16<@JAA>The API response behind that Mastodon example above was actually archived: https://web.archive.org/web/20240212180825/https://infosec.exchange/api/v1/statuses/111919836786322370
19:00:11icedice quits [Client Quit]
19:00:16jacksonchen666 (jacksonchen666) joins
19:01:18icedice (icedice) joins
19:07:08jasons (jasons) joins
19:14:28Island joins
19:41:10jacksonchen666 is now known as RJHacker58248
19:41:12RJHacker58248 quits [Ping timeout: 255 seconds]
19:41:14jacksonchen666 (jacksonchen666) joins
19:41:40<fireonlive>oh great
19:42:36<h2ibot>JacksonChen666 edited Deathwatch (+230, update queer.af: date of shutdown is probably…): https://wiki.archiveteam.org/?diff=51715&oldid=51689
19:45:15anarcat quits [Quit: rebooting]
19:45:55anarcat (anarcat) joins
20:07:27jasons quits [Ping timeout: 272 seconds]
20:38:16JustMeCorne quits [Remote host closed the connection]
20:50:47BlueMaxima joins
20:58:58<h2ibot>Pedrosso edited Swedish Public TV (+302, /* SVT Play */ clarified vital signs): https://wiki.archiveteam.org/?diff=51717&oldid=51638
21:10:19jasons (jasons) joins
21:13:00<h2ibot>Barto edited Votes in Switzerland/2024-03-03 (+821, Add Wallis / Valais): https://wiki.archiveteam.org/?diff=51718&oldid=51695
21:13:04<Barto>JAA: ^
21:13:10<@JAA>:-)
21:13:31<Barto>got some 403 in AB though, which is a shame
21:22:02anarchat (anarcat) joins
21:22:12anarcat quits [Client Quit]
21:41:11wyatt8740 quits [Ping timeout: 272 seconds]
21:42:29wyatt8740 joins
21:53:03jacksonchen666 quits [Client Quit]
21:57:39Dango360_ quits [Ping timeout: 272 seconds]
22:00:54ThetaDev quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
22:01:01ThetaDev joins
22:05:19Naruyoko quits [Quit: Leaving]
22:05:50jasons quits [Ping timeout: 240 seconds]
22:35:14sm joins
22:35:18Dango360 (Dango360) joins
22:35:24sm quits [Remote host closed the connection]
22:45:54anarchat quits [Read error: Connection reset by peer]
22:46:04anarcat (anarcat) joins
23:05:24<h2ibot>Barto edited Votes in Switzerland/2024-03-03 (+886, Add Fribourg / Freiburg): https://wiki.archiveteam.org/?diff=51719&oldid=51718
23:05:52<Barto>:-)
23:08:16Wohlstand (Wohlstand) joins
23:09:27jasons (jasons) joins
23:18:33Ketchup901 quits [Ping timeout: 255 seconds]
23:18:52Ketchup901 (Ketchup901) joins
23:20:26<fireonlive>:3
23:21:49Ketchup901 quits [Remote host closed the connection]
23:22:01Ketchup901 (Ketchup901) joins
23:52:02Ketchup901 quits [Remote host closed the connection]
23:52:20Ketchup901 (Ketchup901) joins