00:08:36bilboed0 quits [Ping timeout: 256 seconds]
00:09:42<@arkiver>cruller: datechnoman created this list, what do you think? https://transfer.archivete.am/pKymz/extracted_kinet-tv.ne.jp_urls.txt
00:09:43<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/pKymz/extracted_kinet-tv.ne.jp_urls.txt
00:16:03bilboed0 joins
00:37:02<h2ibot>Klea edited Paste hosting (+28, Move sites to dead (note: couldn't check all…): https://wiki.archiveteam.org/?diff=57865&oldid=57862
00:37:03<h2ibot>Brad edited Deathwatch (+235, Added Monitor Plus): https://wiki.archiveteam.org/?diff=57866&oldid=57861
00:39:02<h2ibot>Arkiver uploaded File:Jjang0u-icon.png: https://wiki.archiveteam.org/?title=File%3AJjang0u-icon.png
00:51:15<cruller>arkiver: URLs containing "hatena.ne.jp" belong to external services, so they should be excluded from crawl seeding. I also checked hatena and didn't get any new links on kinet-tv.ne.jp. All other URLs are acceptable.
01:00:45<cruller>However, in general, Hatena Antenna (https://a.hatena.ne.jp/) and Hatena Bookmark (https://b.hatena.ne.jp/) are useful for creating such lists. The former is a page update checking service, while the latter is a social bookmarking service, both of which are long-established.
01:14:09etnguyen03 (etnguyen03) joins
01:19:23Guest58 joins
01:23:41Cuphead2527480 quits [Client Quit]
01:28:14<h2ibot>PaulWise edited Paste hosting (+372, fix some formatting): https://wiki.archiveteam.org/?diff=57868&oldid=57865
01:29:38Sanqui_ joins
01:30:40@Sanqui quits [Read error: Connection reset by peer]
01:33:42TastyWiener95 quits [Quit: So long, farewell, auf wiedersehen, good night]
02:01:45Island joins
02:59:40Barto quits [Quit: WeeChat 4.7.1]
02:59:51Barto (Barto) joins
03:14:03Guest58 quits [Client Quit]
03:19:31Guest58 joins
03:20:33etnguyen03 quits [Client Quit]
03:21:50BennyOtt quits [Ping timeout: 256 seconds]
03:22:06Guest58 quits [Client Quit]
03:27:20etnguyen03 (etnguyen03) joins
03:35:55bilboed0 quits [Quit: The Lounge - https://thelounge.chat]
03:40:27bilboed0 joins
03:42:13PredatorIWD251 joins
03:44:13PredatorIWD25 quits [Ping timeout: 272 seconds]
03:44:13PredatorIWD251 is now known as PredatorIWD25
03:48:20etnguyen03 quits [Remote host closed the connection]
04:04:42Guest58 joins
04:18:35Island quits [Read error: Connection reset by peer]
04:29:46Wohlstand quits [Quit: Wohlstand]
04:46:38<ericgallager>I'm getting "SPN internal proxy error." when trying to save this URL in the WBM: https://www.dropsitenews.com/p/jeffrey-epstein-aided-alan-dershowitz-mearsheimer-walt-israel-lobby
04:47:08<@JAA>#internetarchive
04:48:44<h2ibot>Cooljeanius edited Paste hosting (+1, fix link): https://wiki.archiveteam.org/?diff=57869&oldid=57868
04:56:55cyanbox joins
05:36:57Guest58 quits [Ping timeout: 272 seconds]
05:53:05Guest58 joins
06:00:29DogsRNice quits [Read error: Connection reset by peer]
06:44:03thalia quits [Quit: Connection closed for inactivity]
06:45:05BennyOtt (BennyOtt) joins
06:58:24Guest58 quits [Client Quit]
07:14:05Guest58 joins
07:28:25Guest58 quits [Ping timeout: 272 seconds]
07:33:35Commander001 quits [Read error: Connection reset by peer]
07:34:19Commander001 joins
07:35:05Ointment8862 (Ointment8862) joins
08:12:13Commander001 quits [Client Quit]
08:12:22Commander001 joins
08:36:02gosc joins
08:57:56choochaa quits [Remote host closed the connection]
08:58:25choochaa (choochaa) joins
08:59:13Guest58 joins
09:01:50stepney141 quits [Ping timeout: 256 seconds]
09:23:32gosc_1 joins
09:26:12gosc quits [Ping timeout: 256 seconds]
10:00:14DogDisco joins
10:08:50TastyWiener95 (TastyWiener95) joins
10:37:31Webuser109518 joins
10:37:42Webuser109518 quits [Client Quit]
10:43:29gosc_1 quits [Ping timeout: 272 seconds]
11:09:24<DogDisco>Hi, I wanted to ask if #archivebot would be the appropriate tool for archiving Wordpress blogs (I didn't spot a Wordpress specific project). Also if things like Manga Translator blogs are too minor + non-urgent to use resources there. I was thinking of archiving (https://yonkomascans.wordpress.com) since it's old, dead, and has most of its pages
11:09:24<DogDisco>missing from Archive.org. Thanks for the help!
11:28:16gosc joins
11:28:56<gosc>hey, I'm wondering if anyone here archived the Simpsons Tapped Out files? I noticed they were already on wayback with the archiveteam collection
11:29:27<gosc>just wanted to ask if archiveteam really got it all; I found a python tool that could download all the game's files, was that used in saving it or something?
11:50:26linuxgemini quits [Quit: Ping timeout (120 seconds)]
11:50:57linuxgemini (linuxgemini) joins
12:00:02Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat]
12:02:48Bleo182600722719623455222 joins
12:25:22<Dango360>gosc: i did the list for the files; i got as much as i could from the current dlcindexes https://archive.fart.website/archivebot/viewer/job/20240930014128c9p6a
12:25:46<Dango360>(well, at the time; 30th september 2024)
12:26:08khaoohs quits [Read error: Connection reset by peer]
12:26:10atphoenix__ quits [Read error: Connection reset by peer]
12:26:41khaoohs joins
12:26:57atphoenix__ (atphoenix) joins
12:27:25<gosc>Dango360, I see, thanks
12:29:42gosc quits [Client Quit]
12:32:38sg72 quits [Ping timeout: 256 seconds]
12:34:52sg72 joins
13:02:59Wohlstand (Wohlstand) joins
13:11:03Doranwen quits [Ping timeout: 272 seconds]
13:17:27Doranwen (Doranwen) joins
13:21:19cyanbox quits [Read error: Connection reset by peer]
13:30:12SootBector quits [Remote host closed the connection]
13:30:12choochaa quits [Remote host closed the connection]
13:30:33choochaa (choochaa) joins
13:31:24SootBector (SootBector) joins
13:34:52Boppen_ joins
13:38:17Boppen quits [Ping timeout: 272 seconds]
13:40:05Ointment8862 quits [Quit: Lost terminal]
14:20:18HP_Archivist quits [Ping timeout: 256 seconds]
14:22:00HP_Archivist (HP_Archivist) joins
14:35:57cmlow5 joins
14:37:28beardicus quits [Quit: bye]
14:37:52cmlow quits [Ping timeout: 256 seconds]
14:37:52cmlow5 is now known as cmlow
14:37:55Guest58 quits [Quit: My Mac has gone to sleep. ZZZzzz…]
14:38:18Guest58 joins
14:38:37Guest58 quits [Client Quit]
14:39:32Guest58 joins
15:00:32Mateon1 quits [Ping timeout: 256 seconds]
15:02:25Webuser891228 joins
15:03:07Webuser891228 quits [Client Quit]
15:04:27Mateon1 joins
15:23:12Jake quits [Ping timeout: 256 seconds]
15:23:37Jake (Jake) joins
15:29:43beardicus (beardicus) joins
15:34:06beardicus quits [Client Quit]
15:37:21Wohlstand quits [Client Quit]
15:37:36Wohlstand (Wohlstand) joins
15:37:57Hackerpcs quits [Quit: Hackerpcs]
15:39:04beardicus (beardicus) joins
15:45:06Jake quits [Client Quit]
15:45:34Jake (Jake) joins
15:48:16NF885 (NF885) joins
15:49:28Hackerpcs (Hackerpcs) joins
15:50:47<dendory>Not sure if it's been reported yet, I don't see them on the deathwatch page: https://goodenough.us/blog/2025-11-03-we-re-shutting-down-yay-boo-and-ponder/
15:52:09<justauser|m>Ouch. This is serious.
15:56:56<@arkiver>hi checking
15:57:45<@arkiver>does ponder.us have anything public? quick google search `site:ponder.us` says no
15:58:07<@arkiver>yay.boo definitely has
15:58:48<justauser|m>Link at the bottom of yay.boo is a random site.
15:59:11<justauser|m>"Surprise me!", new on each reload.
15:59:33archiveDrill quits [Quit: The Lounge - https://thelounge.chat]
16:00:36archiveDrill joins
16:00:51<@arkiver>we might go and click that a ton of times :P
16:00:57<@arkiver>looking into yay.boo further
16:00:59archiveDrill quits [Read error: Connection reset by peer]
16:02:55<justauser|m>Some pages are JS-heavy, may need mnbot.
16:03:35<justauser|m>https://ta.yay.boo/ in particular selects resources based on RNG.
16:04:03<justauser|m>Fortunately not for the real content, so not archiving that wouldn't be too bad.
16:11:18Gadelhas56287378443 quits [Quit: auf Wiedersehen]
16:12:03Gadelhas562873784438 joins
16:26:31<justauser|m>Color.io shutting down on 12-31: https://www.color.io/about
16:27:53<justauser|m>Looks like the most important part is PWA at https://app.color.io/
16:28:25<justauser|m>Those are supposed to have enough information to downlaod fully and use online, but do we have a way to do so?
16:28:30<@arkiver>sending them a message about yay.boo
16:29:14<TheTechRobo>Funnily enough we do have a wiki page on those "random page" buttons: https://wiki.archiveteam.org/index.php/Collecting_items_randomly
16:29:44<justauser|m>s/online/offline
16:37:43Sluggs (Sluggs) joins
16:42:08<justauser|m>Plugged it into SPN hoping that it does TheRightThing(tm).
17:15:44stepney141 (stepney141) joins
17:19:57<h2ibot>Arkiver uploaded File:Jjong0u.com screenshot.png: https://wiki.archiveteam.org/?title=File%3AJjong0u.com%20screenshot.png
17:36:21<Vokun>I've been getting dangerous site warnings from transfer.archivete.am for the past few days. Did something change?
17:39:19NF885 quits [Client Quit]
17:45:01<h2ibot>Cooljeanius edited Deathwatch (+6, /* 2025 */ let's make a separate page for…): https://wiki.archiveteam.org/?diff=57871&oldid=57866
17:50:02<h2ibot>Cooljeanius created Launchpad (+206, Created page with "{{Infobox project | URL =…): https://wiki.archiveteam.org/?title=Launchpad
17:52:03<pokechu22>DogDisco: Yes, archivebot is appropriate for something like that, and there's currently a decent amount of capacity
17:52:21<pokechu22>Looks like Ryz already ran it for you
17:57:51Webuser696510 joins
17:58:03<h2ibot>Cooljeanius created Dropbox (+314, Created page with "{{Infobox project | URL =…): https://wiki.archiveteam.org/?title=Dropbox
17:59:05<DogDisco>pokechu22 oh that's great! I totally missed it. It must have happened while I was away or I was too unobservant.
17:59:15<Webuser696510>Hi, is it possible to archive http://www.wangfuk.org/? This is a website of the owners of the big apartment fire in HK right now that has killed 36+ ppl. Want to preserve any evidence before they start deleting things.
17:59:15<Webuser696510>Ref; https://www.bbc.co.uk/news/live/c2emg1kj1klt
17:59:18<DogDisco>Ryz thank you so much!
18:00:35<pokechu22>Webuser696510: archivebot job started
18:00:44<Webuser696510>Many thanks!
18:00:45NF885 (NF885) joins
18:01:23SootBector quits [Remote host closed the connection]
18:02:29<Webuser696510>This backups to archive.org right? or archive.is or both?
18:03:09<justauser|m>To Archive.org only.
18:03:11<Webuser696510>ok
18:04:15<Ryz>You're welcome DogDisco, don't hesitate to suggest more websites and ideas in #archivebot
18:09:14<DogDisco>Ryz: Will do!
18:09:32SootBector (SootBector) joins
18:09:47Webuser936687 joins
18:10:03Webuser936687 quits [Client Quit]
18:22:16<pokechu22>Webuser696510: specifically, data should appear at https://archive.fart.website/archivebot/viewer/?q=wangfuk.org in a few hours, and it should be browsable on web.archive.org eventually (normally in a few days, but indexing on web.archive.org has been very slow lately so I don't know when it will show up there)
18:34:09<h2ibot>JustAnotherArchivist created 짱공유닷컴 (+392, Stub page): https://wiki.archiveteam.org/?title=%EC%A7%B1%EA%B3%B5%EC%9C%A0%EB%8B%B7%EC%BB%B4
18:35:00<@arkiver>thanks for creating that JAA
18:36:09<h2ibot>JustAnotherArchivist created Jjang0u.com (+29, Redirected page to [[짱공유닷컴]]): https://wiki.archiveteam.org/?title=Jjang0u.com
18:38:09<h2ibot>JustAnotherArchivist edited 짱공유닷컴 (+63, Add logo & image): https://wiki.archiveteam.org/?diff=57876&oldid=57874
18:38:37<@JAA>Vokun: Yes, something happened. We're on it.
18:39:33<pokechu22>JAA: would it be correct to say that the site is safe as long as you aren't downloading and running suspicious executables someone uploaded to it?
18:40:00<@JAA>pokechu22: Yes
18:40:01<@arkiver>Adobe Aero coming up
18:40:20<nicolas17>man, I remember Aero being teased in an Apple conference
18:43:10<h2ibot>Nicolas17v2 edited Mapillary (-3, data is now in Facebook's servers): https://wiki.archiveteam.org/?diff=57877&oldid=57858
18:43:11<h2ibot>Nicolas17v2 edited Mapillary (-1, fix blank space): https://wiki.archiveteam.org/?diff=57878&oldid=57877
18:43:23<@arkiver>we're mostly going through API data of Adobe Aero
18:55:22tek_dmn quits [Quit: ZNC - https://znc.in]
19:11:17NF885 quits [Client Quit]
19:21:12sec^nd quits [Remote host closed the connection]
19:21:34sec^nd (second) joins
19:26:01<Ryz>Still for some reason wanna itchy for solving those Captchas for fun, and wanna have that used for internet archiving purposes >#<;
19:33:18<Webuser696510>pokechu22 many thanks for your quick help
19:53:24tek_dmn (tek_dmn) joins
19:54:46<nulldata>Do we need a channel name for it? I propose #aeronaught lol
20:04:27<h2ibot>Cooljeanius edited Dropbox (+35, it's similar to [[Google Drive]]): https://wiki.archiveteam.org/?diff=57879&oldid=57873
20:05:44<@arkiver>nulldata: i like it
20:08:28<h2ibot>Cooljeanius edited Launchpad (+370, Add note about mailing lists): https://wiki.archiveteam.org/?diff=57880&oldid=57872
20:12:19<klea>it's ok to run more dashboard-repeater instances connecting to ws://archivebot.archivingyoursh.it/stream right?
20:12:26<klea>oh yes, im stupid
20:13:19<klea>Ryz: which captchas?
20:14:09twiswist quits [Quit: twiswist]
20:14:21twiswist (twiswist) joins
20:22:13<klea>could there be some kind of channel that's just a feed of new channels?
20:22:56<@JAA>Yo dawg
20:23:04<klea>?
20:23:16<nicolas17>Ryz: go get one https://data.nicolas17.xyz/samsung-grab/
20:23:52<@JAA>klea needs to work on her meme-fu: https://knowyourmeme.com/memes/xzibit-yo-dawg
20:24:18<@JAA>New project channels are always announced in #archiveteam .
20:24:49<klea>are they announced in some kind of machine parsable way?
20:25:03klea definitively doesn't want to just make some kind of trigger filter to join new AT related channels
20:25:23<nulldata>But what channel would we announce the new channel announcement channel in?
20:25:41<@JAA>No
20:26:05driib97 (driib) joins
20:26:38<klea>inb4 #archiveteam-new-channels with just channel names, separated by either spaces or ','
20:27:45<klea>JAA: that channel's secret :(
20:28:29<klea>aaa, samsung's evil, they make downloads slow on purpose
20:29:52<klea>im afraid to lose my task if i leave my computer now
20:30:46<nulldata>It's available on the secret ArchiveTeam Discord server
20:31:02<klea>im lazy to join discord
20:33:49<hexa->I thought y'all had a secret phpBB forum
20:34:30<steering>klea: you wouldnt be the first one :P
20:34:59<steering>(auto joining new channels)
20:35:08<klea>steering: you made something for that?
20:35:23klea wonders if someone wrote a weechat trigger for that
20:35:27<steering>not i
20:35:42<nulldata>No one tell hexa- about the ArchiveTeam MySpace page
20:35:44<steering>fireonlive made a... uhh... i think znc module i dunno
20:36:21<klea>oh
20:36:23<klea>i'm kind of dumb
20:36:38<steering>class chansnipe(znc.Module):
20:36:38<steering> description = "joins a channel if one is mentioned.. :3"
20:36:39<klea>i should just make a irc bot thingy, and attach it to my soju account
20:36:47<klea>steering: may you share?
20:36:57<steering>yes thats definitely easier than a weechat script xP
20:37:05<klea>im mostly interested in how it figures out what's crap and what's not crap
20:37:25<steering> # replaced [#&] with # because don't want to join local-only channels
20:37:28<steering> irc_channel_regex = r"(?:^|\s)(#[^\x00\x07\x0a\x0d ,:]{1,50})"
20:37:32<steering> channels = re.findall(irc_channel_regex, message.s)
20:37:36<klea>steering: share some transfer link?
20:37:37<nulldata>I remember spamming fireonlive's spam channel with fake channels to make him hit his join limit :P
20:37:49<steering>probably not gonna share it
20:37:53<klea>oh
20:38:09<steering>also what nulldata said
20:38:11<klea>nulldata: so there's a join limit on hackint?
20:38:17<kline>ive come into a couple of very nice manuals for the Microtan 65 - does anyone know of a library/archival place in the south east of England that has a good quality scanner?
20:38:21<steering>also he had to go so far as both a denylist and an ignorelist (idk the difference)
20:38:27<steering>yeah its 200 i think
20:38:28<kline>I have no idea if there are any local archive people to me
20:38:39<steering>CHANLIMIT=#:256
20:38:41<steering>256 rather
20:38:57<klea>sad
20:39:33<klea>deny, not join, and tell that you didn't, and ignore, not join, and not tell the user (ie, himself) that his program didn't
20:39:58<steering>more likely deny is channel names and ignore is user nicks but idk
20:40:07<klea>oh yeah that
20:40:09<klea>maybe
20:40:11<klea>idk
20:40:16<steering>the code also explicitly excludes idlerpg and hackint, lol
20:40:25DogDisco leaves [Ooops, wrong browser tab.]
20:42:03<klea>useless regex for me (parsing the archiveteam wiki xml
20:43:44<klea>at least one of these should be a valid channel tho: https://transfer.archivete.am/dxbMB/crap.txt
20:43:45<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/dxbMB/crap.txt
20:45:00<@JAA>See also https://wiki.archiveteam.org/index.php/Category:Project_with_an_active_dedicated_IRC_channel
20:45:45<klea>im stupid
20:45:48<klea>i can just parse irc =
20:46:03<@JAA>You'll get lots of hits for decommissioned channels.
20:48:45<klea>yeah true
20:51:33nimaje quits [Read error: Connection reset by peer]
20:51:33LddPotato quits [Read error: Connection reset by peer]
20:52:35LddPotato (LddPotato) joins
20:53:03nimaje joins
20:57:42<steering>JAA: so, new rudder :P
20:58:18<steering>watcha want me to do to set it up (or do you still have the control panel deets)
21:00:35<h2ibot>DigitalDragon created Adobe Aero (+731, Created page with "{{Infobox project | title =…): https://wiki.archiveteam.org/?title=Adobe%20Aero
21:30:40<h2ibot>Usernam edited List of websites excluded from the Wayback Machine (+24, …): https://wiki.archiveteam.org/?diff=57882&oldid=57847
21:31:13AK (AK) joins
21:34:40klea wonders what rudder means
21:35:01<klea>i guess another dead site
21:35:14Dada joins
21:37:46skyrocket quits [Ping timeout: 256 seconds]
21:41:05<@JAA>steering: Should still have it, I think. I'll check later.
21:46:20<Ryz>nicolas17, hmm, I'll see on helping that later tonight, need zzz, too hyperfocused and tired from the two Ubisoft websites I had to constantly bap at in #archivebot x_x;
21:46:35<klea>nicolas17: what was the limit?
21:46:38<klea>oh
21:46:49<klea>Ryz: rest a while, don't worry, it's almost done
21:47:22<pokechu22>klea: rudder is an archivebot pipeline owned by steering
21:47:49<klea>oh, i thought it was related to the tracker thingy, when steering mentioned the control panel details
21:54:31wyatt8740 quits [Remote host closed the connection]