00:01:26 | | hexa_ (hexa-) joins |
00:03:01 | | magmaus3 quits [Ping timeout: 260 seconds] |
00:05:41 | | egallager joins |
00:08:16 | | ericgallager quits [Ping timeout: 260 seconds] |
00:09:49 | | Doranwen (Doranwen) joins |
00:14:16 | | cooljeanius joins |
00:17:36 | | egallager quits [Ping timeout: 260 seconds] |
00:39:19 | | dabs quits [Client Quit] |
00:50:44 | | magmaus3 (magmaus3) joins |
01:03:06 | | camrod636 quits [Ping timeout: 260 seconds] |
02:12:53 | <BlankEclair> | huh, i still have my docs for when i was scraping pixiv |
02:13:00 | <BlankEclair> | i think they're okay to share, should i? |
02:16:06 | | pixel leaves [Error from remote client] |
02:18:48 | <BlankEclair> | fyi that they're "Written on 2023-03-31, updated on 2023-06-04" |
02:28:23 | <@JAA> | Pixiv → #pixeled |
02:29:18 | <Juest> | hey, anything been done on vocaroo archival by chance? |
02:36:09 | | entrox quits [Quit: Ping timeout (120 seconds)] |
02:41:28 | | entrox joins |
02:50:20 | <BlankEclair> | https://wiki.archiveteam.org/index.php/Vocaroo |
02:50:23 | <BlankEclair> | doesn't seem like it? |
02:50:26 | <BlankEclair> | and hi juest ^_^ |
02:52:02 | <Juest> | someone with lots of storage mind want to get scrapping that? |
02:57:55 | <Juest> | hey BlankEclair :) |
03:05:56 | <cancername> | Juest: I was actually planning on doing that anyway for a personal project of mine! so I will probably do that sometime soon :D |
03:13:55 | <pokechu22> | arkiver: https://dictionary.goo.ne.jp/ (closing June 25 per https://help.goo.ne.jp/help/article/2889/) seems to have problematic rate-limiting so where after too many requests it redirects to https://dictionary.goo.ne.jp/isspam/ with a captcha, so it won't work in archivebot. There is a list of 1.5 URLs in https://dictionary.goo.ne.jp/sitemaps/sitemap.xml though. It probably |
03:13:57 | <pokechu22> | needs to be done via DPoS |
03:14:07 | <cancername> | I haven't checked in depth yet, but it seems like you can just set Referer: https://vocaroo.com/ and curl /mp3/<ID>, the technical aspect should be trivial then |
03:49:49 | | BearFortress quits [] |
04:05:06 | | DogsRNice quits [Read error: Connection reset by peer] |
04:08:44 | | BornOn420 quits [Remote host closed the connection] |
04:14:17 | <h2ibot> | BlankEclair edited URLTeam (+371, /* Alive */ Add wiki.surf): https://wiki.archiveteam.org/?diff=55687&oldid=55583 |
04:17:33 | | nicolas17 joins |
04:22:18 | <h2ibot> | BlankEclair edited URLTeam (+216, /* "Official" shorteners */ Add tild.es): https://wiki.archiveteam.org/?diff=55688&oldid=55687 |
04:26:12 | | BornOn420 (BornOn420) joins |
04:34:24 | | BearFortress joins |
04:38:28 | | eythian quits [Quit: http://quassel-irc.org - Chat comfortabel. Waar dan ook.] |
04:40:01 | | eythian joins |
05:19:00 | <nicolas17> | I'm still having weird issues where digitalocean (NYC3) to optane10 uploads are sometimes 40KiB/s and sometimes 20MiB/s, randomly |
05:19:35 | <nicolas17> | when I get 40KiB/s, sometimes I can kill rsync and it will go at 20MiB/s when it retries 1 minute later |
05:36:41 | | nicolas17 quits [Ping timeout: 260 seconds] |
05:54:31 | <h2ibot> | PaulWise edited Wikibot (+107, add more mediawiki URL tips): https://wiki.archiveteam.org/?diff=55689&oldid=55517 |
05:54:36 | | HP_Archivist quits [Quit: Leaving] |
06:06:46 | | HP_Archivist (HP_Archivist) joins |
06:16:38 | <@arkiver> | pokechu22: alright! |
06:16:42 | <@arkiver> | we'll do that |
06:16:45 | <@arkiver> | will get a project up |
06:48:09 | | Megame quits [Quit: Leaving] |
07:47:39 | | Island quits [Read error: Connection reset by peer] |
07:56:24 | | pixel (pixel) joins |
07:59:38 | | camrod636 (camrod) joins |
08:21:59 | <h2ibot> | Exorcism edited Discourse/archived (+90): https://wiki.archiveteam.org/?diff=55690&oldid=55686 |
08:38:33 | | Dada joins |
08:59:04 | <h2ibot> | Exorcism edited Discourse/archived (+87): https://wiki.archiveteam.org/?diff=55691&oldid=55690 |
09:13:38 | | cooljeanius quits [Quit: This computer has gone to sleep] |
09:26:31 | | Riku_V quits [Ping timeout: 260 seconds] |
09:30:32 | | Riku_V (riku) joins |
09:38:40 | <c3manu> | pabs: i've got a bit of a noob question regarding https://wiki.archiveteam.org/index.php/ArchiveBot/Monitoring. when binding a socket to 127.0.0.1:4568 with gunicorn..does that mean it's also only accessible from localhost? |
09:39:03 | <pabs> | right |
09:39:15 | <pabs> | also, I suggest uvnicorn instead |
09:39:26 | <pabs> | er uvicorn |
09:39:31 | <c3manu> | why? |
09:39:43 | <pabs> | I got crashes with gunicorn |
09:39:43 | | f_|DSR quits [Remote host closed the connection] |
09:39:51 | <pabs> | env --chdir ws-repeater/ UPSTREAM=ws://archivebot.com:4568/stream uvicorn app:app --host localhost --port 4568 |
09:40:06 | <pabs> | er, wrong stream URL there |
09:40:15 | <c3manu> | yeah, should use the repeater |
09:40:19 | <c3manu> | thanks :) |
09:40:20 | | f_|DSR (funderscore) joins |
09:40:47 | <c3manu> | just making sure i don't expose the VPS for bonus bandwith |
09:41:10 | <h2ibot> | PaulWise edited ArchiveBot/Monitoring (+51, uvicorn): https://wiki.archiveteam.org/?diff=55692&oldid=54543 |
09:42:10 | <h2ibot> | PaulWise edited ArchiveBot/Monitoring (-6, formatting bleh): https://wiki.archiveteam.org/?diff=55693&oldid=55692 |
09:42:19 | <c3manu> | pabs++ |
09:42:19 | <eggdrop> | [karma] 'pabs' now has 88 karma! |
09:44:11 | <h2ibot> | Exorcism edited Discourse/archived (+85): https://wiki.archiveteam.org/?diff=55694&oldid=55691 |
10:03:23 | <c3manu> | pabs: is the call on the wiki now correct with both 'uvicorn app:app' and 'gunicorn app:app' in it? |
10:14:35 | | runxiyu quits [Quit: ZNC 1.8.2+deb3.1+deb12u1 - https://znc.in] |
10:16:40 | | runxiyu (runxiyu) joins |
10:36:05 | <c3manu> | i think pabs just forgot to remove the old call |
10:36:19 | <c3manu> | i now am at the point where i am failing to curl the open websocket :S |
10:47:20 | <h2ibot> | Exorcism edited Discourse/archived (+92): https://wiki.archiveteam.org/?diff=55695&oldid=55694 |
10:57:22 | <h2ibot> | Exorcism edited Discourse/archived (+94): https://wiki.archiveteam.org/?diff=55696&oldid=55695 |
11:00:03 | | Bleo182600722719623455 quits [Quit: The Lounge - https://thelounge.chat] |
11:02:47 | | Bleo182600722719623455 joins |
11:03:23 | <h2ibot> | Exorcism edited Discourse/archived (+85): https://wiki.archiveteam.org/?diff=55697&oldid=55696 |
11:17:52 | | sec^nd quits [Remote host closed the connection] |
11:18:19 | | sec^nd (second) joins |
11:19:22 | <pabs> | c3manu: left both, folks can choose which one |
11:19:53 | <pabs> | c3manu: needs curl 8.11.0 or later |
11:20:11 | <pabs> | wiki has a couple of other options if curl doesn't work |
11:23:43 | <c3manu> | with uwsc i'm getting a 403 |
11:24:55 | <c3manu> | not sure what's going on |
11:26:06 | <c3manu> | oh wait. i'm supposed to curl localhost:4568/stream, not /, aren't i? |
11:32:41 | <pabs> | right |
11:33:10 | <pabs> | and use ws:// |
11:35:26 | | balrog quits [Ping timeout: 260 seconds] |
11:41:11 | <c3manu> | all the articles always assume people reading them already know about all the other parts >.< |
11:42:36 | <c3manu> | ..or somebody just wanted to punish me for immediately putting it into a systemd-service ^^" |
11:43:15 | <c3manu> | anyways, got it working now. tried to skip the venv and use system packages instead, which was not the most brightest idea. but i only saw those errors using the correct URL |
11:43:53 | <pabs> | btw you don't need the repeater if you are only downloading the stream from one client |
11:44:06 | <pabs> | I'm using system packages too |
11:44:26 | <c3manu> | ah. then i could theoretically just work my way through the missing packages |
11:44:32 | <c3manu> | what system are you running it on? |
11:45:17 | <c3manu> | and i assumed you have multiple 'curl | jq' calls running, checking for different things |
11:46:00 | <pabs> | Debian trixie. yeah I have multiple curl calls |
11:46:22 | <c3manu> | ah, then my understanding of what "one client" means here is probably wrong |
11:46:31 | <c3manu> | how do i check whether it's using the right upstream real quick? |
11:47:02 | <pabs> | one client = curl + jq, or ab2f, or whatever |
11:47:23 | <pabs> | the right upstream is katia's repeater instance ws://archivebot.archivingyoursh.it/stream |
11:48:28 | <c3manu> | i know, but how do i check whether that is what it's actually using? |
11:48:30 | <pabs> | and if you connect to that with curl, or set UPSTREAM to it before running the repeater on your system, then you are using the right one |
11:48:49 | <c3manu> | that's exactly what i wanna test :) |
11:48:55 | <pabs> | I guess you could check with wireshark which IP is being connected to |
11:49:38 | <pabs> | or ss/lsof |
11:49:47 | <pabs> | since its tcp |
11:51:58 | <c3manu> | i tried 'ss -a' but that doesn't list either the ab2f ip nor the dashboard one |
11:52:28 | <katia> | h |
11:53:42 | <pabs> | I see it with ss -a -t |
11:54:02 | <pabs> | guess tcp isn't on by default |
11:55:28 | <c3manu> | ah |
11:55:51 | <katia> | c3manu, i made a commit that prints $UPSTREAM on startup, maybe that helps? |
12:00:51 | <katia> | pabs, i got crashes with uvicorn, fwiw... |
12:01:14 | <pabs> | huh. I only got them with the other one :) |
12:02:31 | <h2ibot> | Cancername edited Vocaroo (+312, Expand with information about downloading from…): https://wiki.archiveteam.org/?diff=55698&oldid=50724 |
12:02:32 | <katia> | https://i.katia.sh/2025-05-18-x3rpXX2ohtNZBoolAzC7TN8_ab2f-abws-4569.txt |
12:02:36 | <katia> | idk :D |
12:02:38 | <c3manu> | oh boi..i was wondering why i couldn't see the output on startup, thinking either the venv/python or systemd might swallow them.. |
12:02:49 | <c3manu> | but it just took the application like two minutes to start :D |
12:03:11 | <c3manu> | but it's using the right one |
12:03:15 | <c3manu> | katia++ |
12:03:15 | <eggdrop> | [karma] 'katia' now has 87 karma! |
12:03:18 | <c3manu> | pabs++ |
12:03:19 | <eggdrop> | [karma] 'pabs' now has 89 karma! |
12:03:22 | <c3manu> | thanks for putting up with me :3 |
12:03:37 | <katia> | whoa why did it take two minutes to start |
12:03:42 | <@arkiver> | i see some talk about vocaroo, but i don't see it on deathwatch and https://wiki.archiveteam.org/index.php/Vocaroo doesn't note that it is going offline |
12:03:47 | <@arkiver> | what is the talk of vocaroo about? |
12:03:49 | <katia> | i don't think that's intended |
12:03:54 | <katia> | c3manu++ |
12:03:55 | <katia> | pabs++ |
12:03:55 | <eggdrop> | [karma] 'c3manu' now has 84 karma! |
12:03:58 | <eggdrop> | [karma] 'pabs' now has 90 karma! |
12:04:02 | <c3manu> | katia: no idea, might be my fault as well |
12:04:35 | <c3manu> | 13:59:32 redacted python[5552]: INFO: Uvicorn running on http://localhost:4568 (Press CTRL+C to quit) |
12:04:39 | <c3manu> | 14:01:16 redacted python[5552]: starting up, connecting to ws://archivebot.archivingyoursh.it/stream |
12:04:49 | <katia> | ah hm dunno |
12:04:53 | <@arkiver> | c3manu: pabs: maybe this is better in #archiveteam-dev ? |
12:04:55 | <c3manu> | probably because i'm using uvicorn ;) |
12:05:18 | <pabs> | Juest: ^ vocaroo question above |
12:05:29 | | Wohlstand (Wohlstand) joins |
12:05:32 | <h2ibot> | Cancername edited URLTeam (+247, /* Alive */ Add giveinto.me URL shortener): https://wiki.archiveteam.org/?diff=55699&oldid=55688 |
12:06:29 | <@arkiver> | thanks |
12:18:33 | <Juest> | yeah thanks pabs |
12:19:48 | <Juest> | cancername: im not sure if the original server is still up though, probably not. unfortunately it seems to be mp3 only? despite it allowing uploading any files. i imagine it'll be a brute forcing of the ids? |
12:20:24 | <BlankEclair> | hmm, stackexchange has cloudflare set up interestingly? |
12:20:35 | <BlankEclair> | residential ip + browser + no cookies -> block |
12:20:39 | <BlankEclair> | residential ip + browser + cookies -> allow |
12:20:44 | <BlankEclair> | (oh, and both no javascript) |
12:21:03 | <@arkiver> | Juest: is vocaroo shutting down? |
12:21:53 | <Juest> | arkiver: no, but it'd be nice to archive it. its a lot of permanent storage material of voices apparently, big privacy concern |
12:22:08 | <@arkiver> | ah |
12:22:20 | <@arkiver> | it'd be fine to archive some one-off stuff |
12:22:31 | <@arkiver> | but i don't think there's going to be a large scale project for it at the moment |
12:22:40 | <Juest> | no worries, im not asking for that |
12:25:01 | <Juest> | just happens that vocaroo switched servers at some point in 2014 or 2016 iirc, according to the faq and some blogging, thats why the files are all prefixed with the same "1s" |
12:26:32 | <Juest> | hmm not in the faq, i dont remember where i saw about the servers |
12:28:15 | <Juest> | ah i forgot about the github repo being public |
12:28:19 | | vitzli (vitzli) joins |
12:28:23 | <Juest> | should be easy to tackle :) |
12:29:53 | | Juest sighs |
12:40:58 | <Juest> | as in, the whole vocaroo app is public |
12:41:15 | | fangfufu_ quits [Quit: ZNC 1.8.2+deb3.1+deb12u1 - https://znc.in] |
12:50:10 | <Juest> | someone is personally interested in archiving vocaroo as well, so that's cool |
12:50:32 | <Juest> | i hardly have resources for doing archival |
12:51:32 | | fangfufu joins |
12:51:42 | | fangfufu is now authenticated as fangfufu |
13:00:10 | | vitzli quits [Client Quit] |
13:36:51 | <h2ibot> | Exorcism edited Discourse/archived (+90): https://wiki.archiveteam.org/?diff=55700&oldid=55697 |
13:56:35 | | Webuser368841 joins |
13:56:50 | | Webuser368841 quits [Client Quit] |
14:04:51 | <IceCodeNew|m> | <pokechu22> "needs to be done via DPoS" <- I am willing to assist. Are we going to craft a warrior for this site? |
14:05:17 | | Radzig2 joins |
14:06:56 | <@arkiver> | IceCodeNew|m: it means i'll have a project coming up for it, which will run in the Warrior yes. |
14:07:30 | | Radzig quits [Ping timeout: 258 seconds] |
14:07:30 | | Radzig2 is now known as Radzig |
14:07:55 | <@arkiver> | that being said - we may want a channel for it for coordination |
14:19:57 | <h2ibot> | Exorcism edited Discourse/archived (+269): https://wiki.archiveteam.org/?diff=55701&oldid=55700 |
14:31:14 | <cancername> | Juest: as far as vocaroo goes, IMO the danger is not shutdown, the danger is expiry. files are deleted 12 months after the last access. |
14:39:42 | <Juest> | cancername: ohhh its not reqlly in the faq iirc? |
14:40:13 | <cancername> | Juest: says "Audio is usually kept for a minimum of 12 months, and regularly accessed audio will be kept for even longer, perhaps indefinitely." |
14:40:15 | <Juest> | late to the party i imagine for the majority of things |
14:40:55 | <cancername> | yep :/ |
14:41:16 | <Juest> | better late than never |
14:41:20 | <cancername> | exactly |
14:41:21 | <@arkiver> | cancername: Juest: if they're open to it... they could send URLs our way of audio that is about to be deleted and we can archive it |
14:41:30 | <@arkiver> | feel free to contact them about it, CC archiveteam@archiveteam.org please |
14:41:49 | <cancername> | arkiver: I might, it sounds like a good idea |
14:41:57 | <@arkiver> | Juest: the party as in the projects? |
14:42:04 | <@arkiver> | (that we currently have running) |
14:42:11 | <cancername> | arkiver: I think to archiving vocaroo URLs |
14:42:15 | <@arkiver> | right |
14:42:18 | <Juest> | arkiver: its an expression |
14:42:27 | <@arkiver> | well feel free to send them an email and CC |
14:43:54 | | etnguyen03 (etnguyen03) joins |
14:44:12 | | ThreeHM quits [Quit: WeeChat 4.4.3] |
14:44:13 | <Juest> | thanks for guidance :3 |
14:48:22 | | ThreeHM (ThreeHeadedMonkey) joins |
14:48:22 | | ThreeHM quits [K-Lined] |
15:07:13 | | etnguyen03 quits [Client Quit] |
15:08:35 | | cooljeanius joins |
15:15:13 | | NotGLaDOS quits [Quit: No Ping reply in 180 seconds.] |
15:16:25 | | NotGLaDOS joins |
16:19:26 | | grill (grill) joins |
16:20:25 | | balrog (balrog) joins |
16:30:42 | | sec^nd quits [Remote host closed the connection] |
16:31:07 | | sec^nd (second) joins |
16:44:48 | | cooljeanius quits [Client Quit] |
17:21:24 | <h2ibot> | HadeanEon edited Deaths in 2025 (-134, BOT - Updating page: {{saved}} (128),…): https://wiki.archiveteam.org/?diff=55702&oldid=55681 |
18:02:08 | | etnguyen03 (etnguyen03) joins |
18:05:09 | | Webuser439623 joins |
18:08:25 | | lennier2 joins |
18:10:17 | | Webuser439623 quits [Client Quit] |
18:11:31 | | lennier2_ quits [Ping timeout: 260 seconds] |
18:11:53 | | etnguyen03 quits [Client Quit] |
18:19:04 | | lennier2_ joins |
18:22:02 | | lennier2 quits [Ping timeout: 258 seconds] |
18:27:34 | <h2ibot> | Exorcism edited Discourse/archived (+87): https://wiki.archiveteam.org/?diff=55703&oldid=55701 |
18:47:02 | | etnguyen03 (etnguyen03) joins |
19:06:59 | | etnguyen03 quits [Client Quit] |
19:08:41 | | grill quits [Ping timeout: 260 seconds] |
19:24:05 | | BornOn420 quits [Remote host closed the connection] |
19:24:57 | | BornOn420 (BornOn420) joins |
19:33:52 | | Island joins |
19:41:34 | | nine quits [Quit: See ya!] |
19:41:46 | | nine joins |
19:41:46 | | nine is now authenticated as nine |
19:41:46 | | nine quits [Changing host] |
19:41:46 | | nine (nine) joins |
19:45:37 | | Wohlstand quits [Quit: Wohlstand] |
19:53:09 | | ichdasich quits [Quit: reboot] |
19:54:57 | | yano quits [Quit: WeeChat, https://weechat.org/] |
19:56:52 | | ichdasich joins |
19:58:50 | | yano (yano) joins |
20:40:32 | | Webuser700085 joins |
20:40:58 | | Webuser700085 quits [Client Quit] |
20:50:01 | <h2ibot> | Exorcism edited Discourse/archived (+181): https://wiki.archiveteam.org/?diff=55704&oldid=55703 |
21:14:25 | | Wohlstand (Wohlstand) joins |
21:25:44 | | nine quits [Client Quit] |
21:25:57 | | nine joins |
21:25:57 | | nine is now authenticated as nine |
21:25:57 | | nine quits [Changing host] |
21:25:57 | | nine (nine) joins |
21:35:38 | | Megame (Megame) joins |
21:42:47 | | nine quits [Client Quit] |
21:42:59 | | nine joins |
21:42:59 | | nine is now authenticated as nine |
21:42:59 | | nine quits [Changing host] |
21:42:59 | | nine (nine) joins |
21:54:52 | | cooljeanius joins |
21:56:58 | | dabs joins |
22:59:03 | | Dada quits [Remote host closed the connection] |
23:02:25 | | etnguyen03 (etnguyen03) joins |
23:14:26 | | DogsRNice joins |
23:27:59 | | cooljeanius quits [Client Quit] |
23:38:07 | | ThreeHM (ThreeHeadedMonkey) joins |
23:46:01 | | etnguyen03 quits [Client Quit] |
23:57:06 | | HP_Archivist quits [Quit: Leaving] |