#archiveteam-bs log for 2024-12-31

Home Search Previous day Next day

00:19:19		DogsRNice_ joins
00:22:43		DogsRNice quits [Ping timeout: 252 seconds]
00:42:01		linuxgemini quits [Quit: getting (hopefully fresh) air o/]
00:42:15		loug8318142 quits [Quit: The Lounge - https://thelounge.chat]
00:45:18		linuxgemini (linuxgemini) joins
01:01:16		Webuser984809 joins
01:01:48		Webuser984809 quits [Client Quit]
01:04:25		PredatorIWD29 joins
01:06:21		PredatorIWD2 quits [Ping timeout: 252 seconds]
01:06:21		PredatorIWD29 is now known as PredatorIWD2
01:58:30		stormcynk joins
01:59:55	<stormcynk>	Hi, does anyone happen to know where I could find a live torrent of archiveteam-twitter-stream-2015-05 from here? https://archive.org/download/archiveteam-twitter-stream-2015-05. I've tried several I found and they're dead. An alternate mirror is also great!
02:10:02	<nicolas17>	stormcynk: did you actually find different torrents or were they all the same one? :P
02:16:09		NeonGlitch quits [Client Quit]
02:34:42	<pabs>	https://news.ycombinator.com/item?id=42551900 Tell HN: John Friel my father, internet pioneer and creator of QModem, has died
03:00:43		Webuser481608 joins
03:01:36	<Webuser481608>	Hello.
03:02:53	<Webuser481608>	I think I read https://wiki.archiveteam.org/index.php/Archive.today in the past. I read https://archive.ph/faq today. It would be cool if whoever runs archive.today had free and open source code so anyone could capture webpages as seen on archive.today. So you could download WARCs and have corresponding archive.today-like pages for replay.
03:03:32	<Webuser481608>	*so anyone could capture webpages like as seen on archive.today
03:03:54	<nicolas17>	yeah it would be nice, go let them know :P
03:05:37	<Webuser481608>	It's about archival fixity - a term I heard of in an ipwb issue thread. ipwb = Interplanetary Wayback and that's open source in GitHub.
03:10:24	<Webuser481608>	So archive.today pages usually look like how the page should look like without stuff missing. If you run an Apache HTTP Server HTML file "1370929" doesn't show up as rendered. You have to rename it to "1370929.html" for it to show up as rendered, but lots of mojibaking. If you run an IPFS gateway, HTML file "1370929" will show up as rendered but
03:10:24	<Webuser481608>	then you get a bit of mojibaking and a thing is missing. archive.today = it all looks perfect (but no WARC).
03:11:12	<Webuser481608>	*Apache HTTP Server, HTML file
03:11:54		stormcynk quits [Client Quit]
03:12:35	<Flashfire42>	We have no affiliation with archive.today and no affiliation with archive.org tho we do host our stuff there
03:12:41	<Webuser481608>	I suspect that archive.today uses Selenium. Just faking the user agent alone may not work to bypass CF-block and similar anti-grabbing software. Plus, I've seen things which indicate that archive.is uses Chrome/Selenium.
03:13:11	<Webuser481608>	@Flashfire42 I know. Just wanted to talk about or post about this.
03:14:45	<Webuser481608>	So loading up a browser tab or window for every capture is a bit intensive and heavier than wget or grab-site, so you may have to use a medium or high-powered computer to do that.
03:15:50	<nicolas17>	does Save Page Now use a browser too?
03:16:01	<Webuser481608>	Yeah, I've seen stuff to indicate that.
03:16:26	<Webuser481608>	Such as some error message like "Save Page Now browser crashed while trying to load this URL."
03:16:39	<Webuser481608>	(in web.archive.org)
03:17:35	<@JAA>	Yes, SPN uses brozzler, I believe.
03:18:22	<@JAA>	I suppose 2-3 orders of magnitude qualifies as 'a bit heavier'.
03:20:19	<Webuser481608>	I was thinking of this recently due to https://derpibooru.org/forums/meta/topics/policy-update-regarding-ai-content which says "[ai generated images WILL BE DELETED after 2025-01-06]". There's 33,958 image/video uploads tagged as "ai content" in that website: https://derpibooru.org/search?q=ai+content
03:20:42	<nicolas17>	(and nothing of value will be lost >.>)
03:20:45	<Webuser481608>	So I was seeing if a couple captures of that site would replay OK. It was not so great.
03:21:52	<Webuser481608>	I didn't know about that JAA https://github.com/internetarchive/brozzler
03:24:59	<Webuser481608>	nicolas17 there's a couple of boorus related to that website. There's Twibooru and Ponerpics which apparently have all Derpibooru images/videos mirrored automatically starting in ~2019. However, those aren't WARCs, and worse, the binary file data is messed with so images and videos get "losslessly" "optimized" with tinypng, ffmpeg, and whatever. I
03:24:59	<Webuser481608>	wanted to put in some effort before this mass deletion.
03:25:26	<Webuser481608>	Huh, dunno why my post was split into two posts again.
03:26:13	<TheTechRobo>	(IRC has a protocol line limit of 512 bytes)
03:26:30	<Webuser481608>	Ok
03:26:54	<TheTechRobo>	As someone who's been messing around with brozzler for awhile now, I can certainly tell you it's slow.
03:27:27	<Webuser481608>	I never used it. Maybe I will in the future.
03:28:05	<TheTechRobo>	Definitely optimisable to at least some degree, though. For example, right now there's a hardcoded `time.sleep(5)` call after it visits each hashtag to let things load, which can absolutely be done in a more elegant way.
03:28:45	<TheTechRobo>	s/hashtag/anchor/ (Brozzler calls them hashtags)
03:29:06	<@JAA>	(There are some sudden feelings of anger building inside me.)
03:29:25	<Webuser481608>	Reminds me of a bug fix I sent to the GNU Wget mailing list: https://lists.gnu.org/archive/html/bug-wget/2024-12/msg00006.html
03:30:38	<TheTechRobo>	JAA: Even the original developer knew it was wrong. :-) https://github.com/internetarchive/brozzler/blob/eb922f515/brozzler/browser.py#L652
03:31:11	<@JAA>	TheTechRobo: My anger was directed at 'hashtags' though.
03:31:26	<TheTechRobo>	Ah, I see. lol
03:31:44	<TheTechRobo>	Are they called anchors? 'fragment' is coming to mind as well.
03:32:16	<@JAA>	Depends on where you look. Fragments in URLs, anchors in HTML.
03:32:46	<TheTechRobo>	Eh, hashtags is close enough :P
03:33:07		@JAA slaps TheTechRobo around a bit with a large trout
03:34:06	<TheTechRobo>	There's fewer load-bearing sleeps than I expected, though. Props to the IA team.
03:35:17	<Webuser481608>	Talk about fragments and anchors make me think of Shadow DOM / Shadow root. Those are annoying!
03:36:32	<TheTechRobo>	I had no idea what those are, so I looked on MDN and now I really have no idea what those are.
03:39:15	<nicolas17>	JAA: I heard of music teachers getting really annoyed with young students calling notes "D hashtag"
03:39:37	<@JAA>	Eww, yeah
03:40:15	<Webuser481608>	In the past, IA's search, collection, and user pages didn't use shadow DOMs, but now they all do as a result of an update to that website. Such as https://archive.org/details/@someuser has ten uploads for example. If you download that link with wget it will have zero details about any of those uploads. It's like you gotta write a custom Selenium
03:40:15	<Webuser481608>	program to download it every time. Or use some fancy/exotic Browser-based software to download it.
03:40:49	<Webuser481608>	*custom Selenium script
03:40:50	<TheTechRobo>	You'd generally use IA's cli tool for that, though.
03:44:47	<Webuser481608>	About the derpi*u AI/ML deletion situation. Concerns of mine: comments, metadata, open WARCs, no-derive full images, newer uploads. That site does use some rate limiting, not sure how bad though. 2264 pages with no filter https://derpibooru.org/search?page=2264&q=ai+content - if I do do this, download the newest first. Also look at or use API/JSON
03:44:47	<Webuser481608>	to help me download that.
03:47:02	<Webuser481608>	Oh, there's a nightly database dump of that site - https://derpibooru.org/pages/data_dumps - nice that they do that, but it isn't warc or replay or raws.
03:47:39	<Webuser481608>	About 4.8GB per dump and "Note: these dumps do not include images."
03:50:10	<Webuser481608>	Not sure what's going on with that message from "*" above "@JAA slaps TheTechRobo around a bit with a large trout"
03:53:29		TheTechRobo slaps Webuser481608 around a bit with a large trout
03:53:43	<TheTechRobo>	/me
03:55:11	<TheTechRobo>	Or /slap if you're using an IRC client that supports it and don't want to type it all out.
04:02:21		Wohlstand (Wohlstand) joins
04:06:23	<Webuser481608>	Speaking of APIs, I think it's neat that conceptnet.io has a "View this term in the API" ( https://conceptnet.io/c/en/mare ). Imagine if anytime you went to whateversite.com/search?q= that search page had a link which said "view this search as JSON" or "view this search in the API".
04:13:43		Wohlstand quits [Read error: Connection reset by peer]
04:13:43		Wohlstand1 (Wohlstand) joins
04:16:06		Wohlstand1 is now known as Wohlstand
04:17:04		Wohlstand quits [Client Quit]
04:17:07		Wohlstand1 (Wohlstand) joins
04:17:32	<Webuser481608>	How many gigabytes are needed to store 34,000 images/videos? For https://derpibooru.org/tags?tq=id:661924 - I think around 40GB to 80GB. For ~25,000 videos, one to four megabyte per (different site), that ended up being like 50 to 60GB.
04:19:29		Wohlstand1 is now known as Wohlstand
04:23:59		Wohlstand quits [Ping timeout: 252 seconds]
04:28:07		Naruyoko5 quits [Remote host closed the connection]
04:28:54		Naruyoko5 joins
04:32:43	<@OrIdow6>	Webuser481608: Am I correct in my reading that all the removed posts are being copied to https://tantabus.ai/ ?
04:33:09		Naruyoko5 quits [Ping timeout: 252 seconds]
04:33:51		Naruyoko joins
04:39:10	<Webuser481608>	Seems to be broken: https://derpibooru.org/api/v1/json/search/images?q=ai+content = ~1000 total = not true. https://derpibooru.org/api/v1/json/search/images?q=safe = ~2 million = true.
04:40:43	<Webuser481608>	@OrIdow6 Text in the OP of the thread "‘ai content’ images (which includes ai generated) have already been copied to tantabus.ai". Also https://tantabus.ai/images/35312 ~= the 34,000 number that I wrote about. I am guessing that comments and stuff won't be copied over.
04:48:10	<@OrIdow6>	Looks like all relevant pictures are "filtered" and filter options are set by POST
04:53:01	<Webuser481608>	Thanks for saying that, guess I would have given up if you didn't. /api/v1/json/search/images is said to be GET at https://derpibooru.org/pages/api
04:55:43	<Webuser481608>	So the corresponding one is https://derpibooru.org/api/v1/json/search/images?q=ai+content&filter_id=56027 - Everything filter (no filter) and "total":33958. Then just mess with &page= and/or &per_page=
05:00:55	<Webuser481608>	( Unlike another Twibooru, that doesn't work outside of an API - https://derpibooru.org/search?q=explicit&filter_id=56027 and https://twibooru.org/search?q=explicit&filter_id=2 )
05:11:54	<Webuser481608>	Neat, some metadata copied over. No comments copied - compare https://tantabus.ai/images/15549?q=sha512_hash:968144dc1a74d3293c4dbdc25cbe037d5d5e8b7148612ace416328ea134771cb93d05e734a5b2b5f7bf98e67d33474cb8a064de8c29136a8fd57d22537f6d840 and https://derpibooru.org/images/3310654
05:15:49	<@OrIdow6>	Webuser481608: Going away for several hours but will come back to this later
05:16:18	<Webuser481608>	OK, glad there's some interest in this.
05:19:24	<Webuser481608>	Max 50 per page. First page is 1 and not 0: https://derpibooru.org/api/v1/json/search/images?q=ai+content&filter_id=56027&per_page=999&page=1 = search JSON. https://derpibooru.org/api/v1/json/images/3310654 = full image JSON.
05:22:38		Webuser481608 quits [Client Quit]
05:39:03		benjins2 quits [Read error: Connection reset by peer]
05:45:53		Mateon1 quits [Ping timeout: 260 seconds]
06:01:28		Mateon1 joins
06:04:54		Webuser015671 joins
06:07:37	<Webuser015671>	I think this channel is publicly logged. Where's the logs? https://archive.fart.website/ ? EMPTY: https://archive.fart.website/bin/irclogger_logs/archiveteam-bs
06:09:59	<pabs>	see https://wiki.archiveteam.org/index.php/Archiveteam:IRC#IRC_Logs
06:10:09		DogsRNice_ quits [Read error: Connection reset by peer]
06:10:40	<Webuser015671>	thanks
06:22:05	<@JAA>	irclogs.archivete.am is not officially publicly operational. I don't know why it's listed there.
06:22:45	<@JAA>	(The logging part is operational, the web part is not.)
06:24:33	<Webuser015671>	So you're running a not-so-good web server for that? I don't get it.
06:33:07	<Webuser015671>	that wiki article: "irclogs.archivete.am is currently the only IRC logger we have running. It is half-broken and therefore was not previously listed on this page, but now it's the only one we have, and broken is preferable to nothing."
06:34:13	<nulldata>	!ig 4v1dllnuwa0np0q828shkpjpo ^https?://www\.gamingonlinux\.com/./page=\d/www\..\.com/
06:44:41	<that_lurker>	nulldata: Ignore noted but not applied :-P
06:49:43	<Webuser015671>	I was asking for logs because my computer crashed; details are in this nice text file titled "Fix for lost/zeroed boot partition" -- https://ipfs.ssi.eecc.de/ipfs/bafkreia7fwz3kbys6b3fm55ucqwzpp62245jxmklhkz5d4cbe7plgscyhq -- which may also show up at https://ar.secret-network.xyz/sAySI4jL5LsZGij08O0fIx3TMlJzc3UCWq5TDZKLgaw after a while. Crash
06:49:43	<Webuser015671>	resulted in my boot partition getting rekt for the 3rd time - fixed in record time by the steps in that text.
07:00:29	<Webuser015671>	That message of mine is kinda "Who cares?" Here's a message related to this channel recently that may be more generally interesting. So I made some posts about My Little Pony web data that's going to soon be deleted. I probably or certainly like MLP:FIM more than "Filly Funtasia". I watched the first two episodes of Filly Funtasia recently: 720p YT
07:00:29	<Webuser015671>	downloads. Bella is sorta cute/pretty and Rose is OK or nice. Bella = best mare I guess. The equines in that show look odd: large legs. Everypony is called "filly" even if they are male: also odd. Filly Funtasia is related to or inspired by MLP. Negative: feels a bit like brainrot dumb cartoon while watching Filly Funtasia. Positive: it's kinda
07:00:29	<Webuser015671>	interesting. Ad-free knowledge base entry for that show which came from Hong Kong but is dubbed in English or is originally English: https://www.wikidata.org/wiki/Q28224547 . Thought I had: a 4chan /mlp/ user would say '"Filly Funtasia"? More like "Filly CuntAsia".' I'd say this message is sorta archive-related because it's details about media and
07:00:29	<Webuser015671>	where you can watch/download it.
07:05:52		Unholy23619246453771312 (Unholy2361) joins
07:06:42		Webuser015671 quits [Client Quit]
08:23:54	<@OrIdow6>	Oh they're gone
08:24:45	<@OrIdow6>	JAA: I'm the one who added the new logs to the wiki, for the reason Webuser quoted
08:25:52	<Fijxu\|m>	huh, do people actually use IPFS?
08:26:01	<Fijxu\|m>	It uses a ton of RAM for me
08:26:02	<@OrIdow6>	Given that all the other logs are dead there is there any issue with it being listed there, overloading privacy etc? I don't recall hearing the exact way in which it was broken but it seems like you've discouraged it rather than kept it a secret
08:26:33	<Fijxu\|m>	I would love to use it more but there is no content there that I'm interested in
08:27:30	<@OrIdow6>	Fijxu\|m: I think I ran a node for a few days but it was cumbersome for some reason (perhaps for the reasons you mention)
08:28:09	<@OrIdow6>	!remindme 2 days my little pony
08:28:09	<eggdrop>	[remind] error: "2" (parsed as 1735610400 → 2024-12-31T02:00:00Z) is in the past
08:28:14	<@OrIdow6>	!remindme 2d my little pony
08:28:15	<eggdrop>	[remind] ok, i'll remind you at 2025-01-02T08:28:14Z
08:33:38	<Fijxu\|m>	Yeah Orldow6 it really sucks
08:33:46		katia_dect5284 is now known as katia-
08:33:49	<Fijxu\|m>	I had some like 20GB of saved files and kubo was using like 8GB of RAM
08:33:53	<Fijxu\|m>	Is just not usable
08:33:57	<Fijxu\|m>	I like the concept but I think no one uses it because of that
08:35:13	<nimaje>	I wanted to look into ipfs a bit more to see if it is possible to implement it in a not so resource hungry way (seems like it is used a bit in the cryptocurrency space, they likely care less about stuff being resource hungry than I)
08:52:11	<that_lurker>	is ipfs still used outside of phishing
08:55:36	<@OrIdow6>	that_lurker: Hahaha
09:04:43		Island quits [Read error: Connection reset by peer]
09:23:55		emphatic quits [Ping timeout: 252 seconds]
09:32:50		loug8318142 joins
09:36:18		graham9 quits [Quit: The Lounge - https://thelounge.chat]
09:58:21		loug8318142 quits [Client Quit]
10:01:06		loug8318142 joins
10:01:49		loug8318142 quits [Client Quit]
10:03:05		emphatic joins
10:07:11		loug8318142 joins
10:12:50		bladem quits [Quit: Leaving]
10:14:43		loug8318142 quits [Client Quit]
10:16:21		wyatt8750 quits [Ping timeout: 252 seconds]
10:16:45		wyatt8740 joins
10:23:45		loug8318142 joins
10:56:12	<h2ibot>	Manu edited Discourse/archived (+80, Queued sendegate.de): https://wiki.archiveteam.org/?diff=54128&oldid=54026
11:04:14	<h2ibot>	Manu edited Discourse/archived (+97, queued discourse.bits-und-baeume.org): https://wiki.archiveteam.org/?diff=54129&oldid=54128
11:07:15	<h2ibot>	Manu edited LiveJournal (+66, /* ArchiveBot: Add information from the…): https://wiki.archiveteam.org/?diff=54130&oldid=53415
11:11:36	<c3manu>	!a https://django.wiki.bits-und-baeume.org/ -u firefox
11:11:43	<c3manu>	:O
11:11:54		bladem (bladem) joins
11:22:00		MrMcNuggets (MrMcNuggets) joins
11:40:39		PredatorIWD2 quits [Read error: Connection reset by peer]
12:00:01		Bleo182600722719623 quits [Quit: The Lounge - https://thelounge.chat]
12:02:50		Bleo182600722719623 joins
12:41:59		SkilledAlpaca418962 quits [Quit: SkilledAlpaca418962]
12:45:48		SkilledAlpaca418962 joins
13:14:29		graham9 joins
13:23:57	<kiska>	I guess I'll try and see if I can import my own logs to whatever log thing
13:32:46	<h2ibot>	Bzc6p edited Blogger.hu (+88, dicovery takes a bit more time, also there will…): https://wiki.archiveteam.org/?diff=54131&oldid=54095
13:34:46	<h2ibot>	Bzc6p edited Cafeblog.hu (+84, proclaimed started): https://wiki.archiveteam.org/?diff=54132&oldid=54100
13:37:47	<h2ibot>	Bzc6p edited Volán (+106, /* Centralization, round 3 */ websites saved): https://wiki.archiveteam.org/?diff=54133&oldid=54097
13:38:47	<h2ibot>	Bzc6p edited Volán (-255, /* Related websites */ remove duplicate entry): https://wiki.archiveteam.org/?diff=54134&oldid=54133
14:19:53		Webuser617418 joins
14:20:04		Webuser617418 quits [Client Quit]
14:26:09		Wohlstand (Wohlstand) joins
14:38:53		eroc1990 quits [Quit: The Lounge - https://thelounge.chat]
15:04:41		notarobot joins
15:06:15		driib9 quits [Quit: Ping timeout (120 seconds)]
15:06:32		driib9 (driib) joins
15:45:16	<Barto>	alright, i finally donated to archive.org for fireonlive.
15:45:59	<Barto>	!tell icedice i checked again, fireonlive was 31
15:46:01	<eggdrop>	[tell] ok, I'll tell icedice when they join next
15:48:35	<Barto>	fireonlive++
15:48:35	<eggdrop>	[karma] 'fireonlive' now has 841 karma!
15:55:47		NeonGlitch (NeonGlitch) joins
15:58:07		eroc1990 (eroc1990) joins
16:07:38		NeonGlitch quits [Client Quit]
17:25:40		yasomi quits [Quit: ZNC 1.9.1 - https://znc.in]
17:31:05		yasomi (yasomi) joins
17:32:44	<@JAA>	OrIdow6: Basically, yeah. It goes down sometimes for various reasons. As long as people don't complain about that, fine.
17:41:48	<steering>	i dont think it has ever worked for me :D
17:51:11		NeonGlitch (NeonGlitch) joins
17:53:54	<@JAA>	It's usually fine, but I don't actively monitor it nor is it currently a priority to fix when it breaks. There are a few things I need to do to the backend, then that will change. But next year (heh).
17:56:53	<steering>	Soon(TM)
18:01:00		DogsRNice joins
18:37:03	<that_lurker>	JAA: I can also setup logging stuff if needed.
18:42:00		lennier2 quits [Quit: Going offline, see ya! (www.adiirc.com)]
18:51:29		lennier2 joins
18:57:03		MrMcNuggets quits [Quit: WeeChat 4.3.2]
19:07:41		Webuser156468 joins
19:08:32	<Webuser156468>	This https://irclogs.archivete.am/archiveteam-bs/2024-12-31 is down now - I don't know what happened since I last posted here about a 3DCG cartoon.
19:09:00	<Webuser156468>	More importantly, I have all the full image IDs of all images tagged as "ai content" in derpibooru.org
19:11:56	<Webuser156468>	At the following link with a meaningful filename, 5 days left: https://transfer.archivete.am/dAJBR/2024-12-31_derpi_33961_ai.txt
19:11:59	<eggdrop>	inline (for browser viewing): https://transfer.archivete.am/inline/dAJBR/2024-12-31_derpi_33961_ai.txt
19:22:11	<@arkiver>	happy new year everyone :)
19:22:39	<@arkiver>	we'll see what 2025 brings us - it may be a busy year
19:28:33	<Webuser156468>	Thanks. Hope you have a good time. I also hope that 2025 will be nicer: probably not...
19:30:07	<Webuser156468>	I'm downloading WARCs+raws of those 34,000 webpages: `sort -r 2024-12-31_derpi_33961_ai.txt \| tail -n+1 \| xargs -d "\n" sh -c 'for args do curl -skL "https://10.0.0.200/cgi-bin/arcmni?url=https://derpibooru.org/images/$args" 1>/dev/null; sleep 0.2; done' _`
19:33:59	<Webuser156468>	.sh file "arcmni" looks something like this: https://github.com/ProximaNova/ipfs-kubo-rpc-api-for-cgi/tree/main/usr/lib/cgi-bin/ - it uses wget and downloads WARCs+raws. Raws go in ./memento/20241231193108/. I'm using ZFS but without dedup on. I didn't want to download CSS files and whatever a million times so I'm using shell script "arcnoi" -
19:33:59	<Webuser156468>	about the same but without wget option --directory-prefix=$basepath/memento/$time/
19:37:08	<Webuser156468>	Sort by file size -- "ls -S ./warc/derpibooru.org/images/" (or "find . [...]") -- may help find any possible rate limiting webpage downloads.
19:42:15		NeonGlitch quits [Client Quit]
19:45:47		wickedplayer494 quits [Ping timeout: 252 seconds]
19:48:55	<Webuser156468>	I'm using a ZFS mirror pool across two different HDDs. The hard drives aren't the crappy model(s) that I used in the past. With ZFS, how much fragmentation is too much? "zpool list" says FRAG:56%. The 16.4-TB pool is 92% full. Definitely don't fill a ZFS pool up above 94 or 95% full. While waiting on stats: more thoughts on "Filly Funtasia" (posted
19:48:55	<Webuser156468>	about earlier). Episodes 3 and 4 are maybe better or more gripping than episodes 1 and 2. Episode 1=about Isabella and lying, 2=about Rose and helping one another(?), 3=about an antagonist and subterfuge, 4=about Lynn and getting along with one another.
19:49:59	<Webuser156468>	Stats are in: 200 webpages downloaded from 2024-12-31T19:37:44.634666667Z to 2024-12-31T19:48:41.176222711Z
19:54:24	<Webuser156468>	arcnoi downloads one warc per url. Convert to Unix time: date -d "2024-12-31T19:48:41Z" +%s -> ... -> 200 pages downloaded in 657 seconds = 0.3044 webpages/second with 0.2-second sleep between requests -> will I meet the deadline?
19:56:41	<Webuser156468>	Will take 111,562 seconds at this rate, or about 31 hours, which is less than 5 days, so yes I can hopefully download it all in time.
19:56:51		Island joins
19:59:27		Webuser156468 quits [Client Quit]
20:00:01		wickedplayer494 joins
20:00:18		wickedplayer494 is now authenticated as wickedplayer494
20:00:48		BlueMaxima joins
20:18:57		PredatorIWD2 joins
20:49:57		Wohlstand quits [Ping timeout: 252 seconds]
20:50:43		linuxgemini quits [Quit: getting (hopefully fresh) air o/]
20:53:35		linuxgemini (linuxgemini) joins
21:45:17		BlueMaxima quits [Client Quit]
21:47:18		Wohlstand (Wohlstand) joins
21:48:58		HP_Archivist quits [Ping timeout: 260 seconds]
22:01:56	<@OrIdow6>	that_lurker: If you do setup your own logging stuff might be useful to combine it with WBM versions of kis ka's logs
22:02:11	<@OrIdow6>	Tho I guess we can just pool all our IRC client logs or something
22:20:08		BornOn420 quits [Remote host closed the connection]
22:20:45		BornOn420 (BornOn420) joins
22:36:28	<h2ibot>	Bear edited List of websites excluded from the Wayback Machine/Partial exclusions (-16, Time ranges were moved.): https://wiki.archiveteam.org/?diff=54136&oldid=54024
22:51:30	<h2ibot>	Bear edited YouTube/Technical details (+232, /* Format codes */ clarifications): https://wiki.archiveteam.org/?diff=54138&oldid=53712
22:58:58	<szczot3k>	happy new year
23:02:49		NeonGlitch (NeonGlitch) joins
23:15:34		BearFortress quits []
23:23:38	<@OrIdow6>	30 minutes til mignight UTC
23:25:05	<Barto>	happy new year gang
23:25:06		NeonGlitch quits [Client Quit]
23:55:21		BearFortress joins

Home Search Previous day Next day