#archiveteam-bs log for 2021-04-08

Home Search Previous day Next day

00:29:20	<Ryz>	Heya folks, I need help determining whether http://schreibfabrik.de/spielenacht/ is found from http://schreibfabrik.de/ - if not, have to do two archives of it on AB instead of one; I tried to find even a single link from searching those pages a bit harder, but nope
00:29:34	<Ryz>	I can maybe archive it as http://www.schreibfabrik.de/spielenacht but I have no idea it it'll find a link to http://www.schreibfabrik.de/ then
00:30:32	<Ryz>	...Oh, oooooh, it would have to be 3, because archiving http://schreibfabrik.de/spielenacht/ would not get into the photos and videos sections in http://schreibfabrik.de/spielenacht/rueckblick.php - since the URLs are under something like http://www.schreibfabrik.de/img/spielenacht2019/ (and I can't archive it as http://www.schreibfabrik.de/img/ ..
00:30:32	<Ryz>	.uuuuuugh)
00:31:10		enterprisey joins
00:33:35	<enterprisey>	when will the Yahoo Answers project start, or is there a way I can be notified when it does?
00:35:16	<thuban>	enterprisey: watch this channel (or if you're using a warrior, just set it to "archiveteam's choice"; it will be switched over automatically)
00:35:45	<enterprisey>	sounds good, thanks!
00:37:39	<enterprisey>	I looked in the FAQs for anything on the legal implications, if any, of running the warrior, but couldn't find any
00:39:51	<@JAA>	Ryz: It's linked on http://schreibfabrik.de/leipzig.php
00:39:59	<thuban>	^ whoops, forgot which channel i was in. you should watch #noanswers for the yahoo answers project
00:41:27	<Ryz>	Thank you JAA~ Meanwhile I'm building up a small list of websites associated with http://schreibfabrik.de/ or linked from http://schreibfabrik.de/ that I'll be archiving~
00:54:02		enterprisey quits [Remote host closed the connection]
01:00:48	<Ryz>	Oh goodness, sometimes going through the backlog of links to archive is basically hard mode on going through potlinks x_x;
01:03:18		dm4v quits [Ping timeout: 250 seconds]
01:05:57		dm4v joins
01:05:59		dm4v is now authenticated as dm4v
01:05:59		dm4v quits [Changing host]
01:05:59		dm4v (dm4v) joins
01:10:46		qw3rty__ quits [Read error: Connection reset by peer]
01:24:48		lunik1 quits [Read error: Connection reset by peer]
01:25:08		lunik1 joins
01:25:14		lunik1 quits [Client Quit]
01:28:10		lunik1 joins
01:33:06		nico_32_ quits [Ping timeout: 244 seconds]
01:33:14		nico_32 joins
01:42:49		Mineroboter joins
01:45:12		Mineroboter_ quits [Ping timeout: 258 seconds]
01:52:10	<@OrIdow6>	tech234a: Looks like it's been changed to "Archiveteam" now
01:52:40	<tech234a>	Yeah
02:41:08		eyo quits [Quit: WeeChat 2.9]
02:41:38		eyo (eyo) joins
02:43:04		Ryz quits [Remote host closed the connection]
02:43:53		Ryz (Ryz) joins
03:42:56		sec^nd quits [Remote host closed the connection]
03:43:20		sec^nd (second) joins
03:50:04		etnguyen03 quits [Client Quit]
03:55:17	<Ryz>	Need help whether or not https://www.lysator.liu.se/~celeborn/sync/ can be found from https://www.lysator.liu.se/~celeborn/ - there's 5 pages initially but I can't seem to find it at all; I see the phrase 'sync' a bunch but never the link x_x;
04:00:33	<Ryz>	I may need a better way to find those links rather having to do manually, and ideally without asking other people for help if possible <_>;
04:04:42	<thuban>	Ryz: is there some reason you can't just run the parent site, then check the archivebot logs and run the child site if it didn't get hit? ab doesn't ascend to parent dirs, so there's no risk of duplication
04:07:12	<Ryz>	If it's a small website, maybe, but if running a much larger website, that sounds like a waste of archiving resources
04:08:33	<thuban>	perhaps i've misunderstood; what exactly is the goal here?
04:10:11	<Ryz>	Okay, I wanna archive https://www.lysator.liu.se/~celeborn/ - but I'm not sure if https://www.lysator.liu.se/~celeborn/sync/ can be found while doing AB; if it's not possible, I would have to archive both of those separately S:
04:11:33	<Ryz>	I'm not sure if trying to find it through search engines would be sufficient or maybe I'm doing that wrong
04:12:35	<thuban>	so you want to archive both of them anyway? i don't see how what i've suggested would be a waste then
04:14:12	<Ryz>	It's more of crystallize maintaining a good habit; like would I do something like that if there's a section that's 1 TiB worth of content? If say I archive both links, that would just be 1 TiB waste of duplicate data
04:14:52	<thuban>	which is why you would check the logs from the parent site, to see whether you got the child site already
04:16:45	<Ryz>	I would consider that too much of a delay, especially for my tastes~
04:18:27	<@OrIdow6>	In any case you're basically going to have the same process
04:18:48	<@JAA>	That is the proper way, though it won't work if there are links from that path to other sections of the site that aren't linked elsewhere. All hope is lost in that case without some AB dev work.
04:19:00	<@OrIdow6>	There's no general way to discover whether the child is linked other than by spidering the site
04:43:26	<purplebot>	ArchiveTeam Warrior edited by Tech234a (+51, Share Docker socket to Warrior …) just now -- https://www.archiveteam.org/?diff=46511&oldid=46510
04:44:28	<@JAA>	^ That sounds like a horrible idea, but I don't know enough about Docker to be sure.
05:04:51		Jonboy345 joins
05:07:13		jonboy3452 quits [Ping timeout: 258 seconds]
05:09:12		Gaelan quits [Quit: ZNC 1.8.2 - https://znc.in]
05:09:44		Gaelan (Gaelan) joins
05:10:30		fuzzy802 joins
05:10:30		fuzzy8021 quits [Killed (NickServ (GHOST command used by fuzzy802!~fuzzy8021@173-224-26-244.ptcnet.net))]
05:10:32		fuzzy802 is now known as fuzzy8021
05:10:32		fuzzy8021 is now authenticated as fuzzy8021
05:10:32		fuzzy8021 quits [Changing host]
05:10:32		fuzzy8021 (fuzzy8021) joins
05:10:34		snowpanda joins
05:11:07	<snowpanda>	Hi, I'm curious about why archive.org allows the archiveteam to upload WARCs to archive.org - from what I understand the archiveteam uses software that runs on a user's machine to scrape
05:11:18	<snowpanda>	Doesn't this mean that a user could potentially manipulate/edit the page captures?
05:12:23	<snowpanda>	Ie, couldn't a malicious member of the archiveteam manipulate pages before they are uploaded to the Wayback Machine? Ie, this could result in the Wayback Machine displaying a manipulated/edited page for a given URL
05:34:37		BlueMaxima quits [Read error: Connection reset by peer]
05:38:34		sec^nd quits [Ping timeout: 255 seconds]
05:38:57		Ajay1 joins
05:39:48		Ajay quits [Ping timeout: 258 seconds]
05:49:34		fuzzy802 joins
05:49:34		fuzzy8021 quits [Killed (NickServ (GHOST command used by fuzzy802!~fuzzy8021@173-224-26-244.ptcnet.net))]
05:49:38		fuzzy802 is now known as fuzzy8021
05:49:42		fuzzy8021 is now authenticated as fuzzy8021
05:49:42		fuzzy8021 quits [Changing host]
05:49:42		fuzzy8021 (fuzzy8021) joins
05:49:43		fuzzy8021 quits [Excess Flood]
05:50:03		fuzzy8021 joins
05:50:03		fuzzy8021 is now authenticated as fuzzy8021
05:50:03		fuzzy8021 quits [Changing host]
05:50:03		fuzzy8021 (fuzzy8021) joins
05:52:11		rynomad joins
05:52:36	<snowpanda>	Anyone have any idea about my questions?
05:59:26	<purplebot>	Google Poly edited by CosmicCoyote (+43) just now -- https://www.archiveteam.org/?diff=46512&oldid=46041
05:59:26	<purplebot>	Reddit edited by S-crypt (+33, DoV banned) just now -- https://www.archiveteam.org/?diff=46513&oldid=45215
06:02:51		snowpanda quits [Remote host closed the connection]
06:34:26		rynomad quits [Remote host closed the connection]
06:46:08	<tech234a>	JAA: fair enough, I wanted to get it in the instructions so it could be used in the future. Watchtower also works by bindmounting the Docker socket, which gave me the idea. The Docker container would only be stopped using this method on the Warrior when the shut down button is pressed in the web UI and this socket has been shared with the container. (Without this, the container would immediately be restarted when it shuts down.)
07:09:45	<tech234a>	that said, if someone thinks the socket mount should be removed, feel free to do so
07:30:34		hooway joins
07:38:00		Arcorann (Arcorann) joins
07:42:26	<purplebot>	ArchiveTeam Warrior edited by Tech234a (-51, Reverting) just now -- https://www.archiveteam.org/?diff=46514&oldid=46511
07:42:30	<tech234a>	I decided to take it out for now
08:03:40	<mgrandi>	@snowpanda yeah, that is a risk yes
09:00:25	<purplebot>	URLTeam edited by Aarchi (+0, Fix title capitalization) just now -- https://www.archiveteam.org/?diff=46515&oldid=46480
09:56:28		ThreeHea1 (ThreeHeadedMonkey) joins
09:56:44		ThreeHeadedMonkey quits [Ping timeout: 250 seconds]
10:43:21		AlsoHP_Archivist quits [Read error: Connection reset by peer]
10:44:06		AlsoHP_Archivist joins
10:48:41		ThreeHea1 is now known as ThreeHeadedMonkey
11:09:39		LeGoupil joins
11:12:28		Arcorann_ joins
11:15:36		Arcorann quits [Ping timeout: 250 seconds]
11:51:10		Zopolis4 (Zopolis4) joins
11:51:43	<Zopolis4>	is it possible to incorporate a direct dump of a server into the wbm?
11:56:37		Zopolis4 quits [Remote host closed the connection]
11:58:53		Zopolis4 (Zopolis4) joins
12:02:12		murmur quits [Read error: Connection reset by peer]
12:07:13		murmur joins
12:33:10		katocala quits [Ping timeout: 250 seconds]
12:33:27	<@OrIdow6>	Zopolis4: Not that I know of, though I don't see why you can't run a crawler on the same machine as (or close on the network to) the site
12:35:43		yawkat quits [Ping timeout: 258 seconds]
12:40:50	<Zopolis4>	because the site is dead
12:40:56	<Zopolis4>	but we do have a server dump
12:41:01	<Zopolis4>	well once it gets released
12:43:56	<@OrIdow6>	I think that once ArchiveTeam took such a dump, restored the site, and then crawled it, at some point before I got here
12:50:48		Zopolis4 quits [Remote host closed the connection]
12:56:43		Zopolis4 (Zopolis4) joins
12:56:47		Zopolis4 quits [Remote host closed the connection]
12:59:55		katocala joins
13:01:47		Zopolis4 (Zopolis4) joins
13:01:51	<Zopolis4>	seems a bit of a blunt solution but ittl work
13:02:01		etnguyen03 (etnguyen03) joins
13:02:23		katocala is now authenticated as katocala
13:17:56		yawkat (yawkat) joins
13:30:45		brgtt2 joins
13:45:37		spirit joins
14:07:43		katocala quits [Ping timeout: 258 seconds]
14:07:59		katocala joins
14:08:06		katocala is now authenticated as katocala
14:14:54		superkuh__ joins
14:14:57		superkuh_ quits [Read error: Connection reset by peer]
14:15:30		atphoenix_ (atphoenix) joins
14:17:41		atphoenix quits [Ping timeout: 258 seconds]
14:25:29		Viniter (Viniter) joins
14:31:56		Viniter_ joins
14:33:47		Viniter_ quits [Client Quit]
14:35:22		Viniter quits [Ping timeout: 250 seconds]
14:37:19		Viniter (Viniter) joins
14:47:25	<purplebot>	File:Yahooanswers logo.png overwritten by Arkiver (+0) just now -- https://www.archiveteam.org/?diff=46516&oldid=0
15:13:16		katocala quits [Ping timeout: 258 seconds]
15:14:01		katocala joins
15:18:05		s-crypt quits [Remote host closed the connection]
15:18:05		kiska quits [Remote host closed the connection]
15:18:05		flashfire42 quits [Remote host closed the connection]
15:21:41		Mateon1 quits [Remote host closed the connection]
15:22:00		Mateon1 joins
15:22:21		brgtt2 quits [Read error: Connection reset by peer]
15:22:43		brgtt2 joins
15:23:38		brgtt2 quits [Client Quit]
16:04:50		DogsRNice (Webuser299) joins
16:06:03		Arcorann__ joins
16:09:37		Arcorann_ quits [Ping timeout: 258 seconds]
16:15:22		Arcorann__ quits [Ping timeout: 258 seconds]
16:30:05		Zopolis4 quits [Ping timeout: 244 seconds]
16:41:34		sec^nd (second) joins
16:53:03		sec^nd quits [Remote host closed the connection]
16:54:45		sec^nd (second) joins
17:06:08		sec^nd quits [Remote host closed the connection]
17:07:25		sec^nd (second) joins
17:12:49		sec^nd quits [Remote host closed the connection]
17:14:07		sec^nd (second) joins
17:36:54		snowpanda joins
17:37:23	<snowpanda>	Hi, anyone have an answer for my question for why archive.org allows archive-team to upload WARC page captures to the Wayback Machine?
17:37:50	<@EggplantN>	They allow anyone afaik?
17:37:57	<snowpanda>	Doesn't this introduce the risk that a malicious archiveteam member could edit a page capture and then upload it, thus creating edited/incorrect page history?
17:38:10	<@EggplantN>	Yes it does and if that happens we can take action
17:38:25	<snowpanda>	I don't think they allow anyone, otherwise the Wayback Machine is not reliable as an archive...
17:38:37	<snowpanda>	But how would you detect if it happens?
17:39:04	<snowpanda>	If they allowed anyone to upload then there could be tons of manipulated / edited pages in the archive
17:39:40	<AK>	Theoretically there could be yes, but there is a level of error checking and trust
17:39:50	<@EggplantN>	We trust each other. There is only 3 people who handle the upload process for warrior projects. I trust the other 2 and the other 2 trust me
17:39:55	<@EggplantN>	I have better things to do with my time
17:39:58	<@EggplantN>	As do both of them
17:40:23	<snowpanda>	I see, I thought anyone could download the archiveteam software and run uploads?
17:40:34	<Sanqui>	In general there is no way to way to protect against a malicious actor anyway
17:40:41	<@EggplantN>	To our targets yes.
17:41:02	<@EggplantN>	Nobody uploads directly to the IA on warrior based projects
17:41:14	<snowpanda>	Ah I see, that makes sense then
17:41:51	<snowpanda>	Sanqui not sure what you mean by that. As far as I'm aware, the Wayback Machine does not put user uploaded warcs from random people into the archive
17:41:54	<jodizzle>	For clarity: anyone can upload WARCs to IA, but only WARCs from whitelisted accounts make it into the WBM
17:42:11	<snowpanda>	Because of the possible lack of authenticity concern
17:42:25	<snowpanda>	jodizzle: Okay, that makes sense
17:43:32	<@JAA>	There was a bug with that whitelisting a few years ago which led to all uploaded WARCs being included in the WBM. And guess what, there was manipulated stuff in there, and that's how the bug was discovered.
17:44:26	<snowpanda>	Interesting, good to know there are safeguards. The archive wouldn't be quite as useful if I couldn't have a good amount of trust in the contents
17:45:49	<jodizzle>	There was a bit of discussion a while back about having the Warrior process do some more evaluation of the content sent to the targets. There was a separate, non-AT project for saving youtube annotations that had a "trust" system based on saving annotations multiple times and people sending similar-looking annotations.
17:46:58	<jodizzle>	As a method for building trust, I mean. If people sent annotations that matched what other people were sending, their worker built trust.
17:47:23	<snowpanda>	Yeah, that sounds like a useful check
17:47:24	<jodizzle>	In principle something like that could be done for the Warriors too, but it would be work
17:48:05	<snowpanda>	Of course I wouldn't publicize whatever security mechanism you choose to use :) makes it easier for people to get around it
17:49:15	<jodizzle>	JAA: Is there any insight on how content was manipulated? I'm curious.
17:53:24		kiskaWeebChat quits [Ping timeout: 250 seconds]
17:54:02	<@JAA>	jodizzle: I have no idea. The above is essentially all I know. 'We found some manipulated content, which shouldn't have entered the WBM, and now it's fixed' or similar.
18:18:27		dm4v_ joins
18:19:11		dm4v quits [Ping timeout: 258 seconds]
18:19:11		dm4v_ is now known as dm4v
18:19:11		dm4v is now authenticated as dm4v
18:19:11		dm4v quits [Changing host]
18:19:11		dm4v (dm4v) joins
18:23:08	<tech234a>	Perhaps another option would be to somehow include HTTPS verification data in the WARCs (probably not in the current standard though)
18:23:54	<Sanqui>	tech234a: there is no such verification data possible with https
18:23:54	<@JAA>	That's impossible.
18:24:36	<tech234a>	Got it. Would have assumed somehow the certificates would sign the page...
18:24:50	<AK>	The biggest downside to the method of having multiple people request the same thing is the performance and size. That turns the 25TB I've done of urls into 50 or 75 if we do 2 or 3 downloads to check
18:24:58	<AK>	On some projects there's barely time to get everything once
18:25:02	<AK>	Let alone 2/3 times
18:25:31	<snowpanda>	AK: I see the Wayback Machine does display the source of the capture though, so I can tell if something is from a Wayback crawl or from another collection
18:25:34	<Sanqui>	tech234a: nope. HTTPS, or rather TLS, does not provide non-repudiation by design
18:26:07	<AK>	snowpanda: yep it should be clear if it was us or someone else
18:26:10	<Sanqui>	(I've looked into the same possibility)
18:27:10	<tech234a>	Interesting
18:28:17	<@OrIdow6>	(From last time this came up) Part of the problem with comparing different warrior results is that pages change
18:28:39	<AK>	Or get served from different places
18:28:46	<AK>	(Anycast dns or a cdn)
18:28:52	<@OrIdow6>	Random tokens generated server-side, some sort of multithreaded page generation that rearranges the order of things, A/B testing
18:29:26	<@OrIdow6>	So you would need to do a large amount of work to define what counts as equivalent
18:29:56	<snowpanda>	Hmm yeah, I guess for being confident about reliability the best way is maybe still to just check the source of the capture
18:30:32	<@OrIdow6>	(Not an exact quote of what I wrote last time, but I think that was basically it)
18:30:38	<snowpanda>	I'm at least pretty confident that page captures in the Wayback Machine that were from Wayback crawls or the "save page now" feature don't have the risk of user manipulation
18:31:11	<snowpanda>	Or at least, shouldn't have the risk of user manipulation as long as the Wayback Machine's software systems don't have security holes :)
18:32:40	<AK>	Fairly certain they use an identifiable user agent
18:32:46	<AK>	So that's still not guaranteed
18:33:26	<snowpanda>	AK: Why would an identifiable user agent introduce risk of user manipulation?
18:33:40	<snowpanda>	Oh I guess from the website owner themselves is what you're saying
18:33:56	<AK>	Yeah from the website owner
18:34:08	<snowpanda>	Ie, the website owner could serve different content based on the user agent. I was talking about user manipulation from a third party with no control over the website
18:37:07	<AK>	But the way AT works means there is an element of trust
18:37:10	<AK>	And sometimes things get weird
18:37:31	<AK>	We archive the location that the dns gives us at the time
18:37:50	<AK>	If dns returns 0.0.0.0 because the new domain owner is weird, we will archive whatever website your local ip returns
18:37:51	<AK>	https://web.archive.org/web/20210214121720/https://rapidshare.com/
18:37:58	<AK>	In this case, my "Oh hello" page from nginx
18:38:35	<@JAA>	Connecting to 0.0.0.0 should fail though. You mean 127.0.0.0/8?
18:38:47	<AK>	I thought it was returning 0.0.0.0 at the time, lemme check irc logs again
18:39:19	<AK>	rapidshare.com returns 0.0.0.0
18:39:29	<@JAA>	Mhm
18:39:34	<@JAA>	0.0.0.0 is non-routable though.
18:39:36	<AK>	Which for whatever reason meant the pipeline requested against the nginx on the host
18:40:02	<@JAA>	That sounds like something's very broken then.
18:40:39	<@OrIdow6>	Looks like that is how wget acts
18:41:07	<AK>	I think on Ubuntu 0.0.0.0 can refer to default route
18:41:16	<AK>	Curl returns my website if I "curl rapidshare.com"
18:41:20	<@OrIdow6>	Oh, could be the OS too, I'm on Debian
18:41:29	<snowpanda>	AT: Hmm but for Wayback Machine's crawls and "save page now" features, it should be secure from third parties who don't have control over the website right?
18:41:35	<@OrIdow6>	Yeah, looks like it
18:42:14	<AK>	snowpanda: Yep it should be, I tend to trust wayback machine and check a couple of crawls before+after to confirm whether something changed or it stayed the same
18:42:48	<@JAA>	tech234a: TLS establishes a shared symmetric key between client and server. There is no asymmetric signature or similar, so the client can manipulate it freely. Or rather, the client can't possibly prove to someone else that the data wasn't modified.
18:43:13	<@JAA>	(This is true even for AES-GCM cipher suites etc. They don't authenticate the contents.)
18:44:14	<@JAA>	This is another fun one due to 'search example.org' lines in resolv.conf: https://web.archive.org/web/*/http://www/
18:46:05	<yano>	heh, fun
18:47:34	<@OrIdow6>	You're connected to Internet, please refresh your page.
18:50:04	<@OrIdow6>	(From someone running #//)
18:51:19		snowpanda quits [Remote host closed the connection]
18:53:02	<AK>	For about 5 seconds I did contemplate changing my "Oh Hello" to a joke for people looking back later. But I decided that wasn't helpful
18:53:32	<@OrIdow6>	Realistically, I think maybe something should be added to prevent this in projects
18:53:53	<@OrIdow6>	Not sure how, though, there are several things going on that cause this
18:54:20	<tech234a>	Start providing our own DNS server?
18:55:02		Daloader_ joins
18:55:16	<AK>	You'll get ddos'd I think when we spin up large
18:55:33	<tech234a>	True
18:55:37	<AK>	I think the best option is gonna be having the workers/warriors check what the domain resolves to
18:55:44	<@OrIdow6>	Though I suppose it doesn't do much damage as long as it's confined to "weird" URLs
18:55:58	<AK>	And cancelling if it resolves to a local domain, or to something else
18:56:05	<@OrIdow6>	(I've seen this in the "real" WBM crawls, too)
18:56:26	<@JAA>	tech234a: Own DNS server doesn't really fix that problem I mentioned. We'd need to implement the lookups directly in wget or whatever.
18:57:02	<tech234a>	Hmm
19:01:28		snowpanda joins
19:04:35		snowpanda quits [Remote host closed the connection]
19:58:23		LeGoupil quits [Client Quit]
20:00:41		katocala is now authenticated as katocala
20:15:31		AlsoIDK quits [Remote host closed the connection]
20:16:24		thuban quits [Ping timeout: 250 seconds]
20:18:19		thuban joins
20:22:54		Daloader_ quits [Ping timeout: 250 seconds]
20:27:24		jtagcat quits [Quit: Bye!]
20:47:26	<purplebot>	ArchiveTeam Warrior edited by Tech234a (-12, Change restart policy for Warrior …) just now -- https://www.archiveteam.org/?diff=46517&oldid=46514
20:47:55		rsn quits [Ping timeout: 258 seconds]
20:48:09	<tech234a>	^ that solves the immediate reboot problem when shutting down using the web interface
20:48:25	<tech234a>	thanks Fu sl
20:50:54	<thuban>	does shutdown of the host count as 'failure'?
20:52:22	<tech234a>	Yes, so it will restart the container on restart (I tested by using the restart Docker option on Windows)
20:52:38	<thuban>	ah, cool
20:53:12		jtagcat (jtagcat) joins
20:53:22		rsn joins
21:07:20		marked quits [Remote host closed the connection]
21:07:47		marked joins
21:08:26		DogsRNice_ (Webuser299) joins
21:09:03		s-crypt (s-crypt) joins
21:09:12		flashfire42 (flashfire42) joins
21:09:23		DogsRNice quits [Ping timeout: 258 seconds]
21:09:47		kiska (kiska) joins
21:20:23	<SketchTheCow>	Hey Jason,
21:20:23	<SketchTheCow>	My company is working on setting up a kind of "crowdfunded X-Prize" product, and would like to use it to drum up support and incentive for archivists working on saving Yahoo Answers data before it gets blinkered out. In researching the situation, I found archiveteam, and realized that y'all have a substantial headstart in winning any eventual prize-pool. I think there may be an opportunity to
21:20:29	<SketchTheCow>	work together such that we can maximize the net percentage of archiving done by the deadline.
21:20:32	<SketchTheCow>	Would you, or someone else running point on the Yahoo project, be open to a quick chat sometime this week?
21:20:35	<SketchTheCow>	Best,
21:20:38	<SketchTheCow>	-Ryan
21:20:40	<SketchTheCow>	.... I'll be declining
21:22:15	<@EggplantN>	Free money SketchTheCow :P
21:22:16	<@EggplantN>	/s
21:23:02	<Barto>	ask him to donate to archive.org instead.
21:26:36		Viniter quits [Ping timeout: 250 seconds]
21:43:04		rsn_ joins
21:43:56		rsn quits [Ping timeout: 250 seconds]
21:47:11		LeighR (LeighR) joins
21:50:55		Eighty quits [Remote host closed the connection]
22:01:20	<SketchTheCow>	I literally did that.
22:01:29	<SketchTheCow>	Like, that was my actual response
22:01:36	<SketchTheCow>	So I am glad we're all in lockstep
22:01:38		rsn joins
22:02:50	<Ajay1>	they sent the same message in the yahoo answers IRC channel a few days ago
22:03:49		rsn_ quits [Ping timeout: 258 seconds]
22:05:19		rsn_ joins
22:07:45		rsn quits [Ping timeout: 250 seconds]
22:09:23		rsn joins
22:11:14		rsn_ quits [Ping timeout: 250 seconds]
22:13:47		Wayward quits [Ping timeout: 258 seconds]
22:17:18		AlsoHP_Archivist quits [Ping timeout: 250 seconds]
22:18:05		AlsoHP_Archivist joins
22:19:42	<tech234a>	FYI It looks like you can save a blank revision to a redirect page to fix the caching problem (literally save it without changing anything, it won't log as a revision but it seems to clear the cache)
22:19:46	<tech234a>	on the wiki
22:20:31	<Ajay1>	when I tried that, it didn't allow it to be accepted
22:21:25	<tech234a>	Is your account manually moderated or automoderated?
22:21:39	<Ajay1>	manual
22:21:52	<@JAA>	Yeah, the mod tool doesn't like empty revisions.
22:21:58	<tech234a>	Yeah it probably worked for me because mine is automoderated
22:22:27	<@JAA>	Have you tried action=purge?
22:22:40	<@JAA>	That clears the cache directly.
22:23:26	<tech234a>	I wasn't aware that existed, good to know
22:28:15		LeighR leaves
22:28:26	<purplebot>	Yahoo! Answers edited by Ajay (+59, Added new tracker and that archiving …) just now -- https://www.archiveteam.org/?diff=46518&oldid=46501
22:29:25	<purplebot>	GeoCities edited by C-Nagy (+9, Fixed a few links) just now -- https://www.archiveteam.org/?diff=46519&oldid=45435
23:05:55		AlsoHP_Archivist quits [Ping timeout: 258 seconds]
23:06:17		AlsoHP_Archivist joins
23:07:26	<purplebot>	FTP/List edited by Pokechu22 (+412, consistent whitespace before {{online}}; …) just now -- https://www.archiveteam.org/?diff=46521&oldid=46288
23:11:52		rsn_ joins
23:14:44		rsn quits [Ping timeout: 258 seconds]
23:23:26	<purplebot>	Yahoo! Answers edited by Ajay (-59, Undo revision 46518 by [[Special:Contributions/Ajay\|Ajay]] …) 17 minutes ago -- https://www.archiveteam.org/?diff=46520&oldid=46518
23:45:01		hooway quits [Client Quit]

Home Search Previous day Next day