00:11:14<G4te_Keep3r>speaking of hacker news, https://hackernoon.com/ and i know im forgetting another big one
00:15:09lennier1 quits [Client Quit]
00:15:25lennier1 (lennier1) joins
00:19:13shoghicp quits [Ping timeout: 258 seconds]
00:22:32shoghicp (shoghicp) joins
00:23:15Arcorann (Arcorann) joins
00:27:14<Ajay>for stack exchange sites, some have archives on kiwix https://wiki.kiwix.org/wiki/Content_in_all_languages
00:34:56Mateon1 quits [Ping timeout: 258 seconds]
00:36:41ddd joins
00:44:21BlueMaxima joins
00:47:25<purplebot>BNK48 edited by Gridkr (+1301) just now -- https://www.archiveteam.org/?diff=46573&oldid=45746
00:47:25<purplebot>Yahoo! Answers edited by Iki (+172, + notice for read-only) just now -- https://www.archiveteam.org/?diff=46575&oldid=46544
00:48:14<@arkiver>JAA: an archiveteam domains page was created on the wiki, i approved it, but FYI (notice should be here shortly)
00:48:27<purplebot>ArchiveTeam Domains created by Nintendofan885 (+2403, Created page with "This page lists …) just now -- https://www.archiveteam.org/?diff=46577&oldid=0
00:53:23<benjins>arkiver: I added Tinkercad to the deathwatch page
01:00:15Mateon1 joins
01:00:46dm4v quits [Read error: Connection reset by peer]
01:04:30dm4v joins
01:04:33dm4v quits [Changing host]
01:04:33dm4v (dm4v) joins
01:15:55Mineroboter joins
01:18:00Mineroboter_ quits [Ping timeout: 250 seconds]
01:30:47britmob joins
01:53:08Arcorann quits [Ping timeout: 258 seconds]
01:56:26<purplebot>Deathwatch edited by Benjins (+274, Add Tinkercad to Deathwatch) just now -- https://www.archiveteam.org/?diff=46578&oldid=46506
02:03:24ddd quits [Remote host closed the connection]
02:22:48nertzy (nertzy) joins
02:30:03l18cp joins
02:34:28ddd joins
02:41:43ddd quits [Client Quit]
02:50:54Jonboy345 quits [Read error: Connection reset by peer]
02:52:04Jonboy345 joins
02:52:33pcr leaves
02:52:34pcr joins
03:22:02<@arkiver>benjins: thank you
03:40:28Jonboy345 quits [Ping timeout: 258 seconds]
03:40:47Jonboy345 joins
03:44:17DogsRNice quits [Read error: Connection reset by peer]
03:52:35l18cp quits [Remote host closed the connection]
03:56:52etnguyen03 quits [Client Quit]
03:59:10qw3rty_ joins
04:02:40qw3rty quits [Ping timeout: 250 seconds]
04:15:16Webuser169 joins
04:15:48Wayward- (wayward) joins
04:16:32Wayward quits [Ping timeout: 250 seconds]
04:52:21Sylirana quits [Ping timeout: 244 seconds]
04:52:40Sylirana (Sylirana) joins
05:00:09<masterX244>G4te_Keep3r: stackexchange does a regular xml dump of all comtent to archive.org already
05:02:27<G4te_Keep3r>wait so they archive themselves for us? amazing :)
05:04:47<masterX244>They know that the content is valuable for the future :)
05:08:23<G4te_Keep3r>i guess y!a is historical but that is future thinking so should be less shocking, but still a big surprise...probably also less stress on their servers to just dump the data that us scrape and crawl and grab
05:17:38Mateon1 quits [Remote host closed the connection]
05:17:59Mateon1 joins
05:41:47Webuser169 quits [Remote host closed the connection]
05:41:55BlueMaxima quits [Client Quit]
06:37:15Eighty quits [Remote host closed the connection]
06:40:22Eighty (Eighty) joins
06:58:36yawkat quits [Ping timeout: 250 seconds]
07:01:22yawkat (yawkat) joins
07:07:00<@OrIdow6>Anyone know of any good SOCKS or HTTP proxies I can use for Aimix-Z?
07:08:59<@OrIdow6>Or any other way I can easily "burn" IP addresses during the development process without messing up everything else I have running every time I use the vanilla wget UA or whatever
07:10:30Zopolis4 (Zopolis4) joins
07:17:20Half-Gray joins
07:31:06hooway joins
07:33:23Half-Gray quits [Client Quit]
07:36:45<masterX244>Try cheap cloud instances at hetzner. if you get a fresh ip on each create it should work.
07:42:15<masterX244>Got to do the same after finishing a discovery crawl since i burned a IP myself on tm-exchange... (also: that site has a "dead range" in search results on the tmnforever subsite due to a error 500 after page 19400 forwards and page 400 if you reverse the results. Almost 2 million ids hidden (iterating through those atm. Crawler code is full of dirty hacks already)
07:44:26spirit joins
07:51:03<@OrIdow6>I was wondering if there is anything less convoluted than something like that
07:53:15<masterX244>if the ip doesnt have to remain the same i would hafe TOR-ed it (thats how i pull tmx discovery down atm, just needing all tracks and replays for the grab-site crawl) but you probably need a nonchanging one until its burned
07:53:29<masterX244>s/hafe/have/
07:53:36Webuser955 joins
08:19:25Arcorann (Arcorann) joins
08:28:35Webuser955 quits [Remote host closed the connection]
09:11:57HackMii_ quits [Remote host closed the connection]
09:12:18HackMii_ (hacktheplanet) joins
10:11:51LeighR (LeighR) joins
11:09:04grawity quits [Ping timeout: 250 seconds]
11:09:16grawity (grawity) joins
11:46:21LeighR quits [Client Quit]
12:20:47l18cp joins
12:21:22l18cp quits [Remote host closed the connection]
12:28:39Iki quits [Remote host closed the connection]
12:54:51<@OrIdow6>So it is at the point that Aimix-Z is blocking me for loading 2 pages in a real browser from a fresh IP address
13:06:59LeighR (LeighR) joins
13:14:08systwi_ (systwi) joins
13:15:05systwi quits [Ping timeout: 258 seconds]
13:20:01etnguyen03 (etnguyen03) joins
13:24:38LeGoupil joins
13:34:51paul2520 (paul2520) joins
13:35:13<paul2520>maybe someone saw this already but https://twitter.com/Michael81803750/status/1384687354921717762 -- can we get his entire Twitter & website crawled?
13:35:22<masterX244>wtf..... seems like they resist any discovery... got tm-x phase 1 done by splitting it and running 18 sections in parallel. Now to clean up amd sort that crawldata for phase2 ingestion
13:54:28<LeighR>paul2520: https://wiki.archiveteam.org/index.php/ArchiveBot - do them a favor and give the link to his personal website (don't make them check Twitter). Not sure what they can do about his Twitter feed
13:59:51<LeighR>paul2520: someone's already on it :)
14:00:24<paul2520>thanks LeighR
14:01:52<LeighR>looks like they scraped his Twitter, check in #archivebot to see about his site (you may need to post that link directly)
14:05:44<paul2520>will do
15:09:35Jonboy345 quits [Read error: Connection reset by peer]
15:16:59Arcorann quits [Ping timeout: 258 seconds]
15:45:31spirit quits [Client Quit]
15:50:32Jonboy345 joins
16:17:43celestial quits [Quit: ZNC 1.8.0 - https://znc.in]
16:18:06celestial joins
16:21:18godane (godane) joins
16:33:07ragu joins
16:36:05Qub3d (Qub3d) joins
16:37:23Qub3d quits [Client Quit]
17:02:01nerdguy1138 quits [Ping timeout: 258 seconds]
17:16:12nerdguy1138 (nerdguy1138) joins
17:18:23<masterX244>got all trackids, crawling for the replay-ids and other derived data atm. will start the grab-site crawl once that second stage finished (and that metadata created by my tool will be uploaded, too)
17:18:28<masterX244>@JAA
17:20:25hilda quits [Ping timeout: 258 seconds]
17:22:37hilda joins
17:42:26nertzy_ joins
17:44:42nertzy quits [Ping timeout: 250 seconds]
17:49:33Barto quits [Ping timeout: 258 seconds]
17:52:15Barto (Barto) joins
17:57:59DogsRNice (Webuser299) joins
18:31:23cmlow joins
18:40:16eyo is now known as xkey
18:40:52xkey quits [Changing host]
18:40:52xkey (xkey) joins
18:42:01xkey quits [Quit: WeeChat 2.9]
18:42:23xkey (eyo) joins
19:04:55cmlow quits [Changing host]
19:04:55cmlow (cmlow) joins
19:19:28lennier2 joins
19:22:19lennier1 quits [Ping timeout: 258 seconds]
19:22:19lennier2 is now known as lennier1
19:26:26Iki joins
19:37:03lunik1 quits [Quit: :x]
19:52:00aarchi quits [Read error: Connection reset by peer]
19:52:00@HCross quits [Read error: Connection reset by peer]
19:52:00Dragnog quits [Read error: Connection reset by peer]
19:52:00Dallas quits [Read error: Connection reset by peer]
19:52:01themadpro quits [Write error: Connection reset by peer]
19:57:40Dragnog joins
19:57:43aarchi (aarchi) joins
19:57:53Dallas (Dallas) joins
19:57:58HCross (HCross) joins
19:57:58@ChanServ sets mode: +o HCross
20:02:02themadpro (themadpro) joins
20:02:45lunik1 joins
20:06:31pcr leaves
20:08:09LeGoupil quits [Client Quit]
20:12:02pcr joins
20:21:30broadways joins
21:24:50hilda_ joins
21:27:26hilda quits [Ping timeout: 250 seconds]
21:34:22etnguyen03 quits [Ping timeout: 250 seconds]
21:45:49etnguyen03 (etnguyen03) joins
22:15:30LeighR quits [Ping timeout: 244 seconds]
22:22:08paul2520 quits [Remote host closed the connection]
22:25:53hooway quits [Client Quit]
22:29:27broadways quits [Ping timeout: 244 seconds]
23:05:42Qub3d (Qub3d) joins
23:22:35Qub3d quits [Client Quit]
23:26:08paul2520 (paul2520) joins
23:47:49paul2520 quits [Remote host closed the connection]