00:05:44 | | Mist8kenGAS quits [Remote host closed the connection] |
00:24:58 | | etnguyen03 quits [Client Quit] |
00:26:38 | | xkey quits [Quit: WeeChat 4.4.3] |
00:27:12 | | xkey (xkey) joins |
00:36:27 | | wickedplayer494 quits [Remote host closed the connection] |
00:43:33 | | wickedplayer494 joins |
00:43:47 | | wickedplayer494 is now authenticated as wickedplayer494 |
00:48:56 | | wickedplayer494 quits [Ping timeout: 276 seconds] |
00:50:56 | | wickedplayer494 joins |
00:51:21 | | wickedplayer494 is now authenticated as wickedplayer494 |
00:53:35 | | etnguyen03 (etnguyen03) joins |
01:16:02 | | etnguyen03 quits [Client Quit] |
01:31:29 | | etnguyen03 (etnguyen03) joins |
01:32:51 | | nine quits [Quit: See ya!] |
01:33:04 | | nine joins |
01:33:04 | | nine is now authenticated as nine |
01:33:04 | | nine quits [Changing host] |
01:33:04 | | nine (nine) joins |
02:01:21 | | kiska (kiska) joins |
02:10:16 | | cmlow quits [Ping timeout: 260 seconds] |
02:24:04 | | etnguyen03 quits [Client Quit] |
02:24:38 | <pabs> | cuphead2527480: nitter.net blocks ArchiveBot, so not feasible. https://wiki.archiveteam.org/index.php/Twitter#Workarounds |
02:24:54 | <pabs> | c3manu: ^ |
03:48:13 | | DogsRNice quits [Read error: Connection reset by peer] |
04:27:57 | | Webuser852265 joins |
04:30:35 | | Webuser852265 quits [Client Quit] |
04:43:28 | | ericgallager quits [Quit: This computer has gone to sleep] |
04:47:18 | | ThetaDev quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
04:48:05 | | ThetaDev joins |
05:41:49 | | Exorcism quits [Quit: Ping timeout (120 seconds)] |
05:42:07 | | Exorcism (exorcism) joins |
05:50:38 | | HP_Archivist (HP_Archivist) joins |
06:17:30 | | Island quits [Read error: Connection reset by peer] |
06:17:30 | | Island_ quits [Read error: Connection reset by peer] |
06:21:06 | | DopefishJustin quits [Ping timeout: 260 seconds] |
06:21:39 | | DopefishJustin joins |
06:21:39 | | DopefishJustin is now authenticated as DopefishJustin |
06:29:51 | | IRC2DC quits [Ping timeout: 260 seconds] |
06:52:59 | | pixel (pixel) joins |
07:06:01 | | pokechu22 quits [Ping timeout: 260 seconds] |
07:12:14 | | pokechu22 (pokechu22) joins |
07:33:00 | | ducky (ducky) joins |
08:01:35 | | Dada joins |
08:31:03 | | Naruyoko5 joins |
08:34:20 | | Naruyoko quits [Ping timeout: 276 seconds] |
08:49:56 | | archiveDrill quits [Quit: The Lounge - https://thelounge.chat] |
08:51:03 | | archiveDrill joins |
08:52:14 | | archiveDrill quits [Client Quit] |
08:53:00 | | archiveDrill joins |
09:48:20 | | ericgallager joins |
10:02:00 | | Lunarian1 (LunarianBunny1147) joins |
10:05:41 | | LunarianBunny1147 quits [Ping timeout: 260 seconds] |
10:06:30 | | Wohlstand (Wohlstand) joins |
10:55:16 | | DLoader quits [Ping timeout: 260 seconds] |
10:59:02 | | DLoader (DLoader) joins |
10:59:21 | | Wohlstand quits [Ping timeout: 260 seconds] |
11:00:06 | | Bleo182600722719623455 quits [Quit: The Lounge - https://thelounge.chat] |
11:00:40 | | Wohlstand (Wohlstand) joins |
11:02:50 | | Bleo182600722719623455 joins |
11:07:49 | | arch quits [Remote host closed the connection] |
11:07:53 | | arch joins |
11:28:56 | | arch quits [Remote host closed the connection] |
11:30:58 | | arch joins |
11:37:35 | | Dango360 quits [Quit: Leaving] |
11:44:25 | | grill (grill) joins |
11:58:16 | | Dango360 (Dango360) joins |
12:07:18 | | ericgallager quits [Client Quit] |
12:12:18 | | ConstantK quits [Quit: Ping timeout (120 seconds)] |
12:12:32 | | ConstantK joins |
12:15:08 | | DLoader quits [Client Quit] |
12:37:56 | | DLoader (DLoader) joins |
12:39:44 | | Mateon1 joins |
12:46:09 | | Mateon1 quits [Remote host closed the connection] |
12:46:50 | | Mateon1 joins |
13:05:23 | | grill quits [Ping timeout: 276 seconds] |
13:10:22 | | yasomi quits [Quit: ZNC 1.9.1 - https://znc.in] |
13:15:56 | | yasomi (yasomi) joins |
13:41:15 | | PredatorIWD253 joins |
13:45:01 | | PredatorIWD25 quits [Ping timeout: 260 seconds] |
13:45:01 | | PredatorIWD253 is now known as PredatorIWD25 |
13:50:15 | | arch quits [Read error: Connection reset by peer] |
13:58:48 | | arch joins |
14:19:08 | <cuphead2527480> | Okay yeah i see. We Will have to wait for someone Big Big which doesnt block block or maybe somehow. A private nitter instance IS made exclusive to archivebot addresses. Until then... Its complicated |
14:20:05 | | yasomi quits [Client Quit] |
14:22:56 | | Wohlstand quits [Quit: Wohlstand] |
14:29:02 | | yasomi (yasomi) joins |
15:14:33 | | ducky_ (ducky) joins |
15:15:26 | | ducky quits [Ping timeout: 260 seconds] |
15:15:26 | | ducky_ is now known as ducky |
15:36:30 | <c3manu> | pabs: i know. i didn't know this would be the suggestion, i just told them to post it here, whatever it is :) |
15:43:46 | <h2ibot> | Manu edited Deathwatch (+354, Add Blackblogs.org): https://wiki.archiveteam.org/?diff=55818&oldid=55794 |
15:48:38 | | spirit joins |
15:50:13 | | BornOn420 quits [Remote host closed the connection] |
15:50:54 | | BornOn420 (BornOn420) joins |
15:52:32 | | andrew (andrew) joins |
15:53:42 | <@arkiver> | pabs: does it seem like fc2 can handle significant numbers of requests |
15:55:11 | <@arkiver> | pabs: what can you tell me about fc2? i remember reading it requires googlebot |
16:00:25 | <@arkiver> | ah yes googlebot |
16:00:28 | <@arkiver> | project coming |
16:05:41 | <@arkiver> | pabs: how were these sitemaps collected? |
16:06:51 | <@arkiver> | fc2 is very simple, no difficult stuff needed, so we'll just do an item per URL |
16:09:24 | | arch quits [Remote host closed the connection] |
16:09:34 | | arch joins |
16:18:27 | | sepro quits [Quit: Bye!] |
16:24:12 | | sepro (sepro) joins |
16:36:19 | <@arkiver> | pabs: is the fc2 blog going away as well/ |
16:36:20 | <@arkiver> | ? |
16:48:50 | | grill (grill) joins |
16:55:49 | <@arkiver> | i see the sitemaps came from cruller |
16:55:59 | <@arkiver> | imer: we have an emergency FC2 project coming up... |
16:57:02 | <@imer> | ack |
17:00:33 | <@arkiver> | yeah it's going to be the "fc2" tracker |
17:00:47 | <@arkiver> | imer: would it be possible to get a target for this? i think this will not be *huge* |
17:00:56 | <@imer> | sure thing |
17:00:59 | <@arkiver> | pabs: what is the rate limiting like? |
17:01:04 | <@arkiver> | imer: thanks lot!! |
17:01:07 | <@arkiver> | we would have |
17:01:10 | <@arkiver> | archiveteam_fc2_ |
17:01:11 | <@arkiver> | fc2_ |
17:01:15 | <@arkiver> | Archive Team FC2: |
17:03:29 | <@imer> | arkiver: target is added |
17:03:48 | <@imer> | shoved onto n905na so if we do need speed we can go ham |
17:04:00 | <@arkiver> | imer: thank you! |
17:04:06 | <@arkiver> | we're starting tomorrow |
17:04:14 | <@arkiver> | i'm too tired now, want to be around in case something goes wrong |
17:08:37 | | ericgallager joins |
17:08:42 | | driib9 quits [Quit: The Lounge - https://thelounge.chat] |
17:11:49 | | driib9 (driib) joins |
17:28:20 | <@imer> | makes sense. have a good rest when you do :) |
17:29:41 | <kiska> | fc2 wss listening |
17:31:08 | <@arkiver> | thanks imer |
17:31:15 | <@arkiver> | kiska: what? |
17:31:19 | <@arkiver> | ah nvm |
17:31:21 | <pokechu22> | There's also https://piyo.fc2.com/ but that's running decently in #archivebot |
17:55:23 | | datechnoman1 (datechnoman) joins |
17:55:32 | | datechnoman quits [Read error: Connection reset by peer] |
17:55:32 | | datechnoman1 is now known as datechnoman |
17:58:35 | | terry joins |
17:58:36 | | NotGLaDOS quits [Read error: Connection reset by peer] |
17:59:48 | | BornOn420 quits [Remote host closed the connection] |
18:00:27 | | BornOn420 (BornOn420) joins |
18:31:11 | <h2ibot> | HadeanEon edited Deaths in 2012 (+417, BOT - Updating page: {{saved}} (195),…): https://wiki.archiveteam.org/?diff=55819&oldid=55655 |
18:31:12 | <h2ibot> | HadeanEon edited Deaths in 2012/list (+34, BOT - Updating list): https://wiki.archiveteam.org/?diff=55820&oldid=55656 |
18:42:58 | <beastbg8> | http://clubs.dir.bg/ this forum seems to be on it's last legs; barely works. it has discussions from 1999 onwards. at some point in early 2000s, it was the most used forum in bulgaria, now it's barely used i feel like it's the next thing they're about to close after glog (custom .dir.bg) sites can you add it for consideration? |
18:43:13 | <h2ibot> | HadeanEon edited Deaths in 2013 (+406, BOT - Updating page: {{saved}} (211),…): https://wiki.archiveteam.org/?diff=55821&oldid=55657 |
18:43:14 | <h2ibot> | HadeanEon edited Deaths in 2013/list (+23, BOT - Updating list): https://wiki.archiveteam.org/?diff=55822&oldid=55658 |
18:59:36 | | sighsloth1090 quits [Quit: WeeChat 4.6.0] |
19:00:30 | | Island joins |
19:17:57 | | BornOn420 quits [Remote host closed the connection] |
19:18:31 | | BornOn420 (BornOn420) joins |
19:19:39 | | PredatorIWD25 quits [Read error: Connection reset by peer] |
19:24:20 | | spirit quits [Quit: Leaving] |
19:24:29 | <h2ibot> | HadeanEon edited Deaths in 2016 (+402, BOT - Updating page: {{saved}} (131),…): https://wiki.archiveteam.org/?diff=55823&oldid=55663 |
19:24:30 | <h2ibot> | HadeanEon edited Deaths in 2016/list (+39, BOT - Updating list): https://wiki.archiveteam.org/?diff=55824&oldid=55664 |
19:24:33 | | PredatorIWD25 joins |
19:35:22 | | Wohlstand (Wohlstand) joins |
19:52:07 | | Wohlstand quits [Client Quit] |
19:54:16 | | grill quits [Ping timeout: 260 seconds] |
20:08:16 | | Snivy quits [Ping timeout: 260 seconds] |
20:13:33 | | Wohlstand (Wohlstand) joins |
20:13:39 | | Snivy (Snivy) joins |
20:15:41 | | lennier2_ quits [Ping timeout: 276 seconds] |
20:15:52 | | lennier2_ joins |
20:23:15 | | ahm258760 quits [Remote host closed the connection] |
20:23:18 | | Wohlstand quits [Client Quit] |
20:50:41 | <h2ibot> | HadeanEon edited Deaths in 2020/list (+1, BOT - Updating list): https://wiki.archiveteam.org/?diff=55825&oldid=55743 |
20:59:39 | | arch quits [Remote host closed the connection] |
21:00:09 | | arch joins |
21:25:47 | <h2ibot> | HadeanEon edited Deaths in 2022 (+464, BOT - Updating page: {{saved}} (217),…): https://wiki.archiveteam.org/?diff=55826&oldid=55675 |
21:25:48 | <h2ibot> | HadeanEon edited Deaths in 2022/list (+26, BOT - Updating list): https://wiki.archiveteam.org/?diff=55827&oldid=55676 |
21:45:50 | <h2ibot> | HadeanEon edited Deaths in 2023 (+425, BOT - Updating page: {{saved}} (181),…): https://wiki.archiveteam.org/?diff=55828&oldid=55677 |
21:45:51 | <h2ibot> | HadeanEon edited Deaths in 2023/list (+24, BOT - Updating list): https://wiki.archiveteam.org/?diff=55829&oldid=55678 |
22:07:53 | <h2ibot> | HadeanEon edited Deaths in 2024 (+782, BOT - Updating page: {{saved}} (224),…): https://wiki.archiveteam.org/?diff=55830&oldid=55744 |
22:07:54 | <h2ibot> | HadeanEon edited Deaths in 2024/list (+48, BOT - Updating list): https://wiki.archiveteam.org/?diff=55831&oldid=55745 |
22:13:33 | | Dada quits [Remote host closed the connection] |
22:16:25 | | pie_ quits [] |
22:16:34 | | pie_ (pie_) joins |
22:17:43 | | etnguyen03 (etnguyen03) joins |
22:17:55 | <h2ibot> | HadeanEon edited Deaths in 2025 (+4217, BOT - Updating page: {{saved}} (130),…): https://wiki.archiveteam.org/?diff=55832&oldid=55808 |
22:17:56 | <h2ibot> | HadeanEon edited Deaths in 2025/list (+328, BOT - Updating list): https://wiki.archiveteam.org/?diff=55833&oldid=55809 |
22:43:24 | <cancername> | hey y'all, I apologize if this is an uninformed question, but I noticed some URL exploration getting taken from common crawl. my question is: do y'all just take the URLs crawled by CC, or also extract the hrefs from it? if you do, how? I noticed the tabular index doesn't contain any outlinks, so one would have to download the WARCs to scan them, correct? |
23:08:19 | | APOLLO_03 joins |
23:09:06 | | APOLLO03 quits [Ping timeout: 260 seconds] |
23:35:29 | | etnguyen03 quits [Client Quit] |
23:58:09 | | etnguyen03 (etnguyen03) joins |