00:17:07 | | lennier2_ joins |
00:20:35 | | lennier2 quits [Ping timeout: 276 seconds] |
00:42:11 | | lennier2 joins |
00:45:26 | | lennier2_ quits [Ping timeout: 260 seconds] |
00:47:00 | <cruller> | こんにちは、個人的にdiary{,1,2,3}.fc2.comの保存をしている日本人です。いくつか情報を共有します。 |
00:47:00 | <cruller> | Hi, I'm Japanese and I personally save diary{, 1,2,3}.fc2.com. I will share some information. |
00:56:56 | <cruller> | Around April 29, I collected user IDs from the Wayback Machine CDX, Common Crawl CDX, Google Search, and Brave Search and extracted only those that actually existed. |
00:56:56 | <cruller> | Here's the result. https://transfer.archivete.am/SnZQe/diary_fc2_urls_another.txt |
00:56:57 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/SnZQe/diary_fc2_urls_another.txt |
00:59:59 | <@JAA> | cruller: Nice, thank you! Do you still have the list of users that didn't exist? We'd want to create a record of that, too. |
01:03:20 | <cruller> | JAA: I think I probably have it. I'll look for it. |
01:07:23 | | lennier2 quits [Ping timeout: 276 seconds] |
01:31:01 | <cruller> | After doing a bit of crawling based on this list, I was able to guess that pages (aside from requisites) fall into the following general types:... (full message at <https://matrix.hackint.org/_irc/v1/media/download/AcauXfLrAl0gpVgUfgVBsEf7PXjH56mtVv2RVgPt56SN4Mmv0V9BSP0GvXMRmmAKYnIjWu2Bd893v8pRJ11mdWZCfghpW3ygAGhhY2tpbnQub3JnL2VqenpldEJadkVxempURWt0YlB3dFFmUw>) |
01:37:59 | | pabs quits [Read error: Connection reset by peer] |
01:38:38 | | pabs (pabs) joins |
01:38:55 | <cruller> | So, I first saved http://diary.fc2.com/cgi-sys/ed.cgi/{user_ID}/, checked which months existed from there, and saved only http://diary.fc2.com/cgi-sys/ed.cgi/{user_ID}/?Y={year}&M ={month}&all=1 and its requisites. |
01:38:55 | <cruller> | This is already done. I can share cdx file (546MB) if needed. |
01:54:04 | <cruller> | In addition, I did a brute force search on Google yesterday... (full message at <https://matrix.hackint.org/_irc/v1/media/download/ASBccz7Rxtk5RB0p5Udeh8vfTAj4GrVY4k9YOH9GsvLr3nvIFYOrMVI7a0cl-EBxhT3A9616udTVuj5Ln-j2BCpCfghqrT9gAGhhY2tpbnQub3JnL2xJcEhyZ0prZ0JrWGhmWVh5d1NETHRlWg>) |
02:00:32 | <cruller> | <cruller> "I think I probably have it. I'll..." <- I found it. This csv is before merging the Google|Brave search results and before checking if they actually exist. |
02:00:32 | <cruller> | https://transfer.archivete.am/LQlkh/diary_fc2_urls_another_dirty.csv |
02:00:33 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/LQlkh/diary_fc2_urls_another_dirty.csv |
02:39:20 | | Wohlstand quits [Client Quit] |
02:40:52 | | Wohlstand (Wohlstand) joins |
02:41:48 | | Wohlstand quits [Client Quit] |
02:56:58 | | hexagonwin quits [Remote host closed the connection] |
03:33:33 | | lennier2 joins |
03:36:28 | <cruller> | <triplecamera|m> "This is an interesting project....." <- The LOC has archived them [https://webarchive.loc.gov/all/20120608063459/http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-828-operating-system-engineering-fall-2006/labs/] and the entire MIT OCW [https://www.loc.gov/item/lcwaN0004360/] (as you may have already noticed). |
04:18:06 | | lennier2_ joins |
04:21:05 | | lennier2 quits [Ping timeout: 276 seconds] |
05:15:54 | | lennier2 joins |
05:19:01 | | lennier2_ quits [Ping timeout: 260 seconds] |
05:19:52 | | magmaus3 quits [Remote host closed the connection] |
05:21:29 | | magmaus3 (magmaus3) joins |
05:43:12 | | Island quits [Read error: Connection reset by peer] |
06:06:59 | <h2ibot> | PaulWise edited Anubis (+51, anubis on more wikis): https://wiki.archiveteam.org/?diff=55811&oldid=55765 |
06:08:38 | <@arkiver> | pabs: do we need an emergency warrior project? |
06:08:55 | <pabs> | I'm thinking yes |
06:10:14 | <pabs> | the AB job is 7qwz68jnobcw90l4utbhabotd, the 9mil queue includes onsite and ignored offsite stuff |
06:10:21 | <pabs> | there are a lot of onsite domains though |
06:12:12 | <pokechu22> | The queue on 7qwz68jnobcw90l4utbhabotd only got to robots.txt/sitemap.xml a bit ago (which happens after it made it through the initial list) |
06:23:24 | | magmaus3 quits [Read error: Connection reset by peer] |
06:23:35 | | magmaus3 (magmaus3) joins |
06:38:42 | | ArchivalEfforts quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
06:38:51 | | ArchivalEfforts joins |
06:44:10 | | Dada joins |
06:52:35 | | xkey quits [Quit: WeeChat 4.4.3] |
06:52:50 | | xkey (xkey) joins |
06:52:53 | | xkey quits [Client Quit] |
06:53:20 | | xkey (xkey) joins |
06:55:22 | | skyrocket quits [Quit: ZNC 1.8.2+deb2build5 - https://znc.in] |
07:25:57 | | Wohlstand (Wohlstand) joins |
07:29:28 | | skyrocket joins |
07:35:09 | | monoxane (monoxane) joins |
08:28:01 | | NatTheCat quits [Ping timeout: 260 seconds] |
09:01:16 | | Doomaholic quits [Read error: Connection reset by peer] |
09:02:05 | | Doomaholic (Doomaholic) joins |
09:36:18 | | arch quits [Remote host closed the connection] |
09:36:26 | | arch joins |
09:51:32 | | T31M quits [Quit: ZNC - https://znc.in] |
09:52:21 | | T31M joins |
09:53:11 | | T31M is now authenticated as T31M |
10:00:42 | <h2ibot> | Manu edited Mailman/2 (+6, /* Queued lists.cubik.org */): https://wiki.archiveteam.org/?diff=55812&oldid=55631 |
10:09:43 | <h2ibot> | Manu edited Mailman/2 (+37, /* Grabbed lists.cypherpunks.ca */): https://wiki.archiveteam.org/?diff=55813&oldid=55812 |
11:00:02 | | Bleo182600722719623455 quits [Quit: The Lounge - https://thelounge.chat] |
11:02:45 | | Bleo182600722719623455 joins |
11:04:32 | | arch_ joins |
11:04:51 | | arch quits [Remote host closed the connection] |
11:04:52 | | arch_ is now known as arch |
11:06:51 | <h2ibot> | Manu edited Mailman/2 (+53, /* Queued lists.danga.com */): https://wiki.archiveteam.org/?diff=55814&oldid=55813 |
11:36:33 | | Webuser813225 joins |
11:39:08 | <Webuser813225> | how to open libgen_compact.rar ,,, libgen_compact.sql ?? MariaDB ?/ |
11:43:20 | <nimaje> | From the .sql extention I would expect a sql dump using standard sql syntax, is it a text file? (if it is a standard sql dump, then any sql database system should be able to import it) |
11:45:46 | | aninternettroll quits [Ping timeout: 260 seconds] |
11:48:46 | | aninternettroll (aninternettroll) joins |
13:04:31 | | datechnoman quits [Ping timeout: 260 seconds] |
13:13:35 | | datechnoman (datechnoman) joins |
13:22:40 | | Wohlstand quits [Quit: Wohlstand] |
13:24:41 | | Wohlstand (Wohlstand) joins |
14:07:24 | | Webuser813225 quits [Client Quit] |
15:05:37 | | Webuser188366 joins |
15:06:08 | | Webuser188366 quits [Client Quit] |
15:18:39 | | Island joins |
15:18:39 | | Island_ joins |
15:25:08 | | Cuphead2527480 (Cuphead2527480) joins |
15:27:05 | <Cuphead2527480> | guys why dont we try nitter.net its a twitter proxy. Althrough its firewall is very delicate (by that i mean if you do like 5 requests in less than a second or smth your IP is blocked for 17 hours so you must wait like 10 seconds unless you know of a nitter instance without this garbage) ??? its not exactly the twitter domain but it does the job |
15:27:05 | <Cuphead2527480> | until a new workaround is found. |
15:27:53 | <Cuphead2527480> | went here because manu suggested it |
15:40:16 | | pseudorizer quits [Ping timeout: 260 seconds] |
15:41:04 | | pseudorizer (pseudorizer) joins |
15:48:08 | | ^ quits [Ping timeout: 276 seconds] |
15:48:16 | | ^ (^) joins |
15:48:43 | <h2ibot> | Manu edited Mailman/2 (+29, /* Queued lists.devloop.org.uk */): https://wiki.archiveteam.org/?diff=55815&oldid=55814 |
15:49:43 | <h2ibot> | Manu edited Mailman/2 (+59, /* Add note regarding lists.debconf.org */): https://wiki.archiveteam.org/?diff=55816&oldid=55815 |
15:56:52 | | Mateon1 quits [Quit: Mateon1] |
16:14:08 | | datechnoman quits [Ping timeout: 276 seconds] |
16:22:05 | | Cuphead2527480 quits [Client Quit] |
16:22:50 | | Wohlstand quits [Client Quit] |
16:30:19 | | datechnoman (datechnoman) joins |
16:38:50 | <h2ibot> | DigitalDragon edited Wikibot (+0, user agents for everything): https://wiki.archiveteam.org/?diff=55817&oldid=55775 |
16:40:18 | | BornOn420 quits [Remote host closed the connection] |
16:40:57 | | BornOn420 (BornOn420) joins |
16:52:11 | | grill (grill) joins |
17:08:05 | | datechnoman quits [Ping timeout: 276 seconds] |
17:24:22 | | datechnoman (datechnoman) joins |
17:29:33 | | datechnoman quits [Ping timeout: 276 seconds] |
17:30:13 | | datechnoman (datechnoman) joins |
17:34:50 | | lennier2_ joins |
17:35:23 | | datechnoman quits [Ping timeout: 276 seconds] |
17:38:06 | | lennier2 quits [Ping timeout: 260 seconds] |
18:18:13 | | datechnoman (datechnoman) joins |
18:21:29 | | ATinySpaceMarine quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
18:46:35 | | datechnoman quits [Client Quit] |
18:46:54 | | datechnoman (datechnoman) joins |
18:55:59 | | datechnoman quits [Ping timeout: 276 seconds] |
19:03:01 | | datechnoman (datechnoman) joins |
19:07:56 | | datechnoman quits [Ping timeout: 260 seconds] |
19:11:26 | | Flashfire42 quits [Ping timeout: 260 seconds] |
19:11:26 | | Ryz2 quits [Ping timeout: 260 seconds] |
19:11:26 | | s-crypt quits [Ping timeout: 260 seconds] |
19:11:26 | | kiska quits [Ping timeout: 260 seconds] |
19:35:57 | | datechnoman (datechnoman) joins |
19:36:01 | | Webuser491448 joins |
19:36:31 | | Webuser491448 quits [Client Quit] |
19:45:23 | | datechnoman quits [Ping timeout: 276 seconds] |
19:47:04 | | s-crypt (s-crypt) joins |
19:47:07 | | Ryz2 (Ryz) joins |
19:47:23 | | Flashfire42 joins |
19:49:14 | <c3manu> | for the record: i suggested making suggestions like that in #archiveteam-bs. i didn't know the suggestion was to use nitter.net. |
19:49:30 | <@JAA> | To repeat it here from the earlier discussion in #archivebot: no, we won't abuse random Nitter instances without the operator's approval. |
19:49:40 | <@JAA> | nitter.net specifically also blocks AB, FWIW. |
20:02:11 | | AlsoHP_Archivist quits [Ping timeout: 260 seconds] |
20:02:56 | | grill quits [Ping timeout: 276 seconds] |
20:17:53 | | datechnoman (datechnoman) joins |
20:53:30 | | Flashfire42 is now authenticated as flashfire42 |
20:55:16 | | datechnoman quits [Ping timeout: 260 seconds] |
20:55:35 | | lennier2_ quits [Ping timeout: 276 seconds] |
20:56:59 | | lennier2_ joins |
21:23:34 | | datechnoman (datechnoman) joins |
21:32:59 | | DogsRNice joins |
21:40:39 | | nine quits [Quit: See ya!] |
21:40:52 | | nine joins |
21:40:52 | | nine is now authenticated as nine |
21:40:52 | | nine quits [Changing host] |
21:40:52 | | nine (nine) joins |
21:44:30 | | cuphead2527480 (Cuphead2527480) joins |
21:45:24 | <cuphead2527480> | Anyone now how to get voiced on archivebot Channel? |
21:46:12 | <nicolas17> | lurk moar |
22:01:53 | | datechnoman quits [Ping timeout: 276 seconds] |
22:06:49 | | datechnoman (datechnoman) joins |
22:10:27 | | etnguyen03 (etnguyen03) joins |
22:10:42 | | yano quits [Quit: WeeChat, https://weechat.org/] |
22:12:36 | | yano (yano) joins |
22:19:26 | | datechnoman quits [Ping timeout: 276 seconds] |
22:34:09 | | datechnoman (datechnoman) joins |
22:40:32 | <pokechu22> | https://s3.us-east-1.amazonaws.com/rds.nsrl.nist.gov/software/NSRL_free_bags_README.htm - this has 15 TB of data according to the CSV file |
22:40:52 | <pokechu22> | (or will have) |
22:45:31 | | datechnoman quits [Ping timeout: 260 seconds] |
22:45:31 | | terry quits [Ping timeout: 260 seconds] |
22:53:01 | | datechnoman (datechnoman) joins |
22:57:21 | | NotGLaDOS joins |
22:59:54 | | Dada quits [Remote host closed the connection] |
23:12:30 | | ATinySpaceMarine joins |
23:23:16 | | etnguyen03 quits [Client Quit] |
23:26:01 | | pixel leaves [Error from remote client] |
23:46:59 | | etnguyen03 (etnguyen03) joins |
23:52:00 | | i_have_n0_idea3 quits [Read error: Connection reset by peer] |
23:52:12 | | i_have_n0_idea3 (i_have_n0_idea) joins |
23:57:08 | | etnguyen03 quits [Client Quit] |
23:57:42 | | etnguyen03 (etnguyen03) joins |
23:58:13 | | Mist8kenGAS (Mist8kenGAS) joins |