00:18:59loug4 quits [Client Quit]
00:19:23sebs quits [Remote host closed the connection]
00:21:22@rewby quits [Ping timeout: 255 seconds]
00:34:13Mateon1 quits [Ping timeout: 272 seconds]
00:35:44Mateon1 joins
00:38:05BlueMaxima quits [Read error: Connection reset by peer]
00:46:48<h2ibot>Switchnode edited Deathwatch (+217, /* 2024 */ add 123guestbook): https://wiki.archiveteam.org/?diff=52359&oldid=52345
00:54:33rewby (rewby) joins
00:54:33@ChanServ sets mode: +o rewby
01:01:41RealPerson joins
01:10:02Macsteel (Macsteel) joins
01:15:55benjins2 joins
01:22:26parfait (kdqep) joins
01:42:27RealPerson leaves
01:43:08nertzy quits [Client Quit]
01:45:16RealPerson joins
01:48:12RealPerson leaves
01:48:37RealPerson joins
01:51:33<thuban>someone with common crawl indices want to grep for *.123guestbook.com? (imer?)
01:51:59<imer>sure, can start that. will take a day or two
01:52:21<thuban>thanks!
01:53:47RealPerson leaves
01:59:43eightthree quits [Ping timeout: 272 seconds]
02:06:40eightthree joins
02:47:48Hackerpcs quits [Client Quit]
02:50:39Hackerpcs (Hackerpcs) joins
03:22:26nertzy joins
03:24:31wickedplayer494 quits [Ping timeout: 255 seconds]
03:24:54wickedplayer494 joins
03:29:28wickedplayer494 quits [Ping timeout: 255 seconds]
03:30:01wickedplayer494 joins
03:49:24sec^nd quits [Remote host closed the connection]
03:50:07sec^nd (second) joins
03:51:58wickedplayer494 quits [Ping timeout: 255 seconds]
03:52:20wickedplayer494 joins
03:54:21Dango360 quits [Ping timeout: 272 seconds]
03:54:57Dango360 (Dango360) joins
03:58:56eroc1990 quits [Quit: The Lounge - https://thelounge.chat]
03:59:25eroc1990 (eroc1990) joins
04:18:25muklumsum quits [Ping timeout: 272 seconds]
04:20:30muklumsum joins
04:35:31muklumsum quits [Ping timeout: 272 seconds]
04:37:10muklumsum joins
04:50:08Island quits [Read error: Connection reset by peer]
05:20:46<thuban>here are some i got from the latest cc web graph: https://transfer.archivete.am/NtANm/123guestbook_subdomains_cc-main-2024-feb-apr-may.txt (this is the case where `!a <` should be fine, fwiw)
05:20:47<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/NtANm/123guestbook_subdomains_cc-main-2024-feb-apr-may.txt
05:21:36<thuban>i tried some brute-forcing, but i hit it too hard (fell over at first, some 429s when i scaled back but not enough)
05:28:16<thuban>(interesting: deleted and not found are both 404, but they render differently https://treasure.123guestbook.com/ / https://travelers.123guestbook.com/)
05:31:15DogsRNice quits [Read error: Connection reset by peer]
05:39:19sludge joins
05:54:59<@hook54321>Are there any easy ways for someone to find their Google+ page from the grab or just if they happen to have the profile URL?
05:59:18Macsteel quits [Client Quit]
06:40:13nertzy quits [Client Quit]
07:00:42Lord_Nightmare quits [Client Quit]
07:04:43Lord_Nightmare (Lord_Nightmare) joins
07:09:25Unholy23619246453771 quits [Ping timeout: 272 seconds]
07:33:22Arcorann (Arcorann) joins
07:36:39nicolas17 quits [Ping timeout: 272 seconds]
07:39:15nicolas17 joins
08:20:37Sluggs quits [Ping timeout: 255 seconds]
08:51:35BlueMaxima joins
09:00:00Bleo1826007227196 quits [Client Quit]
09:01:20Bleo1826007227196 joins
09:03:22muklumsum quits [Ping timeout: 255 seconds]
09:05:51muklumsum joins
09:29:01muklumsum quits [Ping timeout: 255 seconds]
09:33:21muklumsum joins
09:33:53BlueMaxima quits [Read error: Connection reset by peer]
09:36:53loug4 joins
09:46:37muklumsum_ joins
09:46:51<IDK>https://freshcut.gg/ is shutting down tommrow
09:47:45muklumsum quits [Ping timeout: 272 seconds]
09:56:01muklumsum_ quits [Ping timeout: 255 seconds]
10:00:58muklumsum joins
10:06:49muklumsum quits [Ping timeout: 255 seconds]
10:10:12muklumsum joins
10:16:43muklumsum quits [Ping timeout: 255 seconds]
10:17:04decky_e joins
10:18:54muklumsum joins
10:24:08<h2ibot>Exorcism edited LEGO Insiders Community (+4): https://wiki.archiveteam.org/?diff=52360&oldid=52344
10:29:19loug4 quits [Client Quit]
10:29:51loug4 joins
10:46:56<katia>IDK, it's running on archivebot now but i'm not sure it's crawling more than just the user profiles / no media seems to be gotten from those links - those are all via JS to a graphql endpoint :/
10:47:20<katia>oh there are some mp4s now hm :D
10:48:13muklumsum quits [Ping timeout: 255 seconds]
10:53:16muklumsum joins
11:01:13benjins2 quits [Ping timeout: 272 seconds]
11:02:31muklumsum_ joins
11:04:23muklumsum quits [Ping timeout: 272 seconds]
11:04:44benjins2 joins
11:06:15muklumsum joins
11:07:33muklumsum_ quits [Ping timeout: 272 seconds]
11:09:00nertzy joins
11:11:34<IDK>iirc the things are under storage.googleapis.com and a cdn domain
11:15:05<IDK>And it's probably not even mp4s its ts
11:21:48<IDK>tiki flashback lol
11:27:49muklumsum quits [Ping timeout: 255 seconds]
11:30:21muklumsum joins
12:00:30<@OrIdow6>Common crawl indices?
12:02:03<datechnoman>Think they meant indexes
12:04:53<@OrIdow6>More indexes of the links from common crawl? Or lists of the URLs that they've crawled? (And if the latter case what's the advantage over IA CDX?)
12:14:10systwi quits [Ping timeout: 255 seconds]
12:23:50sludge_ joins
12:24:31sludge quits [Ping timeout: 255 seconds]
12:28:21<masterx244|m><OrIdow6> "More indexes of the links from..." <- if you got them locally you can crunch the data much faster
12:28:51<masterx244|m>(did that at imgone times to hunt imgur links out of laion5B and friends)
12:30:37AK quits [Client Quit]
12:31:21AK (AK) joins
12:32:30systwi (systwi) joins
13:33:51Arcorann quits [Ping timeout: 272 seconds]
13:40:51Wohlstand (Wohlstand) joins
13:51:50ArchivalEfforts quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
13:57:05nertzy quits [Client Quit]
14:33:13Guest54 joins
14:43:19emtee joins
14:43:42emtee quits [Client Quit]
14:57:02evanim quits [Read error: Connection reset by peer]
14:57:05evanim (evanim) joins
15:21:38MrMcNuggets (MrMcNuggets) joins
15:27:19<that_lurker>Has anyone tried to AB https://osintukraine.com/ yet?
16:22:21datechnoman quits [Remote host closed the connection]
16:23:05datechnoman (datechnoman) joins
16:39:23<IDK>https://9to5google.com/2024/06/12/youtube-ad-injection/, does this affect #down-the-tube?
16:57:08Overlordz quits [Read error: Connection reset by peer]
16:57:09Overlordz joins
17:00:06Doranwen quits [Read error: Connection reset by peer]
17:00:22bladem quits [Ping timeout: 255 seconds]
17:00:33Doranwen (Doranwen) joins
17:01:49bladem (bladem) joins
17:14:24Island joins
17:39:09datechnoman quits [Remote host closed the connection]
17:39:35datechnoman (datechnoman) joins
17:40:14MrMcNuggets quits [Client Quit]
17:51:04driib quits [Quit: Ping timeout (120 seconds)]
17:51:19driib (driib) joins
18:44:23datechnoman quits [Remote host closed the connection]
18:44:47datechnoman (datechnoman) joins
19:08:10datechnoman quits [Client Quit]
19:08:31loug4 quits [Read error: Connection reset by peer]
19:08:33datechnoman (datechnoman) joins
19:09:09loug4 joins
19:20:30datechnoman quits [Read error: Connection reset by peer]
19:20:55datechnoman (datechnoman) joins
19:25:08Gereon026 (Gereon) joins
19:27:31Gereon02 quits [Ping timeout: 255 seconds]
19:27:31Gereon026 is now known as Gereon02
19:53:52datechnoman quits [Remote host closed the connection]
19:55:01datechnoman (datechnoman) joins
20:26:42datechnoman quits [Remote host closed the connection]
20:27:04datechnoman (datechnoman) joins
20:43:43decky_e quits [Read error: Connection reset by peer]
20:50:21JaffaCakes118 (JaffaCakes118) joins
20:52:14<JaffaCakes118>So Triage (a malware analysis platform), has started going more corporate and and slowly getting rid of their free users and deleting the analysis, they have a sitemap with every single url on triage, is there a way we can get this archived with archivebot or something? https://tria.ge/sitemap.xml
21:04:12datechnoman quits [Remote host closed the connection]
21:04:43flotwig quits [Ping timeout: 255 seconds]
21:04:58datechnoman (datechnoman) joins
21:11:55flotwig joins
21:36:06datechnoman quits [Remote host closed the connection]
21:37:02datechnoman (datechnoman) joins
21:42:41BlueMaxima joins
21:45:52Wohlstand quits [Client Quit]
21:46:02Wohlstand (Wohlstand) joins
21:51:08loug4 quits [Client Quit]
21:53:13loug4 joins
21:54:11pabs quits [Ping timeout: 272 seconds]
21:55:21pabs (pabs) joins
21:57:15JaffaCakes118 quits [Remote host closed the connection]
21:57:23JaffaCakes118 (JaffaCakes118) joins
21:59:15datechnoman quits [Remote host closed the connection]
21:59:37<nulldata>^looks like katia has thrown it in AB
22:00:10datechnoman (datechnoman) joins
22:08:13muklumsum_ joins
22:10:01muklumsum quits [Ping timeout: 272 seconds]
22:23:19Wohlstand quits [Ping timeout: 272 seconds]
22:24:01BlueMaxima quits [Read error: Connection reset by peer]
22:25:08datechnoman quits [Remote host closed the connection]
22:25:49datechnoman (datechnoman) joins
22:41:28Barto quits [Ping timeout: 255 seconds]
22:41:55mgrytbak quits [Ping timeout: 255 seconds]
22:42:13Barto (Barto) joins
22:44:06JohnnyJ quits [Read error: Connection reset by peer]
22:49:43datechnoman quits [Remote host closed the connection]
22:50:07datechnoman (datechnoman) joins
22:56:19ThreeHM quits [Ping timeout: 255 seconds]
23:03:44yarrow quits [Read error: Connection reset by peer]
23:06:58yarrow (yarrow) joins
23:09:49muklumsum_ quits [Ping timeout: 255 seconds]
23:10:08ThreeHM (ThreeHeadedMonkey) joins
23:10:16datechnoman quits [Remote host closed the connection]
23:11:03datechnoman (datechnoman) joins
23:15:36datechnoman quits [Client Quit]
23:16:25JaffaCakes118 quits [Remote host closed the connection]
23:16:53JaffaCakes118 (JaffaCakes118) joins
23:18:56muklumsum joins
23:41:27datechnoman (datechnoman) joins
23:46:17<@OrIdow6>masterx244|m: So it's the URLs that CC captured?
23:46:57<@OrIdow6>If so how is that specifically better than IA CDX?
23:53:26datechnoman quits [Client Quit]
23:54:21<imer>IA cdx is a superset actually since it includes common crawl data too (recent comparison with JAA for #webroasting I found one URL the IA index didnt have.. which was a 404 incidentally)
23:55:31datechnoman (datechnoman) joins