| 00:18:59 | | loug4 quits [Client Quit] |
| 00:19:23 | | sebs quits [Remote host closed the connection] |
| 00:21:22 | | @rewby quits [Ping timeout: 255 seconds] |
| 00:34:13 | | Mateon1 quits [Ping timeout: 272 seconds] |
| 00:35:44 | | Mateon1 joins |
| 00:38:05 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 00:46:48 | <h2ibot> | Switchnode edited Deathwatch (+217, /* 2024 */ add 123guestbook): https://wiki.archiveteam.org/?diff=52359&oldid=52345 |
| 00:54:33 | | rewby (rewby) joins |
| 00:54:33 | | @ChanServ sets mode: +o rewby |
| 01:01:41 | | RealPerson joins |
| 01:10:02 | | Macsteel (Macsteel) joins |
| 01:15:55 | | benjins2 joins |
| 01:22:26 | | parfait (kdqep) joins |
| 01:42:27 | | RealPerson leaves |
| 01:43:08 | | nertzy quits [Client Quit] |
| 01:45:16 | | RealPerson joins |
| 01:48:12 | | RealPerson leaves |
| 01:48:37 | | RealPerson joins |
| 01:51:33 | <thuban> | someone with common crawl indices want to grep for *.123guestbook.com? (imer?) |
| 01:51:59 | <imer> | sure, can start that. will take a day or two |
| 01:52:21 | <thuban> | thanks! |
| 01:53:47 | | RealPerson leaves |
| 01:59:43 | | eightthree quits [Ping timeout: 272 seconds] |
| 02:06:40 | | eightthree joins |
| 02:47:48 | | Hackerpcs quits [Client Quit] |
| 02:50:39 | | Hackerpcs (Hackerpcs) joins |
| 03:22:26 | | nertzy joins |
| 03:24:31 | | wickedplayer494 quits [Ping timeout: 255 seconds] |
| 03:24:54 | | wickedplayer494 joins |
| 03:25:01 | | wickedplayer494 is now authenticated as wickedplayer494 |
| 03:29:28 | | wickedplayer494 quits [Ping timeout: 255 seconds] |
| 03:30:01 | | wickedplayer494 joins |
| 03:30:13 | | wickedplayer494 is now authenticated as wickedplayer494 |
| 03:49:24 | | sec^nd quits [Remote host closed the connection] |
| 03:50:07 | | sec^nd (second) joins |
| 03:51:58 | | wickedplayer494 quits [Ping timeout: 255 seconds] |
| 03:52:20 | | wickedplayer494 joins |
| 03:54:21 | | Dango360 quits [Ping timeout: 272 seconds] |
| 03:54:57 | | Dango360 (Dango360) joins |
| 03:58:56 | | eroc1990 quits [Quit: The Lounge - https://thelounge.chat] |
| 03:59:25 | | eroc1990 (eroc1990) joins |
| 04:18:25 | | muklumsum quits [Ping timeout: 272 seconds] |
| 04:20:30 | | muklumsum joins |
| 04:35:31 | | muklumsum quits [Ping timeout: 272 seconds] |
| 04:37:10 | | muklumsum joins |
| 04:50:08 | | Island quits [Read error: Connection reset by peer] |
| 04:52:16 | | wickedplayer494 is now authenticated as wickedplayer494 |
| 05:20:46 | <thuban> | here are some i got from the latest cc web graph: https://transfer.archivete.am/NtANm/123guestbook_subdomains_cc-main-2024-feb-apr-may.txt (this is the case where `!a <` should be fine, fwiw) |
| 05:20:47 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/NtANm/123guestbook_subdomains_cc-main-2024-feb-apr-may.txt |
| 05:21:36 | <thuban> | i tried some brute-forcing, but i hit it too hard (fell over at first, some 429s when i scaled back but not enough) |
| 05:28:16 | <thuban> | (interesting: deleted and not found are both 404, but they render differently https://treasure.123guestbook.com/ / https://travelers.123guestbook.com/) |
| 05:31:15 | | DogsRNice quits [Read error: Connection reset by peer] |
| 05:39:19 | | sludge joins |
| 05:39:42 | | sludge is now authenticated as sludge |
| 05:54:59 | <@hook54321> | Are there any easy ways for someone to find their Google+ page from the grab or just if they happen to have the profile URL? |
| 05:59:18 | | Macsteel quits [Client Quit] |
| 06:40:13 | | nertzy quits [Client Quit] |
| 07:00:42 | | Lord_Nightmare quits [Client Quit] |
| 07:04:43 | | Lord_Nightmare (Lord_Nightmare) joins |
| 07:09:25 | | Unholy23619246453771 quits [Ping timeout: 272 seconds] |
| 07:33:22 | | Arcorann (Arcorann) joins |
| 07:36:39 | | nicolas17 quits [Ping timeout: 272 seconds] |
| 07:39:15 | | nicolas17 joins |
| 08:20:37 | | Sluggs quits [Ping timeout: 255 seconds] |
| 08:51:35 | | BlueMaxima joins |
| 09:00:00 | | Bleo1826007227196 quits [Client Quit] |
| 09:01:20 | | Bleo1826007227196 joins |
| 09:03:22 | | muklumsum quits [Ping timeout: 255 seconds] |
| 09:05:51 | | muklumsum joins |
| 09:29:01 | | muklumsum quits [Ping timeout: 255 seconds] |
| 09:33:21 | | muklumsum joins |
| 09:33:53 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 09:36:53 | | loug4 joins |
| 09:46:37 | | muklumsum_ joins |
| 09:46:51 | <IDK> | https://freshcut.gg/ is shutting down tommrow |
| 09:47:45 | | muklumsum quits [Ping timeout: 272 seconds] |
| 09:56:01 | | muklumsum_ quits [Ping timeout: 255 seconds] |
| 10:00:58 | | muklumsum joins |
| 10:06:49 | | muklumsum quits [Ping timeout: 255 seconds] |
| 10:10:12 | | muklumsum joins |
| 10:16:43 | | muklumsum quits [Ping timeout: 255 seconds] |
| 10:17:04 | | decky_e joins |
| 10:18:54 | | muklumsum joins |
| 10:24:08 | <h2ibot> | Exorcism edited LEGO Insiders Community (+4): https://wiki.archiveteam.org/?diff=52360&oldid=52344 |
| 10:29:19 | | loug4 quits [Client Quit] |
| 10:29:51 | | loug4 joins |
| 10:46:56 | <katia> | IDK, it's running on archivebot now but i'm not sure it's crawling more than just the user profiles / no media seems to be gotten from those links - those are all via JS to a graphql endpoint :/ |
| 10:47:20 | <katia> | oh there are some mp4s now hm :D |
| 10:48:13 | | muklumsum quits [Ping timeout: 255 seconds] |
| 10:53:16 | | muklumsum joins |
| 11:01:13 | | benjins2 quits [Ping timeout: 272 seconds] |
| 11:02:31 | | muklumsum_ joins |
| 11:04:23 | | muklumsum quits [Ping timeout: 272 seconds] |
| 11:04:44 | | benjins2 joins |
| 11:06:15 | | muklumsum joins |
| 11:07:33 | | muklumsum_ quits [Ping timeout: 272 seconds] |
| 11:09:00 | | nertzy joins |
| 11:11:34 | <IDK> | iirc the things are under storage.googleapis.com and a cdn domain |
| 11:15:05 | <IDK> | And it's probably not even mp4s its ts |
| 11:21:48 | <IDK> | tiki flashback lol |
| 11:27:49 | | muklumsum quits [Ping timeout: 255 seconds] |
| 11:30:21 | | muklumsum joins |
| 12:00:30 | <@OrIdow6> | Common crawl indices? |
| 12:02:03 | <datechnoman> | Think they meant indexes |
| 12:04:53 | <@OrIdow6> | More indexes of the links from common crawl? Or lists of the URLs that they've crawled? (And if the latter case what's the advantage over IA CDX?) |
| 12:14:10 | | systwi quits [Ping timeout: 255 seconds] |
| 12:23:50 | | sludge_ joins |
| 12:24:31 | | sludge quits [Ping timeout: 255 seconds] |
| 12:28:21 | <masterx244|m> | <OrIdow6> "More indexes of the links from..." <- if you got them locally you can crunch the data much faster |
| 12:28:51 | <masterx244|m> | (did that at imgone times to hunt imgur links out of laion5B and friends) |
| 12:30:37 | | AK quits [Client Quit] |
| 12:31:21 | | AK (AK) joins |
| 12:32:30 | | systwi (systwi) joins |
| 13:33:51 | | Arcorann quits [Ping timeout: 272 seconds] |
| 13:40:51 | | Wohlstand (Wohlstand) joins |
| 13:51:50 | | ArchivalEfforts quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 13:57:05 | | nertzy quits [Client Quit] |
| 14:33:13 | | Guest54 joins |
| 14:43:19 | | emtee joins |
| 14:43:42 | | emtee quits [Client Quit] |
| 14:57:02 | | evanim quits [Read error: Connection reset by peer] |
| 14:57:05 | | evanim (evanim) joins |
| 15:21:38 | | MrMcNuggets (MrMcNuggets) joins |
| 15:27:19 | <that_lurker> | Has anyone tried to AB https://osintukraine.com/ yet? |
| 16:22:21 | | datechnoman quits [Remote host closed the connection] |
| 16:23:05 | | datechnoman (datechnoman) joins |
| 16:39:23 | <IDK> | https://9to5google.com/2024/06/12/youtube-ad-injection/, does this affect #down-the-tube? |
| 16:57:08 | | Overlordz quits [Read error: Connection reset by peer] |
| 16:57:09 | | Overlordz joins |
| 17:00:06 | | Doranwen quits [Read error: Connection reset by peer] |
| 17:00:22 | | bladem quits [Ping timeout: 255 seconds] |
| 17:00:33 | | Doranwen (Doranwen) joins |
| 17:01:49 | | bladem (bladem) joins |
| 17:14:24 | | Island joins |
| 17:39:09 | | datechnoman quits [Remote host closed the connection] |
| 17:39:35 | | datechnoman (datechnoman) joins |
| 17:40:14 | | MrMcNuggets quits [Client Quit] |
| 17:51:04 | | driib quits [Quit: Ping timeout (120 seconds)] |
| 17:51:19 | | driib (driib) joins |
| 18:44:23 | | datechnoman quits [Remote host closed the connection] |
| 18:44:47 | | datechnoman (datechnoman) joins |
| 19:08:10 | | datechnoman quits [Client Quit] |
| 19:08:31 | | loug4 quits [Read error: Connection reset by peer] |
| 19:08:33 | | datechnoman (datechnoman) joins |
| 19:09:09 | | loug4 joins |
| 19:20:30 | | datechnoman quits [Read error: Connection reset by peer] |
| 19:20:55 | | datechnoman (datechnoman) joins |
| 19:25:08 | | Gereon026 (Gereon) joins |
| 19:27:31 | | Gereon02 quits [Ping timeout: 255 seconds] |
| 19:27:31 | | Gereon026 is now known as Gereon02 |
| 19:53:52 | | datechnoman quits [Remote host closed the connection] |
| 19:55:01 | | datechnoman (datechnoman) joins |
| 20:26:42 | | datechnoman quits [Remote host closed the connection] |
| 20:27:04 | | datechnoman (datechnoman) joins |
| 20:43:43 | | decky_e quits [Read error: Connection reset by peer] |
| 20:50:21 | | JaffaCakes118 (JaffaCakes118) joins |
| 20:52:14 | <JaffaCakes118> | So Triage (a malware analysis platform), has started going more corporate and and slowly getting rid of their free users and deleting the analysis, they have a sitemap with every single url on triage, is there a way we can get this archived with archivebot or something? https://tria.ge/sitemap.xml |
| 21:04:12 | | datechnoman quits [Remote host closed the connection] |
| 21:04:43 | | flotwig quits [Ping timeout: 255 seconds] |
| 21:04:58 | | datechnoman (datechnoman) joins |
| 21:11:55 | | flotwig joins |
| 21:36:06 | | datechnoman quits [Remote host closed the connection] |
| 21:37:02 | | datechnoman (datechnoman) joins |
| 21:42:41 | | BlueMaxima joins |
| 21:45:52 | | Wohlstand quits [Client Quit] |
| 21:46:02 | | Wohlstand (Wohlstand) joins |
| 21:51:08 | | loug4 quits [Client Quit] |
| 21:53:13 | | loug4 joins |
| 21:54:11 | | pabs quits [Ping timeout: 272 seconds] |
| 21:55:21 | | pabs (pabs) joins |
| 21:57:15 | | JaffaCakes118 quits [Remote host closed the connection] |
| 21:57:23 | | JaffaCakes118 (JaffaCakes118) joins |
| 21:59:15 | | datechnoman quits [Remote host closed the connection] |
| 21:59:37 | <nulldata> | ^looks like katia has thrown it in AB |
| 22:00:10 | | datechnoman (datechnoman) joins |
| 22:08:13 | | muklumsum_ joins |
| 22:10:01 | | muklumsum quits [Ping timeout: 272 seconds] |
| 22:23:19 | | Wohlstand quits [Ping timeout: 272 seconds] |
| 22:24:01 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 22:25:08 | | datechnoman quits [Remote host closed the connection] |
| 22:25:49 | | datechnoman (datechnoman) joins |
| 22:41:28 | | Barto quits [Ping timeout: 255 seconds] |
| 22:41:55 | | mgrytbak quits [Ping timeout: 255 seconds] |
| 22:42:13 | | Barto (Barto) joins |
| 22:44:06 | | JohnnyJ quits [Read error: Connection reset by peer] |
| 22:49:43 | | datechnoman quits [Remote host closed the connection] |
| 22:50:07 | | datechnoman (datechnoman) joins |
| 22:56:19 | | ThreeHM quits [Ping timeout: 255 seconds] |
| 23:03:44 | | yarrow quits [Read error: Connection reset by peer] |
| 23:06:58 | | yarrow (yarrow) joins |
| 23:09:49 | | muklumsum_ quits [Ping timeout: 255 seconds] |
| 23:10:08 | | ThreeHM (ThreeHeadedMonkey) joins |
| 23:10:16 | | datechnoman quits [Remote host closed the connection] |
| 23:11:03 | | datechnoman (datechnoman) joins |
| 23:15:36 | | datechnoman quits [Client Quit] |
| 23:16:25 | | JaffaCakes118 quits [Remote host closed the connection] |
| 23:16:53 | | JaffaCakes118 (JaffaCakes118) joins |
| 23:18:56 | | muklumsum joins |
| 23:41:27 | | datechnoman (datechnoman) joins |
| 23:46:17 | <@OrIdow6> | masterx244|m: So it's the URLs that CC captured? |
| 23:46:57 | <@OrIdow6> | If so how is that specifically better than IA CDX? |
| 23:53:26 | | datechnoman quits [Client Quit] |
| 23:54:21 | <imer> | IA cdx is a superset actually since it includes common crawl data too (recent comparison with JAA for #webroasting I found one URL the IA index didnt have.. which was a 404 incidentally) |
| 23:55:31 | | datechnoman (datechnoman) joins |