| 00:06:00 | | wickedplayer494 quits [Ping timeout: 265 seconds] |
| 01:16:11 | | kiryu quits [Client Quit] |
| 01:19:24 | | Arcorann (Arcorann) joins |
| 01:27:01 | | icedice quits [Client Quit] |
| 01:33:42 | | wickedplayer494 joins |
| 01:33:52 | | wickedplayer494 is now authenticated as wickedplayer494 |
| 01:41:17 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 01:47:52 | <pabs> | arkiver: there are definitely times when AB is at or above the job limit + 5 pending, especially when flashfire42 is around doing ISP stuff. I tend to avoid doing proactive stuff a fair bit, unless we are more than a few jobs below the limit and things seem quiet |
| 01:48:28 | <pabs> | and when/if snscrape comes back, there will be a big backlog of twitter archiving to do |
| 01:49:01 | <flashfire42> | Sorry bout that hahaha |
| 01:50:23 | <pabs> | and there will be other situations where higher peak capacity is useful; for eg the adelaide university merger is going to be tons of jobs due to the many subdomains |
| 01:51:53 | <pabs> | no, I think you're doing good stuff flashfire42 :) |
| 01:52:25 | <pabs> | anyway, I'm sure we will always be able to reach whatever the job limit is :) |
| 02:12:09 | | etnguyen03 (etnguyen03) joins |
| 02:25:17 | | owen joins |
| 02:25:41 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 02:30:59 | <owen> | What's the easiest way to archive a mid-sized portion of a website? (ex. example.com/stuff-to-archive/*) |
| 02:33:25 | | threedeeitguy3 quits [Ping timeout: 265 seconds] |
| 02:33:32 | | threedeeitguy3 (threedeeitguy) joins |
| 02:36:40 | <pabs> | owen: archivebot |
| 02:37:02 | <pabs> | just pass us the URL and we will run it, then everything happens automatically after that |
| 02:37:35 | <pabs> | that needs a directory index or some other link based mechanism that lists all subcontents though |
| 02:38:50 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 02:40:14 | | etnguyen03 (etnguyen03) joins |
| 02:47:27 | | DogsRNice__ joins |
| 02:47:35 | | threedeeitguy3 quits [Client Quit] |
| 02:50:38 | | DogsRNice__ quits [Remote host closed the connection] |
| 02:50:50 | | DogsRNice__ joins |
| 02:51:47 | | DogsRNice_ quits [Ping timeout: 265 seconds] |
| 02:54:33 | | threedeeitguy39 (threedeeitguy) joins |
| 03:01:40 | | owen quits [Client Quit] |
| 03:05:49 | | DogsRNice__ quits [Read error: Connection reset by peer] |
| 03:06:10 | <pokechu22> | The alternative technique is saving the entire site :P |
| 03:12:39 | <project10> | AB job 8k26biu6lro5cb6vi3awnu3z8 is a chonky one |
| 03:20:06 | <@arkiver> | #shreddit is restarted |
| 03:20:42 | <@arkiver> | pabs: so, we're holding back now? |
| 03:21:20 | <pabs> | some folks are occasionally yeah |
| 03:24:06 | | decagon__ joins |
| 03:27:26 | | krvme quits [Ping timeout: 252 seconds] |
| 03:27:52 | <Ryz> | Reminder, inactive user content excluding YouTube on Google may start being deleted starting on 2023 December S: |
| 03:33:29 | <@arkiver> | when IA is all fine again with taking data, got great plans for expanding our archiving - or well especially plans for #// |
| 03:33:48 | <@arkiver> | we'll significantly increase our coverage of 'important stuff' |
| 03:36:55 | <fireonlive> | awesome possum |
| 03:44:29 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 03:47:21 | | etnguyen03 (etnguyen03) joins |
| 04:00:18 | <h2ibot> | JAABot edited CurrentWarriorProject (-4): https://wiki.archiveteam.org/?diff=50742&oldid=50671 |
| 04:00:26 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 04:00:36 | <pabs> | Ryz: does that include public blogspot/blogger/etc stuff? |
| 04:02:26 | | BigBrain_ (bigbrain) joins |
| 04:09:59 | | kiryu (kiryu) joins |
| 04:10:12 | | kiryu quits [Client Quit] |
| 04:11:31 | <Ryz> | pabs, yes...S: |
| 04:13:40 | <Ryz> | The problem with Blogger user number IDs is that it gives 429s pretty easily at least on running ArchiveBot, which is why I would want this to take off the ground as soon as possible... |
| 04:13:46 | <Ryz> | arkiver? |
| 04:27:40 | <pabs> | fuck |
| 04:29:36 | | pabs . o O 0 ( #Y ) |
| 04:31:42 | | pabs has 1323 URLs in his blogspot archive TODO... |
| 04:31:50 | <fireonlive> | x_x |
| 04:33:26 | <pabs> | ISTR with blogspot it is easy to enumerate lots of blogspot starting with one blog, see what other blogs that author has, and same for all the commenters |
| 04:33:59 | | pabs checks shell history for some terrible oneliners |
| 04:34:19 | <pabs> | also theres tons of spammers on blogspot |
| 04:35:45 | <fireonlive> | yeah one of the sites i want to get archived eventually is just 99% overrun with spam (it's also js-hell-frontend-on-top-of-phpBB2) :/ |
| 04:35:47 | <fireonlive> | sad to se |
| 04:35:49 | <fireonlive> | see |
| 04:36:36 | | kiryu (kiryu) joins |
| 04:38:48 | <pabs> | https://transfer.archivete.am/gJyh0/blogspot-profile-enumerator.sh |
| 04:40:59 | <pabs> | https://transfer.archivete.am/sKHm2/pabs-archive-blogspot-todo.txt |
| 04:43:53 | | etnguyen03 quits [Client Quit] |
| 05:17:26 | | Chris5010 quits [Ping timeout: 252 seconds] |
| 05:22:30 | | Chris5010 (Chris5010) joins |
| 05:30:16 | | kiryu quits [Remote host closed the connection] |
| 05:30:50 | <shinji257> | I got a couple of tasks that keep getting stuck at "Lua runtime error: reddit.lua:286: attempt to call global 'unicode_codepoint_as_utf8' (a nil value)"? They are reddit project tasks. |
| 05:36:35 | | kiryu (kiryu) joins |
| 05:47:14 | <imer> | shinji257: thats known i think, #shreddit is the project channel :) |
| 05:47:33 | <imer> | Just waiting for a fix, should be sorted later today |
| 05:55:12 | | erkinalp quits [Remote host closed the connection] |
| 06:31:08 | | nicolas17 quits [Ping timeout: 252 seconds] |
| 07:01:11 | | dumbgoy quits [Ping timeout: 265 seconds] |
| 07:05:01 | | Unholy236131661808515 quits [Remote host closed the connection] |
| 07:06:36 | | Unholy236131661808515 (Unholy2361) joins |
| 07:26:10 | | themadpro (themadpro) joins |
| 08:07:23 | | nulldata quits [Ping timeout: 252 seconds] |
| 08:10:42 | | nulldata (nulldata) joins |
| 08:44:20 | | kiryu quits [Client Quit] |
| 08:45:23 | | Island quits [Read error: Connection reset by peer] |
| 08:51:31 | | kiryu (kiryu) joins |
| 09:00:44 | | nulldata quits [Ping timeout: 252 seconds] |
| 09:04:03 | | nulldata (nulldata) joins |
| 09:15:55 | <pabs> | does anyone know if AB looks at <a href> links inside HTML comments? |
| 09:17:49 | | Exorcism is now known as Exorcism_ |
| 09:24:37 | | Exorcism (exorcism) joins |
| 09:35:31 | | themadpro quits [Client Quit] |
| 09:43:01 | <mgrandi> | https://www.msn.com/en-us/news/technology/atari-pulls-nostalgia-power-move-and-buys-homebrew-community-forum/ar-AA1grqaA, I've heard rumblings that they are going to purge boards on https://forums.atariage.com , dunno how easy it is to archive , it's an Invision forum board |
| 09:52:28 | <pabs> | there is an AB job in progress |
| 09:52:46 | <pabs> | and the forum has been saved before, 2021 or 2019 IIRC |
| 09:53:08 | <pabs> | unfortunately we had to restart the job a couple of times and slow it down a fair bit |
| 09:57:56 | <mgrandi> | Awesome |
| 09:59:17 | <pabs> | got the main website too and some other subdomains |
| 10:00:01 | | railen63 quits [Remote host closed the connection] |
| 10:00:17 | | railen63 joins |
| 10:57:52 | | JensRex quits [] |
| 10:58:24 | | JensRex (JensRex) joins |
| 13:12:10 | | aa joins |
| 13:14:00 | | aa quits [Remote host closed the connection] |
| 13:26:56 | | Arcorann quits [Ping timeout: 252 seconds] |
| 14:03:08 | | railen63 quits [Remote host closed the connection] |
| 14:05:02 | | railen63 joins |
| 14:05:10 | | Webuser10794 joins |
| 14:05:43 | | Webuser10794 quits [Remote host closed the connection] |
| 14:06:17 | | Webuser10794 joins |
| 14:07:26 | | Webuser693 joins |
| 14:07:34 | | driib quits [Quit: The Lounge - https://thelounge.chat] |
| 14:08:31 | | Webuser10794 quits [Remote host closed the connection] |
| 14:09:05 | <Webuser693> | Hey, do you have the video link, it's called https://www.youtube.com/watch?v=fUVrK6089fs |
| 14:11:37 | | Webuser693 quits [Remote host closed the connection] |
| 14:12:22 | | driib (driib) joins |
| 14:25:11 | | DogsRNice joins |
| 14:25:56 | | etnguyen03 (etnguyen03) joins |
| 14:43:11 | | kiryu quits [Client Quit] |
| 14:47:59 | | nncandy joins |
| 14:48:31 | | nncandy quits [Remote host closed the connection] |
| 14:49:32 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 14:50:47 | <shinji257> | imer: acknowledged |
| 14:54:22 | | gfhh quits [Ping timeout: 265 seconds] |
| 14:55:35 | | etnguyen03 (etnguyen03) joins |
| 14:57:23 | | gfhh joins |
| 15:11:51 | | kiryu joins |
| 15:11:51 | | kiryu is now authenticated as kiryu |
| 15:11:51 | | kiryu quits [Changing host] |
| 15:11:51 | | kiryu (kiryu) joins |
| 15:28:45 | | dumbgoy joins |
| 15:37:34 | <h2ibot> | Bzc6p edited ArchiveTeam Domains (+37, /* archiveteam.hu */ Lecsű is discontinued): https://wiki.archiveteam.org/?diff=50743&oldid=50703 |
| 15:42:34 | <h2ibot> | Bzc6p edited Deathwatch (-3, /* 2023 */ fix grammar): https://wiki.archiveteam.org/?diff=50744&oldid=50741 |
| 15:43:35 | <h2ibot> | Bzc6p edited Valhalla (+0, /* Physical Options */ typo): https://wiki.archiveteam.org/?diff=50745&oldid=50740 |
| 16:00:44 | | fede joins |
| 16:01:03 | <fede> | hello |
| 16:01:25 | <fede> | is this like an archiving project? |
| 16:02:43 | <that_lurker> | This is the team that does the projects. You can find info about current and old archiving projects in the wiki https://wiki.archiveteam.org/index.php/Main_Page |
| 16:03:06 | <that_lurker> | On the page of every project you can also find the corresponging irc channel. |
| 16:04:04 | | zhongfu quits [Ping timeout: 258 seconds] |
| 16:04:13 | <fede> | there's no everyplay archive right? |
| 16:05:49 | | zhongfu (zhongfu) joins |
| 16:07:06 | <imer> | https://wiki.archiveteam.org/index.php/Everyplay doesn't look like it |
| 16:07:25 | <fede> | thats so sad |
| 16:07:33 | <fede> | i lost all my videos |
| 16:11:56 | | AmAnd0A quits [Ping timeout: 252 seconds] |
| 16:12:41 | <TheTechRobo> | yeah, I unfortunately haven't been able to find anyone who archived it |
| 16:12:45 | | AmAnd0A joins |
| 16:19:01 | | kiryu quits [Client Quit] |
| 16:22:31 | | Naruyoko quits [Quit: Leaving] |
| 16:26:14 | | qw3rty quits [Ping timeout: 252 seconds] |
| 16:27:55 | | iCaotix quits [Read error: Connection reset by peer] |
| 16:28:09 | | iCaotix joins |
| 16:39:52 | | gfhh quits [Read error: Connection reset by peer] |
| 16:40:22 | | Naruyoko joins |
| 16:42:27 | | gfhh joins |
| 16:53:45 | | szczot3k quits [Ping timeout: 265 seconds] |
| 16:53:55 | | kiryu (kiryu) joins |
| 16:57:29 | | szczot3k (szczot3k) joins |
| 17:15:46 | | icedice (icedice) joins |
| 17:21:18 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 17:25:39 | | lunik173 quits [Ping timeout: 265 seconds] |
| 17:32:44 | | szczot3k quits [Client Quit] |
| 17:33:21 | | szczot3k (szczot3k) joins |
| 17:45:53 | | fede quits [Remote host closed the connection] |
| 17:46:52 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 17:47:19 | | AmAnd0A joins |
| 17:48:21 | | jacksonchen666 quits [Ping timeout: 245 seconds] |
| 18:07:27 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 18:09:34 | | AmAnd0A joins |
| 18:09:56 | | AlsoHP_Archivist joins |
| 18:12:16 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 18:12:34 | | AmAnd0A joins |
| 18:24:11 | <h2ibot> | Exorcism edited DokuWiki (+92): https://wiki.archiveteam.org/?diff=50746&oldid=50527 |
| 18:26:11 | <h2ibot> | Exorcism edited Wordpress.com (+106): https://wiki.archiveteam.org/?diff=50747&oldid=28940 |
| 18:36:04 | | Hackerpcs quits [Quit: Hackerpcs] |
| 18:38:35 | | Hackerpcs (Hackerpcs) joins |
| 18:40:12 | | AlsoHP_Archivist quits [Client Quit] |
| 18:43:52 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 18:43:56 | | AmAnd0A joins |
| 18:57:20 | | erkinalp joins |
| 19:02:05 | <@JAA> | pabs: AB parses the HTML and then walks the element tree. It shouldn't see anything in comments. |
| 19:04:31 | | driib quits [Client Quit] |
| 19:16:34 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 19:18:53 | <@arkiver> | thuban: is there any update on the orange sites coming back? |
| 19:19:21 | <h2ibot> | Myusernameisanything edited University Web Hosting (-7, Changing not saved yet tag to lost.): https://wiki.archiveteam.org/?diff=50748&oldid=47676 |
| 19:19:22 | <h2ibot> | Myusernameisanything edited List of websites excluded from the Wayback Machine (+57, Added 2 links): https://wiki.archiveteam.org/?diff=50749&oldid=50702 |
| 19:19:23 | <h2ibot> | Myusernameisanything edited BluWiki (+10, If there are about 20 dumps, it is partially…): https://wiki.archiveteam.org/?diff=50750&oldid=27576 |
| 19:19:24 | <h2ibot> | Gridkr edited List of websites excluded from the Wayback Machine (+20, Add https://nexo.com/): https://wiki.archiveteam.org/?diff=50751&oldid=50749 |
| 19:20:59 | | driib (driib) joins |
| 19:33:49 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 19:35:01 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 19:36:00 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 19:36:07 | | AmAnd0A joins |
| 19:36:38 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 19:36:56 | | AmAnd0A joins |
| 19:37:18 | | BigBrain_ (bigbrain) joins |
| 19:55:42 | | etnguyen03 (etnguyen03) joins |
| 20:00:29 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=50754&oldid=50751 |
| 20:04:32 | <h2ibot> | JustAnotherArchivist edited SoundCloud (+236, Datetimeify, add 2019 projectn't, add…): https://wiki.archiveteam.org/?diff=50755&oldid=48897 |
| 20:09:14 | | Island joins |
| 20:41:59 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 20:56:00 | | Miki57 joins |
| 21:00:08 | | Miki_57 quits [Ping timeout: 252 seconds] |
| 21:08:51 | | erkinalp quits [Remote host closed the connection] |
| 21:15:43 | <project10> | #archivebot jobs submit discovered things into the backfeed system, yes? |
| 21:19:23 | | AmAnd0A quits [Ping timeout: 252 seconds] |
| 21:20:06 | | AmAnd0A joins |
| 21:28:51 | <TheTechRobo> | i don’t think so, assuming you mean e.g. queuing imgur URLs in #imgone |
| 21:30:36 | <@JAA> | project10: No, there's zero interaction between AB and DPoS projects. |
| 21:36:33 | <project10> | well the genesis of my question was seeing #telegrab items submitted via AB (job 1ty54jgyh2n6iv2ri6o0gbbbp) |
| 21:37:25 | | fireonlive quits [Excess Flood] |
| 21:37:57 | | fireonlive (fireonlive) joins |
| 21:42:07 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 21:42:32 | | AmAnd0A joins |
| 21:44:17 | <@JAA> | That's just me archiving URLs shared in AT channels so our logs aren't full of dead links in the future. |
| 21:44:21 | | iCaotix quits [Read error: Connection reset by peer] |
| 21:44:36 | <project10> | oh :) |
| 21:46:06 | | iCaotix joins |
| 21:46:32 | <fireonlive> | JAA++ |
| 21:47:10 | <that_lurker> | we need commode points system here aswell :P |
| 21:54:01 | | etnguyen03 (etnguyen03) joins |
| 21:56:03 | | nicolas17 joins |
| 21:56:55 | | lunik173 joins |
| 22:23:26 | <fireonlive> | JAA++ |
| 22:23:27 | <eggdrop> | karma for 'JAA' is now 1 |
| 22:23:30 | <fireonlive> | lol |
| 22:28:41 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 22:31:05 | <nicolas17> | 2 files remaining and I'll finish getting the listing of all yahoo-videos .tar.bz2 files |
| 22:31:35 | | etnguyen03 (etnguyen03) joins |
| 22:32:13 | <nicolas17> | my intention was to get *.tar.bz2 first while I wrote a more efficient script to get the .tar lists, which of course I haven't actually started yet so I'll have to continue the .tar files the slow way |
| 22:38:50 | <@JAA> | ++fireonlive |
| 22:39:03 | <fireonlive> | f |
| 22:39:13 | <@JAA> | Pff, doesn't even understand pre-incrementing. |
| 22:39:29 | <fireonlive> | :p |
| 22:44:58 | <TheTechRobo> | eggdrop— |
| 22:45:26 | <TheTechRobo> | Oh thanks the lounge i really needed that transformation |
| 22:45:56 | <@JAA> | The Lounge-- |
| 22:45:57 | <eggdrop> | [karma] 'The Lounge' is now at -1 |
| 22:46:33 | <@JAA> | The Lounge-- |
| 22:46:35 | <eggdrop> | [karma] 'The Lounge' is now at -1 |
| 22:46:43 | <@JAA> | Ah, works with a normal space, too. :-) |
| 22:46:51 | <Terbium> | The Lounge++ |
| 22:46:51 | <eggdrop> | [karma] 'The Lounge' is now at 0 |
| 22:46:54 | <fireonlive> | TheTechRobo: i do believe that was iOS |
| 22:46:56 | <fireonlive> | :P |
| 22:47:05 | <TheTechRobo> | Oh thanks apple then |
| 22:47:10 | <Terbium> | iPhone--\ |
| 22:47:12 | <Terbium> | iPhone-- |
| 22:47:13 | <eggdrop> | [karma] 'iPhone' is now at -1 |
| 22:47:18 | <fireonlive> | ! |
| 22:47:26 | <TheTechRobo> | Dictating how I type letters, thanks Timmy |
| 22:48:28 | <fireonlive> | >not knowing how to configure text replacement |
| 22:49:39 | <@JAA> | This can go in -ot now. :-) |
| 22:49:49 | <fireonlive> | :) |
| 22:49:51 | <@JAA> | Apparently my FuzzyMemories.TV crawl is nearly done. |
| 22:50:39 | <@JAA> | It has a bit of pagination to hunt down but has already retrieved most /watch/ pages and the accompanying videos (that aren't 404s). |
| 22:52:52 | | BearFortress quits [Ping timeout: 265 seconds] |
| 22:52:57 | <@JAA> | Specifically, video IDs go to 4794, and my crawl has retrieved 4668 as of a couple minutes ago. |
| 22:53:16 | <@JAA> | ~100 GiB so far |
| 22:55:35 | <@JAA> | 4054 actual videos as of just now based on some crude log grepping. |
| 23:00:51 | | benjinsm joins |
| 23:00:56 | | Naruyoko5 joins |
| 23:04:28 | | Naruyoko quits [Ping timeout: 265 seconds] |
| 23:04:28 | | benjins quits [Ping timeout: 265 seconds] |
| 23:09:39 | | eythian quits [Client Quit] |
| 23:10:03 | | eythian joins |
| 23:27:16 | | icedice quits [Client Quit] |
| 23:30:46 | | BlueMaxima joins |
| 23:36:53 | | eggdrop quits [Ping timeout: 252 seconds] |
| 23:39:26 | | AmAnd0A quits [Remote host closed the connection] |
| 23:39:38 | | AmAnd0A joins |
| 23:44:02 | | AmAnd0A quits [Ping timeout: 252 seconds] |
| 23:47:58 | | systwi quits [Ping timeout: 265 seconds] |
| 23:48:07 | | systwi__ (systwi) joins |
| 23:48:19 | | eggdrop (eggdrop) joins |
| 23:48:59 | | railen63 quits [Remote host closed the connection] |
| 23:53:27 | | railen63 joins |
| 23:56:54 | | octylFractal|m joins |