00:00:52 | <fireonlive> | skeletor wins |
00:03:04 | | etnguyen03 (etnguyen03) joins |
00:05:52 | <vokunal|m> | I'm not sure where the number comes from, but one source stated 2,398,412 posts |
00:18:21 | <@JAA> | At the bottom on https://www.he-man.org/forums/boards/forum.php |
00:32:50 | <@JAA> | (And 'the source' is https://old.reddit.com/r/Archiveteam/comments/17rps3j/hemanorg_forums_shutting_down_after_over_20_years/ I guess.) |
00:38:50 | <vokunal|m> | ahhhhhh. thanks |
00:48:34 | | etnguyen03 quits [Ping timeout: 265 seconds] |
01:25:14 | | etnguyen03 (etnguyen03) joins |
01:31:49 | | mcint joins |
01:50:55 | | HP_Archivist quits [Client Quit] |
01:51:04 | <Arcorann> | Anyone know what's going on with HikariNoAkari? I heard there was some drama and it was shutting down |
01:52:09 | <thuban> | it's been discussed, but not with any particular insight |
01:52:12 | <@JAA> | Yeah, apparently: https://i.imgur.com/jeJSEu6.jpg |
02:05:25 | | Pedrosso quits [Ping timeout: 265 seconds] |
02:11:21 | | Pedrosso joins |
02:56:10 | | Island quits [Remote host closed the connection] |
02:56:10 | | Island joins |
02:56:10 | | DogsRNice_ joins |
02:56:12 | | DogsRNice quits [Remote host closed the connection] |
02:56:12 | | kdqep__ quits [Remote host closed the connection] |
02:56:12 | | ScenarioPlanet quits [Remote host closed the connection] |
02:56:13 | | parfait_ joins |
02:56:18 | | ScenarioPlanet (ScenarioPlanet) joins |
03:06:48 | | BearFortress_ joins |
03:06:48 | | Island_ joins |
03:06:48 | | Scen joins |
03:09:15 | | pabs quits [Ping timeout: 272 seconds] |
03:10:12 | | Pedrosso quits [Ping timeout: 266 seconds] |
03:10:12 | | BearFortress quits [Ping timeout: 266 seconds] |
03:11:09 | | ScenarioPlanet quits [Ping timeout: 265 seconds] |
03:11:09 | | Island quits [Ping timeout: 265 seconds] |
03:11:19 | | Doran is now known as Doranwen |
03:20:43 | | pabs (pabs) joins |
03:42:11 | | etnguyen03 quits [Ping timeout: 272 seconds] |
03:52:51 | | pabs quits [Client Quit] |
03:55:28 | | pabs (pabs) joins |
04:01:53 | | etnguyen03 (etnguyen03) joins |
04:07:27 | | kiryu quits [Remote host closed the connection] |
04:08:59 | | kiryu (kiryu) joins |
04:12:05 | <@JAA> | So the AB job for Hikari No Akari finished, but it timed out on almost all sitemaps, i.e. I'm not sure it's complete. It does appear to have gone through the pagination, but new posts being added could still have led to some things getting missed. |
04:26:31 | | dumbgoy quits [Ping timeout: 272 seconds] |
05:09:35 | | apache2 quits [Ping timeout: 272 seconds] |
05:15:26 | | DogsRNice_ quits [Read error: Connection reset by peer] |
06:00:53 | | etnguyen03 quits [Ping timeout: 272 seconds] |
06:10:31 | | nicolas17 quits [Client Quit] |
06:12:03 | | etnguyen03 (etnguyen03) joins |
06:15:01 | | Dango360_ joins |
06:18:37 | | Dango360 quits [Ping timeout: 272 seconds] |
06:31:06 | | etnguyen03 quits [Client Quit] |
06:40:18 | | _Dango360 joins |
06:42:02 | | _Dango360 quits [Client Quit] |
06:42:21 | | Dango360 (Dango360) joins |
06:44:35 | | Dango360_ quits [Ping timeout: 272 seconds] |
06:45:29 | | Dango360_ joins |
06:49:08 | | Dango360 quits [Ping timeout: 265 seconds] |
06:53:11 | | Island_ quits [Read error: Connection reset by peer] |
07:03:37 | | Perk8 joins |
07:03:53 | | kdqep__ joins |
07:05:29 | | Perk quits [Ping timeout: 272 seconds] |
07:05:29 | | Perk8 is now known as Perk |
07:07:30 | | parfait_ quits [Ping timeout: 265 seconds] |
07:17:06 | | parfait_ joins |
07:21:02 | | kdqep__ quits [Ping timeout: 265 seconds] |
07:56:05 | <h2ibot> | PaulWise edited Bugzilla (+78, updates): https://wiki.archiveteam.org/?diff=51119&oldid=50954 |
08:04:52 | | jacksonchen666 (jacksonchen666) joins |
08:32:39 | | Dango360_ quits [Client Quit] |
08:58:50 | | onkel joins |
09:00:08 | | onkel quits [Remote host closed the connection] |
09:02:06 | | jacksonchen666 quits [Ping timeout: 245 seconds] |
09:12:17 | | jacksonchen666 (jacksonchen666) joins |
09:15:33 | | parfait_ quits [Client Quit] |
09:36:13 | | lukash9 quits [Ping timeout: 272 seconds] |
09:45:26 | | jacksonchen666 quits [Ping timeout: 245 seconds] |
09:52:13 | | jacksonchen666 (jacksonchen666) joins |
09:57:53 | | jacksonchen666 quits [Client Quit] |
10:00:03 | | Bleo1 quits [Client Quit] |
10:01:20 | | Bleo1 joins |
10:02:36 | | JohnnyJ joins |
10:14:26 | | JohnnyJ quits [Client Quit] |
10:17:11 | | JohnnyJ joins |
10:23:50 | | JohnnyJ quits [Client Quit] |
10:55:09 | | SF quits [Ping timeout: 265 seconds] |
10:55:25 | | SF joins |
11:08:41 | | SF quits [Ping timeout: 272 seconds] |
11:21:52 | | SF joins |
11:28:29 | | sec^nd quits [Remote host closed the connection] |
11:30:27 | | sec^nd (second) joins |
11:45:15 | | Scen is now known as ScenarioPlanet |
11:45:30 | | ScenarioPlanet is now authenticated as ScenarioPlanet |
12:09:34 | | T31M_ joins |
12:10:13 | | Perk6 joins |
12:10:13 | | TheTechRobo quits [Client Quit] |
12:10:13 | | T31M quits [Client Quit] |
12:10:13 | | mattx433 quits [Client Quit] |
12:10:13 | | katocala quits [Remote host closed the connection] |
12:10:13 | | Perk quits [Client Quit] |
12:10:13 | | nulldata quits [Client Quit] |
12:10:13 | | Bleo1 quits [Client Quit] |
12:10:13 | | T31M_ is now known as T31M |
12:10:14 | | Perk6 is now known as Perk |
12:10:14 | | Bleo1 joins |
12:10:16 | | katocala joins |
12:10:18 | | mattx433 (mattx433) joins |
12:10:23 | | nulldata (nulldata) joins |
12:10:41 | | TheTechRobo (TheTechRobo) joins |
12:16:58 | | Wohlstand (Wohlstand) joins |
12:29:47 | | dumbgoy joins |
12:34:14 | | BearFortress_ quits [Ping timeout: 265 seconds] |
12:39:31 | | BearFortress joins |
13:01:25 | | Arcorann quits [Ping timeout: 272 seconds] |
13:03:43 | | Wohlstand quits [Ping timeout: 265 seconds] |
13:04:37 | | HP_Archivist (HP_Archivist) joins |
13:18:13 | <betamax> | high chance it's already been done, but Minnesota is / has run a flag-design contest, all* 2123 designs are on the site: https://serc.mnhs.org/flags |
13:18:37 | <betamax> | *all => I'm pretty sure that they're no longer accepting submissions. not 100%, though |
13:19:22 | <betamax> | I'm not going to put it in AB because I have only glanced at the site and am not sure if extra work is needed to capture the "click each submission to view the larger-size image" |
14:15:43 | | katocala is now authenticated as katocala |
14:20:48 | | etnguyen03 (etnguyen03) joins |
14:50:01 | | sec^nd quits [Ping timeout: 245 seconds] |
14:52:50 | | sec^nd (second) joins |
14:54:43 | | Megame (Megame) joins |
15:09:21 | | etnguyen03 quits [Ping timeout: 272 seconds] |
15:41:23 | | icedice (icedice) joins |
15:43:29 | | etnguyen03 (etnguyen03) joins |
15:52:39 | | DogsRNice joins |
16:07:27 | | icedice quits [Client Quit] |
16:13:37 | | guest joins |
16:13:43 | <guest> | hello. |
16:13:54 | <guest> | like using archive bot |
16:18:27 | <fireonlive> | i’m glad you do! |
16:18:39 | <fireonlive> | it’s a nice bit |
16:18:42 | <fireonlive> | bot* |
16:27:10 | <nulldata> | Bit Bot |
16:29:55 | | Wohlstand (Wohlstand) joins |
16:29:59 | | Wohlstand quits [Client Quit] |
16:33:34 | <DogsRNice> | beep boop |
16:34:17 | | atphoenix_ (atphoenix) joins |
16:35:12 | <@JAA> | betamax: Looks like the large images work just fine without JS. They're standard links. I've thrown it into AB. |
16:37:23 | | atphoenix__ quits [Ping timeout: 272 seconds] |
16:37:59 | | guest quits [Remote host closed the connection] |
16:43:54 | | lennier2 joins |
16:46:53 | | lennier1 quits [Ping timeout: 272 seconds] |
17:12:13 | | etnguyen03 quits [Ping timeout: 272 seconds] |
17:20:26 | <ScenarioPlanet> | https://transfer.archivete.am/l5gdO/static.spore.com-ids-2016.txt.zst - Full (?) list of Spore.com creation IDs including INVALID/PURGED/BANNED statuses, as of 2016. |
17:21:21 | <ScenarioPlanet> | 20741764 entries ^ |
17:26:08 | | etnguyen03 (etnguyen03) joins |
17:30:04 | | Wohlstand (Wohlstand) joins |
17:34:59 | | Naruyoko quits [Remote host closed the connection] |
17:35:17 | | Naruyoko joins |
17:43:35 | | sd quits [Quit: sd] |
17:43:57 | | sd (sd) joins |
17:46:34 | | Mateon1 joins |
17:54:48 | <pokechu22> | ScenarioPlanet: what speed and concurrency can that be ran at? |
17:55:13 | | Pedrosso joins |
17:55:18 | <pokechu22> | oh, wait, those are just numeric IDs, so it can't be ran directly |
17:55:33 | | icedice (icedice) joins |
17:55:44 | <pokechu22> | 17:20 <ScenarioPlanet> https://transfer.archivete.am/l5gdO/static.spore.com-ids-2016.txt.zst - Full (?) list of Spore.com creation IDs including INVALID/PURGED/BANNED statuses, as of 2016. |
17:55:47 | <pokechu22> | 17:21 <ScenarioPlanet> 20741764 entries ^ |
17:56:00 | <pokechu22> | Pedrosso: might find that interesting |
17:56:11 | <Pedrosso> | Hey uh pokechu22, I've been looking over archivebots logs of "https://davoonline.com/phpBB3?archiveteam". It's been archiving a lot of the same login page with different "?*" things |
17:56:23 | <Pedrosso> | Also yes, I find it very interesting |
17:56:36 | <pokechu22> | Yeah, that doesn't look great :| |
17:56:52 | <pokechu22> | well, it makes sense for viewtopic, but https://davoonline.com/phpBB3/ucp.php?style=17&mode=login&redirect=search.php%3Fauthor_id%3D6234%26sd%3Dd%26sk%3Dt%26sr%3Dposts%26st%3D0%26start%3D40%26style%3D17 isn't useful |
17:56:56 | | parfait (kdqep) joins |
17:57:11 | <Pedrosso> | idk too much about the archivebot. I've seen ignorelists used. Idk how they work but can a "just ignore all mode=login lol" work? |
17:58:01 | <pokechu22> | Yeah, mode=login would ignore any URLs with the text mode=login in it, while ^https://davoonline.com/phpBB3/ucp\.php.*[?&]mode=login ignores anything starting with https://davoonline.com/phpBB3/ucp.php and containing either ?mode=login or &mode=login |
17:58:37 | <pokechu22> | Now, https://davoonline.com/phpBB3/viewtopic.php?style=17&p=18852 is a bit weird too since I'm not sure where the style=17 is coming from - I haven't seen other styles though so maybe it's fine? |
17:59:48 | <pokechu22> | hmm, no, URLs from https://davoonline.com/phpBB3/ don't have style=17 but once a URL with style=17 is retrieved that same parameter is added to everything else... and it looks identical to without it |
18:00:18 | <@JAA> | pokechu22: https://transfer.archivete.am/inline/68759/2y5iu7ey1kzbspqay7vlkbkuq-trace |
18:01:04 | <pokechu22> | ... ah, it came from one link that has style=17 on it in https://davoonline.com/phpBB3/viewtopic.php?p=41535#p41535 it looks like |
18:01:10 | | benjins joins |
18:01:12 | <pokechu22> | Probably best to just nuke that then |
18:01:18 | <pokechu22> | we don't need to save everything twice |
18:01:46 | <Pedrosso> | Is it able to be modified live with these ignores? |
18:02:05 | <Pedrosso> | Oh, nice |
18:02:10 | <pokechu22> | Yep, you can add and remove ignores as needed (and adjust the speed and concurrency at which it runs too) |
18:03:17 | | Island joins |
18:03:39 | <Pedrosso> | As for the spore.com archive, I didn't know it would be able to archive users, but I'm glad it found its way, hah |
18:04:10 | <pokechu22> | Looking at http://archivebot.com/ignores/2y5iu7ey1kzbspqay7vlkbkuq we have /ucp\.php\?mode=(login|delete_cookies|pm) in the forums ignoreset - which doesn't work with the random style=17 in the middle |
18:04:47 | | benjinsm quits [Ping timeout: 272 seconds] |
18:07:08 | <@JAA> | Yeah, that should probably be /ucp\.php\?(.*&)?mode=(login|delete_cookies|pm)(&|$) instead. |
18:30:22 | | Mateon2 joins |
18:30:26 | | atphoenix_ quits [Remote host closed the connection] |
18:30:26 | | Naruyoko quits [Remote host closed the connection] |
18:30:26 | | Island quits [Remote host closed the connection] |
18:30:26 | | benjins quits [Remote host closed the connection] |
18:30:26 | | parfait quits [Remote host closed the connection] |
18:30:26 | | icedice quits [Remote host closed the connection] |
18:30:29 | | mattx433 quits [Client Quit] |
18:30:29 | | Mateon1 quits [Remote host closed the connection] |
18:30:29 | | Mateon2 is now known as Mateon1 |
18:30:29 | | benjins joins |
18:30:31 | | Naruyoko joins |
18:30:33 | | mattx433 (mattx433) joins |
18:30:46 | | Island joins |
18:30:48 | | icedice (icedice) joins |
18:30:49 | | atphoenix_ (atphoenix) joins |
18:30:51 | | parfait (kdqep) joins |
18:46:51 | | benjinsm joins |
18:50:45 | | benjins quits [Ping timeout: 265 seconds] |
18:52:52 | <vokunal|m> | Since AB grabs a lot more than just the urls on a site, is there a way to determine whether a job will finish in time or not? Based on the current rate, he-man.org could grab ~3.3m if all goes well, but those aren't nessesarily all the urls on the site and probably external links |
18:56:44 | <pokechu22> | The easiest approach is to use --no-offsite if it seems like it'll be close to not finishing, and then manually run the offsite links afterwards (but that requires having the job's database manually saved since links skipped by --no-offsite don't end up in the log) |
19:13:41 | | rktk (rktk) joins |
19:16:33 | <Pedrosso> | What's the deal with all the 600,001-600,001 ms delays? |
19:17:37 | | null quits [Ping timeout: 272 seconds] |
19:18:23 | <that_lurker> | most of those are in a pipeline that has been offline for about a year |
19:19:09 | <@JAA> | s/most of // |
19:19:51 | <that_lurker> | JAA: Didn't you plan to remove the pipeline? Or is it on the todo list |
19:20:04 | <@JAA> | Yeah, the latter. |
19:21:00 | | nicolas17 joins |
19:30:32 | | Pedrosso quits [Remote host closed the connection] |
19:50:49 | | lunik173 quits [Quit: :x] |
19:51:14 | | lunik173 joins |
19:56:12 | | benjinsm quits [Read error: Connection reset by peer] |
20:05:55 | | benjins joins |
20:54:05 | | BlueMaxima joins |
21:12:45 | | itachi1706 quits [Quit: Bye :P] |
21:13:15 | | itachi1706 (itachi1706) joins |
21:18:33 | | hitgrr8 quits [Quit: away] |
21:36:44 | | Island quits [Remote host closed the connection] |
21:36:44 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
21:36:59 | | Island joins |
21:41:13 | <fireonlive> | https://t.me/zlibrary_official/41 "Sad news! Yesterday a large number of our domains were seized again. We should highlight that the majority of the seized domains were not mirrors of the Z-Library website, but they were separate sub-projects, containing only books in rare languages of the world, and their blocking is confusing. For instance, these |
21:41:13 | <fireonlive> | domains included books in Tamil, Mongolian, Catalan, Urdu, Pashto, and other languages." |
21:47:11 | | icedice2 (icedice) joins |
21:47:16 | | BlueMaxima_ joins |
21:47:26 | | TheTechRobo quits [Client Quit] |
21:47:26 | | nulldata quits [Client Quit] |
21:47:26 | | parfait quits [Remote host closed the connection] |
21:47:26 | | atphoenix_ quits [Remote host closed the connection] |
21:47:26 | | icedice quits [Remote host closed the connection] |
21:47:26 | | Naruyoko quits [Remote host closed the connection] |
21:47:26 | | BlueMaxima quits [Remote host closed the connection] |
21:47:27 | | Naruyoko joins |
21:47:31 | | atphoenix__ (atphoenix) joins |
21:47:32 | | parfait (kdqep) joins |
21:47:43 | | nulldata (nulldata) joins |
21:47:52 | | TheTechRobo (TheTechRobo) joins |
21:48:15 | | benjinsm joins |
21:52:09 | | benjins quits [Ping timeout: 272 seconds] |
21:52:51 | | Naruyoko5 joins |
21:53:04 | | parfait quits [Remote host closed the connection] |
21:53:04 | | Naruyoko quits [Remote host closed the connection] |
21:53:04 | | mattx433 quits [Client Quit] |
21:53:09 | | mattx433 (mattx433) joins |
21:53:18 | | parfait (kdqep) joins |
21:55:31 | | Pedrosso joins |
21:55:42 | <Pedrosso> | Lost connection, did I miss anything? |
21:57:27 | <Pedrosso> | Thanks to whoever added esporo to the archivebot :] |
21:59:07 | | Barto quits [Ping timeout: 272 seconds] |
22:06:05 | | pabs quits [Ping timeout: 272 seconds] |
22:18:23 | | Barto (Barto) joins |
22:36:08 | | pabs (pabs) joins |
22:47:15 | | etnguyen03 quits [Ping timeout: 272 seconds] |
22:50:43 | | Pedrosso quits [Remote host closed the connection] |
22:53:05 | | Pedrosso joins |
23:18:04 | | dumbgoy_ joins |
23:22:05 | | dumbgoy quits [Ping timeout: 272 seconds] |
23:32:05 | | mindstrut joins |
23:32:20 | | mindstrut quits [Client Quit] |
23:33:44 | | Arcorann (Arcorann) joins |
23:59:34 | <h2ibot> | Vokunal edited Deathwatch (+11): https://wiki.archiveteam.org/?diff=51120&oldid=51118 |