| 00:01:32 | | APOLLO03 quits [Read error: Connection reset by peer] |
| 00:03:15 | | APOLLO03 joins |
| 00:03:39 | | Yakov8 is now known as Yakov |
| 00:05:09 | <klea> | 2026-02-26 00:03:56 <nulldata> https://learn.redhat.com/ <- Shutting down March 31, 2026. Running in AB just gives HTTP code 202. https://learn.redhat.com/t5/Red-Hat-Learning-Community-News/Evolving-how-we-learn-together/ba-p/57899 (forwarded from #archiveteam) |
| 00:05:33 | <nicolas17> | fuck sake |
| 00:06:04 | <nicolas17> | pokechu22: au site again not wanting to load in my browser, what's the max page number? |
| 00:06:09 | <nicolas17> | I'm at 2300 |
| 00:06:24 | <pokechu22> | https://www.classification.gov.au/classification-ratings/latest-classification-decisions?field_rating%5B1%5D=1&page=192046 |
| 00:06:38 | <pokechu22> | so you're at 1% |
| 00:06:47 | <billybobbyjoe> | nicholas17: 3840935 total lmao |
| 00:07:58 | <nicolas17> | billybobbyjoe: no like, pages in the *list* |
| 00:08:12 | <nicolas17> | there's 20 movies/games/things listed in each page |
| 00:08:40 | <billybobbyjoe> | well yes; each movie/game/thing is also, itself, a page. |
| 00:09:25 | <billybobbyjoe> | so there are 3840935 title pages AND Current page190447 |
| 00:09:35 | <billybobbyjoe> | pages for the list |
| 00:12:28 | | etnguyen03 quits [Client Quit] |
| 00:13:21 | <billybobbyjoe> | not to mention the numerous filters, each base list page can be filtered in 16 different combinations (base included) |
| 00:14:10 | | tekulvw (tekulvw) joins |
| 00:15:03 | <billybobbyjoe> | and i don't wanna show up here just to burden yall but unfortunately in my 10 minutes of searching i've found numerous other extremely important .gov.au domains almost entirely absent lol |
| 00:16:04 | <nicolas17> | I'm only collecting the list to get the URLs of all the titles |
| 00:16:14 | <billybobbyjoe> | right ok |
| 00:16:36 | <nicolas17> | to actually archive the list for WBM it has to be done differently |
| 00:19:03 | | tekulvw quits [Ping timeout: 272 seconds] |
| 00:24:45 | <billybobbyjoe> | legislation.gov.au, the literal national register of laws for the entire nation: 170795 pieces of legislation, only 1731 total captures |
| 00:24:46 | <billybobbyjoe> | data.gov.au, the national collated register of open data: 109,441 datasets, 6021 total captures |
| 00:24:46 | <billybobbyjoe> | aph.gov.au, anything parliament house related: 7749 total captures, 1.8 million individual Hansard pages alone (i'd guess hansard is only ~20 percent of that whole domain) |
| 00:24:46 | <billybobbyjoe> | i could go on really forever but yeah, full sweep of the entire gov.au domain would be ideal fr |
| 00:24:46 | <billybobbyjoe> | i say that as if i have any idea what the fuck i'm doing lmao |
| 00:25:15 | <billybobbyjoe> | if yall want help i will offer it but someone will need to explain this all to me because i have no idea what half of this means |
| 00:25:31 | <billybobbyjoe> | willing to learn but very confused haha |
| 00:28:01 | <billybobbyjoe> | willing to commit some time to scan all these subdomains for search records |
| 00:28:38 | <billybobbyjoe> | the ones i listed are the ones i use on a regular basis and know of the top of my head so who truly knows how wide reaching this is |
| 00:31:02 | | APOLLO03 quits [Client Quit] |
| 00:31:43 | <pokechu22> | If you can identify important sites, that would be helpful. I do see https://www.legislation.gov.au/gazettes/historic/2004 has a few captures and https://www.legislation.gov.au/files/gazettes/historic/2004/2004GN01.pdf has none |
| 00:33:19 | | APOLLO03 joins |
| 00:33:52 | <pokechu22> | https://www.legislation.gov.au/ doesn't seem to be akamai so I can probably run that one in archivebot directly - the site seems to be somewhat scripty but also has versions that work without javascript |
| 00:35:00 | | tekulvw (tekulvw) joins |
| 00:40:05 | | tekulvw quits [Ping timeout: 268 seconds] |
| 00:46:02 | | etnguyen03 (etnguyen03) joins |
| 00:52:45 | | tekulvw joins |
| 00:52:45 | | tekulvw is now authenticated as tekulvw |
| 00:57:21 | | tekulvw quits [Ping timeout: 268 seconds] |
| 00:57:27 | | etnguyen03 quits [Client Quit] |
| 00:59:37 | <klea> | 2026-02-26 00:58:03 <Slimm> https://www.theverge.com/news/884824/corsair-ending-drop-shopping-site drop.com will be ceasing operations next month and likely purged/shut down (forums, site, reviews, etc) (forwarded from #archiveteam) |
| 01:02:45 | | Arcorann__ quits [Ping timeout: 272 seconds] |
| 01:12:44 | | Express826 joins |
| 01:13:19 | | Express826 quits [Client Quit] |
| 01:17:28 | | APOLLO03 quits [Client Quit] |
| 01:17:53 | | APOLLO03 joins |
| 01:20:12 | | lennier2_ quits [Read error: Connection reset by peer] |
| 01:20:28 | | lennier2_ joins |
| 01:20:55 | | tekulvw (tekulvw) joins |
| 01:29:21 | | tekulvw quits [Ping timeout: 272 seconds] |
| 01:30:33 | | tekulvw (tekulvw) joins |
| 01:38:52 | | tekulvw quits [Remote host closed the connection] |
| 01:39:09 | | tekulvw (tekulvw) joins |
| 01:54:05 | | tekulvw quits [Ping timeout: 268 seconds] |
| 02:14:29 | | sec^nd quits [Remote host closed the connection] |
| 02:14:52 | | tekulvw (tekulvw) joins |
| 02:14:59 | | sec^nd (second) joins |
| 02:21:05 | | billybobbyjoe quits [Quit: Ooops, wrong browser tab.] |
| 02:25:05 | | tekulvw quits [Ping timeout: 272 seconds] |
| 02:26:21 | | pabs quits [Ping timeout: 272 seconds] |
| 02:27:10 | | sec^nd quits [Remote host closed the connection] |
| 02:27:41 | | sec^nd (second) joins |
| 02:34:10 | | tekulvw (tekulvw) joins |
| 02:42:49 | | tekulvw quits [Ping timeout: 272 seconds] |
| 02:49:32 | | tekulvw (tekulvw) joins |
| 02:56:51 | | etnguyen03 (etnguyen03) joins |
| 02:58:50 | | tekulvw quits [Ping timeout: 268 seconds] |
| 03:09:05 | | APOLLO03 quits [Client Quit] |
| 03:09:28 | | APOLLO03 joins |
| 03:10:10 | | etnguyen03 quits [Client Quit] |
| 03:18:00 | | etnguyen03 (etnguyen03) joins |
| 03:21:52 | | tekulvw (tekulvw) joins |
| 03:25:20 | | pabs (pabs) joins |
| 03:26:35 | | tekulvw quits [Ping timeout: 268 seconds] |
| 03:31:57 | | tekulvw joins |
| 03:31:57 | | tekulvw is now authenticated as tekulvw |
| 03:36:42 | <h2ibot> | Pokechu22 edited ArchiveBot/Ignore (+49, /* Pinterest */ s.pinimg.com/webapp too): https://wiki.archiveteam.org/?diff=60556&oldid=60495 |
| 03:39:49 | | tekulvw quits [Ping timeout: 272 seconds] |
| 03:42:15 | | PredatorIWD253 joins |
| 03:44:53 | | PredatorIWD25 quits [Ping timeout: 272 seconds] |
| 03:44:53 | | PredatorIWD253 is now known as PredatorIWD25 |
| 03:51:59 | | tekulvw (tekulvw) joins |
| 03:56:10 | | etnguyen03 quits [Client Quit] |
| 03:59:27 | | tekulvw quits [Ping timeout: 272 seconds] |
| 04:00:38 | | legoktm quits [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.] |
| 04:00:41 | | legoktm joins |
| 04:01:28 | | etnguyen03 (etnguyen03) joins |
| 04:10:18 | | etnguyen03 quits [Remote host closed the connection] |
| 04:16:11 | | APOLLO03 quits [Client Quit] |
| 04:17:06 | | APOLLO03 joins |
| 04:19:02 | | tekulvw (tekulvw) joins |
| 04:26:41 | | tekulvw quits [Ping timeout: 272 seconds] |
| 04:30:22 | | tekulvw (tekulvw) joins |
| 04:33:42 | | nine quits [Quit: See ya!] |
| 04:33:55 | | nine joins |
| 04:33:56 | | nine is now authenticated as nine |
| 04:33:56 | | nine quits [Changing host] |
| 04:33:56 | | nine (nine) joins |
| 04:36:05 | | Arcorann__ (Arcorann) joins |
| 04:37:17 | | Island_ quits [Read error: Connection reset by peer] |
| 04:37:30 | | tekulvw quits [Ping timeout: 268 seconds] |
| 05:04:38 | | n9nes quits [Ping timeout: 268 seconds] |
| 05:05:13 | | n9nes joins |
| 05:13:50 | | APOLLO03 quits [Read error: Connection reset by peer] |
| 05:14:42 | | DogsRNice quits [Read error: Connection reset by peer] |
| 05:14:57 | | APOLLO03 joins |
| 05:18:12 | <nicolas17> | I misread the number and thought I was almost done with the classification ratings |
| 05:18:16 | <nicolas17> | I am in fact almost 10% done |
| 05:30:43 | <Hans5958> | Is #losttenure official? Why is it not announced on #archiveteam (or mentioned on the MOTD; it still refers to #archiveteam-bs) |
| 05:31:16 | <nicolas17> | what |
| 05:31:37 | <nicolas17> | we don't put every new project channel in the MOTD |
| 05:32:45 | <Hans5958> | I mentioned it since it's still say "We know about the Tenor API → #archiveteam-bs" |
| 05:33:02 | <Hans5958> | If it is official then it should point to #losttenure, no? |
| 05:33:16 | | SootBector quits [Remote host closed the connection] |
| 05:33:26 | <Hans5958> | (on #archiveteam) |
| 05:34:33 | | SootBector (SootBector) joins |
| 05:35:55 | <nicolas17> | oh |
| 05:36:13 | <nicolas17> | I missed that part x_x |
| 05:40:19 | | APOLLO03 quits [Client Quit] |
| 05:41:23 | | APOLLO03 joins |
| 05:42:25 | | eythian quits [Quit: http://quassel-irc.org - Chat comfortabel. Waar dan ook.] |
| 05:43:59 | | eythian joins |
| 05:50:55 | | pabs quits [Ping timeout: 272 seconds] |
| 05:52:06 | | pabs (pabs) joins |
| 06:03:29 | | APOLLO03 quits [Client Quit] |
| 06:03:30 | | LddPotato quits [Read error: Connection reset by peer] |
| 06:04:24 | | LddPotato (LddPotato) joins |
| 06:04:29 | | APOLLO03 joins |
| 06:07:57 | <pabs> | hmm, archive.today is returning "Server Error" for me on archival, anyone else? |
| 06:10:19 | <pabs> | nicolas17: NLA uses Brozzler btw, so they can get JSy things |
| 06:11:29 | | APOLLO03 quits [Client Quit] |
| 06:12:54 | | APOLLO03 joins |
| 06:13:57 | <pokechu22> | Same |
| 06:14:26 | | nexussfan quits [Quit: Konversation terminated!] |
| 06:21:48 | | LddPotato quits [Read error: Connection reset by peer] |
| 06:23:08 | | LddPotato (LddPotato) joins |
| 06:30:52 | | lennier2_ quits [Read error: Connection reset by peer] |
| 06:31:08 | | lennier2_ joins |
| 06:35:06 | | LddPotato quits [Read error: Connection reset by peer] |
| 06:35:50 | | LddPotato (LddPotato) joins |
| 06:45:52 | | tekulvw (tekulvw) joins |
| 06:51:05 | | tekulvw quits [Ping timeout: 272 seconds] |
| 06:51:57 | | LddPotato quits [Read error: Connection reset by peer] |
| 06:52:35 | | LddPotato (LddPotato) joins |
| 07:00:12 | | APOLLO03 quits [Client Quit] |
| 07:00:31 | | APOLLO03 joins |
| 07:01:05 | | tekulvw (tekulvw) joins |
| 07:03:36 | | LddPotato quits [Read error: Connection reset by peer] |
| 07:04:03 | | LddPotato (LddPotato) joins |
| 07:06:07 | | tekulvw quits [Ping timeout: 268 seconds] |
| 07:11:59 | | Webuser613121 joins |
| 07:12:14 | | Webuser613121 quits [Client Quit] |
| 07:36:57 | | tekulvw (tekulvw) joins |
| 07:41:45 | | tekulvw quits [Ping timeout: 272 seconds] |
| 08:03:03 | | lennier2 joins |
| 08:03:17 | | APOLLO03 quits [Ping timeout: 272 seconds] |
| 08:03:39 | | APOLLO03 joins |
| 08:06:27 | | lennier2_ quits [Ping timeout: 272 seconds] |
| 08:06:56 | | fangfufu_ joins |
| 08:09:01 | | fangfufu quits [Ping timeout: 268 seconds] |
| 08:09:43 | <nicolas17> | ok my curl requests are not working anymore |
| 08:10:00 | | GodzFire quits [Quit: Ooops, wrong browser tab.] |
| 08:12:43 | <nicolas17> | got it working again... will do only 2 concurrent instead of 3 |
| 08:23:25 | | APOLLO03a joins |
| 08:23:53 | | APOLLO03 quits [Read error: Connection reset by peer] |
| 08:34:24 | | APOLLO03a quits [Read error: Connection reset by peer] |
| 08:35:49 | | APOLLO03 joins |
| 08:43:28 | | APOLLO03 quits [Client Quit] |
| 08:43:54 | | APOLLO03 joins |
| 09:06:33 | <TheoH7> | Just got round to uploading the archive I did of https://community.jisc.ac.uk to IA: |
| 09:06:38 | <TheoH7> | https://archive.org/details/community.jisc.ac.uk-2026-02-16-8e903462-00000.warc |
| 09:09:47 | <TheoH7> | This is the one were I'd successfully got links to https://community.ja.net to redirect to the same domain as that was just an alias for the same IP's. |
| 09:11:00 | | APOLLO03a joins |
| 09:11:18 | | APOLLO03 quits [Ping timeout: 268 seconds] |
| 09:27:31 | | ducky quits [Ping timeout: 272 seconds] |
| 09:29:39 | | ducky (ducky) joins |
| 09:38:10 | | tekulvw (tekulvw) joins |
| 09:45:53 | | tekulvw quits [Ping timeout: 272 seconds] |
| 09:48:18 | | APOLLO03a quits [Read error: Connection reset by peer] |
| 09:52:27 | | nulldata-alt1 quits [Quit: Ping timeout (120 seconds)] |
| 09:52:46 | | APOLLO03 joins |
| 10:07:25 | | Guest quits [Ping timeout: 272 seconds] |
| 10:07:41 | | Washuu joins |
| 10:15:14 | | arch quits [Remote host closed the connection] |
| 10:15:46 | | arch (arch) joins |
| 10:18:01 | | APOLLO03 quits [Client Quit] |
| 10:18:50 | | APOLLO03 joins |
| 10:25:37 | | Guest joins |
| 10:30:51 | | Guest quits [Ping timeout: 272 seconds] |
| 10:33:59 | | APOLLO03 quits [Client Quit] |
| 10:35:28 | | APOLLO03 joins |
| 10:46:24 | | Dada joins |
| 10:48:45 | | Guest joins |
| 10:55:31 | | Guest quits [Ping timeout: 268 seconds] |
| 11:01:23 | | Guest joins |
| 11:01:28 | | croissant_ joins |
| 11:05:23 | | croissant quits [Ping timeout: 268 seconds] |
| 11:05:50 | | tekulvw (tekulvw) joins |
| 11:06:37 | | Guest quits [Ping timeout: 268 seconds] |
| 11:07:49 | | APOLLO03 quits [Client Quit] |
| 11:09:28 | | APOLLO03 joins |
| 11:10:45 | | tekulvw quits [Ping timeout: 272 seconds] |
| 11:11:53 | | Guest joins |
| 11:17:05 | | Guest quits [Ping timeout: 272 seconds] |
| 11:20:15 | | APOLLO03 quits [Client Quit] |
| 11:20:46 | | APOLLO03 joins |
| 11:31:29 | | APOLLO03 quits [Client Quit] |
| 11:33:45 | | APOLLO03 joins |
| 11:42:35 | | APOLLO03 quits [Client Quit] |
| 11:42:52 | | APOLLO03 joins |
| 12:00:01 | | Bleo1826007227196234552220 quits [Quit: The Lounge - https://thelounge.chat] |
| 12:02:43 | | Bleo1826007227196234552220 joins |
| 12:09:36 | | tekulvw (tekulvw) joins |
| 12:12:26 | | khaoohs__ quits [Read error: Connection reset by peer] |
| 12:13:05 | | khaoohs__ joins |
| 12:14:43 | | tekulvw quits [Ping timeout: 272 seconds] |
| 12:15:06 | | atphoenix_ quits [Read error: Connection reset by peer] |
| 12:15:41 | | atphoenix_ (atphoenix) joins |
| 12:16:26 | | Snivy quits [Quit: Ping timeout (120 seconds)] |
| 12:18:18 | | Snivy (Snivy) joins |
| 12:22:43 | <@Fusl> | nulldata: re https://learn.redhat.com/ looks like it wants not only proper UA but also cookies that are generated on first visit |
| 12:49:44 | | Washuu quits [Client Quit] |
| 13:12:44 | | cyanbox quits [Read error: Connection reset by peer] |
| 13:37:24 | | xtheaurisx joins |
| 13:37:36 | | xtheaurisx quits [Client Quit] |
| 13:40:51 | | Arcorann__ quits [Ping timeout: 272 seconds] |
| 13:49:06 | | APOLLO03 quits [Client Quit] |
| 13:50:11 | | APOLLO03 joins |
| 14:05:27 | | APOLLO03 quits [Read error: Connection reset by peer] |
| 14:07:53 | | APOLLO03 joins |
| 14:17:47 | | midou quits [Ping timeout: 268 seconds] |
| 14:22:41 | | midou joins |
| 14:38:20 | <@arkiver> | on learn.redhat.com, should be good with AB *i think* |
| 14:41:55 | <masterx244|m> | sometimes a sacrificial first URL (with some meaningless extra parameters) to prime the cookies can be a trick there. had to do that at a crawl once, too where i had the first URL of the URLlist copied and a garbage parameter added to get a unparametered one with the right flags set, there was a ad interstitial on the first visit of that site that had to be skipped since it was useless for archival and putting that onto a different URL |
| 14:41:55 | <masterx244|m> | was the easiest method |
| 14:45:23 | | FiTheArchiver joins |
| 14:52:17 | | FiTheArchiver quits [Client Quit] |
| 14:54:01 | | simon816 quits [Quit: ZNC 1.10.1 - https://znc.in] |
| 14:54:10 | | ^ quits [Ping timeout: 268 seconds] |
| 14:54:14 | | ^ (^) joins |
| 15:00:43 | | simon816 (simon816) joins |
| 15:02:11 | | ^ quits [Ping timeout: 268 seconds] |
| 15:02:12 | | midou quits [Ping timeout: 268 seconds] |
| 15:02:17 | | ^ (^) joins |
| 15:05:35 | | catbottom quits [Quit: ZNC 1.9.1+deb2+b3 - https://znc.in] |
| 15:06:51 | | catbottom joins |
| 15:10:11 | | tekulvw (tekulvw) joins |
| 15:15:13 | | tekulvw quits [Ping timeout: 272 seconds] |
| 15:28:59 | | APOLLO03 quits [Client Quit] |
| 15:30:13 | | APOLLO03 joins |
| 15:30:26 | <@arkiver> | we have a little emergency project coming for https://numerabilis.u-paris.fr/medica/bibliotheque-numerique/ |
| 15:30:32 | <@arkiver> | shutting down on the 28th |
| 15:31:47 | | Nekroschizofrenetyk joins |
| 15:31:48 | <eggdrop> | [tell] Nekroschizofrenetyk: [2026-02-21T12:13:57Z] <justauser> https://www.olawsky.de/schlesien/forum.html works for me. Want an AB run? |
| 15:32:01 | <justauser> | Actually, already started. |
| 15:32:13 | <Nekroschizofrenetyk> | Hi |
| 15:32:40 | <Nekroschizofrenetyk> | direct links to separate messages work for you? Like this: https://www.olawsky.de/forum/messages/10462.html |
| 15:32:47 | <Nekroschizofrenetyk> | Yeah, that would be great! |
| 15:32:51 | <justauser> | Should be all done already. |
| 15:33:19 | <justauser> | About 1G, list of URLs saved here: https://archive.org/download/archiveteam_archivebot_go_20260224194814_be59de7b/www.olawsky.de-inf-20260224-172857-bfa95-meta.warc.gz |
| 15:34:16 | <Nekroschizofrenetyk> | oh, yes |
| 15:34:24 | <Nekroschizofrenetyk> | great, fantastic! |
| 15:34:32 | <Nekroschizofrenetyk> | by the way |
| 15:34:46 | <Nekroschizofrenetyk> | are you still going to run the Archiwum Allegro project? |
| 15:34:55 | <Nekroschizofrenetyk> | because it might be pointless |
| 15:35:18 | <justauser> | I think it concluded with "potato". |
| 15:35:20 | <Nekroschizofrenetyk> | urls of archival auctions redirect to current ones or similar |
| 15:38:43 | | Nekroschizofrenetyk quits [Client Quit] |
| 15:50:39 | | lennier2_ joins |
| 15:52:19 | | APOLLO03 quits [Client Quit] |
| 15:53:38 | | APOLLO03 joins |
| 15:53:51 | | lennier2 quits [Ping timeout: 272 seconds] |
| 16:04:15 | | iPwnedYourIOTSmartdog quits [Quit: Ping timeout (120 seconds)] |
| 16:04:29 | | iPwnedYourIOTSmartdog joins |
| 16:07:46 | | VerifiedJ quits [Remote host closed the connection] |
| 16:08:31 | | VerifiedJ (VerifiedJ) joins |
| 16:10:19 | | ducky quits [Ping timeout: 272 seconds] |
| 16:20:08 | | Island joins |
| 16:28:07 | | ducky (ducky) joins |
| 16:32:00 | | tekulvw (tekulvw) joins |
| 16:41:21 | | tekulvw quits [Ping timeout: 272 seconds] |
| 16:58:16 | | APOLLO03 quits [Client Quit] |
| 16:58:42 | | APOLLO03 joins |
| 17:03:33 | | tekulvw (tekulvw) joins |
| 17:08:36 | | tekulvw quits [Ping timeout: 268 seconds] |
| 17:15:50 | | tekulvw (tekulvw) joins |
| 17:20:37 | | tekulvw quits [Ping timeout: 272 seconds] |
| 17:21:13 | | Webuser726658 joins |
| 17:23:12 | | lennier2_ quits [Read error: Connection reset by peer] |
| 17:23:26 | | lennier2_ joins |
| 17:30:48 | | APOLLO03 quits [Ping timeout: 268 seconds] |
| 17:32:01 | | APOLLO03 joins |
| 17:44:39 | | Wohlstand (Wohlstand) joins |
| 17:47:57 | | tekulvw (tekulvw) joins |
| 17:52:55 | | tekulvw quits [Ping timeout: 272 seconds] |
| 17:54:30 | | APOLLO03 quits [Client Quit] |
| 17:55:07 | | APOLLO03 joins |
| 17:56:12 | | rover joins |
| 17:57:59 | | roverinexile quits [Ping timeout: 272 seconds] |
| 18:08:13 | | APOLLO03 quits [Client Quit] |
| 18:08:51 | | APOLLO03 joins |
| 18:08:56 | | DogsRNice joins |
| 18:25:22 | | tekulvw (tekulvw) joins |
| 18:32:28 | | tekulvw quits [Ping timeout: 268 seconds] |
| 18:34:09 | | APOLLO03 quits [Client Quit] |
| 18:34:58 | | APOLLO03 joins |
| 18:51:56 | | tekulvw (tekulvw) joins |
| 18:56:53 | | tekulvw quits [Ping timeout: 272 seconds] |
| 19:10:22 | | APOLLO03a joins |
| 19:10:42 | | APOLLO03 quits [Ping timeout: 268 seconds] |
| 19:15:15 | | aninternettroll quits [Ping timeout: 272 seconds] |
| 19:21:44 | | tekulvw (tekulvw) joins |
| 19:21:48 | | APOLLO03a quits [Ping timeout: 268 seconds] |
| 19:23:08 | | APOLLO03 joins |
| 19:24:40 | | aninternettroll (aninternettroll) joins |
| 19:26:39 | | tekulvw quits [Ping timeout: 272 seconds] |
| 19:26:58 | <@arkiver> | imer: for whenever you are around, we have a (i think not huge) project coming up with close deadline for Medica - Bibliothèque Numérique. i made the tracker under "medica" |
| 19:27:10 | <@arkiver> | whenever possible could add a target under "medica" with |
| 19:27:13 | <@arkiver> | archiveteam_medica |
| 19:27:16 | <@arkiver> | medica_ |
| 19:27:31 | <@imer> | iyep, ’ll set that up soon |
| 19:27:32 | <@arkiver> | Archive Team Medica Bibliothèque numérique: |
| 19:27:36 | <@arkiver> | thanks a lot! |
| 19:27:41 | <@arkiver> | i will be starting this after sleep |
| 19:27:51 | <@arkiver> | so not need to get it up immediately |
| 19:27:53 | <@arkiver> | good night :) |
| 19:28:02 | <@arkiver> | or day, or something :) |
| 19:32:37 | | DogsRNice_ joins |
| 19:35:22 | | DogsRNice quits [Ping timeout: 268 seconds] |
| 19:36:01 | | tekulvw (tekulvw) joins |
| 19:36:57 | <nicolas17> | oh no |
| 19:37:11 | <nicolas17> | the classification.gov.au website added more titles |
| 19:38:14 | <nicolas17> | will I have to start over? I don't even know if these are ordered by date |
| 19:39:43 | <pokechu22> | I'm pretty sure it's ordered by date |
| 19:40:02 | <pokechu22> | at least "latest-classification-decisions" makes it sound like it is |
| 19:41:13 | | tekulvw quits [Ping timeout: 272 seconds] |
| 19:41:44 | <nicolas17> | so say I got pages 1-46200, then they added more titles, if I continue from 46201 will I get duplicates or will I get missed items? that's the problem >_< |
| 19:42:51 | <nicolas17> | oh looks like I also have two random pages in 39xxx that failed |
| 19:43:00 | <pokechu22> | I think you'll get duplicates, assuming you're going in order |
| 19:43:06 | <nicolas17> | guess I'll run it during the weekend and hope I can get a consistent list |
| 19:43:17 | <pokechu22> | redoing the failed pages is more annoying - you'll need to retry both those pages and several pages after them |
| 19:44:00 | <nicolas17> | checking for duplicates now |
| 19:45:56 | | petrichor quits [Quit: ZNC 1.10.1 - https://znc.in] |
| 19:47:36 | <nicolas17> | title count increased by 1623, and I have 1720 duplicates, mismatch probably caused by those two retried pages... |
| 19:47:45 | | petrichor (petrichor) joins |
| 19:50:59 | <nicolas17> | ETA 20 hours (: |
| 19:53:31 | | cipherrot (petrichor) joins |
| 19:55:47 | | petrichor quits [Ping timeout: 272 seconds] |
| 19:58:27 | <nicolas17> | how are we going to archive this for real anyway? qwarc? |
| 19:59:21 | <nicolas17> | you need some akamai cookie, which seems to stop working after a few hours, and I get a new one from a real browser |
| 19:59:48 | | tekulvw (tekulvw) joins |
| 20:04:39 | | tekulvw quits [Ping timeout: 272 seconds] |
| 20:05:15 | | etnguyen03 (etnguyen03) joins |
| 20:10:21 | | APOLLO03 quits [Ping timeout: 272 seconds] |
| 20:11:08 | | APOLLO03 joins |
| 20:16:23 | <Dango360> | that's a lot of fortnite maps |
| 20:16:44 | <nicolas17> | where |
| 20:17:42 | <Dango360> | i'm just guessing here, but they must be rating every map that gets a certain amount of visits |
| 20:18:05 | <nicolas17> | idk what you're talking about |
| 20:18:49 | <Dango360> | i can't access the website anymore, but the example you shown `𝐒𝐇𝐀𝐃𝐎𝐖 𝐑𝐏 - 𝐕𝟐 - 🏙` was a fortnite map |
| 20:19:08 | <nicolas17> | oh |
| 20:20:19 | <Dango360> | im getting ERR_HTTP2_PROTOCOL_ERROR when i try to access it |
| 20:20:38 | <nicolas17> | does https://www.classification.gov.au/sitemap.xml load? |
| 20:20:57 | <Dango360> | i cleared the cookies/site data and it works again |
| 20:21:03 | <nicolas17> | yep... |
| 20:22:53 | | tekulvw (tekulvw) joins |
| 20:25:40 | <pokechu22> | akamai-- |
| 20:25:40 | <eggdrop> | [karma] 'akamai' now has -225 karma! |
| 20:32:06 | | tekulvw quits [Ping timeout: 268 seconds] |
| 20:54:40 | | tekulvw (tekulvw) joins |
| 20:56:09 | | APOLLO03 quits [Ping timeout: 268 seconds] |
| 20:59:14 | | tekulvw quits [Ping timeout: 268 seconds] |
| 20:59:47 | | Webuser726658 quits [Quit: Ooops, wrong browser tab.] |
| 21:08:13 | | tekulvw (tekulvw) joins |
| 21:17:16 | | Wohlstand quits [Quit: Wohlstand] |
| 21:17:41 | | etnguyen03 quits [Client Quit] |
| 21:20:39 | | tekulvw quits [Ping timeout: 272 seconds] |
| 21:25:31 | | roverinexile joins |
| 21:27:37 | | rover quits [Ping timeout: 272 seconds] |
| 21:32:29 | | tekulvw (tekulvw) joins |
| 21:46:29 | <nicolas17> | now getting several 403s with "A high volume of simultaneous submissions from your network have been made to this website and the security tools used to protect this website have interpreted this as a possible attack on the site." |
| 21:46:39 | <nicolas17> | on like half my requests, randomly |
| 21:47:14 | <nicolas17> | adding a delay doesn't seem to change anything so I might as well go fast |
| 21:47:20 | | tekulvw quits [Ping timeout: 268 seconds] |
| 21:48:58 | | Guest joins |
| 21:58:51 | | tekulvw (tekulvw) joins |
| 21:59:40 | | crullerIRC quits [Ping timeout: 268 seconds] |
| 22:01:20 | | crullerIRC joins |
| 22:03:59 | | tekulvw quits [Ping timeout: 268 seconds] |
| 22:08:30 | | etnguyen03 (etnguyen03) joins |
| 22:21:49 | | Hackerpcs quits [Remote host closed the connection] |
| 22:22:43 | | Hackerpcs (Hackerpcs) joins |
| 22:28:31 | | APOLLO03 joins |
| 22:33:05 | <PredatorIWD25> | Does !a on #telegrab work for all users? |
| 22:35:08 | <klea> | I believe !a is not limited there, yes. |
| 22:38:39 | | tekulvw (tekulvw) joins |
| 22:40:44 | <nicolas17> | !remindme 1h Safari Tech Preview |
| 22:40:45 | <eggdrop> | [remind] ok, i'll remind you at 2026-02-26T23:40:45Z |
| 22:43:27 | | tekulvw quits [Ping timeout: 268 seconds] |
| 22:45:18 | | Arcorann__ (Arcorann) joins |
| 22:47:52 | | TheEnbyperor_ quits [Read error: Connection reset by peer] |
| 22:48:03 | | TheEnbyperor quits [Ping timeout: 272 seconds] |
| 22:48:09 | | TheEnbyperor joins |
| 22:48:13 | | TheEnbyperor_ (TheEnbyperor) joins |
| 22:49:41 | | Dada quits [Remote host closed the connection] |
| 22:53:42 | | PredatorIWD25 is now known as PredatorIWD |
| 23:01:21 | | TheEnbyperor_ quits [Ping timeout: 272 seconds] |
| 23:01:21 | | TheEnbyperor quits [Ping timeout: 272 seconds] |
| 23:03:47 | | nexussfan (nexussfan) joins |
| 23:09:09 | | APOLLO03 quits [Client Quit] |
| 23:09:37 | | APOLLO03 joins |
| 23:11:19 | | APOLLO03 quits [Client Quit] |
| 23:13:15 | | ljcool2006 quits [Quit: Leaving] |
| 23:15:01 | | TheEnbyperor (TheEnbyperor) joins |
| 23:15:03 | | matoro quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 23:15:04 | | TheEnbyperor_ joins |
| 23:16:46 | | atphoenix__ (atphoenix) joins |
| 23:16:50 | | matoro joins |
| 23:16:57 | | tekulvw (tekulvw) joins |
| 23:19:43 | | atphoenix_ quits [Ping timeout: 272 seconds] |
| 23:20:31 | | matoro quits [Client Quit] |
| 23:20:41 | | matoro joins |
| 23:21:21 | | Island quits [Read error: Connection reset by peer] |
| 23:21:29 | | khaoohs__ quits [Read error: Connection reset by peer] |
| 23:21:37 | | tekulvw quits [Ping timeout: 272 seconds] |
| 23:22:02 | | phillipsjk quits [Remote host closed the connection] |
| 23:22:15 | | phillipsjk joins |
| 23:22:21 | | Island joins |
| 23:22:36 | | khaoohs joins |
| 23:22:37 | | Synbi joins |
| 23:22:41 | | beardicus1 quits [Quit: Ping timeout (120 seconds)] |
| 23:22:47 | | ScenarioPlanet quits [Quit: Ping timeout (120 seconds)] |
| 23:23:30 | | Island quits [Remote host closed the connection] |
| 23:23:38 | | khaoohs quits [Remote host closed the connection] |
| 23:23:43 | | Zachava quits [Quit: ZNC disconnected] |
| 23:23:55 | | Synbi quits [Client Quit] |
| 23:24:00 | | TheTechRobo quits [Quit: Ping timeout (120 seconds)] |
| 23:24:02 | | khaoohs joins |
| 23:24:47 | | TheTechRobo (TheTechRobo) joins |
| 23:25:06 | | Zachava (Zachava) joins |
| 23:25:08 | | khaoohs quits [Remote host closed the connection] |
| 23:26:03 | | Church quits [Ping timeout: 272 seconds] |
| 23:27:24 | | Synbi joins |
| 23:27:51 | <Synbi> | are yall looking into archiving myrient? |
| 23:28:10 | | Island joins |
| 23:28:30 | | Island quits [Remote host closed the connection] |
| 23:28:48 | | Island joins |
| 23:29:34 | <nicolas17> | what is the website domain? |
| 23:29:44 | | khaoohs joins |
| 23:30:01 | <multisn8> | https://myrient.erista.me -- see https://t.me/myrient/107 for the shutdown announcement |
| 23:30:08 | | khaoohs quits [Remote host closed the connection] |
| 23:30:25 | | khaoohs joins |
| 23:30:29 | <nicolas17> | "Myrient sets the standard for video game preservation and takes a different approach by focusing on accessibility" |
| 23:30:32 | <nicolas17> | maybe we can contact them |
| 23:31:14 | <nicolas17> | https://myrient.erista.me/files/Internet%20Archive/ this should probably be excluded |
| 23:32:22 | | Synbi quits [Client Quit] |
| 23:32:37 | | v01d joins |
| 23:40:45 | <eggdrop> | [remind] nicolas17: Safari Tech Preview |
| 23:42:22 | <PredatorIWD> | When crawling sites for archival, do we archive mediafire, gdrive, mega etc links that are found? If not why not and is there a list being kept somewhere of scraped links that will be used one day to do so? |
| 23:42:54 | <PredatorIWD> | I can look into making some software to save that data |
| 23:48:36 | | Synbi joins |
| 23:49:30 | | tekulvw (tekulvw) joins |
| 23:52:21 | | Church (Church) joins |
| 23:52:26 | <ericgallager> | https://wiki.archiveteam.org/index.php/MediaFire |
| 23:52:44 | <ericgallager> | https://wiki.archiveteam.org/index.php/Google_Drive |
| 23:54:33 | | tekulvw quits [Ping timeout: 272 seconds] |
| 23:54:35 | <h2ibot> | Cooljeanius edited Mega (+8, use URL template): https://wiki.archiveteam.org/?diff=60557&oldid=59728 |
| 23:55:35 | <h2ibot> | Cooljeanius edited Mega (+20, "Mega.nz" redirects here): https://wiki.archiveteam.org/?diff=60558&oldid=60557 |
| 23:56:09 | <PredatorIWD> | Those don't say if the archivebot and other projects by default save everything found as the sites are crawled, nor if there is a list of links saved for the future if they are not, and I'm sure JS heavy sites like mega probably aren't supported, but yet they also hold a lot of important data |
| 23:58:39 | | Webuser710364 joins |
| 23:58:59 | | Webuser710364 quits [Client Quit] |