00:01:07 | | icedice quits [Read error: Connection reset by peer] |
00:03:32 | | le0n quits [Quit: see you later, alligator] |
00:04:07 | | icedice (icedice) joins |
00:17:13 | | le0n (le0n) joins |
00:36:38 | | nic9070 quits [Read error: Connection reset by peer] |
00:38:08 | | nic9070 (nic) joins |
00:58:09 | | jasons quits [Ping timeout: 272 seconds] |
01:04:21 | | kiryu_ joins |
01:06:14 | | kiryu__ joins |
01:07:39 | | kiryu quits [Ping timeout: 272 seconds] |
01:10:11 | | kiryu_ quits [Ping timeout: 272 seconds] |
01:37:43 | <pabs> | fishingforsoup: you can do it in #down-the-tube, please read the channel topic and the scope on the YouTube wiki page though |
02:00:59 | | jasons (jasons) joins |
02:40:25 | | systwi_ joins |
02:48:02 | | lennier1 quits [Remote host closed the connection] |
02:57:13 | | jasons quits [Ping timeout: 272 seconds] |
03:01:26 | | nic9070 quits [Client Quit] |
03:02:21 | | nic9070 (nic) joins |
03:07:38 | | a joins |
03:08:38 | <a> | Does anyone know if purevolume.com/IslandViewDrive got archived in the purevolume job? I can't seem to find it in the cdx files |
03:17:41 | <@JAA> | a: Looks like WARCs 00002 and 00014 should be relevant. Note that it got archived in lowercase. |
03:20:00 | <@JAA> | + 00046 |
03:21:07 | <@JAA> | That should get you started at least. |
03:21:33 | <@JAA> | There might be more in some later WARC. |
03:22:03 | <@JAA> | `curl -sL https://archive.org/download/archiveteam_archivebot_go_20180828220002/www.purevolume.com-inf-20180424-221829-97mda-meta.warc.gz | zstdgrep -Fia islandviewdrive` is all I did. |
03:22:34 | <@JAA> | The data for a particular time should usually be in the item with the next highest timestamp in its name. |
03:23:11 | <a> | How do I find those 3 WARCs? |
03:23:16 | <@JAA> | https://archive.fart.website/archivebot/viewer/job/2018042422182997mda |
03:23:50 | <a> | Thank you :) |
03:33:42 | | lukash96 joins |
03:34:20 | | lukash9 quits [Ping timeout: 240 seconds] |
03:34:20 | | lukash96 is now known as lukash9 |
03:44:38 | | sonick quits [Client Quit] |
03:45:45 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
04:00:28 | | jasons (jasons) joins |
04:03:17 | <a> | @JAA I found http://www.purevolume.com/islandviewdrive/albums/The+Sun+Don%27t+Shine+In+Your+TV?PageSpeed=noscript in that list. Trying to find which WARC it's in and not finding it. |
04:08:05 | <@JAA> | a: Look at the timestamp at the beginning of the line, then look for the first WARC that's in an item (last column) with a timestamp bigger than that. |
04:09:01 | <a> | ok |
04:09:12 | | emberquill080 quits [Client Quit] |
04:09:37 | | emberquill080 (emberquill) joins |
04:12:44 | | a quits [Remote host closed the connection] |
04:32:34 | | emberquill080 quits [Client Quit] |
04:33:17 | | emberquill080 (emberquill) joins |
04:37:28 | | hackbug quits [Remote host closed the connection] |
04:39:39 | | hackbug (hackbug) joins |
04:41:54 | | Craigle quits [Client Quit] |
04:42:25 | | Craigle (Craigle) joins |
05:01:20 | | jasons quits [Ping timeout: 240 seconds] |
05:36:39 | | Island quits [Read error: Connection reset by peer] |
06:05:16 | | jasons (jasons) joins |
06:19:52 | | DogsRNice quits [Read error: Connection reset by peer] |
06:26:51 | | le0n quits [Ping timeout: 272 seconds] |
06:29:33 | | le0n (le0n) joins |
06:32:22 | | Arcorann (Arcorann) joins |
06:52:00 | | lennier1 (lennier1) joins |
07:03:20 | | jasons quits [Ping timeout: 240 seconds] |
07:13:37 | | ymgve_ is now known as ymgve |
07:53:53 | | riku quits [Quit: WeeChat 4.1.2] |
07:54:20 | | @rewby quits [Ping timeout: 240 seconds] |
08:06:51 | | jasons (jasons) joins |
08:24:06 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
08:32:27 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
09:04:33 | | jasons quits [Ping timeout: 272 seconds] |
09:14:19 | | riku (riku) joins |
09:19:20 | | bf_ joins |
09:32:17 | | lizardexile joins |
10:00:01 | | Bleo18260 quits [Client Quit] |
10:01:27 | | Bleo18260 joins |
10:07:26 | | jasons (jasons) joins |
10:09:17 | | riku quits [Client Quit] |
10:27:03 | | lizardexile quits [Client Quit] |
10:49:52 | | AK quits [Quit: AK] |
11:04:53 | | jasons quits [Ping timeout: 272 seconds] |
11:08:20 | | ScenarioPlanet quits [Ping timeout: 240 seconds] |
11:08:50 | | Pedrosso quits [Ping timeout: 240 seconds] |
11:08:50 | | TheTechRobo quits [Ping timeout: 240 seconds] |
11:12:11 | | Pedrosso joins |
11:13:49 | | tbc1887 quits [Quit: Ping timeout (120 seconds)] |
11:14:10 | | tbc1887 (tbc1887) joins |
11:14:12 | | TheTechRobo (TheTechRobo) joins |
11:14:12 | | TheTechRobo quits [Excess Flood] |
11:15:57 | | ScenarioPlanet (ScenarioPlanet) joins |
11:17:16 | | TheTechRobo (TheTechRobo) joins |
11:32:25 | | riku (riku) joins |
11:36:20 | | TheTechRobo quits [Ping timeout: 240 seconds] |
11:36:20 | | ScenarioPlanet quits [Ping timeout: 240 seconds] |
11:36:20 | | Pedrosso quits [Ping timeout: 240 seconds] |
11:37:58 | | Pedrosso joins |
11:38:06 | | ScenarioPlanet (ScenarioPlanet) joins |
11:38:23 | | TheTechRobo (TheTechRobo) joins |
11:43:29 | | tbc1887 quits [Client Quit] |
11:49:25 | | Pedrosso quits [Client Quit] |
11:49:25 | | ScenarioPlanet quits [Client Quit] |
11:49:25 | | TheTechRobo quits [Client Quit] |
11:49:36 | | Pedrosso joins |
11:49:41 | | ScenarioPlanet (ScenarioPlanet) joins |
11:49:59 | | TheTechRobo (TheTechRobo) joins |
11:52:57 | | riku quits [Client Quit] |
12:07:47 | | VerifiedJ quits [Client Quit] |
12:07:58 | | jasons (jasons) joins |
12:09:50 | | riku (riku) joins |
12:33:35 | | VerifiedJ (VerifiedJ) joins |
13:02:20 | | rewby (rewby) joins |
13:02:20 | | @ChanServ sets mode: +o rewby |
13:09:20 | | jasons quits [Ping timeout: 240 seconds] |
13:10:50 | | Arcorann quits [Ping timeout: 240 seconds] |
13:18:10 | | eroc19905 quits [Client Quit] |
13:18:38 | | eroc1990 (eroc1990) joins |
13:28:41 | <h2ibot> | OrIdow6 edited List of websites excluded from the Wayback Machine (+22, https://shivae.com/): https://wiki.archiveteam.org/?diff=51568&oldid=51567 |
13:33:41 | <h2ibot> | OrIdow6 edited List of websites excluded from the Wayback Machine (+22, https://shivae.net/): https://wiki.archiveteam.org/?diff=51569&oldid=51568 |
13:34:42 | <h2ibot> | OrIdow6 edited List of websites excluded from the Wayback Machine (+21, shivae.org (I do not mean to be doing these all…): https://wiki.archiveteam.org/?diff=51570&oldid=51569 |
13:34:49 | <@OrIdow6^2> | Apropos of the above, thoughts on archiving sites like the above, where a web 1.0 site is excluded from the WBM presumably by its creator/owner? |
13:35:38 | <@OrIdow6^2> | Excluded but no indication that it is otherwise intended to be ephemeral |
13:38:41 | <qwertyasdfuiopghjkl> | Bilibili Comics ( https://www.bilibilicomics.com , https://manga.bilibili.com ) is shutting down on 2024-02-29 06:00 GMT. The site requires JS, so ArchiveBot doesn't get anything useful. |
13:40:05 | <@OrIdow6^2> | Holy moly that is some timing |
13:40:15 | <@OrIdow6^2> | Right as I am looking thru webcomics |
13:41:14 | <@OrIdow6^2> | In queue for Deathwatch qwertyasdfuiopghjkl or should I add it? |
13:48:02 | <@OrIdow6^2> | Oof, there's |
13:48:07 | <@OrIdow6^2> | some POST going on there |
13:52:14 | <qwertyasdfuiopghjkl> | OrIdow6^2: I haven't had time to make an account on the wiki yet, so would be nice if you could add it yourself. |
13:54:16 | <@OrIdow6^2> | Alright qwertyasdfuiopghjkl, thanks |
13:54:45 | <h2ibot> | OrIdow6 edited Deathwatch (+144, /* 2024 */ https://www.bilibilicomics.com/ Feb 29): https://wiki.archiveteam.org/?diff=51571&oldid=51560 |
13:59:05 | <imer> | who designs these apis to use POST instead of GET :< |
14:00:47 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=51572&oldid=51570 |
14:03:20 | <qwertyasdfuiopghjkl> | The shutdown notice on https://www.bilibilicomics.com links to https://manga.bilibili.com/blackboard/activity-eKy5XrQQmj.html?lang=en , so I assume https://manga.bilibili.com will also probably be shut down. Other relevant subdomains I found are https://m.bilibilicomics.com (mobile site) and https://uat-www.bilibilicomics.com (looks like some sort |
14:03:21 | <qwertyasdfuiopghjkl> | of test thing). |
14:03:49 | <@OrIdow6^2> | It has POST but beyond that the basic viewing at least is determinate |
14:04:00 | <@OrIdow6^2> | So a hypothetical future POST-capable WBM should work with it |
14:04:04 | <qwertyasdfuiopghjkl> | also https://uat-m.bilibilicomics.com |
14:13:09 | | jasons (jasons) joins |
14:16:19 | | AK (AK) joins |
14:21:33 | <@OrIdow6^2> | Anyhow seems like the AB job is picking up on a lot, maybe we'll just be able to parse out the images from thE WARC |
14:21:46 | <@OrIdow6^2> | Parse out the pertienet data |
14:24:16 | <qwertyasdfuiopghjkl> | All the pages are identical if you look at view-source: so I don't think that'll be much help |
14:28:43 | | lizardexile joins |
14:34:27 | <qwertyasdfuiopghjkl> | actually, looks like some of them are different |
14:57:27 | | sonick (sonick) joins |
15:10:20 | | jasons quits [Ping timeout: 240 seconds] |
15:28:18 | | sec^nd quits [Ping timeout: 255 seconds] |
15:41:34 | | DogsRNice joins |
16:06:21 | | lennier1 quits [Ping timeout: 272 seconds] |
16:14:16 | | jasons (jasons) joins |
16:16:23 | | lizardexile quits [Client Quit] |
16:58:14 | | sec^nd (second) joins |
17:13:20 | | jasons quits [Ping timeout: 240 seconds] |
17:52:11 | | Megame (Megame) joins |
17:55:48 | <Vokun> | Is the Runescape forum read only yet? Was there a time/timezone given? |
17:57:00 | | Wohlstand (Wohlstand) joins |
17:57:10 | <@JAA> | I didn't see a time or timezone anywhere. |
17:58:18 | <@JAA> | Looks like it might be locked though. Last post I can find was at '10:00:56', though it doesn't mention a time zone there either. |
17:58:28 | <@JAA> | https://secure.runescape.com/m=forum/forums?55,56,613,66067321,goto,102 |
18:00:37 | <@JAA> | I think it's UTC. |
18:01:03 | <@JAA> | Judging by WBM snapshots of the homepage, at least. |
18:07:18 | <@JAA> | (Also, nice song choice for the last post ever in a large forum.) |
18:12:52 | | Island joins |
18:16:52 | | jasons (jasons) joins |
18:17:14 | | Hackerpcs quits [Quit: Hackerpcs] |
18:20:11 | | Hackerpcs (Hackerpcs) joins |
18:22:37 | <fireonlive> | huh, yes indeed lol |
18:22:59 | <@JAA> | bear.community is already dead, doesn't resolve. It was supposed to shut down on the 30th. |
18:23:20 | <fireonlive> | i wonder what forum software that is. "Quick find code: 55-56-613-66067321" |
18:23:42 | <@JAA> | I'm pretty sure it's a custom thing. |
18:23:48 | <fireonlive> | ah that'd make sense |
18:23:52 | | fireonlive pours one out for the gay bears |
18:24:29 | <@JAA> | Forum pagination is limited to 50 pages. |
18:24:56 | <fireonlive> | 🤦♂️ |
18:25:33 | <@JAA> | Thread URLs contain four IDs, and it appears you need all of them. |
18:26:08 | <@JAA> | First two appear to identify the subforum, but not sure how. No clue what the third number is. |
18:26:58 | <@JAA> | The third one seems to be up to three digits. |
18:27:47 | <@JAA> | The fourth is the actual thread ID, and it tops out somewhere around 66293580. |
18:29:45 | <@JAA> | Between 47 forums, 1000 possible third IDs, and 66 million thread IDs, we're looking at 3.1 trillion possible combinations. :-| |
18:33:21 | <@JAA> | Oh, old threads might not be kept. Some thread links from the WBM from 2015 don't work, at least. |
18:37:35 | <@JAA> | Same with 2021 |
18:38:19 | <@JAA> | I'm looking at the 'General' forums, the second largest. |
18:38:56 | <@JAA> | Same with early 2023 |
18:39:09 | <@JAA> | https://secure.runescape.com/m=forum/forums?14,15,goto,50 goes back to June. |
18:39:47 | <@JAA> | Yep, links on https://web.archive.org/web/20230530224940/https://secure.runescape.com/m=forum/forums?14,15 are also dead. |
18:40:12 | <@JAA> | So I guess they only keep 50 pages worth of threads per subforum. |
18:43:36 | <fireonlive> | oh wow o_o |
18:43:41 | <fireonlive> | talk about ephemeral |
18:45:29 | <@JAA> | Random Redditor comments from 2016 confirm it: https://old.reddit.com/r/runescape/comments/41l3q8/in_the_rs_forum_is_there_a_way_to_see_all_your/ |
18:49:06 | | RealPerson leaves |
18:52:20 | | Wohlstand quits [Client Quit] |
19:15:05 | | jasons quits [Ping timeout: 272 seconds] |
19:15:52 | | jtagcat quits [Quit: Bye!] |
19:16:19 | | jtagcat (jtagcat) joins |
19:22:41 | | Chris5010 quits [Ping timeout: 272 seconds] |
19:40:30 | | ctag quits [Read error: Connection reset by peer] |
19:40:49 | | ctag (ctag) joins |
19:43:22 | | Chris5010 (Chris5010) joins |
20:18:17 | | jasons (jasons) joins |
20:28:59 | | Carnildo_again is now known as Carnildo |
20:32:39 | <@JAA> | On the plus side, this probably means a simple recursive AB job (already running) is enough to capture everything that still exists. |
20:32:56 | <@JAA> | Although bruteforcing the IDs would've been fun. |
20:34:38 | | sonick quits [Client Quit] |
20:45:33 | | Dango360 (Dango360) joins |
21:01:49 | | lennier1 (lennier1) joins |
21:02:15 | | BlueMaxima joins |
21:11:34 | | datechnoman quits [Quit: The Lounge - https://thelounge.chat] |
21:12:10 | | datechnoman (datechnoman) joins |
21:15:20 | | jasons quits [Ping timeout: 240 seconds] |
21:31:11 | <h2ibot> | JustAnotherArchivist edited Deathwatch (+347, Dead sites are dead): https://wiki.archiveteam.org/?diff=51573&oldid=51571 |
21:41:06 | | magmaus3 quits [Quit: Ping timeout (120 seconds)] |
21:41:20 | | magmaus3 (magmaus3) joins |
22:18:57 | | jasons (jasons) joins |
22:20:39 | | lennier1 quits [Ping timeout: 272 seconds] |
22:23:07 | | Arcorann (Arcorann) joins |
22:31:25 | | BearFortress quits [Ping timeout: 272 seconds] |
23:05:47 | | lukash98 joins |
23:06:20 | | lukash9 quits [Ping timeout: 240 seconds] |
23:06:20 | | lukash98 is now known as lukash9 |
23:18:17 | | jasons quits [Ping timeout: 272 seconds] |
23:20:32 | | Webuser442 joins |
23:20:50 | <Webuser442> | Hello Anyone here, i need a little help? |
23:21:12 | <pokechu22> | Hi |
23:21:49 | <Webuser442> | HI pokechu22 I have a question about the Warrior Tool on VirtualBox: Can you or anyone tell me how to archive a website? |
23:22:31 | <pokechu22> | Warrior is used to run projects that someone else has set up - it's good for large sites which need a lot of people, but it's not a tool you can use to save a small site for yourself |
23:22:51 | <pokechu22> | https://github.com/ArchiveTeam/grab-site is better for that kind of thing (though I haven't used it myself) |
23:23:49 | <Webuser442> | Then I misunderstood... |
23:24:06 | <pokechu22> | What site are you trying to archive? |
23:24:46 | <Webuser442> | I'll take a look at it.Various pages, the Wayback Site does not archive all domains. |
23:38:04 | | Webuser442 quits [Remote host closed the connection] |
23:38:05 | | lennier1 (lennier1) joins |