| 00:04:15 | | Flashfire42 is now authenticated as flashfire42 |
| 03:14:51 | <HP_Archivist> | WBM captured this page: https://encompass.com/model/EPSE10000XLPH |
| 03:15:17 | <HP_Archivist> | But what I want to do is capture the outlinks for each part listed on each of the 7 pages. Any suggestions other than manually going in and doing that over again for all? |
| 03:22:41 | <pabs> | HP_Archivist: locally download all the pages, extract outlinks from each of them, compile into a list, upload to transfer, do #archivebot !ao < $transferurl |
| 03:23:13 | <HP_Archivist> | I used HTTrack to do that, but couldn't figure out where to look for the links |
| 03:23:17 | <HP_Archivist> | pabs ^ |
| 03:24:00 | <pabs> | personally I would just do this: curl -sL $page | pup 'a attr{href}' | sort -u |
| 03:24:27 | <pabs> | have you got the URLs to the other pages? seems to need JS? |
| 03:25:13 | <pabs> | (pup is a html parser, the 'a attr{href}' commands means find all <a> tags and grab their href attributes) |
| 03:28:53 | <HP_Archivist> | No, that's just it. Need an easy way to grab those URLs to the other pages/individual parts pages |
| 03:29:11 | <pabs> | parts pages will work with pup |
| 03:29:20 | <pabs> | the 7 pages not sure, will check |
| 03:31:03 | <HP_Archivist> | Luckily, WBM did just find with the 7 pages |
| 03:31:11 | <HP_Archivist> | Did not grab the parts pages though |
| 03:31:21 | <HP_Archivist> | Did just fine* |
| 03:33:12 | <HP_Archivist> | I mean, the individual parts pages doesn't have much detail from this partner supplies company. The pages don't list much other than part number and some vague name. But they're still important to save as reference |
| 03:36:04 | <pabs> | lol, those "pages" are all in the same HTML file |
| 03:36:20 | <pabs> | so a simple pup href extraction works, sec |
| 03:37:48 | <HP_Archivist> | Ah, that's why SPN did fine with it |
| 03:40:05 | <pabs> | if you did SPN outlinks then it should have all pages |
| 03:40:14 | <pabs> | but if not, here is what pup got: https://transfer.archivete.am/S4sVE/encompass.com-model-EPSE10000XLPH-outlinks.txt |
| 03:40:53 | <pabs> | edit that, reupload it, do something like this in the #archivebot chan: !ao < https://transfer.archivete.am/S4sVE/encompass.com-model-EPSE10000XLPH-outlinks.txt |
| 03:40:59 | <HP_Archivist> | I did, but maybe I just need to give it time for them to show up in WBM because clicking each part link just brings up not a not crawled SPN page |
| 03:41:07 | <HP_Archivist> | Many thanks, pabs |
| 03:41:19 | <HP_Archivist> | I'll submit it in ab in a few |
| 03:42:05 | <HP_Archivist> | This particular scanner from Epson is used in the VGPC community and is also widely used generally for large format archival scanning. So, any reference to its technical literate and parts list identification is good to save |
| 03:42:16 | <HP_Archivist> | literature* |
| 04:11:28 | | Justin[home] is now known as DopefishJustin |
| 04:21:43 | | fireonlive quits [Client Quit] |
| 04:23:27 | | fireonlive (fireonlive) joins |
| 06:01:46 | | AlsoHP_Archivist joins |
| 06:02:02 | | atphoenix__ (atphoenix) joins |
| 06:02:24 | | s-crypt20 (s-crypt) joins |
| 06:02:35 | | sepro8 (sepro) joins |
| 06:02:35 | | fireonlive quits [Client Quit] |
| 06:02:35 | | Flashfire42 quits [Client Quit] |
| 06:02:35 | | s-crypt2 quits [Client Quit] |
| 06:02:35 | | nulldata quits [Client Quit] |
| 06:02:35 | | kiska quits [Client Quit] |
| 06:02:36 | | s-crypt20 is now known as s-crypt2 |
| 06:02:39 | | TheTechRobo quits [Client Quit] |
| 06:02:39 | | sepro quits [Client Quit] |
| 06:02:39 | | HP_Archivist quits [Remote host closed the connection] |
| 06:02:39 | | atphoenix_ quits [Remote host closed the connection] |
| 06:02:39 | | sepro8 is now known as sepro |
| 06:02:39 | | andrew quits [Client Quit] |
| 06:02:39 | | project10 quits [Client Quit] |
| 06:02:39 | | Ryz quits [Client Quit] |
| 06:02:52 | | nulldata (nulldata) joins |
| 06:03:11 | | Ryz (Ryz) joins |
| 06:04:03 | | kiska (kiska) joins |
| 06:04:12 | | fireonlive (fireonlive) joins |
| 06:04:37 | | TheTechRobo (TheTechRobo) joins |
| 06:22:23 | | Jake quits [Ping timeout: 272 seconds] |
| 06:28:55 | | Jake (Jake) joins |
| 06:37:46 | | Flashfire42 joins |
| 06:37:51 | | Flashfire42 is now authenticated as flashfire42 |
| 06:56:44 | | Arcorann (Arcorann) joins |
| 06:59:06 | | project10 (project10) joins |
| 08:44:40 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
| 09:48:24 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 09:56:30 | | [42] quits [Remote host closed the connection] |
| 09:57:25 | | [42] (N4Y) joins |
| 11:53:02 | | dxrt_ joins |
| 11:53:11 | | project10 quits [Client Quit] |
| 11:53:11 | | TheTechRobo quits [Client Quit] |
| 11:53:12 | | fireonlive quits [Client Quit] |
| 11:53:12 | | [42] quits [Max SendQ exceeded] |
| 11:53:12 | | dxrt quits [Client Quit] |
| 11:53:12 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 11:53:12 | | balrog quits [Client Quit] |
| 11:53:21 | | [42] (N4Y) joins |
| 11:53:57 | | balrog (balrog) joins |
| 11:53:57 | | project10 (project10) joins |
| 11:54:44 | | fireonlive (fireonlive) joins |
| 11:57:06 | | fireonlive quits [Killed (NickServ (GHOST command used by fireonlive9))] |
| 11:57:42 | | project10 quits [Client Quit] |
| 11:57:57 | | project10 (project10) joins |
| 11:58:35 | | fireonlive (fireonlive) joins |
| 11:59:40 | | TheTechRobo (TheTechRobo) joins |
| 12:01:36 | | TheTechRobo quits [Excess Flood] |
| 12:03:59 | | TheTechRobo (TheTechRobo) joins |
| 12:44:55 | | Arcorann quits [Ping timeout: 272 seconds] |
| 14:11:31 | | Matthww1192 joins |
| 14:13:35 | | Matthww119 quits [Ping timeout: 272 seconds] |
| 14:13:35 | | Matthww1192 is now known as Matthww119 |
| 17:30:31 | | nicolas17 joins |
| 21:16:01 | | project10 quits [Ping timeout: 272 seconds] |
| 21:21:52 | | project10 (project10) joins |
| 21:44:59 | | Jake quits [Client Quit] |
| 21:45:53 | | Jake (Jake) joins |
| 21:48:49 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 21:51:02 | | Jake quits [Client Quit] |
| 21:52:55 | | Jake (Jake) joins |
| 22:30:42 | | BearFortress_ joins |
| 22:34:34 | | BearFortress quits [Ping timeout: 265 seconds] |
| 22:42:43 | | TheTechRobo quits [Client Quit] |
| 22:42:43 | | project10 quits [Client Quit] |
| 22:43:29 | | project10 (project10) joins |
| 22:48:02 | | TheTechRobo (TheTechRobo) joins |
| 23:00:25 | | BearFortress_ quits [Read error: Connection reset by peer] |
| 23:00:31 | | BearFortress joins |
| 23:00:41 | | AlsoHP_Archivist quits [Read error: Connection reset by peer] |
| 23:01:29 | | AlsoHP_Archivist joins |
| 23:01:43 | | atphoenix_ (atphoenix) joins |
| 23:01:49 | | sepro quits [Client Quit] |
| 23:02:01 | | fireonlive quits [Client Quit] |
| 23:02:01 | | Jake quits [Client Quit] |
| 23:02:09 | | s-crypt21 (s-crypt) joins |
| 23:02:20 | | sepro (sepro) joins |
| 23:02:20 | | Justin[home] joins |
| 23:02:20 | | Justin[home] is now authenticated as DopefishJustin |
| 23:02:22 | | Flashfire424 joins |
| 23:02:25 | | G4te_Keep3r3492 quits [Quit: Ping timeout (120 seconds)] |
| 23:02:31 | | Ryz quits [Client Quit] |
| 23:02:32 | | geezabiscuit quits [Read error: Connection reset by peer] |
| 23:02:39 | | nulldata3 (nulldata) joins |
| 23:02:47 | | G4te_Keep3r3492 joins |
| 23:02:49 | | dxrt_ quits [Client Quit] |
| 23:03:00 | | Lord_Nightmare quits [Client Quit] |
| 23:03:00 | | geezabiscuit joins |
| 23:03:00 | | geezabiscuit is now authenticated as geezabiscuit |
| 23:03:00 | | geezabiscuit quits [Changing host] |
| 23:03:00 | | geezabiscuit (geezabiscuit) joins |
| 23:03:07 | | dxrt joins |
| 23:03:09 | | dxrt is now authenticated as dxrt |
| 23:03:09 | | dxrt quits [Changing host] |
| 23:03:09 | | dxrt (dxrt) joins |
| 23:03:10 | | Ryz (Ryz) joins |
| 23:03:15 | | nulldata quits [Read error: Connection reset by peer] |
| 23:03:16 | | nulldata3 is now known as nulldata |
| 23:03:20 | | Lord_Nightmare (Lord_Nightmare) joins |
| 23:03:27 | | Jake (Jake) joins |
| 23:03:38 | | fireonlive (fireonlive) joins |
| 23:03:54 | | kiska6 (kiska) joins |
| 23:04:19 | | s-crypt2 quits [Ping timeout: 272 seconds] |
| 23:04:19 | | DopefishJustin quits [Ping timeout: 272 seconds] |
| 23:04:19 | | s-crypt21 is now known as s-crypt2 |
| 23:04:57 | | Flashfire42 quits [Ping timeout: 272 seconds] |
| 23:04:57 | | kiska quits [Ping timeout: 272 seconds] |
| 23:04:57 | | atphoenix__ quits [Ping timeout: 272 seconds] |
| 23:04:57 | | kiska6 is now known as kiska |
| 23:04:58 | | Flashfire424 is now known as Flashfire42 |
| 23:16:39 | | DLoader_ joins |
| 23:16:39 | | DLoader_ quits [Excess Flood] |
| 23:17:27 | | DLoader_ joins |
| 23:18:15 | | rewby quits [Ping timeout: 272 seconds] |
| 23:18:40 | | rewby (rewby) joins |
| 23:20:09 | | DLoader quits [Ping timeout: 272 seconds] |
| 23:20:09 | | DLoader_ is now known as DLoader |
| 23:40:36 | | Flashfire42 is now authenticated as flashfire42 |