Skip to content

Just a script or two to download public Navy instructions (OPNAVINST/SECNAVINST).

Notifications You must be signed in to change notification settings

mpyne-navy/spider-navy-instructions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Navy Instruction Downloader

Overview

This uses the GNU wget to spider the U.S. Navy's public instructions website ("Department of Navy Issuances") for PDF files, and downloads them (but no other files) into a dedicated directory.

Compiling the instruction list

So the Navy's instructions website uses Javascript and wget isn't a big fan of that. The quickest way I found was just to use Chrome DevTools and scrape out the list of URLs after paginating manually, by running this in the console:

x = $$("tr a.ms-listlink[href$='.pdf']").flatMap(x => x.href).join("\n");

Once run Chrome will prompt if you want to copy to clipboard, which is what I did (to then paste into the file). At the time I did this, that meant 3 separate page loads to get the ~1100 instructions in their table, 500 at a time.

The [href$='.pdf'] part is to ensure that the non-canceled instructions are chosen only (canceled instructions still have a hyperlink but it seems to normally go to a .docx file instead explaining that the instruction is unavailable).

Of course, there are also non-cancelled but still not-public instructions. These still link to a PDF, but it's a PDF saying that you can't have this instruction. Luckily the Navy has adopted a sane naming convention for instruction filenames, so filtering these out is relatively easy as well.

There's probably an easy way to do this using sed(1) but I just ended up using vim to find and delete lines matching this regex: \/[CSF][0-9][^/]\+$. That is, searching within the last path component of the URL after all other / characters, find filenames starting with C, S or F and then a digit. These are the instructions that are Confidential (C), Secret (S) or For Official Use Only (F) respectively.

About

Just a script or two to download public Navy instructions (OPNAVINST/SECNAVINST).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages