The goal of SEC13Flist package is to provide functions to work with official list of Section 13(f) Securities.
Functions SEC_13F_list
and SEC_13F_list_local
parses PDF list from
SEC.gov based
on supplied year and quarter and returns data frame with list of
securities, maintaining the same structure as official list. Functions
appends YEAR and QUARTER columns to the list. Returned data frame could
be customized and filtered according to your needs.
SEC_13F_list
function reaches to
SEC.gov website
and requires tweaks if landing page changes. In case of a breaking
change on landing page, you can use SEC_13F_list_local
function to
parse file downloaded to local folder.
SEC_13F_list
function requires setup of user agent prior to attempting
download from sec.gov website. For details how to setup user agent and
maximum request rate please refer to
https://www.sec.gov/os/accessing-edgar-data.
User agent could be setup via options(HTTPUserAgent=...)
.
Functions isCusip
, isSedol
, and isIsin
verify checksum digit of
security identifiers based on leading characters of the identifier
(except last checksum digit). Functions returns TRUE
/FALSE
for
correct/incorrect identifier.
CUSIP, ISIN, and SEDOL checksum calculation pseudo code located at Wikipedia - CUSIP, Wikipedia - SEDOL, Wikipedia - ISIN and R/C/C++ implementation is at Rosettacode - CUSIP, Rosettacode - SEDOL, and Rosettacode - ISIN
You can install current development version from GitHub with:
remotes::install_github("yanlesin/SEC13Flist")
CUSIP
: chr - CUSIP number of the security
HAS_LISTED_OPTION
: chr - An asterisk indicates that security having a
listed option and each option is individually listed with its own CUSIP
number immediately below the name of the security having the option
ISSUER_NAME
: chr - Issuer Name
ISSUER_DESCRIPTION
: chr - Issuer Description
STATUS
: chr - “ADDED” (The security has become a Section 13(f)
security) or “DELETED” (The security ceases to be a 13(f) security since
the date of the last list)
YEAR
: int - Year of the list
QUARTER
: int - Quarter of the list
These are basic examples of usage:
library(SEC13Flist)
library(tidyverse)
## Return list for Q3 2018
SEC13Flist_2018_Q3 <- SEC_13F_list(2018,3)
## Customizing
SEC13Flist_current <- SEC_13F_list(2023, 3) |>
filter(STATUS!="DELETED") |> #Filter out records with STATUS "DELETED"
select(-YEAR,-QUARTER) #Remove YEAR and QUARTER columns
## Verifying CUSIP
verify_CUSIP <- SEC_13F_list(2023, 3) |>
rowwise() |> ##CUSIPs are not unique, isCusip function is not vectorized and requires single nine character CUSIP as input
mutate(VALID_CUSIP=isCusip(CUSIP)) ##validating CUSIP
According to FAQ section of CUSIP Global Services:
Can firms take CGS Data from public sources and create their own database without signing a license agreement with CGS?
CGS Data is publicly available in some offering documents and from other sources. Firms can elect to collect this information and store it in their internal databases for non-commercial use, provided that the source of such information permitted the reproduction and use of such information. However, CGS’s experience has been that the CGS data generally has not come from publicly available sources but rather from other sources such as a CGS Authorized Distributor or through improperly scraping websites of CGS customers with valid CGS’ licenses. Most end-user customers of CGS Data prefer to enter into a license agreement with CGS for authorized use and to enjoy the benefits of the integrity and functionality of downloadable, timely and accurate data (either from CGS directly or from an Authorized Distributor).
This discussion at stackexchange describes problem with CUSIP codes for CALL and PUT options that is still present at current list.
This discussion at FundApps support article describes how FundApps (software provider for regulatory compliance) addresses quality issue for CUSIP codes including all option securities with the same first six-character subset of CUSIP code as main issue (* for HAS_LISTED_OPTION field in the list).