hospitalstars

CMS issues star ratings annually for hospitals, and they make the data (Care Compare reports) and the algorithm (SAS package) publicly available. However, that can be difficult to work with because the data is spread out over dozens of archives of CSV files, and the algorithm is implemented with SAS.

The hospitalstars package has an R implementation of the most recent (April 2021, July 2022, July 2023) versions of the SAS package, and prepared input data from each quarterly Care Compare report from January 2018 through October 2023.

Installation

Before installing this package on Windows, you need to download and install RTools for your version of R. If RStudio is open, you may need to restart it.

This has only been tested on Windows and Ubuntu. The two steps to install there are:

Install the devtools package.
Install the development version of hospitalstars with the command below.

devtools::install_github("jimurick/hospitalstars")

Hospital and Measure Info

The table hospitalstars::hospital_info_df has basic information for each hospital.

library(hospitalstars)

hospital_info_df[1:5, c("PROVIDER_ID", "hospital_name")]
#>   PROVIDER_ID                   hospital_name
#> 1      010001 SOUTHEAST HEALTH MEDICAL CENTER
#> 2      010005        MARSHALL MEDICAL CENTERS
#> 3      010006    NORTH ALABAMA MEDICAL CENTER
#> 4      010007        MIZELL MEMORIAL HOSPITAL
#> 5      010008     CRENSHAW COMMUNITY HOSPITAL

The next chunk pulls 3 of Geisinger’s bigger hospitals.

suppressWarnings(library(tidyverse))

geisinger_df <-
  hospital_info_df %>%
  filter(grepl("GEISINGER.*CENTER$", hospital_info_df$hospital_name))
  
geisinger_df %>%
  select(PROVIDER_ID, hospital_name)
#>   PROVIDER_ID                           hospital_name
#> 1      390001      GEISINGER-COMMUNITY MEDICAL CENTER
#> 2      390006                GEISINGER MEDICAL CENTER
#> 3      390270 GEISINGER WYOMING VALLEY MEDICAL CENTER

There’s also a table measure_info_df with one row for each combination of a measure appearing in a Care Compare report. The column report_date gives the first day of the month that the Care Compare report was released.

This chunk shows the 7 mortality measures and their IDs. Some measure names are a little long, so they’re limited to 60 characters here.

mortality_df <-
  measure_info_df %>%
  filter(report_date == "2023-10-01", measure_group == "Mortality")

mortality_df %>%
  transmute(measure_id, measure_name = str_sub(measure_name, end = 60))
#>     measure_id                                                 measure_name
#> 1  MORT_30_AMI      Acute Myocardial Infarction (AMI) 30-Day Mortality Rate
#> 2 MORT_30_CABG 30-Day All-Cause Mortality Following Coronary Artery Bypass 
#> 3 MORT_30_COPD Chronic Obstructive Pulmonary Disease (COPD) 30-Day Mortalit
#> 4   MORT_30_HF                     Heart Failure (HF) 30-Day Mortality Rate
#> 5   MORT_30_PN                              Pneumonia 30-Day Mortality Rate
#> 6  MORT_30_STK            Acute Ischemic Stroke (STK) 30-Day Mortality Rate
#> 7       PSI_04 Death rate among surgical inpatients with serious treatable

Historical Measure Data

The function historical_measure_data() pulls historical scores for each quarter since January 2018 for the given hospitals and measures.

mortality_history_df <-
  historical_measure_data(
    measure_ids = mortality_df$measure_id,
    provider_ids = geisinger_df$PROVIDER_ID
  )

mortality_history_df %>% head()
#>    report_date PROVIDER_ID  measure_id score
#> 1:  2018-01-01      390001 MORT_30_AMI 0.136
#> 2:  2018-01-01      390006 MORT_30_AMI 0.135
#> 3:  2018-01-01      390270 MORT_30_AMI 0.158
#> 4:  2018-05-01      390001 MORT_30_AMI 0.136
#> 5:  2018-05-01      390006 MORT_30_AMI 0.135
#> 6:  2018-05-01      390270 MORT_30_AMI 0.158

The format of this data is easy to use with ggplot.

geis_names_df <-
  geisinger_df %>%
  select(PROVIDER_ID, hospital_name)

mortality_history_df %>%
  inner_join(geis_names_df, by = "PROVIDER_ID") %>%
  ggplot(aes(report_date, score,
             group = hospital_name, color = hospital_name)) +
  geom_line() +
  facet_wrap(~measure_id, scales = "free_y") +
  theme(legend.position = c(0.75, 0.15)) +
  ggtitle("Geisinger Scores for Mortality Measures")

Star Rating Algorithm

In order to calculate star ratings, you need an input table with all hospitals’ data. The star_rating_input() function returns a data.frame made of Care Compare data. Its argument report_date specifies the first day of the month the Care Compare report was released.

Then compute_star_scores() will calculate the star ratings (using the July 2023 version by default). The results are returned in a copy of the input table with extra columns added.

input_df <- star_rating_input(report_date = "2023-01-01")
output_df <- compute_star_scores(input_df)

output_df %>%
  filter(PROVIDER_ID %in% geisinger_df$PROVIDER_ID) %>%
  select(PROVIDER_ID, peer_group = n_groups, summary_score, stars)
#>   PROVIDER_ID peer_group summary_score stars
#> 1      390001          5    0.09732463     4
#> 2      390006          5    0.30614614     4
#> 3      390270          5    0.35972998     4

hospitalstars also includes a copy of the SAS Package’s output, which was computed in SAS.

sas_package$v202307$output_df %>%
  select(
    PROVIDER_ID, peer_group = Total_measure_group_cnt,
    summary_score, stars = star
  ) %>%
  filter(PROVIDER_ID %in% geisinger_df$PROVIDER_ID)
#>   PROVIDER_ID peer_group summary_score stars
#> 1      390001          5    0.09956553     4
#> 2      390006          5    0.30740967     4
#> 3      390270          5    0.36152747     4

It may be interesting to note that the results in the last two code chunks above are not the same, because the SAS Package’s input file has some differences with the Care Compare report from January 2023 (the data used for the 2023 Star Ratings). If you use the SAS Package’s input file in R here, you should get the same output as you would in SAS.

compute_star_scores(sas_package$v202307$input_df) %>%
  filter(PROVIDER_ID %in% geisinger_df$PROVIDER_ID) %>%
  select(PROVIDER_ID, peer_group = n_groups, summary_score, stars)
#>   PROVIDER_ID peer_group summary_score stars
#> 1      390001          5    0.09956553     4
#> 2      390006          5    0.30740967     4
#> 3      390270          5    0.36152747     4

To see the specific measure values where the two disagree, you could use a query like this, which counts how many disagreements there are by measure.

r_unpivoted_df <- input_df %>%
  pivot_longer(-PROVIDER_ID, names_to = "measure_id", values_to = "score_r")
sas_unpivoted_df <- sas_package$v202307$input_df %>%
  pivot_longer(-PROVIDER_ID, names_to = "measure_id", values_to = "score_sas")

r_unpivoted_df %>%
  full_join(sas_unpivoted_df, by = c("PROVIDER_ID", "measure_id")) %>%
  mutate(
    score_r = replace_na(score_r, -1000),
    score_sas = replace_na(score_sas, -1000)
  ) %>%
  filter(score_r != score_sas) %>%
  filter(!grepl("(NUMB_COMP|RATE_P|_DEN|_DEN_PRED)$", measure_id)) %>%
  count(measure_id) %>%
  arrange(desc(n)) %>%
  head()
#> # A tibble: 6 × 2
#>   measure_id       n
#>   <chr>        <int>
#> 1 OP_18B         478
#> 2 IMM_3          133
#> 3 HCP_COVID_19   131
#> 4 OP_22          111
#> 5 OP_13          110
#> 6 OP_29          101

What-if Analysis

Suppose we wanted to see how improving mortality scores would change the final star ratings.

We’ll reuse the input_df dataframe created in Star Rating Algorithm above. The current mortality scores are

input_df %>%
  inner_join(geis_names_df, by = "PROVIDER_ID") %>%
  select_at(c("hospital_name", mortality_df$measure_id))
#>                             hospital_name MORT_30_AMI MORT_30_CABG MORT_30_COPD
#> 1      GEISINGER-COMMUNITY MEDICAL CENTER       0.125        0.035        0.088
#> 2                GEISINGER MEDICAL CENTER       0.117        0.032        0.083
#> 3 GEISINGER WYOMING VALLEY MEDICAL CENTER       0.109        0.022        0.087
#>   MORT_30_HF MORT_30_PN MORT_30_STK PSI_04
#> 1      0.136      0.146       0.112 162.13
#> 2      0.101      0.130       0.113 174.50
#> 3      0.094      0.122       0.134 140.58

Create a new input table where those scores are improved by 5%.

improvement <-
  ifelse(input_df$PROVIDER_ID %in% geisinger_df$PROVIDER_ID, 0.95, 1)

new_input_df <- input_df
for (m_id in mortality_df$measure_id)
  new_input_df[[m_id]] <- new_input_df[[m_id]] * improvement

new_input_df %>%
  inner_join(geis_names_df, by = "PROVIDER_ID") %>%
  select_at(c("hospital_name", mortality_df$measure_id))
#>                             hospital_name MORT_30_AMI MORT_30_CABG MORT_30_COPD
#> 1      GEISINGER-COMMUNITY MEDICAL CENTER     0.11875      0.03325      0.08360
#> 2                GEISINGER MEDICAL CENTER     0.11115      0.03040      0.07885
#> 3 GEISINGER WYOMING VALLEY MEDICAL CENTER     0.10355      0.02090      0.08265
#>   MORT_30_HF MORT_30_PN MORT_30_STK   PSI_04
#> 1    0.12920     0.1387     0.10640 154.0235
#> 2    0.09595     0.1235     0.10735 165.7750
#> 3    0.08930     0.1159     0.12730 133.5510

Then recompute the star scores.

new_output_df <- compute_star_scores(new_input_df)

new_output_df %>%
  inner_join(geis_names_df, by = "PROVIDER_ID") %>%
  select(hospital_name, peer_group = n_groups, summary_score, stars)
#>                             hospital_name peer_group summary_score stars
#> 1      GEISINGER-COMMUNITY MEDICAL CENTER          5     0.2187725     4
#> 2                GEISINGER MEDICAL CENTER          5     0.4196913     5
#> 3 GEISINGER WYOMING VALLEY MEDICAL CENTER          5     0.4658343     5

So improving all 7 Mortality scores by 5% changed the overall star ratings from 4, 4 and 4 to 4, 5 and 5.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
R		R
data		data
man		man
renv		renv
src		src
tests		tests
.Rbuildignore		.Rbuildignore
.Rprofile		.Rprofile
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
hospitalstars.Rproj		hospitalstars.Rproj
renv.lock		renv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

hospitalstars

Installation

Hospital and Measure Info

Historical Measure Data

Star Rating Algorithm

What-if Analysis

About

Licenses found

Releases

Packages

Contributors 2

Languages

License

Licenses found

jimurick/hospitalstars

Folders and files

Latest commit

History

Repository files navigation

hospitalstars

Installation

Hospital and Measure Info

Historical Measure Data

Star Rating Algorithm

What-if Analysis

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages