-
Notifications
You must be signed in to change notification settings - Fork 0
/
code-cleaning.qmd
581 lines (377 loc) · 23.5 KB
/
code-cleaning.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
# Gathering and Cleaning Data! {#sec-gathering-and-cleaning-data}
```{r setup, warning = FALSE, message=FALSE}
#| include: false
library(tidyverse)
library(readxl)
knitr::opts_chunk$set(warning = FALSE, message = FALSE, eval=FALSE)
```
Information for the Fiscal Futures all funds database comes from detailed information provided by the Illinois Comptroller's office. Currently the database includes 300,000 individual records for fiscal years 1998 to 2022.
- Do the FOIA request. In a week or so, they send the expenditure and revenue data as excel files.
- Checks whether there are any new agencies, re-used funds etc. Create a list of funds, agencies, fund names, etc. for the new year and compare it to the immediate prior year to identify new funds.
- Update the funds_ab_in file which shows the use of funds. Use criteria to determine if the new funds should be in or out of the all-funds frame.
- Change the variable names to be consistent with other files such as AGENCYNAME --\> agency_name
- Once variable names are shared over all years of data, combine past years with newest year. All revenue files are in a \`revenue\` folder that I reference when I set the working directory. When adding new fiscal years, put the the newest year of data for revenue and expenditures in their respective folders.
Normally, when your receive the new fiscal year files from the Comptrollers office, you will need to change the variable names so that they are consistent with past years. This is an example of reading in the new file and changing the variable names. They seem to change almost every year in the file received from the FOIA so if the code breaks here, check to make sure that the columns you are trying to rename exist and are spelled correctly! Once variables are the same, you will want to save the file as a csv file in its Revenue/Expenditure file and bind all past years and the current year together in one dataframe.
```{r}
#| code-fold: false
#| eval: false
#Example code below: Read in excel file and rename columns so that it plays well with the other years' files.
revenue_fy23 <- read_xlsx("Fis_Fut_Rev_2023_Final.xlsx") %>%
rename(fy = 'FY',
fund = 'FUND',
fund_name = 'FUND NAME',
agency = 'AGENCY',
agency_name = 'AGENCY NAME',
source = 'REVENUE SOURCE',
source_name = 'REV SRC NAME',
receipts = 'AMOUNT'
)
exp_fy23 <- read_xlsx("Fis_Fut_Exp_2023_Final.xlsx") %>%
rename(fy = 'FY',
fund = 'FUND',
fund_name = 'FUND NAME',
agency = 'AGENCY',
agency_name = 'AGENCY NAME',
appr_org = 'ORGANIZATION',
org_name = 'ORGANIZATION NAME',
obj_seq_type = 'APPROPRIATION',
wh_approp_name = 'APPROPRIATION NAME',
# exp_net_xfer = 'NET OF TRANS AMOUNT',
expenditure = 'EXPENDED'
)
# %>%
# # these come from ioc_source file after merging
# mutate(data_source = "exp IOC Aug 2022",
# object = ,
# seq = ,
# type = ,
# fund_cat = FIND_COLUMN, #create fund_cat column
# fund_cat_name = FIND_NAME) # create fund_cat_name column
```
The code chunk below takes the .dta files for all fiscal years before FY 2022 and binds them together. Variable names were manually changed by past researchers so that they were consistent across years.
For FY 2023 and after, .dta files can be avoided entirely and .csv or excel files will be used. All files before this year had been saved and passed on as .dta files for Stata code before the transition to R in Fall 2022.
<!--- Dug deep in the box files and found old Excel files for each year of reqeuested data. Tried to make a new Database folder that has the excel files before manipulating but didn't finish. --->
For years after fy22: add line of code to bind csv files after binding the dta files together. Variable names must be identical to merge files together.
<!--- Code chunk below not updated. Is in other .rmd files for fy2023--->
```{r create-rev-csv-FY23, include=FALSE, eval=FALSE}
# years after fy22: add line of code to bind csv files after binding the dta files.
setwd("./data/ioc_data_received/revenue")
allrevfiles23 = list.files(path = "./data/ioc_data_received/revenue", pattern = ".dta") %>% lapply(read_dta) %>% bind_rows
#Fy21: 62294 observations, 13 variables
#FY22: 65094 obs, 13 vars
#FY23: 57822 obs
write_csv(allrevfiles23, "./data/allrevfiles23.csv")
setwd("./data/ioc_data_received/expenditures")
allexpfiles23 = list.files(path = "./data/ioc_data_received/expenditures", pattern = ".dta") %>% lapply(read_dta) %>% bind_rows
#fy21 213372 observations, 20 variables
# fy22 225587 obs, 21 vars.
# fy23 238089 obs
write_csv(allexpfiles23, ".data/allexpfiles23.csv")
```
Code below reads in the csv files created in chunks above (allrevfiles.csv and allrexpfiles.csv). These files contain all years of data combined into one file BEFORE any recoding is done. Do not use this file for summing categories because it is just an inbetween step before recoding revenue and expenditure categories. It also does not contain changes in funds or agencies.
```{r readCSVs, warning=FALSE, message=FALSE}
# combined in past chunks called create-rev-csv and create-exp-csv
allrevfiles23 <- read_csv("./data/allrevfiles_Nov132023.csv") #combined but not recoded
allexpfiles23 <- read_csv("./data/allexpfiles_Nov132023.csv") #combined but not recoded
```
### Inspecting new FY23 data files
Data files for closed years have been obtained from IOC. The numbers of funds, agencies, organizations, and revenue sources below were found by using pivot tables in the codebook files in the FY2022 Box folder. This could also be done using R and grouping new files by fund, agency, source number, source names, etc..
Revenue File:
- 684 Fund Numbers
- 80 Agencies
- 1184 Revenue source numbers
- 1156 revenue source names
Expenditure File:
- 708 Fund Numbers
- 107 Agencies
- 98 Organization Numbers
- 313 Organization names
### Finding new agencies and funds
General steps:
1. Identify new and reused funds for newest fiscal year.
2. Recode funds to take into account different fund numbers/names over the years. See \[Recoding New and Reused Funds\] for code chunk that does this.
3. Update `fund_ab_in_CURRENTFY.xlsx` with any changes from previous fiscal year.
**New Agencies, Funds, and Organizations from Expenditure files:**
- Using the code below, I found 2 agencies, multiple funds, and a couple organizations that had not been specifically mentioned in the funds_ab_in file - AWM, FY2022.
- Note: One new org name and org number combo has an expenditure of \$600 million for pensions payment.
```{r eval=FALSE}
#allrevfiles23 <- read_csv("allrevfiles23.csv") #combined but not recoded
#allexpfiles23 <- read_csv("allexpfiles23.csv") #combined but not recoded
cur_fy <- 2023 # Create variable for current fiscal year to save time for updating in future years.
#### From Expenditure Data #####
# agencies referenced in any year before 2023:
agencies_past <- allexpfiles23 %>%
filter(fy < cur_fy) %>%
mutate(agency == as.character(agency)) %>%
group_by(agency, agency_name) %>% unique() %>%
summarize(expenditure = sum(expenditure, na.rm = TRUE)) %>%
drop_na() %>%
arrange(agency)
# agencies_past # 148 agencies ever
# agencies in 2023 data:
agencies23 <- allexpfiles23 %>%
filter(fy == cur_fy) %>%
mutate(agency == as.character(agency)) %>%
group_by(agency, agency_name) %>%
summarize(expenditure = sum(expenditure, na.rm = TRUE))
# 280 and 533 are new agency codes from old code cleaning methodology.
# When creating fiscal future categories for the fiscal gap, 280 is assigned to Group = 920 ( Judicial) expenditure category
# 533 was assigned to "949. Other Boards and Commissions"
# no new agencies from fy22 to fy23:
anti_join(agencies23, agencies_past,
by = c("agency", "agency_name")) %>%
arrange(agency)
funds_past <- allexpfiles23 %>%
filter(fy < cur_fy) %>%
mutate(fund == as.character(fund)) %>%
group_by(fund, fund_name) %>%
summarize(count = n(), Expenditure = sum(expenditure, na.rm = TRUE)) %>%
drop_na()
funds23 <- allexpfiles23 %>%
filter(fy == cur_fy) %>%
mutate(fund == as.character(fund)) %>%
group_by(fund, fund_name) %>%
summarize(count = n(), Expenditure = sum(expenditure, na.rm = TRUE)) %>%
unique()
# 26 funds were in FY23 data that were not in past data:
anti_join(funds23, funds_past,
by = c("fund", "fund_name")) %>%
arrange(fund)
# Each year these must examined closely to determine if they are a new fund, a reused fund, or if there is just a slight difference in the name of the fund
# orgs_past for orgs in the past = 920 org groups ever
orgs_past <- allexpfiles23 %>%
filter(fy < cur_fy) %>%
mutate(appr_org == as.character(appr_org)) %>%
group_by(appr_org, org_name) %>% unique() %>%
summarize(Expenditure = sum(expenditure, na.rm = TRUE)) %>%
drop_na()
# orgs_past # 916 org groups ever
orgs23 <- allexpfiles23 %>%
filter(fy == cur_fy) %>%
mutate(appr_org = as.character(appr_org)) %>%
group_by(appr_org, org_name) %>%
summarize(Expenditure = sum(expenditure, na.rm = TRUE))
# orgs22 # 399 org groups this year
# 3 org number and org name combos are new for FY2023:
# Usually not a big deal, orgs are not used much in the gap calculation.
# but good to stay aware and look for any change that may be important.
anti_join(orgs23, orgs_past,
by = c("appr_org", "org_name")) %>%
arrange(appr_org)
```
> Largest change for FY23 was the \$184 million Grocery Tax Replacement fund and \$20 million for fund 0156.
**New Revenue Funds, Sources, and New Agencies:**
```{r eval=FALSE}
#### From Revenue Data ####
# agencies_past # 108 agencies ever
agencies_past <- allrevfiles23 %>% filter(fy < cur_fy) %>%
mutate(agency == as.character(agency)) %>%
group_by(agency, agency_name) %>% unique() %>%
summarize(Receipts = sum(receipts, na.rm = TRUE)) %>%
drop_na()
agencies23 <- allrevfiles23 %>%
filter(fy == cur_fy) %>%
mutate(agency == as.character(agency)) %>%
group_by(agency, agency_name) %>%
summarize(Receipts = sum(receipts, na.rm = TRUE))
# Finds 533 and 280 again.
anti_join(agencies23, agencies_past, by = c("agency", "agency_name")) %>%
arrange(agency)
funds_past <- allrevfiles23 %>%
filter(fy < cur_fy) %>%
mutate(fund == as.character(fund)) %>%
group_by(fund, fund_name) %>%
summarize(count = n(), Receipts = sum(receipts, na.rm = TRUE)) %>%
drop_na()
funds23 <- allrevfiles23 %>%
filter(fy == cur_fy) %>%
mutate(fund == as.character(fund)) %>%
group_by(fund, fund_name) %>%
summarize(count = n(), Receipts = sum(receipts, na.rm = TRUE)) %>%
unique() %>%
drop_na()
# 13 revenue funds were in FY23 revenue data that were not in past data
# some could be small fund name changes
anti_join(funds23, funds_past, by = c("fund", "fund_name")) %>%
arrange(fund)
sources_past <- allrevfiles23 %>%
filter(fy < cur_fy) %>%
mutate(source == as.character(source)) %>%
group_by(source, source_name) %>%
summarize(count = n(), Receipts = sum(receipts, na.rm = TRUE)) %>%
drop_na()
sources23 <- allrevfiles23 %>%
filter(fy == cur_fy) %>%
mutate(source == as.character(source)) %>%
group_by(source, source_name) %>%
summarize(count = n(), Receipts = sum(receipts, na.rm = TRUE)) %>%
unique()
# 16 revenue sources were in FY23 data that were not in past data
# some could be small source name changes:
anti_join(sources23, sources_past, by = c("source", "source_name")) %>%
arrange(source)
```
### Recoding New and Reused Funds
::: {.callout-warning icon="false"}
Remember: `allrevfiles` and `allexpfiles` contain the data received from the comptroller without recoding agencies or funds that have changed between fiscal years. Variables were renamed when needed to have consistent names but funds and agencies have NOT been recoded for consistency in these files.
:::
New or reused funds were manually added to the `funds_ab_in.xlsx` file and determined if they should or should not be included in Fiscal Future calculations.
For funds that were reused once, a 9 replaces the 0 as the first digit. If reused twice, then the first two values are 10.
- Ex. 0350 --\> 9350 because its use changed.
- Ex. 0367 becomes 10367 because its use has changed twice now. There was fund 0367 originally, then its use changed and it was recoded as 9367, and now it changed again so it is a 10367.
- Excel file also has alternative ways to name funds (e.g. 0397-A and 0397-B) and variables for the year that the fund stopped being used.
New or reused funds revenue file recoding:
```{r recode-rev-funds}
#| code-fold: false
# if first character is a 0, replace with a 9 if its purpose has changed
rev_1998_2023 <- allrevfiles23 %>%
mutate(fund = ifelse(fy < 2002 & fund %in% c("0730", "0241", "0350", "0367", "0381", "0382", "0526", "0603", "0734", "0913", "0379"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse (fy < 2008 & fund %in% c("0027", "0033", "0037", "0058", "0062", "0066", "0075", "0083", "0116", "0119", "0120", "0122", "0148", "0149", "0157", "0158", "0166", "0194", "0201", "0209", "0211", "0217", "0223", "0231", "0234", "0253", "0320", "0503", "0505", "0512", "0516", "0531", "0532", "0533", "0547", "0563", "0579", "0591", "0606", "0616", "0624", "0659", "0662", "0665", "0676", "0710",
"0068", "0076", "0115", "0119", "0168", "0182", "0199", "0241", "0307", "0506", "0509", "0513"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2016 & fund %in% c("0263", "0399", "0409"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2017 & fund == "0364", str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2018 & fund %in% c("0818", "0767", "0671", "0593", "0578"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy>1999 & fy < 2018 & fund == "0231", "10231", fund) ) %>%
mutate(fund = ifelse(fy < 2019 & fund %in% c("0161", "0489", "0500", "0612", "0893", "0766"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2020 & fund %in% c("0254", "0304", "0324", "0610", "0887", "0908", "0939", "0968"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2021 & fund %in% c("0255", "0325", "0348", "0967", "0972"), str_replace(fund, "0","9"), fund) ) %>%
#2022 changes
mutate(fund = ifelse(fy < 2022 & fund %in% c("0110","0165","0351", "0392", "0393", "0422", "0544", "0628", "0634", "0656", "0672", "0683", "0723", "0742", "0743"), str_replace(fund, "0","9"), as.character(fund))) %>% # replaces first 0 it finds with a 9
mutate(fund = ifelse(fy < 2022 & fund == "0367", "10367", as.character(fund)) # fund reused for 3rd time
) %>%
# 2023 fund changes
mutate(fund = ifelse(fy < 2023 & fund %in% c("0099","0210","0246", "0250", "0264", "0268", "0300", "0311", "0403", "0448","0645", "0727", "0729", "0791"), str_replace(fund,"0", "9"), as.character(fund))) %>%
mutate(fund = ifelse(fy < 2023 & fund == "0734" , "10734", as.character(fund)),
fund = ifelse(fy<2023 & fund == "0820", "10820", as.character(fund) ) # fund reused for 3rd time
) %>%
# other fund changes noticed on April 1, 2024
mutate(fund = ifelse(fy < 2023 & fund %in% c("0490", "0684", "0747", "0869"), str_replace(fund,"0", "9"), as.character(fund)))
```
Expenditure recoding:
```{r recode-exp-funds}
# if first character is a 0, replace with a 9
exp_1998_2023 <- allexpfiles23 %>%
mutate(fund = ifelse(fy < 2002 & fund %in% c("0730", "0241", "0350", "0367", "0381", "0382", "0526", "0603", "0734", "0913", "0379"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2008 & fund %in% c("0027", "0033", "0037", "0058", "0062", "0066", "0075", "0083", "0116", "0119", "0120", "0122", "0148", "0149", "0157", "0158", "0166", "0194", "0201", "0209", "0211", "0217", "0223", "0231", "0234", "0253", "0320", "0503", "0505", "0512", "0516", "0531", "0532", "0533", "0547", "0563", "0579", "0591", "0606", "0616", "0624", "0659", "0662", "0665", "0676", "0710",
"0068", "0076", "0115", "0119", "0168", "0182", "0199", "0241", "0307", "0506", "0509", "0513"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2016 & fund %in% c("0263", "0399", "0409"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2017 & fund == "0364", str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2018 & fund %in% c("0818", "0767", "0671", "0593", "0578"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy>1999 & fy < 2018 & fund == "0231", "10231", fund) ) %>%
mutate(fund = ifelse(fy < 2019 & fund %in% c("0161", "0489", "0500", "0612", "0893", "0766"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2020 & fund %in% c("0254", "0304", "0324", "0610", "0887", "0908", "0939", "0968"), str_replace(fund, "0","9"), fund)) %>%
mutate(fund = ifelse(fy < 2021 & fund %in% c("0255", "0325", "0348", "0967", "0972"), str_replace(fund, "0","9"), fund)) %>%
#2022 changes
mutate(fund = ifelse(fy < 2022 & fund %in% c("0110","0165","0351", "0392", "0393", "0422", "0544", "0628", "0634", "0656", "0672", "0683","0723", "0742", "0743"), str_replace(fund, "0","9"), as.character(fund))) %>% # replaces first 0 it finds with a 9
mutate(fund = ifelse(fy < 2022 & fund == "0367", "10367", as.character(fund)) # fund reused for 3rd time
) %>%
# 2023 fund changes
mutate(fund = ifelse(fy < 2023 & fund %in% c("0099","0210","0246", "0250", "0264", "0268", "0300", "0311", "0403", "0448","0645", "0727", "0729", "0791"), str_replace(fund,"0", "9"), as.character(fund))) %>%
mutate(fund = ifelse(fy < 2023 & fund == "0734" , "10734", as.character(fund)),
fund = ifelse(fy<2023 & fund == "0820", "10820", as.character(fund) ) # fund reused for 3rd time
) %>%
# other fund changes noticed on April 1, 2024
mutate(fund = ifelse(fy < 2023 & fund %in% c("0490", "0684", "0747", "0869"), str_replace(fund,"0", "9"), as.character(fund)))
```
::: {.callout-tip icon="false"}
The `funds_ab_in.xlsx` file contains the following information: all funds that have existed since 1998, if they still exist, indicates if fund numbers have been reused for varying purposes, and is updated yearly with new fund numbers used by the IOC.
:::
```{r create-exp_temp}
agencies_exp <- exp_1998_2023 %>%
distinct(agency, agency_name) %>%
filter(!is.na(agency_name)) %>%
mutate(agency = as.character(agency))
funds_exp <- exp_1998_2023 %>%
distinct(fund, fund_name) %>%
filter(!is.na(fund_name))
funds_ab_in_2023 = readxl::read_excel("data/funds_ab_in_2023.xlsx")
exp_temp <- exp_1998_2023 %>%
select(-fund_name) %>%
arrange(fund, fy) %>%
# join funds_ab_in_2023 to exp_temp
left_join(funds_ab_in_2023, by = "fund")
# Agency == 799 for Statutory transfers
# Object == 1993 is for Interfund cash transfers
exp_temp <- exp_temp %>%
mutate(transfer = ifelse(org_name == "TRANSFERS", 1, 0),
trans_agency = ifelse(org_name == "TRANSFERS",
str_sub(obj_seq_type,1,3), NA),
trans_type = ifelse(org_name == "TRANSFERS",
str_sub(obj_seq_type, 4,9), NA)) %>%
mutate(
object = ifelse(transfer == 0, as.character(str_sub(obj_seq_type, 1, 4)), NA_character_), # appropriation object
sequence = ifelse(transfer == 0, as.character(str_sub(obj_seq_type, 5,6)), NA_character_),
type = ifelse(transfer == 0, str_sub(obj_seq_type, 7,8), NA_character_) # appropriation type
)
```
- the initial combined and years of data are saved as dataframes named `exp_1998_2023` and `rev_1998_2023`. These are then saved as exp_temp and rev_temp while recoding variables. This is BEFORE creating Fiscal Future category groups and further data cleaning done in the Calculating the Fiscal Gap page.
```{r remove-all_obs_df, include=FALSE}
#include = FALSE in the chunk settings run the code but do not include the chunk in the html output
# remove from computer memory to free up space (in case your computer needs it)
rm(allexpfiles23)
rm(allrevfiles23)
```
Update Agencies: Some agencies have merged with others or changed names over time.
**\[\[ TO DO: Add appendix item of Current agencies (with indented past agencies) from the Methodology Document. \]\]**
```{r agencies-exp}
# recodes old agency numbers to consistent agency number
exp_temp <- exp_temp %>%
mutate(agency = case_when(
(agency=="438"| agency=="475" |agency == "505") ~ "440",
# financial institution & professional regulation &
# banks and real estate --> coded as financial and professional reg
agency == "473" ~ "588", # nuclear safety moved into IEMA
(agency =="531" | agency =="577") ~ "532", # coded as EPA
(agency =="556" | agency == "538") ~ "406", # coded as agriculture
agency == "560" ~ "592", # IL finance authority (fire trucks and agriculture stuff)to state fire marshal
agency == "570" & fund == "0011" ~ "494", # city of Chicago road fund to transportation
TRUE ~ (as.character(agency))))
#
# exp_temp <- exp_temp %>% filter(!is.na(fy))
# ## Fill in missing names for funds
# exp_temp <- exp_temp %>%
# select(-c(agency_name)) %>%
# left_join(funds_exp) %>%
# left_join(agencies_exp)
```
For aggregating revenue, use the `rev_1998_2023` file, join the `funds_ab_in_2023` file to it, and then join the `ioc_source_type` file. Remember: You need to update the `funds_ab_in` and `ioc_source_type` file every year!
```{r}
#| code-fold: false
rev_temp <- inner_join(rev_1998_2023, funds_ab_in_2023, by = "fund") %>%
arrange(source)
funds_rev <- rev_temp %>% distinct(fund, fund_name_ab) %>% filter(!is.na(fund_name_ab))
agencies_rev <- rev_temp %>% distinct(agency, agency_name) %>% filter(!is.na(agency_name)) %>%
mutate(agency = as.character(agency))
# need to update the ioc_source_type file every year!
ioc_source_type <- readxl::read_xlsx("./data/ioc_source_updated23_AWM.xlsx")
rev_temp <- left_join(rev_temp, ioc_source_type, by = "source")
sources_rev <- rev_temp %>% distinct(source, source_name_AWM) %>% filter(!is.na(source_name_AWM))
# fund info to revenue for all years
rev_temp <- rev_temp %>%
select(-c(fund_name, agency_name,
source_name_AWM)) %>%
left_join(funds_rev) %>%
left_join(sources_rev)
# automatically used source, source name does not match for the join to work using source_name
# recodes old agency numbers to consistent agency number
rev_temp <- rev_temp %>%
mutate(agency = case_when(
(agency=="438"| agency=="475" |agency == "505") ~ "440", # financial institution & professional regulation &
# banks and real estate --> coded as financial and professional reg
agency == "473" ~ "588", # nuclear safety moved into IEMA
(agency =="531" | agency =="577") ~ "532", # coded as EPA
(agency =="556" | agency == "538") ~ "406", # coded as agriculture
agency == "560" ~ "592", # IL finance authority (fire trucks and agriculture stuff)to state fire marshal
agency == "570" & fund == "0011" ~ "494", # city of Chicago road fund to transportation
TRUE ~ (as.character(agency)))) %>%
#left_join(sources_rev) %>%
left_join(agencies_rev) # add correct agency names back in
```
### Export Intermediate Files
```{r eval=FALSE}
write_csv(rev_temp, file = "./data/rev_temp.csv")
write_csv(exp_temp, file = "./data/exp_temp.csv")
```
::: {.callout-tip icon="false"}
The `exp_temp.csv` file and `rev_temp.csv` file are most likely the data that non-Fiscal Future researchers would want to use for their own analyses. These files should have consistent agencies and funds tracked from 1998 to the present.
:::