HospitalGeneratorRDF_V2 is an updated version of the software presented in HospitalGeneratorRDF. Compared to the previous version, the changes are related to achieving a hospital design with some specific characteristics. This software has been used in the work "TODO: HACER PAPER" with DOI doi: TODO to create the dataset used for the experiments.
As its previous version, HospitalGeneratorRDF_V2 is a software that, based on the output of H-Outbreak (a simulation model of the movements of patients inside a hospital), creates a knowledge graph (KG) in RDF and RDF* according to the data model presented in "Spatiotemporal Data Modelling for Epidemiological Research in Hospitals" (10.1109/JBHI.2024.3417224) with the little changes presented in HospitalKG_changes.
Since H-Outbreak does not cover all the classes and relations from the data model, HospitalGeneratorRDF completes it by adding Floors, Areas, Corridors, Rooms, Beds, Services and HospitalizationUnits, and the relations between them. It also creates different subclasses of Events: Hospitalization, Radiology, Surgery and Death.
IMPORTANT: Also read the file PARAMS.md
IMPORTANT: The files in the Program\Input
folder are those used to generate the dataset for the work TODO: HACER PAPER. The files in the Program\OutputRDF_star
are the dataset used for the experiments in TODO: HACER PAPER.
Below, we present some other related repositories that may be of interest to you:
- HospitalKG_changes: It is also linked to
doi: TODO. - HospitalEdgeWeigths: It is also linked to
doi: TODO. - STeMECH: Code for
doi: TODO. - HospitalGeneratorRDF: Previous version of this software. In the folder Description of this repository, there is an exhaustive description of how the process of generating random hospitals works.
We have based several characteristics of the hospital on the main building of the Virgen de la Arrixaca University Clinical Hospital, Region of Murcia, Spain. These characteristics are:
- Number of Services for hospitalisations
- Number of HospitalizationUnits (HU) per Service
- Number of operating theatres.
- Number of rooms for radiology and other diagnostic imaging techniques (we will call all of them as radiology).
- Number of beds for A&E
- Number of beds for Intensive Care (IC)
The hospital will have four Floors:
- Ground Floor: In thi Floor there will be: ICU rooms, A&E rooms, radiology rooms and operating theatres.
- Upper Floor: These 3 Floors will be used for hospitalisations.
Next, there is a brief description of the services and each kind of Floor.
The hospital will have:
- 17 Services for Hospitalisations.
- 1 Service for Radiology.
- 1 Service for A&E.
- 1 Service for IC.
- There isn't a Service for only surgeries. Each Service will be in charge of its own surgeries.
Below, we present the distribution of HospitalizationUnits per Service:
- The 17 Services for hospitalisations, the A&E Service and the IC Service will have each 1 HU for surgeries.
- A&E Service will have 1 HU for patient stays.
- IC Service will have 1 HU for patient stays.
- Radiology Service will have 6 HU.
- The Services for hospitalisations will have a certain number of HospitalizationUnits. There will be:
- 8 Services with 1 HU. These Services are: S0, S1, S2, S10, S11, S12 and S13.
- 3 Services with 2 HU. These Services are: S2, S6 and S8.
- 4 Services with 3 HU. These Services are: S3, S4, S9 and S16.
- 1 Services with 4 HU. This Service is: S7.
- 1 Services with 5 HU. This Service is: S15.
- 1 Services with 6 HU. This Service is: S14.
This Floor has a layout with 2 rows (Units) and 5 columns (Blocks). In this Floor there will be:
- Operating theatres: 27. Each operating theatre will have 1 Bed
- Rooms for Radiology: 24. Each room will have 1 Bed.
The distribution of Rooms per HU is:- 1 HU with 8 Rooms.
- 2 HU with 2 Rooms.
- 3 HU with 3 Rooms.
- A&E Rooms: 4 Rooms. Each room will have 4 Beds. In total, there will be 16 A&E Beds.
- IC Rooms: 4 Rooms. Each room will have 5 Beds. In total, there will be 20 IC Beds.
The following figure shows a schematic representation of the ground Floor with its Areas and the number of Corridors and Rooms in each.
To finish this section, we have defined 4 LogicZones. Each one covers one of the spaces in this Floor, that is, one for surgery, one for radiology, one for A&E and one for IC.
There will be 3 Floors for hospitalisations over the ground floor. These three Floors will have layouts with 2 rows (Units) and 4 columns (Blocks). The number and distribution of the Corridors will depend on the number of Rooms on each Floor, and it is assigned following the same process as the first version of this software. The total number of Rooms in the hospital is random, but it is approximately 320. These Rooms will be distributed evenly between the three Floors. So, there will be 105 Rooms per Floor, approximately.
The 17 Services for hospitalisations will be distributed between these three Floors such that each Floor has 13 HU and none Service is divided between two Floors. Specifically, this will be the Services per Floor:
- Floor 1: 7 Services (S0-S6), 13 HU.
- Floor 2: 7 Services (S7-S13), 13 HU.
- Floor 3: 3 Services (S14-S16), 14 HU.
We have refined the random Floor generation algorithm to balance the number of Corridors and Rooms per Area. We have also improved the system to manage the Event creation and the writing of the KG files.
Eac HU will have 8±2 Rooms with the following probability:
- 8 Rooms: 50%
- 7 Rooms: 15%
- 9 Rooms: 15%
- 6 Rooms: 10%
- 10 Rooms: 10%
The source code is currently hosted on github.com/LorenaPujante/HospitalGeneratorRDF.
The program is in Python 3.10, and no external packages are needed.
Before the first execution of the code, all output folders with their subfolders must be created. These folders are listed in section 5.
The input for HospitalGeneratorRDF_V2 must be:
- The two files obtained as output from H-Outbreak:
movements.csv
andpatients.csv
. - Two additional files that are obtained by adding new code to H-Outbreak.
hospital.txt
: This file has a CSV-like representation of the layout of beds and wards from the H-Outbreak output.roomsHU.txt
: Each line of this file represents the triplet (Service, HU inside the Service, number of Rooms of the HU)
The input data used to generate the data for doi: TODO is in the directory Program/Input
To get the extra files and settle some of the specific characteristics of the hospital to create, we need to modify two files from H-Outbreak: hospital.py and simulation.py. The code with the changes is in the files Modifications/mod_hospital.py and Modifications/mod_simulation.py.
To run the program in the terminal, go to the folder containing the program and run: python main.py
The main function receives as parameters the following:
- index: Index to start numbering (property id) the objects that are created to complete the hospital layout. It is recommended that this index be greater than the last id of the elements in the H-Outrbreak hospital layout.
- huPerService: Number of HospitalizationUnits to create per Service.
- nFloors: Number of Floors that the hospital will have. This number must be greater or equal to 2.
- huPerFloor: Number of HospitalizationUnits that each Floor (except the ground floor) will have.
- nRows, nColumns: Number of rows and columns that the grill that divides each Floor (except the ground floor) will have.
- startDateTime: Date and time of the first step from the simulation generated by H-Outbreak. The next steps will be dated from this parameter.
- optionFloorHU: When it is not possible to create a suitable hospital layout that has the specified number of HospUnits per Floor, with this parameter, we select if we want to keep the number of HospUnits per Floor and modify
nFloors
(option1
) or to keep the number of Floors and modifyhuPerFloor
(option2
). If this parameter isNone
, then the option will be asked by the terminal.
In the file PARAMS.md we present the values for the parameters of H-Outbreak and HospitalGenerator_V2 used to create the dataset for the work doi: TODO.
After running the program, the following folders are created:
- OutputCSV: Folder with the nodes and edges of the graph in the form of CSV files.
- OutputRDF: Folder with the nodes and edges of the RDF knowledge graph in the form of N-Triples files.
- OutputRDF_star: Folder with the nodes and edges of the RDF* knowledge graph in the form of N-Triples files (nodes) and Turtle files (edges).
- OutputSummary: Folder with two summary files:
- EpisodeSummary.txt: This file shows how many episodes and events there are for each patient with their description, id, and start and end dates. For Events, it also shows their subclass and to which Bed they are connected.
- HospitalSummary.txt: This file shows a list of all the services, hospital hospitalisation units, and locations. For each element, it presents its id, description and several lists with all the other elements of the spatial dimension to which it is connected.
Repeated runs will replace existing files.
The output folders must be created before running the code for the first time.
Inside each folder there must be another two folders: Classes and Relations. The also must be created before running the code for the first time.