Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper procedures to load partitioned mesh #134

Open
zhangchonglin opened this issue Dec 11, 2024 · 3 comments
Open

Proper procedures to load partitioned mesh #134

zhangchonglin opened this issue Dec 11, 2024 · 3 comments

Comments

@zhangchonglin
Copy link
Contributor

Currently, the following procedures are used to load partitioned mesh in both XGCm and GITRm following PUMIPic example:

  • read in the full mesh
  • load in the partition file and create PICparts
  • delete the full mesh as necessary

These procedures are not optimal at all and will encounter memory issue as the mesh size becomes large (>20-50 million mesh elements for example, depending on the GPU memory). They are similarly to this test:

//**********Load the mesh in serial everywhere*************//
Omega_h::Mesh mesh = Omega_h::read_mesh_file(argv[1], lib.self());
int dim = mesh.dim();
int ne = mesh.nents(dim);
if (rank == 0)
printf("Mesh loaded with <v e f r> %d %d %d %d\n", mesh.nverts(), mesh.nedges(),
mesh.nfaces(), mesh.nelems());
//********* Load the partition vector ***********//
pumipic::Input input(mesh, argv[2], pumipic::Input::FULL, pumipic::Input::FULL);
pumipic::Mesh picparts(input);

The proper way would be to read in the partitioned mesh directly. It's not clear if below function is already providing the necessary steps:

void read(Omega_h::Library* library, Omega_h::CommPtr comm, const char* path,
Mesh* mesh) {
const char* prefix = splitPath(path);
char dir[4096];
sprintf(dir, "%s_%d.ppm", path, comm->size());
if (!Omega_h::filesystem::exists(dir)) {
printError("Directory %s does not exist\n", dir);
return;
}
char mesh_file[4096];
sprintf(mesh_file, "%s/%s_%d.osh", dir, prefix, comm->rank());
char ppm_file[4096];
sprintf(ppm_file, "%s/%s_%d.ppm", dir, prefix, comm->rank());
mesh->picpart =
new Omega_h::Mesh(Omega_h::binary::read(mesh_file, library->self()));
std::ifstream in_str(ppm_file);
if (!in_str) {
printError("Cannot open file %s\n", ppm_file);
return;
}
#ifdef OMEGA_H_USE_ZLIB
bool compress = true;
#else
bool compress = false;
#endif
bool swap = !is_little_endian_cpu();
Omega_h::I8 version;
Omega_h::binary::read_value(in_str, version, swap);
Omega_h::I8 is_full_mesh_int;
Omega_h::binary::read_value(in_str, is_full_mesh_int, swap);
mesh->is_full_mesh = is_full_mesh_int;
for (int i = 0; i < 4; ++i) {
if (version >= 2) {
//Read num_entites
Omega_h::binary::read_value(in_str, mesh->num_entites[i], swap);
}
//Read num_cores
Omega_h::binary::read_value(in_str, mesh->num_cores[i], swap);
//Read buffered_parts
Omega_h::binary::read_array(in_str, mesh->buffered_parts[i], compress, swap);
//Read offset_ents_per_rank_per_dim
Omega_h::binary::read_array(in_str, mesh->offset_ents_per_rank_per_dim[i],
compress, swap);
//Read ent_to_comm_arr_index_per_dim
Omega_h::binary::read_array(in_str, mesh->ent_to_comm_arr_index_per_dim[i],
compress, swap);
//Read is complete part
Omega_h::binary::read_array(in_str, mesh->is_complete_part[i], compress, swap);
//read num_bounds
Omega_h::binary::read_value(in_str, mesh->num_bounds[i], swap);
//read num_boundaries
Omega_h::binary::read_value(in_str, mesh->num_boundaries[i], swap);
//read boundary_parts
Omega_h::binary::read_array(in_str, mesh->boundary_parts[i], compress, swap);
//read offset_bounded_per_dim
Omega_h::binary::read_array(in_str, mesh->offset_bounded_per_dim[i],
compress, swap);
//read bounded_ent_ids
Omega_h::binary::read_array(in_str, mesh->bounded_ent_ids[i], compress, swap);
}
mesh->commptr = comm;
int world_rank;
MPI_Comm_rank(comm->get_impl(), &world_rank);
if (!world_rank) {
char buffer[1024];
char* ptr = buffer + sprintf(buffer, "PumiPIC Mesh read <v e f");
if (mesh->dim() == 3)
ptr += sprintf(ptr, " r");
ptr += sprintf(ptr, "> (%ld %ld %ld", mesh->num_entites[0], mesh->num_entites[1],
mesh->num_entites[2]);
if (mesh->dim() == 3)
ptr += sprintf(ptr, " %ld", mesh->num_entites[3]);
ptr += sprintf(ptr, ")");
printInfo("%s\n", buffer);
}
//Create load balancer after reading in the mesh
mesh->ptcl_balancer = new ParticleBalancer(*mesh);
}

The above function is used in the following test:
// //Create picparts using classification with the full mesh buffered and minimum safe zone
// p::Input input(full_mesh, argv[2], bufferMethod, safeMethod);
// if(!comm_rank)
// input.printInfo();
// MPI_Barrier(MPI_COMM_WORLD);
// p::Mesh picparts(input);
p::Mesh picparts;
pumipic::read(&lib, lib.world(), argv[1], &picparts);
o::Mesh* mesh = picparts.mesh();
mesh->ask_elem_verts(); //caching adjacency info

@cwsmith
Copy link
Contributor

cwsmith commented Dec 11, 2024

I thought I had remembered using pre-constructed meshes for performance testing. The read(...) function you found looks correct and the write(...) function to produce the constructed mesh is also in src/pumipic_file.cpp.

This doesn't solve the high memory usage issue, but may give you a chance to run the mesh construction job on nodes (possibly using a CPU only build) with more memory as a pre-processing step before running an analysis.

@cwsmith
Copy link
Contributor

cwsmith commented Dec 11, 2024

@dhyan1272 @Angelyr Any thoughts?

@zhangchonglin
Copy link
Contributor Author

@cwsmith: yes, looking further we need to write the partition file *.ppm file in a node with more memory to get the partitioned mesh and partition information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants