Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BeamDyn performance improvements #2399

Merged
merged 10 commits into from
Sep 20, 2024
Merged

Conversation

deslaughter
Copy link
Collaborator

@deslaughter deslaughter commented Aug 29, 2024

This PR is ready to merge.

Feature or improvement description
This PR contains several commits that were developed during the Tight Coupling project to increase BeamDyn performance. They use LAPACK_GEMM routines to perform matrix multiplication inside BeamDyn during element construction. These changes have shown some performance improvement especially when using the Intel MKL library.

This also changes some low level code in ModMesh, NWTC_Num, and ModMesh_Mapping which were identified as hotspots during performance profiling.

Impacted areas of the software

  • BeamDyn.f90 - use LAPACK_GEMM in several routines, use subroutines to simplify some calculations
  • BeamDyn_Subs.f90 - use select case instead of if statements in BD_CrvCompose
  • NWTC_Num - Add PURE to Cross_Product functions to hint that these have no side effects (maybe the compiler can inline them)
  • ModMesh.f90 - use select case instead of if statements in MeshCopy to process CtrlCode, should be easier to optimize
  • ModMesh_Mapping.f90 - use maxval intrinsic instead of looping to find max value in matrix.

Test results, if applicable

The following test references were updated:

  • Ideal_Beam_Free_Free_Linear
  • 5MW_Land_BD_Linear
  • 5MW_Land_BD_Linear_Aero

@@ -173,8 +173,8 @@ type(BD_MiscVarType) function simpleMiscVarType(nqp, dof_node, elem_total, nodes
call AllocAry(m%qp%RR0mEta, 3, nqp, elem_total, 'qp_RR0mEta', ErrStat, ErrMsg)
call AllocAry(m%DistrLoad_QP, 6, nqp, elem_total, 'DistrLoad_QP', ErrStat, ErrMsg)

CALL AllocAry(m%qp%uuu, dof_node ,nqp,elem_total, 'm%qp%uuu displacement at quadrature point',ErrStat,ErrMsg)
CALL AllocAry(m%qp%uup, dof_node/2,nqp,elem_total, 'm%qp%uup displacement prime at quadrature point',ErrStat,ErrMsg)
call AllocAry(m%qp%uuu, dof_node, nqp, elem_total, 'm%qp%uuu displacement at quadrature point', ErrStat, ErrMsg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indenting looks a bit odd here.

@andrew-platt andrew-platt marked this pull request as draft September 4, 2024 20:09
@andrew-platt
Copy link
Collaborator

Results are slightly different on linearization. It's a bit unknown if this a numerical artifact, or a real issue. Will explore further and potentially continue with it or close it.

@ptrbortolotti
Copy link
Contributor

The rotor-only structural-only comparison for the IEA15 looks good. This is the same plot as Figure 1 in https://iopscience.iop.org/article/10.1088/1742-6596/2767/2/022018/pdf
campbell_iea15_of_vs_h2_struct_rotor_fixedrpm_3p5p4

@ptrbortolotti
Copy link
Contributor

Solution with tower DOFs is also good

campbell_iea15_of_vs_h2_struct_towdt_rpm_3p5p4

@ptrbortolotti
Copy link
Contributor

campbell_iea15_of_vs_h2_qs_aero_new2

no notable differences with Figure 4 in https://iopscience.iop.org/article/10.1088/1742-6596/2767/2/022018/pdf

this plot was for commit f93ef05

sims are running for 56a97f6

we'll post new results tonight/tomorrow morning

@ptrbortolotti
Copy link
Contributor

all good for 56a97f6

campbell_iea15_of_vs_h2_qs_aero_new3

@andrew-platt andrew-platt marked this pull request as ready for review September 20, 2024 15:55
@andrew-platt
Copy link
Collaborator

Decided that we really don't need to keep the error handling from LAPACK_GEMM calls. The only errors that would get reported from there are from matrix size mismatch. While in principle I like the idea of keeping that error handling, the reality is it would never get triggered and adds a whole lot of error handling through the call stack that is annoying to implement. So I'm fine with ignoring them in this case.

@andrew-platt andrew-platt merged commit e28e1a0 into OpenFAST:rc-3.5.4 Sep 20, 2024
19 checks passed
@andrew-platt andrew-platt mentioned this pull request Oct 21, 2024
28 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants