-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add replicatedDimCount to ESMF_ArrayGet, and improve some unit tests of replicated dimensions #187
Conversation
This allows us to remove some duplicated code to count the number of replicated dimensions, both within and outside of ESMCI_Array.C. The motivation for doing this now is that I anticipate adding another place where we need to get this count in ESMCI_IO.C. In order to test this new function (which is not exposed via the Fortran API), I have added a new C++ unit test class, ESMCI_ArrayUTest.C.
Previously, two files were getting overwritten by subsequent tests. This made it impossible to examine the results. I am introducing new files to avoid this overwriting.
Add more variability to the arrays. This started as an attempt to fix what appear to be messed up values in some of the I/O of Arrays with replicated dimensions, especially this one: netcdf Array_repli1_first { dimensions: myData_dim001 = 10 ; myData_dim002 = 3 ; variables: double myData(myData_dim002, myData_dim001) ; data: myData = 0, _, 12345, _, 24690, _, 37035, _, 49380, 61725, 0, _, 12345, _, 24690, _, 37035, _, 49380, 61725, 0, _, 12345, _, 24690, _, 37035, _, 49380, 61725 ; } But this pattern of missing data persists no matter how I fill the arrays. So I'm giving up on trying to fix this pattern of missing data, but am leaving in place this slight improvement in the setting of values (so that there's more variability from one point to another).
These tests are now completely redundant with the recently-added tests at the Fortran API level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -385,6 +385,8 @@ namespace ESMCI { | |||
const int *getTotalLBound() const {return totalLBound;} | |||
const int *getTotalUBound() const {return totalUBound;} | |||
int getTensorCount() const {return tensorCount;} | |||
int getReplicatedDimCount() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I don't have a return code (rc) for this function, since nothing can currently lead to it failing. I was thinking of this as just a small step beyond a simple getter. Does that seem reasonable, or should this have a place-holder rc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this fine. New code on the C++ side should rely on exception throwing anyway in case of error, instead of explicit RC return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I would go a step further, and implement the actual determination of the replicated dim count right here under int getReplicatedDimCount()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like:
int getReplicatedDimCount() const {
int replicatorCount = 0;
for (auto i=0; i<distgrid->getDimCount(); i++)
if (distgridToArrayMap[i] == 0) ++replicatorCount;
return replicatorCount;
}
Then get rid of the getReplicatedDimCountImpl()
altogether.
@@ -129,7 +129,7 @@ contains | |||
! !INTERFACE: | |||
! Private name; call using ESMF_ArrayGet() | |||
subroutine ESMF_ArrayGetDefault(array, keywordEnforcer, arrayspec, typekind, & | |||
rank, localarrayList, indexflag, distgridToArrayMap, & | |||
rank, replicatedDimCount, localarrayList, indexflag, distgridToArrayMap, & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this seem like a good place to insert this new optional output argument? Another possible position is after dimCount
, though the stuff there seems related solely to the distGrid
, so that didn't feel quite as appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about putting it /after/ the dimCount
variable further back in the list. Seems a bit more appropriate to me. But all pretty subjective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bill, please see my comments in the code. My sense is that the implementation can be simplified. Everything else looks good to me.
@@ -385,6 +385,8 @@ namespace ESMCI { | |||
const int *getTotalLBound() const {return totalLBound;} | |||
const int *getTotalUBound() const {return totalUBound;} | |||
int getTensorCount() const {return tensorCount;} | |||
int getReplicatedDimCount() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this fine. New code on the C++ side should rely on exception throwing anyway in case of error, instead of explicit RC return.
@@ -385,6 +385,8 @@ namespace ESMCI { | |||
const int *getTotalLBound() const {return totalLBound;} | |||
const int *getTotalUBound() const {return totalUBound;} | |||
int getTensorCount() const {return tensorCount;} | |||
int getReplicatedDimCount() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I would go a step further, and implement the actual determination of the replicated dim count right here under int getReplicatedDimCount()
.
@@ -385,6 +385,8 @@ namespace ESMCI { | |||
const int *getTotalLBound() const {return totalLBound;} | |||
const int *getTotalUBound() const {return totalUBound;} | |||
int getTensorCount() const {return tensorCount;} | |||
int getReplicatedDimCount() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like:
int getReplicatedDimCount() const {
int replicatorCount = 0;
for (auto i=0; i<distgrid->getDimCount(); i++)
if (distgridToArrayMap[i] == 0) ++replicatorCount;
return replicatorCount;
}
Then get rid of the getReplicatedDimCountImpl()
altogether.
@@ -503,6 +505,8 @@ namespace ESMCI { | |||
static void superVecParam(Array *array, int localDeCount, | |||
bool superVectorOkay, int superVecSizeUnd[3], int *superVecSizeDis[2], | |||
int &vectorLength); | |||
private: | |||
static int getReplicatedDimCountImpl(int dimCount, int *distgridToArrayMap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is need of a static implementation of this feature. It seems like a perfect instance method to me. See my other comments.
@@ -129,7 +129,7 @@ contains | |||
! !INTERFACE: | |||
! Private name; call using ESMF_ArrayGet() | |||
subroutine ESMF_ArrayGetDefault(array, keywordEnforcer, arrayspec, typekind, & | |||
rank, localarrayList, indexflag, distgridToArrayMap, & | |||
rank, replicatedDimCount, localarrayList, indexflag, distgridToArrayMap, & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about putting it /after/ the dimCount
variable further back in the list. Seems a bit more appropriate to me. But all pretty subjective.
//----------------------------------------------------------------------------- | ||
#undef ESMC_METHOD | ||
#define ESMC_METHOD "ESMCI::Array::getReplicatedDimCountImpl()" | ||
int Array::getReplicatedDimCountImpl(int dimCount, int *distgridToArrayMap) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope this can be removed.
int replicatorCount = 0; // initialize | ||
for (int i=0; i<dimCount; i++) | ||
if (distgridToArrayMapArray[i] == 0) ++replicatorCount; | ||
int replicatorCount = getReplicatedDimCountImpl(dimCount, distgridToArrayMapArray); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to getReplicatedDimCount()
.
int replicatorCount = 0; // initialize | ||
for (int i=0; i<dimCount; i++) | ||
if (distgridToArrayMapArray[i] == 0) ++replicatorCount; | ||
int replicatorCount = getReplicatedDimCountImpl(dimCount, distgridToArrayMapArray); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to getReplicatedDimCount()
.
@theurich - thanks a lot for your comments here. I only have a minute right now, so have just skimmed them for now, but I wanted to explain the need (I think) for |
Yes, being functional inside |
And on second thought, it is really a DistGrid method. Still fine to route it through the ESMCI::Array and ESMF_Array API for convenience (as you have it), but I think the implementation should be an instance method of ESMCI::DistGrid. |
Per Gerhard's suggestion
@theurich - I made your suggested change to the placement of the Yes, let's talk tomorrow about your other comments. I thought that it didn't make sense to put this new function in the DistGrid class because it's mainly working with |
Yes, the DistGrid method was a dumb idea, with half my brain out the door already yesterday afternoon. Just ignore it please. |
I have checked the new API documentation and confirmed that it looks good. |
@theurich - I think I have made the changes you suggested (making a member variable for number of replicated dimensions); can you look this over and let me know if it looks good to you now? However, I didn't see where / how to change the calculation of buffer size in the serialize / deserialize. If that's needed, can you point me to what I need to do for it? (Unit tests currently pass, so maybe nothing more is needed?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good, except for one small change needed (answering your question about the serialize buffer size inquire part).
Code at
esmf/src/Infrastructure/Array/src/ESMCI_Array.C
Line 4468 in 13ac9ca
ip += 2 + 2*tensorCount + 2*distgrid->getDimCount () + |
ip += 2 + 2*tensorCount + 2*distgrid->getDimCount () +
rank + delayout->getDeCount ();
needs to be changed to
ip += 3 + 2*tensorCount + 2*distgrid->getDimCount () +
rank + delayout->getDeCount ();
i.e. bump up the first number by one, because there is one more int
member now.
Need to add 1 for the new int
Thanks @theurich ! I made that change, reran local testing (all good) and have pushed it. Once you approve, I'll merge this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bill, this looks good to me now. Please merge at your convenience. Thanks!
This started with my work towards addressing #184 . That turned out to be a bigger problem than I anticipated, but some of the work towards addressing that issue seemed worth preserving.
This PR:
(1) Introduces a method to get the replicated dimension count of an array and uses this method in place of some previously-duplicated code (in both ESMCI_Array.C and ESMCI_IO.C).
(2) Adds a replicatedDimCount optional output argument to ESMF_ArrayGet, with associated unit tests.
(3) Improves unit tests of Arrays with replicated dimensions in ESMF_ArrayIOUTest.F90 - most notably, outputting the results of different tests to different files to aid later examination, and making the values in the Array vary within each PET to make the tests more rigorous.