Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling under OpenMP / 共享内存编程下的异常捕捉 #11

Open
crazyzlj opened this issue Apr 1, 2017 · 1 comment
Open
Assignees
Labels

Comments

@crazyzlj
Copy link
Contributor

crazyzlj commented Apr 1, 2017

OpenMP内抛出异常

SEIMS中异常捕捉是一个继承于exceptionModelException类,在各个模块中,如遇到异常,只需要抛出异常,即throw ModelException(),捕获异常并打印错误信息则在主函数调用处,即invoke.cpp
在具体模块执行计算任务时,子流域内部栅格或基于流向分层的每一层内部栅格间的计算相互独立,可以很方便的利用共享内存的方式(OpenMP)进行线程级的并行,比如,下面一段代码我们经常用到:

omp_set_num_threads(4);
#pragma omp parallel for
for (int i = 0; i < m_nCells; i++) {
    DoSomethingThatMayThrowAnException();
    if (ExceptionOccurred) {
        throw ModelException("MID", "FunctionName", "ErrorMessage!\n");
    }
}

这时候,我们期望的是:一旦计算出现异常,主程序捕获异常,打印异常信息并正常释放内存,退出程序,比如这样:
期望的捕获异常输出信息
然而,实际上(指的是在m_nCells次循环中,有大于1次异常出现),我们会得到这样:
实际上程序崩溃
此时,程序崩溃,内存没有得到完全释放,只能选择关闭程序。

什么原因呢?

Exceptions may not be thrown and caught across omp constructs. That is, if a code inside an omp for throws an exception, the exception must be caught before the end of the loop iteration; and an exception thrown inside a parallel section must be caught by the same thread before the end of the parallel section.
Guide into OpenMP: Easy multithreading programming for C++ by Joel Yliluoma

换句话说,在哪个线程抛出异常,就得在哪个线程捕获,而且必须在当前代码块内捕获

如何解决呢?

Google一下关键字“C++ exception openmp loop”,我们就能发现几种解决方案,如StackoverflowBreaking Out of Loops in OpenMP,甚至还有一些论文,如Towards an Error Model for OpenMPManaging C++ OpenMP Code and Its Exception Handling等。

当然,解决思路无外乎两种:

  • 在循环体内显式写try{...} catch{...}语句,并增加一些openmp指令代码用于同步
  • 汇总循环体内的异常信息,待循环结束之后,再抛出异常

为了减少对现有代码的修改程度,也尽量不过多暴露openmp指令代码,我们目前采用一种简单的实现方式:

omp_set_num_threads(4);
size_t errCount = 0;
#pragma omp parallel for  reduction(+:errCount)
for (int i = 0; i < m_nCells; i++) {
    DoSomethingThatMayThrowAnException();
    if (ExceptionOccurred) {
        PrintErrorMessage();
        errCount++;
    }
}
if (errCount > 0) {
    throw ModelException("MID", "FunctionName", "ErrorMessage!\n");
}

TBD

请大家在写代码的时候,注意这一点。
如有更好的解决方法,请在下面留言,我们慢慢改进!

扩展阅读

@aibbgiser
Copy link

It's really helpful. Cool!

crazyzlj added a commit that referenced this issue Jun 11, 2024
460a0834 Merge pull request #11 from crazyzlj/dev
415c1280 Update how to use CCGL with MongoDB in docker
f50d3f99 Copy ccgl to /usr/local directories
55635670 Remove dockerfiles that running gtest of ccgl
c46ba95d Build amd64 and arm64 versions
c18889e3 Update gdal image from ghcr
ece54d09 test all image versions
3e2f3334 test mongodb on macos-latest
e2b12743 test tags for acr and ghcr
b7652b48 test ankane/setup-mongodb on macos
a8000c93 test acr and ghcr
b54e60de only test ghcr
506a1a96 test permissions write-all
91112103 test
e7791ea7 test ci
14568e41 Test again
533a4483 Test push images to ACR and ghcr
0190e958 Test push images to ACR and ghcr
9f3a4509 Test push images to ACR and ghcr
e33f3873 Merge pull request #10 from crazyzlj/dev
4c705a08 add deploy docker images to ghcr.io
43a91748 add deploy docker images to ghcr.io
82933b02 add deploy docker images to ghcr.io

git-subtree-dir: seims/src/ccgl
git-subtree-split: 460a0834b280c30eb03b7cd3978a368638593f84
crazyzlj added a commit that referenced this issue Nov 21, 2024
f55a618e [ci skip] Merge commit '8d31c4cf074efa132e8f8b4ea083507b2fa634da' into dev
8d31c4cf Squashed 'cmake/' changes from 2c2cf3f..2c05601
c11f6de7 update github actions to use Node.js 20 [ci skip]
cc7c6c8f update to the latest master [ci skip]
9c857bb6 Typo fixed of IMAGE_TAG, using github.ref == 'refs/heads/master'
20a046d3 Merge pull request #14 from crazyzlj/dev
002902e1 deploy images after the success of build and test
77608eba Update dockerfiles
57f33486 Backup previous used workflow for deploying images
9eab234e Remove links to dockerfiles
cce3f0de Ignore for docker build process
308a8df2 Change mongo-c-driver download path
45439f66 update to the latest master branch
dc331ec5 Merge branch 'master' of github.com:crazyzlj/CCGL
e085ab46 Update gdal version; use default version using ankane/setup-mongodb
411ba9a3 Merge pull request #13 from crazyzlj/dev
b3f9c929  remove --platform=$BUILDPLATFORM
39fbff55 Merge pull request #12 from crazyzlj/dev
6bce8436 Use FROM --platform=$BUILDPLATFORM xxx as builder; Add provenance: false for docker/build-push-action
460a0834 Merge pull request #11 from crazyzlj/dev
415c1280 Update how to use CCGL with MongoDB in docker
f50d3f99 Copy ccgl to /usr/local directories
55635670 Remove dockerfiles that running gtest of ccgl
c46ba95d Build amd64 and arm64 versions
c18889e3 Update gdal image from ghcr
ece54d09 test all image versions
3e2f3334 test mongodb on macos-latest
e2b12743 test tags for acr and ghcr
b7652b48 test ankane/setup-mongodb on macos
a8000c93 test acr and ghcr
b54e60de only test ghcr
506a1a96 test permissions write-all
91112103 test
e7791ea7 test ci
14568e41 Test again
533a4483 Test push images to ACR and ghcr
0190e958 Test push images to ACR and ghcr
9f3a4509 Test push images to ACR and ghcr
e33f3873 Merge pull request #10 from crazyzlj/dev
4c705a08 add deploy docker images to ghcr.io
43a91748 add deploy docker images to ghcr.io
82933b02 add deploy docker images to ghcr.io
13444638 Merge pull request #9 from crazyzlj/dev
c78b3437 (bugfix):set installation related paths to cache
35998ae0 Merge pull request #8 from crazyzlj/dev
e9478086 (bugfix):only support mongo-c-driver-1.5.0+ to use mongoc_collection_find_with_opts
8c3db8f1 Merge pull request #7 from crazyzlj/dev
f163d02b Squashed 'cmake/' changes from 8a954e2..2c2cf3f
52707277 Merge commit 'f163d02bd16f45d53927393d63e7a506c0e21ec1' into dev
ea3b1661 (bugfixed): default nodata should depends on data type of clsRasterData
21ce5201 (feat/experimental): Add support of Sanitizers
314dee89 (bugfix): delete tm correctly
fe0630dc (bugfix): fixed memory leak on MSVC
738b8028 merge latest master
e74c4cca Bug fixed on GDAL 1.x & 2.x
c4997932 Add CODECOV_TOKEN
fa7d7229 Add GDAL data types added from versions 3.5 & 3.7
c797e3e5 Test passed on Xcode

git-subtree-dir: seims/src/ccgl
git-subtree-split: f55a618e2e84c0416b0d9436185a023ba838c932
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants