Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/585 celix conditions #588

Merged
merged 38 commits into from
Jul 31, 2023
Merged

Feature/585 celix conditions #588

merged 38 commits into from
Jul 31, 2023

Conversation

pnoltes
Copy link
Contributor

@pnoltes pnoltes commented Jul 11, 2023

This PR introduces the celix_condition service interface and several framework-specific celix_condition services. The celix_condition is based on the OSGi Condition Specification, but has been adapted for Apache Celix / C.

This PR includes:

  • The celix_condition interface.
  • Updated framework documentation for the newly added celix_condition services.
  • Refactored framework.c with some minor improvements, including some error injection for framework_start failure.
  • A separate celix_framework_bundle.c source file which handles the framework-specific services (celix_condition services), specifically:
    • true condition.
    • "framework.ready" condition.
    • "framework.error" condition.
    • "components.ready" condition.
  • A fix for the query command so that bundle 0 (framework) services are queried.
  • A private celix_framework_shutdownAsync function which is now used by the framework bundle to shut down the framework.

@PengZheng PengZheng self-requested a review July 12, 2023 01:42
@PengZheng
Copy link
Contributor

Please note that ScheduledEventTestSuite.CxxWaitForScheduledEvent failed.

Copy link
Contributor

@PengZheng PengZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very useful mechanism!
I already used marker services similarly in my day job. Glad to have upstream support.

However, the current implementation of "components.ready" condition is costly.
Also I suspect its usefulness for a dynamic component framework as Celix:
when the framework becomes ready, anyone is free to install a bundle by hand.
The concept seems not well-defined.
Moreover, as analyzed in #588 (comment), it can subtly interfere with our existing tests in unexpected way.
I suggest we postpone the addition of "components.ready" until a well-defined concept and an efficient solution is worked out.

"framework.ready" and "framework.error" are indeed useful concepts.
Considering some bundles keeping producing events during their life cycle, the requirement of empty event queue does not make much sense. Moreover, it will hurts start-up speed unnecessarily. One big benefit of using Celix is that we can move whichever bundle to the front of the start-up list so that we can get whatever functionality we need as soon as possible, leaving the framework to deal with complex dependency management.

documents/framework.md Show resolved Hide resolved
documents/framework.md Outdated Show resolved Hide resolved
libs/framework/include/celix_condition.h Show resolved Hide resolved
libs/framework/src/celix_framework_bundle.c Outdated Show resolved Hide resolved
return CELIX_BUNDLE_EXCEPTION;
}

fw_addFrameworkListener(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return value should be checked.

}
}
return status;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a bundle fails to start, status is not set correctly (it is still CELIX_SUCCESS).
We need a test case for this.

}
location = strtok_r(NULL, delims, &save_ptr);
}
} else {
fw_log(fw->logger, CELIX_LOG_LEVEL_ERROR, "Could not auto install bundles, out of memory.");
}
celix_utils_freeStringIfNotEqual(autoStartBuffer, autoStart);
return status;;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need error code for out of memory case.

"Celix framework started with uuid %s",
celix_framework_getUUID(framework));
} else {
fw_log(framework->logger, CELIX_LOG_LEVEL_ERROR, "Celix framework failed to start");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary, status == CELIX_SUCCESS

libs/framework/src/framework.c Show resolved Hide resolved
fw_fireFrameworkEvent(framework, OSGI_FRAMEWORK_EVENT_STARTED, CELIX_SUCCESS);
} else {
//note not returning an error, because the framework is started, but not all bundles are started/installed
fw_logCode(framework->logger, CELIX_LOG_LEVEL_ERROR, status, "Could not auto start or install all configured bundles");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status == CELIX_SUCCESS

@PengZheng
Copy link
Contributor

PengZheng commented Jul 12, 2023

The crash of CelixFrameworkUtilsErrorInjectionTestSuite.testIsBundleUrlValid is very interesting.

https://github.com/apache/celix/actions/runs/5525141973/jobs/10078372242

[2023-07-11T21:47:10] AddressSanitizer[  error] [celix_framework] [celix_framework_utils_resolveFileBundleUrl:96] Failed(No such file or directory) to resolve bundle location 'non-existing.zip', taking into account the cwd and Celix bundle path 'bundles'.
:DEADLYSIGNAL
=================================================================
[2023-07-11T21:47:10] [  error] [celix_framework] [celix_framework_utils_locateEmbeddedBundle:147] Framework exception(0x11177): "Failed to locate embedded bundle symbols for bundle 'embedded://simple_test_bundle1'.";
 Cause: Inject dl error
[2023-07-11T21:47:10] [  error] [celix_framework] [celix_framework_utils_locateEmbeddedBundle:147] Cannot allocate memory(0xc): "Failed to locate embedded bundle symbols for bundle 'embedded://simple_test_bundle1'.";
 Cause: Cannot allocate memory
[2023-07-11T21:47:10] [  error] [celix_framework] [celix_framework_utils_locateEmbeddedBundle:147] Cannot allocate memory(0xc): "Failed to locate embedded bundle symbols for bundle 'embedded://simple_test_bundle1'.";
 Cause: Cannot allocate memory
==35135==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f1cd907e839 bp 0x7f1cd3cfe9c0 sp 0x7f1cd3cfe128 T13)
==35135==The signal is caused by a READ memory access.
==35135==Hint: address points to the zero page.
    #0 0x7f1cd907e838 in __sanitizer::internal_strnlen(char const*, unsigned long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cc:212
    #1 0x7f1cd8fce234 in __interceptor_strndup ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:379
    #2 0x559452ee5d88 in serviceRegistration_createInternal /home/runner/work/celix/celix/libs/framework/src/service_registration.c:54
    #3 0x559452ee5a6e in serviceRegistration_create /home/runner/work/celix/celix/libs/framework/src/service_registration.c:35
    #4 0x559452ee86b1 in serviceRegistry_registerServiceInternal /home/runner/work/celix/celix/libs/framework/src/service_registry.c:194
    #5 0x559452eec2f4 in celix_serviceRegistry_registerService /home/runner/work/celix/celix/libs/framework/src/service_registry.c:734
    #6 0x559452ed07ed in fw_handleEventRequest /home/runner/work/celix/celix/libs/framework/src/framework.c:1374
    #7 0x559452ed1225 in fw_handleEvents /home/runner/work/celix/celix/libs/framework/src/framework.c:1432
    #8 0x559452ed28a9 in fw_eventDispatcher /home/runner/work/celix/celix/libs/framework/src/framework.c:1597
    #9 0x7f1cd8414608 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x8608)
    #10 0x7f1cd7fed132 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11f132)

It was caused by celix_utils_strdup failure in one of celix_framework_registerServiceAsync calls introduced by this PR.

Of course, CELIX_EI_UNKNOWN_CALLER should be avoided at the first place.
This crash tells us more: an unexpected service registration together with corresponding events process could interact with tests more subtly than we thought.

If we remove "components.ready" for now, adding celix_framework_waitForEmptyEventQueue after a framework instance is created should protect us from most hazards in tests. In production, we shall never do that since it will hurt startup speed.

If we keep "components.ready", this will be more difficult to fix.

@pnoltes
Copy link
Contributor Author

pnoltes commented Jul 12, 2023

Please note that ScheduledEventTestSuite.CxxWaitForScheduledEvent failed.

I though I had this covered, but I will look into this.

@pnoltes
Copy link
Contributor Author

pnoltes commented Jul 12, 2023

The crash of CelixFrameworkUtilsErrorInjectionTestSuite.testIsBundleUrlValid is very interesting.

https://github.com/apache/celix/actions/runs/5525141973/jobs/10078372242

[2023-07-11T21:47:10] AddressSanitizer[  error] [celix_framework] [celix_framework_utils_resolveFileBundleUrl:96] Failed(No such file or directory) to resolve bundle location 'non-existing.zip', taking into account the cwd and Celix bundle path 'bundles'.
:DEADLYSIGNAL
=================================================================
[2023-07-11T21:47:10] [  error] [celix_framework] [celix_framework_utils_locateEmbeddedBundle:147] Framework exception(0x11177): "Failed to locate embedded bundle symbols for bundle 'embedded://simple_test_bundle1'.";
 Cause: Inject dl error
[2023-07-11T21:47:10] [  error] [celix_framework] [celix_framework_utils_locateEmbeddedBundle:147] Cannot allocate memory(0xc): "Failed to locate embedded bundle symbols for bundle 'embedded://simple_test_bundle1'.";
 Cause: Cannot allocate memory
[2023-07-11T21:47:10] [  error] [celix_framework] [celix_framework_utils_locateEmbeddedBundle:147] Cannot allocate memory(0xc): "Failed to locate embedded bundle symbols for bundle 'embedded://simple_test_bundle1'.";
 Cause: Cannot allocate memory
==35135==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f1cd907e839 bp 0x7f1cd3cfe9c0 sp 0x7f1cd3cfe128 T13)
==35135==The signal is caused by a READ memory access.
==35135==Hint: address points to the zero page.
    #0 0x7f1cd907e838 in __sanitizer::internal_strnlen(char const*, unsigned long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cc:212
    #1 0x7f1cd8fce234 in __interceptor_strndup ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:379
    #2 0x559452ee5d88 in serviceRegistration_createInternal /home/runner/work/celix/celix/libs/framework/src/service_registration.c:54
    #3 0x559452ee5a6e in serviceRegistration_create /home/runner/work/celix/celix/libs/framework/src/service_registration.c:35
    #4 0x559452ee86b1 in serviceRegistry_registerServiceInternal /home/runner/work/celix/celix/libs/framework/src/service_registry.c:194
    #5 0x559452eec2f4 in celix_serviceRegistry_registerService /home/runner/work/celix/celix/libs/framework/src/service_registry.c:734
    #6 0x559452ed07ed in fw_handleEventRequest /home/runner/work/celix/celix/libs/framework/src/framework.c:1374
    #7 0x559452ed1225 in fw_handleEvents /home/runner/work/celix/celix/libs/framework/src/framework.c:1432
    #8 0x559452ed28a9 in fw_eventDispatcher /home/runner/work/celix/celix/libs/framework/src/framework.c:1597
    #9 0x7f1cd8414608 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x8608)
    #10 0x7f1cd7fed132 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11f132)

It was caused by celix_utils_strdup failure in one of celix_framework_registerServiceAsync calls introduced by this PR.

Of course, CELIX_EI_UNKNOWN_CALLER should be avoided at the first place. This crash tells us more: an unexpected service registration together with corresponding events process could interact with tests more subtly than we thought.

If we remove "components.ready" for now, adding celix_framework_waitForEmptyEventQueue after a framework instance is created should protect us from most hazards in tests. In production, we shall never do that since it will hurt startup speed.

If we keep "components.ready", this will be more difficult to fix.

Yeah indeed interesting and good that you already analyzed the root cause. I did introduce a "CELIX_FRAMEWORK_CONDITION_SERVICES_ENABLED" config property so that the condition services can be disabled when testing. But this is a warning that we should be careful with timing sensitive actions in the framework, especially in combination with error injection test suites.

@codecov-commenter
Copy link

codecov-commenter commented Jul 24, 2023

Codecov Report

Merging #588 (ab9f1f8) into master (d40cada) will increase coverage by 0.04%.
The diff coverage is 99.65%.

❗ Current head ab9f1f8 differs from pull request most recent head 43a9131. Consider uploading reports for the commit 43a9131 to get more accurate results

@@            Coverage Diff             @@
##           master     #588      +/-   ##
==========================================
+ Coverage   78.56%   78.60%   +0.04%     
==========================================
  Files         234      237       +3     
  Lines       35431    35623     +192     
==========================================
+ Hits        27836    28002     +166     
- Misses       7595     7621      +26     
Files Changed Coverage Δ
libs/framework/src/framework.c 84.95% <98.86%> (+1.97%) ⬆️
...nts_ready_check/src/celix_components_ready_check.c 100.00% <100.00%> (ø)
...check/src/celix_components_ready_check_activator.c 100.00% <100.00%> (ø)
bundles/shell/shell/src/query_command.c 79.36% <100.00%> (-4.64%) ⬇️
libs/framework/include/celix/Framework.h 100.00% <100.00%> (ø)
libs/framework/include/celix/FrameworkFactory.h 100.00% <100.00%> (ø)
libs/framework/include/celix/UseServiceBuilder.h 100.00% <100.00%> (ø)
libs/framework/src/celix_framework_bundle.c 100.00% <100.00%> (ø)
libs/framework/src/celix_framework_factory.c 100.00% <100.00%> (+12.50%) ⬆️

... and 8 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@pnoltes
Copy link
Contributor Author

pnoltes commented Jul 30, 2023

Added a ENABLE_TESTING_ON_CI cmake/conan option that is enabled when building on CI servers.

The ENABLE_TESTING_ON_CI is used in the scheduled event and components_ready_check tests to increase to allowed timing error margin when running on a CI environment.

Note that on my physical systems the error margin is a factor 10 stricter, but - based on the flaky test results - this is not possible on the virtual CI servers. And because the goal of the scheduled events is not to support very precise or high frequent scheduled events, IMO relaxing the timing requirements on the CI servers is acceptable.

@pnoltes
Copy link
Contributor Author

pnoltes commented Jul 30, 2023

Separated the "components.,ready" from the framework bundle to a separate bundle named "components_ready_check". Also decreased the component.ready check frequency, which was (incorrectly) too high.

@pnoltes pnoltes requested a review from PengZheng July 30, 2023 18:28
Copy link
Contributor

@PengZheng PengZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pnoltes pnoltes merged commit 1ff71f6 into master Jul 31, 2023
@pnoltes pnoltes deleted the feature/585-celix-conditions branch July 31, 2023 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants