Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify Y-build and I-build configurations and their tests #2625

Open
1 task done
HannesWell opened this issue Dec 3, 2024 · 19 comments
Open
1 task done

Unify Y-build and I-build configurations and their tests #2625

HannesWell opened this issue Dec 3, 2024 · 19 comments
Labels
enhancement New feature or request

Comments

@HannesWell
Copy link
Member

HannesWell commented Dec 3, 2024

Recently I have reworked the definition of I-build tests to reduce duplication and to make them more unified and simpler.
This has the effect that the definition of Y-build tests has now significantly diverged from the I-build tests and I intended to apply the same simplifications for Y-build tests as well.

And while looking into all the definitions of I-builds and Y-builds I noticed that they have a lot in common and wonder respectively think we should try to unify their definitions and try to define both pipelines from one source, while considering the points where they are different.
But this leads to the questions what exactly the differences between I-builds and Y-builds are? What I have found so far is

Together with #1950 this should to further simplify the overall releng.
Of course a unification would have to happen in multiple steps.

An alternative solution would be to just develop ECJ's support for upcoming java versions on the master branch and stop using another branch. AFAICT this has already been discussed respectively is currently discussed but will not be implemented soon.

The questions I have to the JDT-team:

  • Are you fine with the proposed general goal?
  • Who of you is interested to discuss the details that come along the path of making this happen?

@jarthana, @stephan-herrmann or @iloveeclipse are you interested in discussing this or can you tell who is interested?

Community

  • I understand suggesting an enhancement doesn't mandate anyone to implement it. Other contributors may consider this suggestion, or not, at their own convenience. The most efficient way to get it fixed is that I implement it myself and contribute it back as a good quality patch to the project.
@HannesWell HannesWell added the enhancement New feature or request label Dec 3, 2024
@stephan-herrmann
Copy link
Contributor

I am interested in seeing successful Y-builds with useful test result, ideally with no big gaps around the times of release, or infra changes. If unification helps to this end, then surely I fully agree with the goals, and your description looks correct to me.

I don't, however, have much insight into the inner workings of I-builds nor Y-builds, so I don't know in which detail discussion I could help. I'm already looking into P-builds, which is about all I can contribute in terms of releng builds.

And while it's correct that the matter of master vs BETA_JAVA24 (all of jdt, not just jdt.core) is being discussed widely and for a long time already, it's pretty clear that the mere fact of branching will stay for the foreseeable future, for more than one reason.

@HannesWell
Copy link
Member Author

I am interested in seeing successful Y-builds with useful test result, ideally with no big gaps around the times of release, or infra changes. If unification helps to this end, then surely I fully agree with the goals, and your description looks correct to me.

It will simplify the maintenance of the Y-build and will make it quicker to apply changes and it will also mean that the basic configurations that are meant to be equal, stay equal. So I think this could be a contribution to that overall goal.

I don't, however, have much insight into the inner workings of I-builds nor Y-builds, so I don't know in which detail discussion I could help. I'm already looking into P-builds, which is about all I can contribute in terms of releng builds.

It's mainly when I encounter differences to the I-build and I have to find out if they are really necessary or not. Then I hope to get an answer from you or somebody else from the JDT-team. But the technical Jenkins details I can hopefully handle by myself.

And while it's correct that the matter of master vs BETA_JAVA24 (all of jdt, not just jdt.core) is being discussed widely and for a long time already, it's pretty clear that the mere fact of branching will stay for the foreseeable future, for more than one reason.

Yes, I just wanted to mention that option, knowing that it's at the moment not feasible. :)

HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 5, 2024
This simplifies later additions of new os-arch combinations.

Part of eclipse-platform#2625
HannesWell added a commit that referenced this issue Dec 5, 2024
This simplifies later additions of new os-arch combinations.

Part of #2625
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 5, 2024
and slightly clean-up existing pipeline for linux.
Unify numbers like time-outs and numbers of retained builds.

Part of eclipse-platform#2625
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 5, 2024
and slightly clean-up existing pipeline for linux.
Unify numbers like time-outs and numbers of retained builds.

Part of eclipse-platform#2625
HannesWell added a commit that referenced this issue Dec 5, 2024
and slightly clean-up existing pipeline for linux.
Unify numbers like time-outs and numbers of retained builds.

Part of #2625
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 5, 2024
- Simplify environment variable definitions and tool handling
- Remove unnecessary print-outs and whitespace
- Remove unused or unnecessary parameters and property definitions

Part of eclipse-platform#2625
HannesWell added a commit that referenced this issue Dec 5, 2024
- Simplify environment variable definitions and tool handling
- Remove unnecessary print-outs and whitespace
- Remove unused or unnecessary parameters and property definitions

Part of #2625
@HannesWell
Copy link
Member Author

HannesWell commented Dec 5, 2024

For your information I have now done the following further step to unify Y-build and I-build tests:

The tasks executed respectively the behavior should be the same, but please let me know if you encounter any problem upon the next execution. Respectively if you don't plan to run another Y-build soon, do you mind if I start one?

Does the JDT team mind if obsolete Y-build test jobs, i.e. those that are no longer in use, are removed from https://ci.eclipse.org/releng/job/YPBuilds/ or do you need them for some time for reference?

@HannesWell
Copy link
Member Author

HannesWell commented Dec 7, 2024

For your information I have now done the following further step to unify Y-build and I-build tests:

To me it looks like the first execution after these changes were ok, without regressions due to infrastructure changes.

@stephan-herrmann or @jarthana I have two questions about the Y-build setup:

  1. Is there a reason why Y-build tests don't run on windows? Is it just to save precious computation time on the windows machine?

  2. Would it be possible to use the jdk installations provided by default for Eclipse Jenkins instances in Y-build tests on Linux?
    At the moment specific open-jdks are download for each build:

    [javaVersion: 17, javaHome: '''installJDK('https://download.java.net/java/GA/jdk17.0.2/dfd4a8d0985749f896bed50d7138ee7f/8/GPL/openjdk-17.0.2_linux-x64_bin.tar.gz')''' ],
    [javaVersion: 21, javaHome: '''installJDK('https://download.java.net/java/GA/jdk21/fd2272bbf8e04c3dbaee13770090416c/35/GPL/openjdk-21_linux-x64_bin.tar.gz')''' ],
    [javaVersion: 24, javaHome: '''installJDK('https://download.java.net/java/early_access/jdk24/18/GPL/openjdk-24-ea+18_linux-x64_bin.tar.gz')''' ]

    But EF Jenkins provides all these JDKs already, even EA builds for not yet released versions:
    See 'openjdk-ea-latest' at https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#openjdk
    @fredg02 can you tell how often the EA builds are updated for all Jenkins instances?

To the JDT-devs:
Would you be fine with changing to the provided 'openjdk-ea-latest' for the Y-build
Java 24 tests and switch to temurin for all others?
Just like all other builds:
https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#eclipse-temurin

@stephan-herrmann
Copy link
Contributor

Does the JDT team mind if obsolete Y-build test jobs, i.e. those that are no longer in use, are removed from https://ci.eclipse.org/releng/job/YPBuilds/ or do you need them for some time for reference?

IMHO old jobs can be removed once the current set of jobs (build & test) is functional.

Let's perhaps just wait for one more +1 from @jarthana or @mpalat

@stephan-herrmann
Copy link
Contributor

  1. Is there a reason why Y-build tests don't run on windows? Is it just to save precious computation time on the windows machine?

I'll have to pass this one, as I don't know the reason.

  1. Would it be possible to use the jdk installations provided by default for Eclipse Jenkins instances in Y-build tests on Linux?
    At the moment specific open-jdks are download for each build:
    [javaVersion: 17, javaHome: '''installJDK('https://download.java.net/java/GA/jdk17.0.2/dfd4a8d0985749f896bed50d7138ee7f/8/GPL/openjdk-17.0.2_linux-x64_bin.tar.gz')''' ],
    [javaVersion: 21, javaHome: '''installJDK('https://download.java.net/java/GA/jdk21/fd2272bbf8e04c3dbaee13770090416c/35/GPL/openjdk-21_linux-x64_bin.tar.gz')''' ],
    [javaVersion: 24, javaHome: '''installJDK('https://download.java.net/java/early_access/jdk24/18/GPL/openjdk-24-ea+18_linux-x64_bin.tar.gz')''' ]

But EF Jenkins provides all these JDKs already, even EA builds for not yet released versions:
See 'openjdk-ea-latest' at https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#openjdk
@fredg02 can you tell how often the EA builds are updated for all Jenkins instances?

IMHO, the specific JDK version is not relevant here, as long as we have some "good" EA build of the current stream.

To the JDT-devs: Would you be fine with changing to the provided 'openjdk-ea-latest' for the Y-build Java 24 tests and switch to temurin for all others? Just like all other builds: https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#eclipse-temurin

No objections from my side.

@jarthana
Copy link
Contributor

jarthana commented Dec 9, 2024

Does the JDT team mind if obsolete Y-build test jobs, i.e. those that are no longer in use, are removed from https://ci.eclipse.org/releng/job/YPBuilds/ or do you need them for some time for reference?

IMHO old jobs can be removed once the current set of jobs (build & test) is functional.

Let's perhaps just wait for one more +1 from @jarthana or @mpalat

No objections from me either.

1. Is there a reason why Y-build tests don't run on windows? Is it just to save precious computation time on the windows machine?

Just checked with @MohananRahul who says that it's configured but not configured to run automatically.

To the JDT-devs: Would you be fine with changing to the provided 'openjdk-ea-latest' for the Y-build Java 24 tests and switch to temurin for all others? Just like all other builds: https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#eclipse-temurin

I agree with Stephan. No problem for me.

@jarthana
Copy link
Contributor

jarthana commented Dec 9, 2024

  1. Is there a reason why Y-build tests don't run on windows? Is it just to save precious computation time on the windows machine?

Just checked with @MohananRahul who says that it's configured but not configured to run automatically.

Rahul just dug this up for me, thanks Rahul!
#457
Still doesn't explain why, though.

@fredg02
Copy link
Contributor

fredg02 commented Dec 9, 2024

@fredg02 can you tell how often the EA builds are updated for all Jenkins instances?

They are mostly updated on demand.

HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 19, 2024
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 19, 2024
For the Linux Y-Build for Java-24 the latest Temurin early-access build
is downloaded.

Part of eclipse-platform#2625
HannesWell added a commit that referenced this issue Dec 19, 2024
For the Linux Y-Build for Java-24 the latest Temurin early-access build
is downloaded.

Part of #2625
@HannesWell
Copy link
Member Author

Thanks for your agreement, I have unified this with:

Rahul just dug this up for me, thanks Rahul!
#457

Thanks. Then just leave it as it is for now. The load is already high enough.

FYI after the I-build and all it's tests have been moved to Java-21 I applied the same change for the Y-build:

HannesWell added a commit that referenced this issue Dec 23, 2024
This reduces the number of places to adapt in case the I-/Y-build
test-configurations are changed and in general reduces the difference
between I-builds and Y-builds.
Furthermore it reduces duplication introduced to handle the small
difference between I- and Y-builds.

Part of #2625
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 27, 2024
Update 'API_PREV_REF_LABEL' for Y- and P-build to be in sync with the
value defined for I-builds.
Remove unused 'TESTED_BUILD_TYPE' variable.

Part of
eclipse-platform#2625
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 27, 2024
Instead of duplicating the 'buildproperties.txt' template for each build
type, keep only variables with common values in the general
'buildproperties.txt' and define differing variables in the Jenkins
pipeline and add it dynamically to the generated/derived
'buildproperties.shsource/properties'.

Remove unused 'TESTED_BUILD_TYPE' variable.
Update 'API_PREV_REF_LABEL' for Y- and P-build to be in sync with the
value defined for I-builds.

Part of
eclipse-platform#2625
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 27, 2024
Instead of duplicating the 'buildproperties.txt' template for each build
type, keep only variables with common values in the general
'buildproperties.txt' and define differing variables in the Jenkins
pipeline and add it dynamically to the generated/derived
'buildproperties.shsource/properties'.

Remove unused 'TESTED_BUILD_TYPE' variable.
Update 'API_PREV_REF_LABEL' for Y- and P-build to be in sync with the
value defined for I-builds.

Part of
eclipse-platform#2625
HannesWell added a commit that referenced this issue Dec 27, 2024
Instead of duplicating the 'buildproperties.txt' template for each build
type, keep only variables with common values in the general
'buildproperties.txt' and define differing variables in the Jenkins
pipeline and add it dynamically to the generated/derived
'buildproperties.shsource/properties'.

Remove unused 'TESTED_BUILD_TYPE' variable.
Update 'API_PREV_REF_LABEL' for Y- and P-build to be in sync with the
value defined for I-builds.

Part of
#2625
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 27, 2024
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 29, 2024
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 29, 2024
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Dec 29, 2024
@stephan-herrmann
Copy link
Contributor

Test job https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/ aborts since 2024-12-31.

Error is:

java.lang.OutOfMemoryError: Java heap space
Also:   org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: a030fa77-332d-4e01-8dec-f10f51ace8d9
Caused: java.io.IOException: Remote call on JNLP4-connect connection from 10.40.71.91/10.40.71.91:47894 failed
	at hudson.remoting.Channel.call(Channel.java:1116)
	at hudson.FilePath.act(FilePath.java:1228)
	at hudson.FilePath.act(FilePath.java:1217)
	at PluginClassLoader for junit//hudson.tasks.junit.JUnitParser.parseResult(JUnitParser.java:146)
	at PluginClassLoader for junit//hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:177)
	at PluginClassLoader for junit//hudson.tasks.junit.JUnitResultArchiver.parseAndSummarize(JUnitResultArchiver.java:282)
	at PluginClassLoader for junit//hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:62)
	at PluginClassLoader for junit//hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:27)
	at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:857)
Finished: FAILURE

Could this be connected to changes done here?

@HannesWell
Copy link
Member Author

Test job https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/ aborts since 2024-12-31.

Error is:

java.lang.OutOfMemoryError: Java heap space

Could this be connected to changes done here?

I saw that as well, but since the jobs are more and more unified (which is the goal of this issue) and Java-23 respectively I-builds don't show this I was hoping that the cause is an increased memory consumption (or maybe even a memory leak?) with java-24. Is that possible?

To help us in finding the cause I replayed the latest build, with -XX:+HeapDumpOnOutOfMemoryError enabled (as described in https://www.baeldung.com/java-heap-dump-capture#automatically):
https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/15/

Maybe that's something we should enable in general because it has no cost if no OOM error happens and helps to track down memory issues (also for the builds not only the tests)?

@stephan-herrmann
Copy link
Contributor

I was hoping that the cause is an increased memory consumption (or maybe even a memory leak?)

As the error happens while parsing test results, the leak might simply be a flood of failures that blows up memory. But due to the error we cannot see those test failures :-/

I wonder if the memory limit is defined by the executor or as a build parameter?

@HannesWell
Copy link
Member Author

HannesWell commented Jan 4, 2025

I was hoping that the cause is an increased memory consumption (or maybe even a memory leak?)

As the error happens while parsing test results, the leak might simply be a flood of failures that blows up memory. But due to the error we cannot see those test failures :-/

Good point. I didn't look at the logs at that detail 😅

I wonder if the memory limit is defined by the executor or as a build parameter?

For Linux the tests run in a kubernetes pod and we can indeed use a pod with increased resource limits by replacing

agent {
  label 'ubuntu-2404'
}

with

agent {
  kubernetes {
    inheritFrom 'ubuntu-2404'
    yaml """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: "jnlp"
    resources:
      limits:
        memory: "8Gi"
        cpu: "4000m"
      requests:
        memory: "4Gi"
        cpu: "2000m"
"""
  }
}

I have started another replay where I applied this:

You can see the exact diff in the job configuration at (you probably have to be logged in into Jenkins):

@HannesWell
Copy link
Member Author

HannesWell commented Jan 5, 2025

It failed again :/
But you could have a look at the test result files directly:
https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/16/artifact/workarea/Y20250104-1000/eclipse-testing/results/xml/

Some of them have multiple MiB, but I don't see an error or failure. And although the number of tests didn't change, the size of org.eclipse.jdt.core.tests.compiler suite almost doubled.
Maybe the configuration changed? E.g. I see 22 hits for
<testcase classname="org.eclipse.jdt.core.tests.compiler.regression.TestAll" name="test067 - 16" (randomly chosen)

Maybe this also leads to an endless loops in the test-result reading step?

@stephan-herrmann
Copy link
Contributor

It failed again :/ But you could have a look at the test result files directly: https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/16/artifact/workarea/Y20250104-1000/eclipse-testing/results/xml/

Some of them have multiple MiB, but I don't see an error or failure. And altought the number of tests didn't change, the size of org.eclipse.jdt.core.tests.compiler suite almost doubled. Maybe the configuraiton changed? E.g. I see (randomly choosen) 22 hits for <testcase classname="org.eclipse.jdt.core.tests.compiler.regression.TestAll" name="test067 - 16"

Maybe this also leads to a endless in the test-result reading step?

Thanks! This led to a candidate culprit:
The recent merge from master to BETA_JAVA24 removed -Dcompliance.jre.24=1.8,11,17,21,24 from the test invocation resulting in all tests being executed at 16 distinct compliance levels rather than the intended 5. This didn't surface in PR builds, because the bogus change was in test.xml, which is not used in maven builds, but apparently this is what still drives tests during production builds.

Over to eclipse-jdt/eclipse.jdt.core#3517

@merks
Copy link
Contributor

merks commented Jan 5, 2025

Nice teamwork. 👍

HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Jan 16, 2025
The existing differences are not relevant and probably only due to
omitted synchronization.

Part of eclipse-platform#2625
HannesWell added a commit to HannesWell/eclipse.platform.releng.aggregator that referenced this issue Jan 16, 2025
The existing differences are not relevant and probably only due to
omitted synchronization.
For the 'dropTemplateFileName' property, rely on the default, which is
exactly the new unify filename.

Part of eclipse-platform#2625
HannesWell added a commit that referenced this issue Jan 16, 2025
The existing differences are not relevant and probably only due to
omitted synchronization.
For the 'dropTemplateFileName' property, rely on the default, which is
exactly the new unify filename.

Part of #2625
@HannesWell
Copy link
Member Author

I have just pushed another unification of the result website via:

The result shouldn't be really different as before.

@stephan-herrmann I'm also reading your thread on the jdt-dev mailing list about how to compile ECJ on the Y-build not with the latest I-build but with the latest Y-build (if I'm not mistaken that's the actual intention isn't it?).
I have an idea for that, but have to verify it first before implementing it. I'll try to propose a patch in the next days.

@stephan-herrmann
Copy link
Contributor

@stephan-herrmann I'm also reading your thread on the jdt-dev mailing list about how to compile ECJ on the Y-build not with the latest I-build but with the latest Y-build (if I'm not mistaken that's the actual intention isn't it?).

I think you got me right. There's just two things I'm not 100% sure about:

  • at least in PR builds we use the compiler built in the very same build, I assume this is not happening in any production builds?
  • you say "to compile ECJ", but isn't it actually about all compilation during that build, not just ECJ self-compilation?

At the bottom line, the same dog fooding from beta branches, as I-builds do for the master branch would be great (and for the rare use case discussed recently, this would prevent a strange hiccup).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants