Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitencoding cleanup #1631

Draft
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

carstenartur
Copy link
Contributor

@carstenartur carstenartur commented Sep 8, 2024

Port of "explicit encoding" cleanup from https://github.com/carstenartur/sandbox to eclipse

What it does

Some java api has been designed originally without a strong focus on safety on usage of different charsets. Because of that apis have simple string parameters to determine the encoding to use or even a default encoding that is based on the environment settings. The situation has improved a lot and you should explicitly set the charset. In doing so for type safety better switch to Charset class.

For an example what the result looks similar to look at
58ae341

That is what this cleanup is trying to support.

Starting with Java 10

1. 

ByteArrayOutputStream ba=new ByteArrayOutputStream();
String result=ba.toString();

Case 1: Option "keep_behavior"

ByteArrayOutputStream ba=new ByteArrayOutputStream();
String result=ba.toString(Charset.defaultCharset());

Case 2: Option "enforce_utf8"

ByteArrayOutputStream ba=new ByteArrayOutputStream();
String result=ba.toString(StandardCharsets.UTF_8);

2.

ByteArrayOutputStream ba=new ByteArrayOutputStream();
String result=ba.toString("UTF-8");

ByteArrayOutputStream ba=new ByteArrayOutputStream();
String result=ba.toString(StandardCharsets.UTF_8);

Starting with Java 10

Reader r=Channels.newReader(ch,"UTF-8");
Reader r=Channels.newReader(ch,StandardCharsets.UTF_8);

Starting with Java 10

Channels.newWriter(ch,"UTF-8");
Channels.newWriter(ch,StandardCharsets.UTF_8);

Starting with Java 18
Currently, jdt.ui supports test environments only up to Java 17, which means tests cannot be executed for this version.

1.

Charset.forName("UTF-8");
StandardCharsets.UTF_8;

2.

Charset.forName("UTF-16");
StandardCharsets.UTF_16;

Starting with Java 5

1.

new java.util.Formatter(new File(), "UTF-8");
new java.util.Formatter(new File(), StandardCharsets.UTF_8);

2.

new java.util.Formatter(new File(), "UTF-8",new java.util.Locale());
new java.util.Formatter(new File(), StandardCharsets.UTF_8,new java.util.Locale());

...

Starting with Java 5

new InputStreamReader(InputStream in, "UTF-8");
new InputStreamReader(InputStream in, StandardCharsets.UTF_8);

Starting with Java 5

new OutputStreamWriter os=new OutputStreamWriter(InputStream in,"UTF-8");
new OutputStreamWriter os=new OutputStreamWriter(InputStream in, StandardCharsets.UTF_8);

Starting with Java 5

Stream fw=new PrintStream("file.txt", "UTF-8");
Stream fw=new PrintStream("file.txt", StandardCharsets.UTF_8);

Starting with Java 10

Properties.storeToXML(java.io.OutputStream,"UTF-8");
Properties.storeToXML(java.io.OutputStream,StandardCharsets.UTF_8);

Starting with Java 10

new java.util.Scanner(new File("asdf"),"UTF-8");
new java.util.Scanner(new File("asdf"),StandardCharsets.UTF_8);

Starting with Java 10

String s=new String(byte[],"UTF-8");
String s=new String(byte[],StandardCharsets.UTF_8);

Starting with Java 5

String.getBytes("Utf-8")
String.getBytes(StandardCharsets.UTF_8);

Starting with Java 10

java.net.URLDecoder.decode("asdf","UTF-8");
java.net.URLDecoder.decode("asdf",StandardCharsets.UTF_8);

There are still some missing parts:

  1. Now using a Java 22 test environment since there is no Java 18 test enviroment support in jdt ui. Should be fine.
  2. In a number of cases the UnsupportedEncodingException is removed now. Several classes behave different. So there is no common approach to treat the situation.
  3. The code for PrintWriter,FileReader should be reimplemented in the way the other cleanups do it
  4. 3 different options are supported "KEEP" , "USE_UTF8" ,"USE_UTF8_AGGREGATE"
    KEEP means that the current behavior of the code is not changed
    USE_UTF-8 means that in all cases where there is no fixed encoding we add a charset UTF-8
    USE_UTF-8_AGGREGATE means the same as USE_UTF-8 but additionally add a constant to the class and refers to it
  5. NLS Comments are currently not removed
  6. This cleanup appears on the code style page although only using the first choice allows to keep the current behavior!

image

How to test

Use the cleanup on one of the affected java apis.

Author checklist

@carstenartur
Copy link
Contributor Author

@jjohnstn would it be acceptable to have a separat test project to hold junit5 based tests until they can be backported to the junit 4 standard jdt.ui complies to?

@akurtakov
Copy link
Contributor

Note that if you have separate project it would not be automatically executed by Ibuilds unless you do the extra work to register it like done in eclipse-platform/eclipse.platform.releng.aggregator#2322 .

@carstenartur
Copy link
Contributor Author

Ok, what is your advice, @akurtakov ?

@akurtakov
Copy link
Contributor

JUnit 5 can run junit 4 tests so migrating the suites to junit 5 suites should allow migrating tests to junit 5 one by one while in the same bundle.

@carstenartur
Copy link
Contributor Author

Ok, so you mean moving the tests to the usual test folder and add it to a suite?

It is a pity that this approach does not allow to add code coverage computation and clear splitting of test and supporting code.

Fine with me. I need help for parts of the implementation anyway:
How to provide a Java 18 based test environment?
How to remove different exceptions catched after applying the fix?

@carstenartur carstenartur changed the title Explicitencoding Explicitencoding cleanup Sep 9, 2024
@carstenartur carstenartur marked this pull request as ready for review September 15, 2024 16:38
@jjohnstn
Copy link
Contributor

Hi @carstenartur The Java 18 stubs are something I have meant to look into. The recipe for stubs did not work for me and I need to get some help from the IBM folks or Stephanne.

Copy link
Contributor

@jjohnstn jjohnstn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comments.

  1. new plug-ins need about.html files (added to sources as well) - won't matter if you move them to existing plug-ins
  2. new plug-ins should be 1.0.0 rather than match platform 4.34 (jdt.ui and jdt.ui.tests for example are lower versions) - again moot if you move into existing test plug-ins
  3. nearly all files including test sources and property files should have licensing headers (see other JDT plug-ins if in doubt)
  4. plug-in name and provider name should be supplied by plugin.properties file - moot when moved
  5. remove commented out code bits unless they will be added at a later date
  6. not sure about how this is added to Java Feature tab - there are Java 10 and 18 specific items but you only add Java 6 options and this appears after Java 5 and not between 5 and 7 (should it just be on or off in a different tab like code style?)
  7. last encoding option has no preview changes (as far as I can see - single constant) and isn't very clear (is it needed or should it be defaulted)?
  8. perhaps a little more description in feature option could help users to understanding what it is doing

@carstenartur
Copy link
Contributor Author

JUnit 5 can run junit 4 tests so migrating the suites to junit 5 suites should allow migrating tests to junit 5 one by one while in the same bundle.

@akurtakov I have copied the tests to the usual project without backporting them to junit 4. They can be executed now if you start them explicitly but do not run as part of the junit 4 based jdt test suites.
Any suggestion to get rid of the need to port them back to junit 4?

@carstenartur carstenartur force-pushed the explicitencoding branch 4 times, most recently from 1725922 to a170e61 Compare September 21, 2024 19:11
@jjohnstn
Copy link
Contributor

Hi @carstenartur You should never do a merge commit in a PR. You need to instead do a rebase on up to date master. You need to locally do a hard reset to the commit before the merge, then do a rebase on top of up to date master, then do a force push to fix this.

@carstenartur
Copy link
Contributor Author

Hi @carstenartur You should never do a merge commit in a PR. You need to instead do a rebase on up to date master. You need to locally do a hard reset to the commit before the merge, then do a rebase on top of up to date master, then do a force push to fix this.

Hi @jjohnstn, thanks for hint but I think I can simply solve it by choosing a "rebase update" in the web interface after the next change is merged to master. Dunno why it is not configured as default in jdts github setup. I would have done that beforehand but the webinterface of github does not allow it in case of a conflict it seems. So to solve one conflict that showed up with the libraries in one manifest file I had no choice at least as long as I solve the conflict in the web interface of github.
But it does not matter, I'm not in hurry. We can merge next year - currently I have no idea how to run Jave 18 based tests anyway.

@carstenartur
Copy link
Contributor Author

Hi @carstenartur You should never do a merge commit in a PR. You need to instead do a rebase on up to date master. You need to locally do a hard reset to the commit before the merge, then do a rebase on top of up to date master, then do a force push to fix this.

IMG_8844
This usually heals the issue

@carstenartur
Copy link
Contributor Author

Hi @carstenartur You should never do a merge commit in a PR. You need to instead do a rebase on up to date master. You need to locally do a hard reset to the commit before the merge, then do a rebase on top of up to date master, then do a force push to fix this.

IMG_8845
Usually the „allow merge commits“ featue could simply be unchecked in the project configuration for jdt.ui to not have that accidently. But of course I know nothing about the policy for that in jdt.

@jjohnstn
Copy link
Contributor

Hi @carstenartur There has been discussion about why the merge is offered and I don't believe it can be done. I am guessing it is due to some github lack of function (I also preferred gerrit). If the rebase fixes it great, otherwise, you will have to do stuff on your local system because github lacks a lot of git functionality.

@carstenartur carstenartur force-pushed the explicitencoding branch 2 times, most recently from 1725922 to c01dca4 Compare September 28, 2024 08:15
@carstenartur
Copy link
Contributor Author

General comments.

  1. new plug-ins need about.html files (added to sources as well) - won't matter if you move them to existing plug-ins

I removed the extra plugin projects. So this time I do not try to port the code coverage computation from my sandbox project.

  1. new plug-ins should be 1.0.0 rather than match platform 4.34 (jdt.ui and jdt.ui.tests for example are lower versions) - again moot if you move into existing test plug-ins

so this is no longer needed.

  1. nearly all files including test sources and property files should have licensing headers (see other JDT plug-ins if in doubt)

have to check that…

  1. plug-in name and provider name should be supplied by plugin.properties file - moot when moved

no longer needed

  1. remove commented out code bits unless they will be added at a later date

sure

  1. not sure about how this is added to Java Feature tab - there are Java 10 and 18 specific items but you only add Java 6 options and this appears after Java 5 and not between 5 and 7 (should it just be on or off in a different tab like code style?)

Yes, I agree. Maybe on the tab for cleanups that might chane behavior - not really sure about this.

  1. last encoding option has no preview changes (as far as I can see - single constant) and isn't very clear (is it needed or should it be defaulted)?

I have to find out how to consolidate all occurences of the encodings inserted into one constant per project. That was what I wanted because in my use cases that was what I needed most of the time.

  1. perhaps a little more description in feature option could help users to understanding what it is doing

yes, thats true. I‘ll try to improve that but don‘t hold your breath:)

@carstenartur carstenartur force-pushed the explicitencoding branch 3 times, most recently from f1a22fc to bd56b08 Compare October 6, 2024 19:07
@carstenartur carstenartur force-pushed the explicitencoding branch 2 times, most recently from 5488280 to 4e60dfa Compare October 9, 2024 17:30
@carstenartur
Copy link
Contributor Author

Hi @carstenartur Could you handle the one API error for your code? The others indicating difference in .class files I am not sure why they occur but I got them for a PR that didn't touch that code at all. Could you also revisit my original comments some of which you had mentioned that you would look into and give a status update since you have made numerous commits after? If the PR isn't complete, just let me know. I would like to try to get it in for 4.35 M1.

Hi Jeff,
I already updated the description. I think I resolved many of your comments. Have to check again to see what is still missing.
Especially two points described in the description are not ready yet:

The code for PrintWriter,FileReader should be reimplemented in the way the other cleanups do it. The current implementation was made with former java apis in mind.
2)
NLS Comments are currently not removed. I was looking for an easy way to concatenate cleanups. That way I could just reuse the cleanup to fix nls issues but I failed to find a way so far. This could have been useful in other cases as well. In the junit test you can see it partly works though.

Besides there are changes in a unrelated test that was constantly failing I had to change to get it run through.

So - unfortunately not ready, sorry.

@carstenartur carstenartur force-pushed the explicitencoding branch 2 times, most recently from ecd4a56 to 324ffca Compare December 8, 2024 16:52
@carstenartur carstenartur marked this pull request as draft December 8, 2024 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request noteworthy Noteworthy feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants