Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that Static recursive bindings always go through pre-allocation #8

Open
wants to merge 508 commits into
base: let-rec-propagate
Choose a base branch
from

Conversation

lthls
Copy link
Owner

@lthls lthls commented Sep 14, 2023

This PR introduces a Constant classification and relies on it to ensure that Static bindings always get pre-allocated.

The patch is less simple than I would have liked for multiple reasons.

First, classes. The translation of classes (and all object code in general) generates recursive bindings that do not go through Rec_check. These bindings can be actually recursive, in which case they often match the shape of a Static binding, but when they're not actually recursive the compilation pass can also generate code that returns a function call, which should be Dynamic.
So it looks like we might have to patch the class compilation code so that it also returns the classification, but after a bit more digging it is not enough.
When I said "often" earlier, I left out cases like class c1 = object end and c2 = c1, which is compiled as an alias recursive definition which would normally be forbidden by Rec_check. But other checks on class definitions force these aliases to always occur after the classes they alias to, so as long as the compiler initializes the recursive bindings in the same order as the definitions the current scheme happens to work.
So my solution is to add a new classification for classes and keep the old behaviour in this case.
ocaml#8956 chooses another solution, which is to always pre-allocate classes (they are always blocks of size 4); I've left the possibility open here in a comment.

Second, Flambda can lift recursive definitions to symbols, transforming a Static binding into a Constant one. Typically, in the test letrec-compilation/record_with.ml, both records can be statically allocated. So I could either change the expression computations to dynamically detect constants, or patch the Flambda code to change the classification when appropriate. I chose the second option (but the first one might have been better, see next point).

Finally, the order of evaluation for recursive definitions depends on the classification (as shown by the test letrec-compilation/evaluation_order_1.ml), so I decided to be as precise as possible for the classification of constants, making the classification in Rec_check a bit more complicated than necessary.
In the end, it turns out that the test would pass even if the evaluation order changes; it only failed because Tree [] was classified as Static but the size computation treated it as a constant, so my new assertions triggered. Allowing Static bindings to be re-classified as constants during size computation, as discussed earlier, would have solved the problem.

@lthls
Copy link
Owner Author

lthls commented Sep 14, 2023

I'm going to try the alternate solution discussed above, allowing switching from Static to Constant at any point and detecting that during the size computation.

@lthls lthls force-pushed the let-rec-constant-classification branch 2 times, most recently from 6a892d5 to 95b4a90 Compare September 26, 2023 11:45
smuenzel and others added 27 commits October 17, 2023 15:53
Identify mismatched class type parameters and class parameters by ordinal in error messages.
Compiler's ocaml-variants.opam always includes the VERSION number, so
there's no need to edit it further.
* Building the compiler with ThreadSanitizer and running the testsuite caused too
many reports in OCaml 5 and was disabled (see ocaml#11040).

Since then, the work on TSan support for OCaml programs has led to fix a number
of those data races and temporarily silence the ones that are waiting to be
investigated (see ocaml#11040 again). As a result, running the testsuite with
`--enable-tsan` is now a cheap and effective way of detecting new data races
that may be introduced in the runtime.

A second good reason to restore the TSan CI is that it will detect early if a
recent change has accidentally broken TSan instrumentation (as has happened
before as an accidental consequence of removing a symbol
ocaml#12383 (review)), or
other issues (e.g. a new test revealed a TSan limitation with signals
ocaml#12561 (comment)).

Adding this test to the Github Actions CI arguably lengthens the runs (a GHA
run on amd64 Linux with TSan takes about 50 minutes). This PR therefore
suggests the compromise of enabling it on the Inria CI which is run on every
merge.

* Disable tests parallel/catch_break with tsan

* CI sanitizers: Use clang 14

clang 13 thread sanitizer produces different, less precise traces.
Also, clang 14 is the default version in Ubuntu 22.04 LTS.

---------

Co-authored-by: Xavier Leroy <xavier.leroy@college-de-france.fr>
Simplify `opam pin` command in `HACKING.adoc`
…rity-in-simplif

Stop `Simplif` from causing disagreement between runtime arity and syntatic arity of a function
parsing: Attach a location to the RHS of Ptyp_alias
Merge compilerlibs/Makefile.compilerlibs into the root Makefile
This is captured by the generic framework and does thus not need
to be written here.

(Follow-up to PR ocaml#12321 merging ocamltest/Makefile into the root Makefile)
This variable was used in ocamldoc/Makefile, by the test targets that got
removed in PR ocaml#12615.
This commit introduces two private build variables: OCAMLDOC_TARGET
and OCAMLDOC_OPT_TARGET.
The ./ prefix in front of $(OCAMLDOC) and $(OCAMLDOC_OPT) is not useful
since the definitions of these variables are already prefixed with
$(ROOTDIR).
Makefile.best_ocamldoc was included in two files:

1. In ocamldoc/Makefile, which also includes

  $(ROOTDIR)/Makefile.best_binaries

So that the definitions which were in Makefile.best_ocamldoc remain available.

2. In api_docgen/ocamldoc/Makefile which also includes
Makefile.best_binaries via api_docgen/Makefile.common
STDLIB_MANPAGES is still defined in Makefile.config.in for backwards
compatibility, whereas build_libraries_manpages is defined in
Maekfile.build_config.in and thus remains private.

Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Before this commit, the build_libraries_manpages variable (aka STDLIB_MANPAGES)
was set to true if the build of the manpages for libraries was
enabled at configure time, whether ocamldoc (which is required
 to build those manpages) was enabled or not. It was thus the build
system's responsibility to determine whether the manpages for
libraries should be built / installed,
by testing both the
build_ocamldoc and the build_libraries_manpages variables.

However, it is known at configure time whether ocamldoc is available
or not, which makes it possible to set build_libraries_manpages to
true only if the manpages for libraries have been requested AND
ocamldoc has been enabled.

This is what is done in this commit, leading to a simplification
on the build system's side since it becomes enough to test only
one variable, namely build_libraries_manpages, rather than two
like before.

Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Merge ocamldoc/Makefile into the root Makefile
dra27 and others added 24 commits November 17, 2023 15:50
)

This is achieved by adding a CAML_BA_SUBARRAY flag that is honored by caml_ba_alloc.

Fixes: ocaml#12491
Closes: ocaml#12500
IFS not set as in the other calls to ocamltest
The previous "trick" set IFS to "\r\n" and then used parameter
separatation to strip trailing \r characters. That's a bit nefarious,
and it turns out it doesn't work (for some reason) in MSYS2.

The "simple" alternative is to pipe the output of ocamltest through tr,
but given that the while loop reading the results must necessarily use
IFS in order to read entire lines, instead this alternate trick abuses
IFS to strip the \r as a field delimiter to the while loop.
Fix ocamlnat by registering frametables correctly
Update `msvs-promote-path` to upstream's 0.6.0
* Add the following functions:

- `Random.int_in_range`
- `Random.int32_in_range`
- `Random.int64_in_range`
- `Random.nativeint_in_range`
- `Random.State.int_in_range`
- `Random.State.int32_in_range`
- `Random.State.int64_in_range`
- `Random.State.nativeint_in_range`

`Random.int_in_range ~min ~max` randomly draws an integer in the range
[min,max] (bounds included), and likewise for the other functions.

* Add chi2 tests for these functions.

* Fixes `Random.full_int` to ensure that the results are consistent between
32-bit OCaml (with 31-bit integers), 64-bit OCaml (with 63-bit integers)
and JS-of-OCaml (with 32-bit integers).  This was guaranteed in OCaml 4
but was lost in OCaml 5.0.

Co-authored-by: Xavier Leroy <xavier.leroy@college-de-france.fr>
`caml_oldify_one` is now `oldify_one`.
Fix summary computation in ocamltest
The compilerlibs were no longer compiled with -linkall, due to a
bogous line introduced in commit e1c2928,
which was part of PR ocaml#12586 (Merge compilerlibs/Makefile.compilerlibs
into the root Makefile).

The line was using a wrong GNU make syntax and a wrong variable name.

The present commit fixes both the syntax and the variable name.
Co-authored-by: Samuel Hym <samuel.hym@rustyne.lautre.net>
Avoid the MinGW implementation of this library so that the MinGW and
MSVC ports use the same WinAPI code.

Don't define HAS_UNISTD under Windows hosts, always guard its
inclusion with this macro, defined in "caml/config.h".

Co-authored-by: Antonin Décimo <antonin@tarides.com>
Co-authored-by: Samuel Hym <samuel.hym@rustyne.lautre.net>
@lthls lthls force-pushed the let-rec-constant-classification branch from b228bb1 to b1da323 Compare November 24, 2023 12:58
Use that to assert that Static bindings always have a known size
- Add Value_rec_types modules
- Rename Rec_check to Value_rec_check
@lthls lthls force-pushed the let-rec-constant-classification branch from b1800be to c3cf856 Compare November 24, 2023 13:16
lthls pushed a commit that referenced this pull request Aug 13, 2024
…l#13294)

The toplevel printer detects cycles by keeping a hashtable of values
that it has already traversed.

However, some OCaml runtime types (at least bigarrays) may be
partially uninitialized, and hashing them at arbitrary program points
may read uninitialized memory. In particular, the OCaml testsuite
fails when running with a memory-sanitizer enabled, as bigarray
printing results in reads to uninitialized memory:

```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    #2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    #3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    #2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    #3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    #4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

The only use of hashing in genprintval is to avoid cycles, that is, it
is only useful for OCaml values that contain other OCaml values
(including possibly themselves). Bigarrays cannot introduce cycles,
and they are always printed as "<abstr>" anyway.

The present commit proposes to be more conservative in which values
are hashed by the cycle detector to avoid this issue: we skip hashing
any value with tag above No_scan_tag -- which may not contain any
OCaml values.

Suggested-by: Gabriel Scherer <gabriel.scherer@gmail.com>

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Co-authored-by: Edwin Török <edwin.torok@cloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.