Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

creating code for other languages than C #38

Open
benibela opened this issue Oct 18, 2020 · 19 comments
Open

creating code for other languages than C #38

benibela opened this issue Oct 18, 2020 · 19 comments

Comments

@benibela
Copy link

How much effort would it take to extend the transpiler to create code for other languages, e.g. Pascal?

@nigeltao
Copy link
Collaborator

I'd expect the effort required to be relatively straightfoward. A design goal from day one was to accomodate other target languages. Long term, doc/roadmap.md already lists:

  • Generate Go code.
  • Generate Rust code.

and the directory structure has a test/c directory in anticipation of others such as test/pascal.

However, in the short term, Wuffs-the-language is still changing relatively rapidly, and such changes are harder to make the more target languages (C, Go, Pascal, Rust, etc) there are.

You are obviously welcome to write your own experimental Wuffs-to-Pascal transpiler, using the github.com/google/wuffs/lang/... Go packages (start with github.com/google/wuffs/lang/generate based on how github.com/google/wuffs/internal/cgen uses it), but I'd rather not merge any such pull requests until Wuffs-the-language has stabilized.

@adsharma
Copy link

I've recently added support for the following languages in py2many on top of cpp/rust:

  • Julia
  • Nim
  • Go
  • Dart
  • Kotlin

However, it doesn't have the same level of support as wuffs for parsing files in a secure way. It does however do a few things in this general direction (checking overflows when you add u8 + u8 for example).

@adsharma
Copy link

@nigeltao
Copy link
Collaborator

checking overflows when you add u8 + u8 for example

Where is this done? I don't see any overflow checking in
https://github.com/adsharma/py2many/blob/main/tests/expected/fib.go

@adsharma
Copy link

adsharma commented Feb 19, 2021 via email

@benibela
Copy link
Author

I've recently added support for the following languages in py2many on top of cpp/rust:

But no Pascal? That does not help me

t does however do a few things in this general direction (checking overflows when you add u8 + u8 for example).

I am not using Pascal for fun, but because I thought it was the safest language 15 years ago. Especially Pacal has integer overflow checking. And range checking on strings/arrays. That way it prevents almost all overflows (although in practice, people disable overflow checking in release builds to make it run faster)

I asked because the floating point parsing in FreePascal is both very slow and incorrectly rounded. I wanted to use Eisel-Lemire parsing in Pascal. But I had no time to implement it myself. I do not want to use the C library, because I made an open-source project, and if that combines different languages, people complain they cannot compile it (although the most common complain is that they cannot compile Pascal). So if the Wuffs parsing could be ported to Pascal, it would be perfect.

@nigeltao
Copy link
Collaborator

I wanted to use Eisel-Lemire parsing in Pascal. But I had no time to implement it myself.

It shouldn't be hard to port. It's only 80 lines of code (and 700 lines of data tables):
https://github.com/golang/go/blob/release-branch.go1.16/src/strconv/eisel_lemire.go

@nigeltao
Copy link
Collaborator

nigeltao commented Feb 19, 2021

https://github.com/adsharma/py2many/blob/main/common/inference.py#L132

The test case that exercises this code path is called infer-ops.py

https://github.com/adsharma/py2many/blob/main/tests/expected/infer-ops.go

says:

func add8(x uint64, y uint64) uint64 {
	return (x + y)
}

and that can still overflow. Similarly for any size_t like type on 64-bit systems, typically used in any pointer-length bounds checks, right?

Conversely, how do you write an overflow-checked fib function, when recursion means that you can't just keep widening the types?

@adsharma
Copy link

adsharma commented Feb 19, 2021 via email

@adsharma
Copy link

adsharma commented Feb 19, 2021 via email

@nigeltao
Copy link
Collaborator

size_t + size_t = usize_t?

size_t is already an unsigned type. There is no usize_t in C, only size_t and ssize_t.

What do you suggest for those cases?

I'm sorry, but I don't have a good suggestion, because I don't think the approach can fundamentally work. At some point you have a widest integer type, and you can't widen further when you add two of them. I'm also skeptical about any imperative programming language that doesn't allow x = x + 1, where the left hand size's type is obviously the type of x, but the right hand side's type has to be wider.

@benibela
Copy link
Author

I wanted to use Eisel-Lemire parsing in Pascal. But I had no time to implement it myself.

It shouldn't be hard to port. It's only 80 lines of code (and 700 lines of data tables):
https://github.com/golang/go/blob/release-branch.go1.16/src/strconv/eisel_lemire.go

That code looks easy. I was looking at the Wuffs-generated C code last year rather than the Go code, which was harder. Even harder when I tried to understand the blog posts first. But it might not be future proof to port anything to Pascal anymore.

Pascal was my first language in school. So I have a soft spot for it. But
in 2021 I have to go with languages with a larger ecosystem and type safety.

Have you looked at Rust or Kotlin? They have these capabilities as well.

I have ported some parts to Kotlin, but their multiplatform support is not mature yet and it does not support 32-bit linux. Rust is focused on safety, but it panics all the time. It is less "panic-safe" than Pascal. A language that never panics could quickly turn Rust into a legacy language

Anyways, I do not have time to port my entire project at once. But step-by-step would have worked. Port one function to a popular, memory-safe, panic-safe language (like Wuffs? Is it popular?) that has a Pascal code generator, and then the next function. Keep distributing the Pascal code, until each function is written in the new language a few years later, and then only distribute it in the new language...

At some point you have a widest integer type, and you can't widen further when you add two of them. I'm also skeptical about any imperative programming language that doesn't allow x = x + 1, where the left hand size's type is obviously the type of x, but the right hand side's type has to be wider.

I made my own language, too. If there is a looming overflow, it switches to an arbitrary precision decimal type. That is the best way for a scripting language, but not appropriate for a system language

@adsharma
Copy link

adsharma commented Feb 20, 2021 via email

@adsharma
Copy link

adsharma commented Feb 20, 2021 via email

@nigeltao
Copy link
Collaborator

My thinking is that 64 bit overflow is extremely rare in practice,

Rare still means exploitable. https://blog.chromium.org/2012/05/tale-of-two-pwnies-part-1.html discusses remote code execution due in part to a size_t overflow (and size_t is often 64 bits).

If py2many isn't overflow-proof, that's fine, but then it's not really playing the same game that Wuffs is, so the Wuffs issue tracker probably isn't the best place to discuss it.

@adsharma
Copy link

adsharma commented Feb 21, 2021 via email

@eliasnaur
Copy link

I'd rather not merge any such pull requests until Wuffs-the-language has stabilized.

What's the status here, in particular for Go output? I'd like to port a library to Go, but Wuffs would give me portability and additional safety guarantees.

@nigeltao
Copy link
Collaborator

I don't think that Wuffs-the-language is stable enough yet. Sorry.

For example, commits 2f18dc6 and 2865c5b just landed a week ago, each adding new methods to the "slice of T" types.

@adsharma
Copy link

adsharma commented Jan 2, 2025

Here's a python version of parse.wuffs

py2many/py2many#678

It passes tests. But some of the semantics are questionable, since the comparisons are happening in the python arbitrary precision int domain. (u32 from ctypes can't be compared AFAIK).

Would love more eyes on the translation to python and how it could be improved.

A subsequent step would be to generate rust/go code from the python example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants