Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle escaping of { and } in FormattableString #3890

Merged
merged 6 commits into from
Sep 19, 2024

Conversation

roboz0r
Copy link
Contributor

@roboz0r roboz0r commented Sep 18, 2024

Fixes #3889 by replacing {{ and }} with { and }. Adds tests to confirm correct unescaping.

@roboz0r
Copy link
Contributor Author

roboz0r commented Sep 18, 2024

Not sure how I missed this last night but this change achieves the expected output for JS template literals but it breaks an existing test:

    let s5: FormattableString = $"I have {{escaped braces}} and %%percentage%%"
    s5.Format |> equal "I have {{escaped braces}} and %percentage%"

There is a difference between _.Format and _.ToString()

open System
let s:FormattableString = $"""I have {{escaped braces}} and %%percentage%%"""
s.Format
s.ToString()

--- FSI Output ---

- s.Format;;
val s: System.FormattableString = I have {escaped braces} and %percentage%
val it: string = "I have {{escaped braces}} and %percentage%"

> s.ToString();;
val it: string = "I have {escaped braces} and %percentage%"

I am unsure whether it is simply I need to change the implementation of _.Format or if I have misunderstood the problem more fundamentally?

@MangelMaxime
Copy link
Member

I am not sure what direction to take either. I never played deeply with string interpolation escaping or the custom GetStrings from Fable.

While looking for a fix my approach was different.

Instead of having a direct access to the underlying strs. What I did, is create a getStrings function in JavaScript which handles the call to cssNew.GetStrings().

My reasoning is it looks like the escaping problem only happens when using GetStrings for fixing it only for this call make sense to me.

And it also, fix it at a single place where your solution needs to modify the string/parts at 2 different places.

CleanShot 2024-09-18 at 18 21 18

What do you think of this approach ?

@roboz0r
Copy link
Contributor Author

roboz0r commented Sep 18, 2024

How does that approach affect the generated template literals? Specifically, the template literal shouldn't include {{

I think that StringTests.js should look like:

equal("I have {{escaped braces}} and %percentage%", getFormat(fmt_2`I have {escaped braces} and %percentage%`));

@MangelMaxime
Copy link
Member

How does that approach affect the generated template literals? Specifically, the template literal shouldn't include {{

It doesn't affect it and the tests are all greens.

CleanShot 2024-09-18 at 18 34 49

I am not sure what we want as an output actually 😅

I would have liked if it was as simple as we want the user enter is what we want in the output. But it seems like it will be more difficult than that

let formatString0 : FormattableString = $"""{{}}"""
let formatString1 : FormattableString = $"""{{}}"""
let formatString2 : FormattableString = $$"""{}"""
let formatString3 : FormattableString = $$$"""{}"""
let formatString4 : FormattableString = $$$"""{{}}"""

Current implementation (compatible with my proposition)

export const formatString0 = fmt`{{}}`;
export const formatString1 = fmt`{{}}`;
export const formatString2 = fmt`{{}}`;
export const formatString3 = fmt`{{}}`;
export const formatString4 = fmt`{{{{}}}}`;

Your implementation

export const formatString0 = fmt`{}`;
export const formatString1 = fmt`{}`;
export const formatString2 = fmt`{}`;
export const formatString3 = fmt`{}`;
export const formatString4 = fmt`{{}}`;

@MangelMaxime
Copy link
Member

Looking at Fable code, contrary to my primary fear it seems like String.get_Format is only used when working withFormattableString.

This is the only place I see it being used:

let formattableString
(com: ICompiler)
(_ctx: Context)
r
(t: Type)
(i: CallInfo)
(thisArg: Expr option)
(args: Expr list)
=
match i.CompiledName, thisArg, args with
// Even if we're going to wrap it again to make it compatible with FormattableString API, we use a JS template string
// because the strings array will always have the same reference so it can be used as a key in a WeakMap
// Attention, if we change the shape of the object ({ strs, args }) we need to change the resolution
// of the FormattableString.GetStrings extension in Fable.Core too
| "Create", None, [ StringConst str; Value(NewArray(ArrayValues args, _, _), _) ] ->
let matches = Regex.Matches(str, @"\{\d+(.*?)\}") |> Seq.cast<Match> |> Seq.toArray
let hasFormat = matches |> Array.exists (fun m -> m.Groups[1].Value.Length > 0)
let tag =
if not hasFormat then
Helper.LibValue(com, "String", "fmt", Any) |> Some
else
let fmtArg =
matches
|> Array.map (fun m -> makeStrConst m.Groups[1].Value)
|> Array.toList
|> makeArray String
Helper.LibCall(com, "String", "fmtWith", Any, [ fmtArg ]) |> Some
let holes =
matches
|> Array.map (fun m ->
{|
Index = m.Index
Length = m.Length
|}
)
let template = makeStringTemplate tag str holes args |> makeValue r
// Use a type cast to keep the FormattableString type
TypeCast(template, t) |> Some
| "get_Format", Some x, _ -> Helper.LibCall(com, "String", "getFormat", t, [ x ], ?loc = r) |> Some
| "get_ArgumentCount", Some x, _ -> getFieldWith r t (getField x "args") "length" |> Some
| "GetArgument", Some x, [ idx ] -> getExpr r t (getField x "args") idx |> Some
| "GetArguments", Some x, [] -> getFieldWith r t x "args" |> Some
| _ -> None

Something to not too, is that how we decide to represents output in this PR will probably not only have impact on JavaScript but also the others targets. Because it seems like they are using the same output format.

@roboz0r Is there a case where not using the current output format cause issue?

@ncave @dbrattli Any opinion on how we should handle FormattableString representation ?

@roboz0r
Copy link
Contributor Author

roboz0r commented Sep 18, 2024

I think that my implementation is the right way to go then. I don't think template literals require escaping of { so having

export const formatString0 = fmt`{{}}`;

Doesn't give the right semantics as

// JS
`{{}}` === "{{}}" // true
`{{}}` === "{}" // false
//F#
$"""{{}}""" = "{}" // true

This would be possibly breaking existing code but if existing template literals were always wrapped in fmt() then it wouldn't affect anything.

If we make the change then fmt wouldn't be required for js functions accepting template literals directly.

@ncave
Copy link
Collaborator

ncave commented Sep 18, 2024

@MangelMaxime IMO we should try to keep string interpolation as similar to F# on .NET as possible:
https://github.com/dotnet/docs/blob/main/docs/fsharp/language-reference/interpolated-strings.md#extended-syntax-for-string-interpolation

roboz0r referenced this pull request in fable-compiler/Fable.Lit Sep 19, 2024
@roboz0r
Copy link
Contributor Author

roboz0r commented Sep 19, 2024

One of the things I don't like about doing unescaping in getStrings is it's causing runtime string allocations each time it is called. Ideally, we should just be passing through the runtime optimized TemplateStringsArray unchanged when passing a format string to a JS API.

Trying to stick closer to the .NET API makes me think we should reexamine the FormattableString type definition in String.ts

interface FormattableString {
strs: TemplateStringsArray,
args: any[],
fmts?: string[]
}

I sketched up a type that more closely matches the .NET type definition of FormattableString. I'm really not a JS dev so please change any non-idiomatic usage.

public abstract class FormattableString : IFormattable
{
    [StringSyntax(StringSyntaxAttribute.CompositeFormat)]
    public abstract string Format { get; }
    public abstract object?[] GetArguments();
    public abstract int ArgumentCount { get; }
    public abstract object? GetArgument(int index);
    public abstract string ToString(IFormatProvider? formatProvider);

    string IFormattable.ToString(string? ignored, IFormatProvider? formatProvider) =>
        ToString(formatProvider);

    public static string Invariant(FormattableString formattable)
    {
        ArgumentNullException.ThrowIfNull(formattable);
        return formattable.ToString(Globalization.CultureInfo.InvariantCulture);
    }

    public static string CurrentCulture(FormattableString formattable)
    {
        ArgumentNullException.ThrowIfNull(formattable);
        return formattable.ToString(Globalization.CultureInfo.CurrentCulture);
    }

    public override string ToString() =>
        ToString(Globalization.CultureInfo.CurrentCulture);
}

Concept

abstract class FormattableString {
    // Should this just be an interface?
    // Unsure about default naming conventions for JS/TS.
    abstract get Format(): string;
    abstract get Arguments(): any[] | undefined;
    abstract get ArgumentCount(): number;
    abstract getArgument(index: number): any;
    // toString() should return the formatted string. Currently in fable it returns [object Object]
    abstract toString(): string; // TODO: How to add IFormatProvider?
}

// Long Name. Consider "TemplatedString", "FableString", etc.
class TemplatedFormattableString extends FormattableString {
    readonly strs: TemplateStringsArray
    readonly fmts?: ReadonlyArray<string> // The constant portion of the string is truly immutable.
    readonly args?: any[] // The arguments are mutable, to match the .NET API.

    constructor(strs: TemplateStringsArray, fmts: string[], args: any[]) {
        super();
        this.strs = strs;
        this.fmts = fmts;
        this.args = args;
    }

    static createWithFmts(fmts: string[]) {
        return (strs: TemplateStringsArray, ...args: any[]) => {
            if (args.length !== strs.length - 1 || fmts.length !== args.length) {
                throw new Error("The number of format strings must match the number of expressions.");
            }
            let x = new TemplatedFormattableString(strs, fmts, args);
        }
    }

    static create(strs: TemplateStringsArray, ...args: any[]) {
        if (args.length !== strs.length - 1) {
            throw new Error("The number of format strings must match the number of expressions.");
        }
        return new TemplatedFormattableString(strs, [], args);
    }

    get Format(): string {
        // Inefficient, but it works to return the .NET style format string.
        // Consider caching the result in a private field.
        return this.fmts
            ? this.strs.map((s, _) => s.replace('{', '{{').replace('}', '}}'))
                .reduce((acc, newPart, index) => acc + `{${String(index - 1) + this.fmts![index - 1]}}` + newPart)
            : this.strs.map((s, _) => s.replace('{', '{{').replace('}', '}}'))
                .reduce((acc, newPart, index) => acc + `{${index - 1}}` + newPart);
    }

    get Arguments(): any[] | undefined {
        return this.args;
    }

    get ArgumentCount(): number {
        return this.args?.length ?? 0;
    }

    getArgument(index: number): any {
        return this.args![index];
    }

    private static applyFormat(format: string, arg: any): string {
        // TODO: Use the format string to format the argument.
        return arg.toString();
    }

    toString(): string {
        if (this.args === undefined) {
            return this.strs[0];
        } else {
            return this.strs.reduce((acc, newPart, i) => acc + (TemplatedFormattableString.applyFormat(this.fmts![i - 1], this.args![i - 1])) + newPart);
        }
    }
}

function fmt(strs: TemplateStringsArray, ...args: any[]) { return TemplatedFormattableString.create(strs, args); }
function fmtWith(fmts: string[]) { return (strs: TemplateStringsArray, ...args: any[]) => TemplatedFormattableString.createWithFmts(fmts)(strs, args); }

Usage

let speedOfLight = 299792.458  
let frmtStr = $"The speed of light is {speedOfLight:N3} km/s." : FormattableString

Becomes

const speedOfLight = 299792.458;
const frmtStr = fmtWith([":N3"])`The speed of light is ${X_speedOfLight} km/s.`;

let color = "blue"
Lit.css 
        $""":host {{
            color: {color};
        }}"""

Becomes

function css(strs: TemplateStringsArray, ...args: any[]) {
    // As imported by e.g. lit
}

const color = "blue";
const s = fmt`:host {
            color: ${color};
        }`;
// Default path for template literals has no extra string allocations provided we unescape "{{" to "{" at compile time.
return css(s.strs, s.args);

@MangelMaxime
Copy link
Member

We avoid creating classes in fable-library so bundler have a chance to remove dead code. I don't know if now days they can do it for classes or not.

Trying to stick closer to the .NET API makes me think we should reexamine the FormattableString type definition in String.ts

What we mean by that in general, is that the behaviour at runtime should be same. Not necessary that we use a class if they use a class, etc.

@MangelMaxime MangelMaxime merged commit d61c993 into fable-compiler:main Sep 19, 2024
18 checks passed
@MangelMaxime
Copy link
Member

Note, I just said that we avoid using class in fable-library but working on another fix I just found that we do use class sometimes 😅.

So let's say it depends and can be used.

@roboz0r I merged the PR as is because it seemed to fix your original issue and from what I understood using a class would just re-structured but with the same behaviour.

If I was wrong feel free to tell me and we can merge another PR.

@roboz0r
Copy link
Contributor Author

roboz0r commented Sep 19, 2024

@MangelMaxime Thanks for your help getting this one completed. The class version didn't change any behaviour but was just trying to better capture the semantics of the .NET API. I'll trust your judgement whether to go with a class def or leave the interface as is.

While I was working on it, I did identify that FormattableString.ToString() needs an implementation to produce the formatted string instead of returning [object Object]. That can be a separate issue/task.

@MangelMaxime
Copy link
Member

Yes, there a few places where if you use printfn "%A" myValue then you get [object Object].

But if you use JS.console.log(myValue) you often get something more useable. I never really dig this issue and always assume it was a limitation to how some types are represented or how Fable handle printing values. But if we can improve it that's good :)

Thanks for the work done in this issue, I want to fix a few other ones before making a release but should not take too long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

System.FormattableString results in doubled '{' in JavaScript output
3 participants