Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refcounted_he_(new|fetch)_pvn: Don't roll-own code #22638

Merged
merged 14 commits into from
Nov 28, 2024

Commits on Nov 28, 2024

  1. Configuration menu
    Copy the full SHA
    d1f2d9e View commit details
    Browse the repository at this point in the history
  2. utf8.c: White-space only

    Outdent and reflow some comments and code in preparation for them to be
    moved out of the loop
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    13add96 View commit details
    Browse the repository at this point in the history
  3. utf8_to_bytes() Move failure code out of loop

    This is for clarity.  All this very-unlikely-to-be-used code was in the
    middle of what is really going on, creating a distraction.
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    f0b9efb View commit details
    Browse the repository at this point in the history
  4. utf8_to_bytes: Refactor loop

    The previous version did not make sure that it wasn't reading beyond the
    end of the buffer in all cases, and the first pass through the input
    string already ruled out it having most problems.  Thus we don't need
    the full generality here of the macro UTF8_IS_DOWNGRADEABLE_START; and
    this simplifies things
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    b0a6997 View commit details
    Browse the repository at this point in the history
  5. utf8_to_bytes: Update and fix comments.

    These were misleading.  On ASCII platforms, many calls to this function
    won't use the per-word algorithm.  That's only done for long-enough
    strings.
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    7c96860 View commit details
    Browse the repository at this point in the history
  6. utf8_to_bytes: Rename variable

    The new name, s0, is used in more other places for this meaning, and is
    more descriptive.
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    51f780c View commit details
    Browse the repository at this point in the history
  7. Add preliminary utf8_to_bytes_()

    This is an internal function, designed to be an extension of
    utf8_to_bytes(), with a slightly different API.  This commit just adds
    it and calls it from just utf8_to_bytes.
    
    Future commits will extend this API.
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    2be76dc View commit details
    Browse the repository at this point in the history
  8. utf8_to_bytes_: Add const

    This variable should not be being changed by the function
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    dc5f672 View commit details
    Browse the repository at this point in the history
  9. utf8_to_bytes_: Add argument, macro

    The argument is currently unused.  The macro is a public facing API that
    calls this function with the correct argument
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    67dfdb5 View commit details
    Browse the repository at this point in the history
  10. utf8_to_bytes_: Slight refactor

    This makes the next commit smaller
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    342278c View commit details
    Browse the repository at this point in the history
  11. utf8_to_bytes_: Add non-destructive write option

    This causes this function to be able to both overwrite the input, and to
    instead create new memory.  It changes bytes_from_utf8() to use this new
    capability instead of being a near duplication of the core code of this
    function.
    
    Prior to this commit, bytes_from_utf8() just allocated memory the size
    of the original string, and started copying into it.  When it came to a
    sequence that wasn't convertible, it stopped, and freed up the copy.
    The new behavior has it checking first before the malloc that the string
    is convertible.  That has the advantage that there is no malloc without
    being sure it will be useful; but the disadvantage that there is an
    extra pass through the input string, but that pass is per-word.
    
    The next commit will introduce another advantage.
    
    Thanks to Tony Cook for the 'free_me' idea
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    82a91b9 View commit details
    Browse the repository at this point in the history
  12. utf8_to_bytes_: Calculate needed malloc size

    Prior to this commit, the size malloced was just the same as the length
    of the input string, which is a worst case scenario.  This commit
    changes so the new pass through the input (introduced in the previous
    commit) also calculates the needed length.
    
    The additional cost of doing this is minimal.  It has advantages on a
    very long string with lots of sequences that are convertible.
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    eaf0c6e View commit details
    Browse the repository at this point in the history
  13. utf8_to_bytes_: Add ability to return a mortalized pv

    This is a non-destructive conversion of the input into native bytes, and
    with any new memory required set for destruction via SAVEFREEPV.  This
    allows the caller to not have to be concerned at all if memory was
    created or not.
    
    A new macro is created that calls this internal function with the
    correct parameter to force this behavior.
    khwilliamson committed Nov 28, 2024
    Configuration menu
    Copy the full SHA
    1a87086 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    285efd1 View commit details
    Browse the repository at this point in the history