Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Improve language coherence for the behaviour of offsets and containers: object handlers #14920

Draft
wants to merge 58 commits into
base: master
Choose a base branch
from

Conversation

Girgias
Copy link
Member

@Girgias Girgias commented Jul 11, 2024

Part of https://wiki.php.net/rfc/container-offset-behaviour

This PR mainly deals with the object handler part of the RFC.

@iluuu1994
Copy link
Member

The benchmark shows a degradation larger than I expected (0.22% for Symfony). Any idea where this is coming from?

@Girgias
Copy link
Member Author

Girgias commented Jul 11, 2024

The benchmark shows a degradation larger than I expected (0.22% for Symfony). Any idea where this is coming from?

Some ideas that might cause this:

  • The new algorithm for isset() which means we are calling more userland functions
  • Not caching the function pointers in something like the arrayaccess_funcs_ptr struct, so we always need to look them up
  • Opcache not really being implemented, as the benchmarking CI pipeline used to segfault prior to me doing the minimal opcache changes

Comment on lines +30 to +31
bool (*/* const */ has_dimension)(zend_object *object, zval *offset);
zval *(*/* const */ fetch_dimension)(zend_object *object, zval *offset, zval *rv);
Copy link
Contributor

@morrisonlevi morrisonlevi Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at a few places and didn't see any. Does the engine ever call has_dimension without ever calling one of the other handlers? It seems like it's always followed by one.

Potentially, we should stop calling has_dimension and rely on the error handling that still seems to exist in the other dimension handlers. It certainly looks like they cannot assume that they are only called when has_dimension is true. And in the case of hash lookups, the lookup is going to get repeated if we call has_dimension and read_dimension, for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's reasonable to "just" call the read_dimension and not call the had_dimension handler, as this would be confusing for users.

I suppose what I could do is what ArrayAccess used to do is cache the zend_function pointers so we don't need to look them up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's reasonable to "just" call the read_dimension and not call the had_dimension handler, as this would be confusing for users.

Can you expand? What's confusing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following class and code should not cause any warnings (or even exceptions) to be emitted.

I don't see how you'd ensure no warnings (which may or may not be promoted to exceptions) or exceptions are being thrown in this instance.

class Map implements DimensionReadable {
    private array $map = [];

    public function offsetGet(mixed $offset): mixed {
        return $this->map[$offset];
    }

    public function offsetExists(mixed $offset): bool {
        return array_key_exists($offset, $this->map);
    }
}

$map = new Map();
var_dump(isset($map['undef']));

Copy link
Contributor

@morrisonlevi morrisonlevi Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean userland, I mean internally. There are C-level calls to has_dimension which get followed up by calls to things like read_dimension. Note that in your example the hash lookup happens twice. Internal calls should be able to avoid that, I think. Or at least, I would try to design the APIs that way. If read_dimension returns NULL, it clearly did not have that dimension. See what I'm saying? Look at zend_weakmap_has_dimension and zend_weakmap_read_dimension for what I mean; it's effectively doing all the same work twice if you call has_dimension before read_dimension. Should be able to coalesce the two, for internal handlers (not userland interface methods).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between a C handler and the userland function is basically non-existent, having it be different is just confusing.

The hash look_up would have only happened if you were calling the userland version anyway and did not have an optimized C version of the handler.

The API is literally just has_dimension == call to offsetExists(), read_dimension == call to offsetGet(), etc.
But all the internal extensions basically implement an overloaded C version which does not call a userland function.

Also about the hash look-up, this should be a non-issue now as I decided to use a union to either store the zend_function* for userland classes or the C handler function for internal extensions.

@SakiTakamachi
Copy link
Member

PDO part looks good to me.

Shot in the dark via CI
This is a seperate part of the RFC, but to already have a test
The implementation is the best I could come up after this was noticed
I need to implement the autovivify handler in ArrayObject

This reverts commit 5c9fc22.
@Girgias Girgias force-pushed the zend_container_offset_semantics branch from 89509f0 to 3a53217 Compare July 18, 2024 02:14
Not sure why did this not fail localy
@Girgias Girgias force-pushed the zend_container_offset_semantics branch from fe5d079 to 8e5d564 Compare July 18, 2024 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants