Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHPLIB-1137: Reject PackedArrays when expecting documents #1117

Merged
merged 7 commits into from
Jun 27, 2023

Conversation

alcaeus
Copy link
Member

@alcaeus alcaeus commented Jun 22, 2023

PHPLIB-1137

This PR adds a new is_document function that is used to validate input options. While we accept array|object for documents, a PackedArray instance is one of those objects where we are absolutely sure that it is not a document.

This PR also includes a commit to refactor most data providers that provide options to operation classes. This was done to give the individual data sets a name, making it easier to spot failing tests. Due to the large amount of unrelated changes in that first commit, I suggest reviewing the commits separately.

@alcaeus alcaeus requested review from jmikola and GromNaN June 22, 2023 14:28
src/functions.php Show resolved Hide resolved
*/
function is_document($document): bool
{
return is_array($document) || (is_object($document) && ! $document instanceof PackedArray);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that we don't bother checking if array is a list, since APIs that take arrays as documents generally use an object cast to ensure proper BSON encoding as a document type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but can a list be a valid value for the commands? Can 0 be used as a field name?

If not, it may help developers if they're notified as soon as possible that the value they've passed is a list of documents, when a single document is expected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but can a list be a valid value for the commands?

Strictly speaking, a list is not a valid document, even though it has been treated as such in PHPLIB/PHPC.

Can 0 be used as a field name?

Yes. Arrays are serialised like objects in BSON, but denoted through a special type. Likewise, inserting a structure like [0 => 'foo', 2 => 'bar'] would always yield an object with "0" and "2" as keys (since the value isn't a list).

If not, it may help developers if they're notified as soon as possible that the value they've passed is a list of documents, when a single document is expected.

I agree, which is why we're prohibiting PackedArray, as we're always sure that it's an array, not an object. For PHP arrays however this isn't as simple and sometimes performance sensitive, which is why it wasn't added before (I'll leave it to @jmikola to confirm this). There is one more difference: with a structure like [0 => 'foo', 1 => 'bar'], the trained eye sees an array, but it can easily be used as an object. A PackedArray is by definition always an array, which is why we're prohibiting it.

Copy link
Member

@jmikola jmikola Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PHP arrays however this isn't as simple and sometimes performance sensitive

I'm not sure what this is referring to. Regarding performance, I know we definitely didn't want to call bsonSerialize() too many times for differentiating things that might encode as BSON documents or arrays.

I agree with the change here, though. PackedArray is the only object that we know for certain will not be BSON document.

tests/Model/ChangeStreamIteratorTest.php Show resolved Hide resolved
tests/Operation/BulkWriteTest.php Outdated Show resolved Hide resolved
tests/Operation/BulkWriteTest.php Outdated Show resolved Hide resolved
@@ -139,6 +140,11 @@ public function provideInvalidIntegerValues()
return $this->wrapValuesForDataProvider($this->getInvalidIntegerValues());
}

public function provideInvalidUpdateValues()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only used for Operation classes? If so, you can consider moving it to MongoDB\Tests\Operation\TestCase, which is where I put other providers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. I've moved this and getInvalidUpdateValues for consistency.

This also provides a nice refactoring opportunity down the line once we make all our data providers static (which is required in later PHPUnit versions): we can offload these data providers to separate classes or traits and import them as necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving our data providers and utility functions into traits has been on my mind for a few months now. Static methods are great, but I suppose there's nothing holding up an immediate trait migration today.

I created PHPLIB-1170 to track this.

Do all versions of PHPUnit allow static data providers? I saw that sebastianbergmann/phpunit@9caafe2 now requires static as of PHPUnit 10+, but I'm curious if anything is holding us up from making the change today. Our PHP 7.2+ requirement means we'll only be using PHPUnit 8.5+.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, static data providers have been allowed for quite some time but have only become required as of PHPUnit 10. We could make that change already.

tests/Collection/CollectionFunctionalTest.php Outdated Show resolved Hide resolved
tests/Command/ListCollectionsTest.php Outdated Show resolved Hide resolved
tests/TestCase.php Outdated Show resolved Hide resolved
tests/TestCase.php Outdated Show resolved Hide resolved
@alcaeus alcaeus force-pushed the phplib-1137-reject-packedarray branch 2 times, most recently from 65883a2 to 73af4cb Compare June 26, 2023 07:12
@alcaeus alcaeus requested a review from jmikola June 26, 2023 07:12
Copy link
Member

@GromNaN GromNaN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to have factored this recurring code pattern.

if (isset($options['filter']) && ! is_array($options['filter']) && ! is_object($options['filter'])) {
throw InvalidArgumentException::invalidType('"filter" option', $options['filter'], 'array or object');
if (isset($options['filter']) && ! is_document($options['filter'])) {
throw InvalidArgumentException::invalidType('"filter" option', $options['filter'], 'document');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"document" is not a native php type. The term should be defined somewhere. Maybe update the message in MongoDB\Exception\InvalidArgumentException.

- Expected %s to have type "document" but found "string"
+ Expected %s to have type "document" (array or object) but found "string"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. I was trying to avoid "array or object" since that would include all objects including PackedArray. I can add it if we want to glance over the packed array case. What do you think @jmikola?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I left a comment about my concern with using "document" as a type since we use actual PHP types everywhere else. I can't find that so perhaps I deleted the draft before submitting my last review.

In any event, I agree that something like "document" (array or object) would be preferable. We don't need to call out PackedArray specifically, as anyone using that should understand that it's not a document -- and if it leads to a bug report it'd be a nice opportunity to educate someone :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you pass document (array or object) to the invalidType() factory method I suppose it's going to be wrapped in double quotes. Perhaps you just want to pass document and let the factory method deal with formatting -- but if that's a bit hairy (especially with the existing logic to handle arrays) then maybe just let it get quoted as-is.

Copy link
Member

@GromNaN GromNaN Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you just want to pass document and let the factory method deal with formatting

That's what I was thinking. To factorise the message in the exception class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created a new expectedDocumentType factory for this case - it seemed a more sensible choice than messing around with the formatting.

*/
function is_document($document): bool
{
return is_array($document) || (is_object($document) && ! $document instanceof PackedArray);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but can a list be a valid value for the commands? Can 0 be used as a field name?

If not, it may help developers if they're notified as soon as possible that the value they've passed is a list of documents, when a single document is expected.

Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once you add "array or object" to the "document" type exception message.

@alcaeus alcaeus force-pushed the phplib-1137-reject-packedarray branch from 028bd7e to e9e5636 Compare June 27, 2023 08:27
@alcaeus alcaeus force-pushed the phplib-1137-reject-packedarray branch from e9e5636 to 3af1d1f Compare June 27, 2023 08:43
@alcaeus alcaeus merged commit 88731a5 into mongodb:v1.16 Jun 27, 2023
11 checks passed
@alcaeus alcaeus deleted the phplib-1137-reject-packedarray branch June 27, 2023 13:54
alcaeus added a commit that referenced this pull request Jun 27, 2023
* v1.16:
  PHPLIB-1137: Reject PackedArrays when expecting documents (#1117)
GromNaN added a commit to GromNaN/mongo-php-library that referenced this pull request Jun 27, 2023
GromNaN added a commit to GromNaN/mongo-php-library that referenced this pull request Jul 7, 2023
GromNaN added a commit to GromNaN/mongo-php-library that referenced this pull request Jul 18, 2023
GromNaN added a commit to GromNaN/mongo-php-library that referenced this pull request Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants