Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v5.0.0 compatible with PuppeteerSharp >= 19.0.2 and tested #2

Closed
wants to merge 12 commits into from

Conversation

CaCTuCaTu4ECKuu
Copy link

  • Bump dependencies accordinng to PuppeteerSharp

PuppeteerSharp => 19.0.0 removed Newtonsoft.Json dependency

  • Update and fix

  • Bump PuppeteerSharp.Dom version

  • Update tests to work with PuppeteerSharp>=19.0.2

Update tests to use System.Text.Json

  • Add DomHandle.ParseJSValueTo

  • Values parsing in DomHandle.Evaluate and CodeGen.Partial Element.GetAttributeAsync

PuppeteerSharp use types of variables from JS (maybe due to transfer from Newtonsoft.Json to System.Text.Json) so it's not working and actually it make sense since C# is stronly typed

Try to parse evaluation result to requested generic type

Use ParseJSValueTo to parse evaluation result

  • Update README.md

PuppeteerSharp => 19.0.0 removed Newtonsoft.Json dependency
* Update and fix
* Bump PuppeteerSharp.Dom version
Update tests to use System.Text.Json
PuppeteerSharp use types of variables from JS (maybe due to transfer from Newtonsoft.Json to System.Text.Json) so it's not working and actually it make sense since C# is stronly typed

Try to parse evaluation result to requested generic type
Optimize code
Make public static
Use ParseJSValueTo to parse evaluation result
Missing ConfigureAwait
Copy link
Contributor

@amaitland amaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR 👍

Comments inline.

Given we are no responsible for type cohesion, I think whatever implementation is chosen we should look at allowing the user an extension point to hook the conversion and customize it. There are likely many changes in behavior that will be hard to account for.

README.md Outdated Show resolved Hide resolved

## Values parsing

**PuppeteerSharp** considers variable types from within JavaScript when requesting typed values on pages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you mean

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chech on this
hardkoded/puppeteer-sharp#2805

await page.EvaluateFunctionAsync<int>("() => '123'") dont work but
await page.EvaluateFunctionAsync<int>("() => Number('123')") works

And there's said

That's part of switching to System.Text.Json in v19. System.Text.Json is more strict than Newtonsoft.

PuppeteerSharp.Dom/DomHandle.cs Show resolved Hide resolved
}
var valueIsNull = value == null || value.Value.ValueKind == JsonValueKind.Null || value.Value.ValueKind == JsonValueKind.Undefined;
var nullable = returnType.IsClass;
if (Nullable.GetUnderlyingType(returnType) != null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nullable.GetUnderlyingType needs to only be called once and result stored.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My laziness, though I'm not sure if it's really necessary. I believe that code like this is optimized during compilation. Is it not?

PuppeteerSharp.Dom/DomHandle.cs Show resolved Hide resolved
PuppeteerSharp.Dom/DomHandle.cs Show resolved Hide resolved
PuppeteerSharp.Dom/DomHandle.cs Outdated Show resolved Hide resolved
@CaCTuCaTu4ECKuu
Copy link
Author

There was few reasons why I choose to manually cast.
One of them was that TypeDescriptor.GetConverter was used in PuppeteerSharp library and while I was looking why my types are not casting I initially tried to implement it there, but while debugging I found why and how (System.Text.Json). So the problem was that PuppeteerSharp.Dom expect old behavior when values are casted so I dug into this library with mindset thought that Converters didnt work well with JsonElement, but now thinking about it - might be that I already get tired and my brains stop thinking trying to solve problem.
Other reason is that code like this optimized good when compiling so it wont hurt, at best looks lousy

I suggested that there are few cases that will hit more often than other considering library wrappings so I play with order, however I didnt run any tests at that matter

To be honest, I just wanted to fix it quickly and make it good enough so that if someone wants to improve code itself or performance, they can do it themselves and send a pull request :)

@CaCTuCaTu4ECKuu
Copy link
Author

I may look into using TypeDescriptor.GetConverter instead individual handling but I'm not sure if it worth it.
In terms of look it will fit better I guess, but I don't exactly know how it compares performance wise, maybe someone knows better

Actually I remember about custom type converter for JsonElement but I dont quite remember why I reject that option

To summarize my thoughts

  • PuppeteerSharp starting from v19 introduced breaking change and I think PuppeteerSharp.Dom major version should increase
  • PuppeteerSharp.Dom should parse types same way as before (I believe it will break on real sites otherwise)
  • I don't like ParseJSValueTo method and feel that it could be resolved in much better way (like TypeConverters)
  • README is on you then, I just add some comments, better extra than none

Now, when I look at it as a whole and after using and experiencing I at least know point where it may break and how, so I think I can remake some things, test and commit as there is no hurry now when it's working. So what are we doing?

@amaitland
Copy link
Contributor

Thanks for the PR, I've manually cherry picked some of the parts.

5.0.40 should be available for download shortly.

  • PuppeteerSharp.Dom should parse types same way as before (I believe it will break on real sites otherwise)

There were only a couple of methods that appear to be effected, I've removed those.

  • Element.GetAttributeAsync<T> has been removed, you can now only use GetAttributeAsync which returns a string (which matches the return type of the underlying getAttribute method).
  • CssStyleDeclaration.GetPropertyValueAsync<T> has been removed, you can now only use GetPropertyValueAsync which returns a string (which matches the return type of the underlying getPropertyValue method).

For now I'm not going to bring in the type conversion changes, can revisit this later if it looks like there are lots of issues. It'll highlight where the return types don't match the HTML spec, which should be fixed.

It might be worth trying to see if you can get the changes merged into PuppeteerSharp directly.

Thanks again for the PR.

@amaitland amaitland closed this Nov 9, 2024
@CaCTuCaTu4ECKuu
Copy link
Author

Not sure about removing types altogether, it feels like someone may relying on it.
Though you can write your own implementation, wrap it with extension so no code should be changed and be sure to have control over it.
I guess thats a way to go as well, anyway if you use recent version of PuppeteerSharp Evaluate function, which is more demanded, already break so you'll need to make changes to code regardless

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants