-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE REQUEST]: .Net 6.0/7.0 Support #1149
Comments
I have used the current version 2.1.1 writing delta format from dotnet 7.0 used sdkman to install java and spark. dotnet dotnetapp was compile to native ubuntu 22.04 run spark submit spark-submit --packages io.delta:delta-core_2.12:2.0.2 --class org.apache.spark.deploy.dotnet.DotnetRunner --master local ./bin/Release/dotnetapp/microsoft-spark-3-2_2.12-2.1.1.jar ./bin/Release/dotnetapp/dotnetapp There is still the bug with the udfs. |
udfs is not working in polyglot note book due to #1131 |
Could you provide a working solution to make UDFs work in polyglot by working with the polyglot team => @claudiaregio Basically, it is the directory path problem associated with polyglot, |
Hi @AFFogarty & @GeorgeS2019, thank you for the quick reply! We are heavily reliant on this library for our solution which is ready for production now. Rest of our application is on .Net 6.0 and would like this library to be upgraded as well. We are currently using the main branch and it's all working fine on .Net 6.0, as we are not using either UDFs or polyglot notebook. However, as we are going for production, would like an official version and it appears #1131 is a security vulnerability that would fail some security checks. Also, we are looking for a complete port of Spark along with MLLib. Would greatly appreciate if there's a new version of this library with full compatibility with latest version of Spark. |
Hi Team (@imback82 , @Niharikadutta , @dbeavon, @suhsteve, @AFFogarty, @bamurtaugh), |
First off this library is great and I want to comend all the hard work that has gone into it. Just my two cents here but I think it would be a good idea to consider a release with Binary Serializer still in place for the following reasons,
Hope my comments are clear. I look forward to hearing what others think. |
@bmazzarol <= well communicated..very appreciated 👏 We need to find ways fast to continue the iteration of improving this project. |
Hi Team: Also, is there a chance this library can be merged with SynapseML (https://github.com/microsoft/SynapseML)? It appears it is actively being developed and has a better technology to generate Spark bindings without much delay and also has so many other features integrated. Thanks! |
Could you provide more information? UDF is only an issue with PolyGlot notebook. Could you just elaborate further so others could continue to add more information and we iteratively get closer to a suitable solution? I wonder if the block is due to legal issues than the software implementation Why there is no incentive to address this at the Software level for the .NET community? #AGAIN Leaving this not moving forwards could have UNDESIRABLE consequences for the entire Microsoft Big Data analytics offerings |
@GeorgeS2019 Will do my best! Spark connect is a built-in set of grpc bindings included with Spark 3.4+ This provides a low level API that can be used to drive Spark in a very similar way to how this project works, infact the latest version of pyspark supports this client mode already This solves the Serializer issue as it uses protobuf behind a defined grpc contract. However my understanding is that a udf needs to run on the Spark workers and be one of the supported languages to work via Spark connect. However it's not my intention to solution, I just want to argue for a roadmap to be created and an "as is" release to be considered so progress can be made incrementally. At the very least a counter argument against an "as is" release would be good so the comunity can understand more issues that might not have been considered. |
Is it feasible to make .NET one of the supported languages (e.g. python, R, Go according to the diagram)? I am still fuzzy. Are u familiar with ikvm? It is .NET6, it is possible to load java code files and compile within VS2022 into .NET If ikvm is feasible, then the question of keeping Spark.NET always up to date is no longer an issue |
I wonder if it is potentially feasible to replace the JVM part of the diagram to ikvm.NET? https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals |
My understanding is that ikvm.NET allows java programs to run on dotnet. So would not be equivalent to Py4j, which is essentially a python interpreter running on the jvm with access to the jvm memory space. But again my main point was there are lots of ways to move forward, all take time and require planning, the bigger issue that creates is in the meantime there is no supported release of this library. |
Hi @bmazzarol! It appears it's going to be a long time for any new version of this library. We'll explore alternatives. Thank you for the clarification! |
What alternative(s) are you expecting? |
@Vislesha |
@GeorgeS2019, we are moving to Java based APIs for our Analytics Engine so we don't have to play a catchup with compatible libraries. It's going to be time consuming but looks like that's a better alternative. |
You have abandoned, but not everyone YET. So, do consider leaving it open even if you are no longer interested |
@GeorgeS2019! Sure, |
This issues is certainly of interest to me. We are considering using Spark and Spark .NET but this issue raises some obvious concerns. |
I'm testing with .Net 8 on OSS and Azure HDI. @AFFogarty It has been almost a year since you mentioned the concern related to I'm eager to help get this merged. Let me know how we can help. I will start testing it on OSS and HDI as soon as possible. Can we get this merged? And after that I will have follow-up changes to migrate to .net 8. They are basically the same as your old changes to migrate to .net 6. |
Thx for helping to keep this project forwards |
Hi Team (@imback82 , @Niharikadutta , @dbeavon, @suhsteve, @AFFogarty, @bamurtaugh),
Is there ever going to be a new version of this library with .Net 6.0/7.0 support? There's been no updates or a new version from long time and many PR's are still pending. Could someone please provide a guidance on the timeline or the future of this project please?
Thanks
The text was updated successfully, but these errors were encountered: