-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Apple Metal/MPS -- TVM/MLC-LLM won't compile from source #2540
Comments
Likely you don't need to turn on arm compute and mps since we generate our own metal code |
Hm, funny, came back here to comment that I got MLC-LLM to compile -- without MPS on. Problem is that now when I went to compile a model, I kept getting
And when I included --device metal
I also have device="mps" set as an env. var. and MTLDevice=1 |
ah, you need to write device="metal" |
🤦♂️ it's always something so stupid, agh! Thank you Welp, And while I'm here, I neglected to mention the warning that always serve as a precursor to the 4 "foundational" errors I had mentioned.
Edit: TVM just compiled with MPS off! (!!! because I was able to use ACL, MKL and whatever else) |
I have a feeling this is the answer to my prayers. I had swapped out DMLC_Core and the related files that were changed when that was updated without luck. Then sought to check out the last time those threading/pool files were modified. Haven't tried those files yet but figure it must be Apple's policies w/ threading and this seems to confirm But donno, just glazed over and saw a couple apples but maybe compared to a couple oranges. Just came up in my search so... And noticed this change |
I was able to compile TVM/MLC but its producing segmentation fault errors on conversion (weights) of my Codestral model. Also gotten errors with compiling a 3-bit Omniquant Llama model (gen_config worked fine) and trying to chat with an AQLM 2-bit model I managed to get to compile previously. However, I'm not sure if the .dylib I'd compiled was legitimate (used no_quant for gen_config/compilation) so I'd need to double back on another compile anyways. Seems it is the fact that there's only a "Protected: CopyDataFromTo(vars, etc. etc.)" No "Public: CopyDataFromTo" defined in runtime/metal/metal_common.h. There's also no "GetCommandQueue" (used in metal_api.mm and conv.mm (IIRC)) defined in metal_common.h. I remedied those issues. Then it was only a matter of introducing PublicCopyDataFromTo function & the GCQ definition in conv.mm, gemm.mm (contrib/MPS files) and metal_api.mm. Well, if the resolution is kosher, but I guess from the errors I'm experiencing, it breaks something important (I'm guessing the data should be transported in a protected state. I'm thinking that there must've just been a discrepancy (in conv.mm or metal_api?)/missing definition (GCQ) prohibiting it from protected data transference
|
[1] 63036 segmentation fault mlc_llm convert_weight /Users/zack/.home/local/models/Uncensored_Llama-70B I'm losing my sanity here at this point. My Python/Poetry appear to be ARM64...so it nixes that possibility. I checked because I saw all the multiprocessing errors here have been related to that or other user errors. My last remaining guess, and I wish I just turned it off when I turned off ARM Comp. Lib. is BLAS. I'm feeling a bit dumb now really, because I believe when issues started and I posted this in TVM, I noted Apple BLAS was suspect. I think the code is out of date because it shoots warnings (and before I messed around w/ the Metal/MPS code, errors!) about how the code is relying on sgemm and dgemm or w/e functions/scripts that are deprecated. And I tried everything under the sun to force CMake to incorporate the new Apple BLAS without change. So I'll be turning that off now, too. I do have tons of modules on normally. AOTExec, UMA, BNNS, Threads, RPC, CPP TVM, CPP RPC, Profiler, Graph Executor, CoreML, TCMalloc, MLIR, Pipeline. I think that might be it 😂😅 |
The CMake module for Modules/OpenMP.cmake should be updated because there's nothing Apple-friendly
codegen copycc.txt Sadly stock, with all options off (except Metal), segfault errors on Convert_Weight + Compile 😭 Edit: 🤦 -- turns out the dang EXE binary wasn't updating, no wonder nothing was happening. Just got it working with a stock build. Time to try to piece it back up to the full shebang Edit 2: Also, this should be added to Metal_Device_API.mm (Under ICHECK_LT(index, devices.size()) << "Invalid device id " << index; Edit 3: Seems I've gotten everything on besides MPS, hopefully that can be fixed sooner than later! |
🐛 Bug
To Reproduce
Steps to reproduce the behavior:
I've compiled each a few times each. But since I updated and attempted to compile, I've been unable to (*except once, not sure if pure luck or a matter of a stock, no features, build). With that little detail of success, it was off a fresh git repo DL, whereas when I have dropped the features after a failed build back to stock, it still fails.
Features that seem to exacerbate the issue: BLAS, MKL, CoreML, Arm Compute Lib., basically anything that'd go through MPS and it causes this foundational error (as in there may be an error about inability to find, for the file src/runtime/contrib/ACL/allocator.cc, <#include acl/runtime/IAllocator.h> + Core/Types.h (which doesn't make sense, I've gone out of the way to incorporate the precise directory that IAlloc. is in my Include flags/CMake conf. (ACL/arm_compute/core + ACL/arm_compute/runtime):
I originally posted about this issue in TVM 6 days ago...but it is inactive. mlc-ai/relax#321
More errors/gen. information for context if needed.
Environment
conda
, source): Source (Github)pip
, source): Source (Github)python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models): [Haven't actually reinstalled TVM as a Python instance, just built.Additional context
Sadly couldn't find anything on the net about how to fix this error. Figured there'd be a lot of these MTLCommandQueue errors but nothing concrete.
I only tried compiling MLC-LLM 1 or 2 times. And that was with my TVM (that compiled that one time...well the .dylibs, but not the pure 100% instance, I had attempted more builds after). I suppose I'll give a try to compile MLC w/ 3rd Party TVM, but I need to manipulate MLC's quantization file so I can import a custom-quantized model of mine.
The text was updated successfully, but these errors were encountered: