ggml-matmul with pytorch
- copy your cuda path to setup.py line24
- install pytorch >= 2.1.2
- install ggml (cd ggml-master, mkdir build, cd build, cmake .., export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/cuda/lib64, cmake --build . --config Release -j 8)
- pip install .
- test.py is the example
- my_gguf.py is copied from https://github.com/kvcache-ai/ktransformers/blob/main/ktransformers/util/custom_gguf.py
- gguf file is downloaded from https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF
- ggml-master\src\ggml-cuda.cu line 2270-2275
- ggml-master\include\ggml-cuda.h line 45
supoort to pytorch cuda graph (I have no idea now)