-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REQUEST: gfx803 support #173
Comments
I have newer had that generation of AMD cards by my self, so I have not never tried to add the support for that. If you have time to try, I could first try to add some very basic component support for it to some build branch. Build would propably fail at some point but at least we could try to test whether rocminfo and amd-smi will detect the card and and also test whether hipcc and opencl apps works. And maybe also llama.cpp, vllm. etc which does not require whole stack like pytorch with gpu acceleration to work. |
Happy to run it for you. I've successfully run rocm stuff using I think the
xui persons docker but rocm version too low and my main usage is whisperx
which uses ctranslate.
It would definitely be great to be able to run llama locally, especially on
GPU so CPU doesn't get bogged down, eg as code assistant.
As far as I understood, the process was to to basically just have ROCm
reasonably updated so one can build rocm pytorch from source (or ideally
just use the appropriate existing rocm-pytorch version).
There is also https://github.com/arlo-phoenix/CTranslate2-rocm
This would be the ultimate ideal, as faster-whisper and whisperx (vs
vanilla whisper) use ctranslate, and simply by doing so, become competitive
on CPU (whisperx ctranslate2 cpu) what was already fast on gpu (whisper)
Running ctranslate on the GPU would suggest a significant uplift in
performance.
…On Thu, Nov 7, 2024, 7:23 PM Mika Laitio ***@***.***> wrote:
I have newer had that generation of AMD cards by my self, so I have not
never tried to add the support for that. If you have time to try, I could
first try to add some very basic component support for it to some build
branch.
Build would propably fail at some point but at least we could try to test
whether rocminfo and amd-smi will detect the card and and also test whether
hipcc and opencl apps works. And maybe also llama.cpp, vllm. etc which does
not require whole stack like pytorch with gpu acceleration to work.
—
Reply to this email directly, view it on GitHub
<#173 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGM4B3QIGRYH44HH4HD67IDZ7MWRVAVCNFSM6AAAAABRCHV7OGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRRG4ZDAMZUG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
If you can manually add the line gfx803 to build_cfg.user file and then start the build with command "./babs.sh -b' to see how it will work? gfx803 is so old card that I do not believe I have time to start testing it by myself but if there are some easy to fix build errors I can try to help on those. |
Editing the build_cfgs doesn't seem to effect babs? EDIT: edited user_config.sh to include gfx803, running now (base) (Sun Jan 12 13:08:27) c@archb rocm_sdk_builder$ cat /home/c/rocm_sdk_builder/docs/notes/containers/config/build_cfg_all.user
|
(base) (Sun Jan 12 18:18:07) c@archb rocm_sdk_builder$ ./babs.sh -b [0] BINFO_APP_NAME: rocm-core
|
@chboishabba, So you have put gfx803 manually as a build target on build_cfg.user? Can you check whether you have any of these files in directory builddir/021_rocFFT/library/src?
rocFFT project will first after building itself try to launch an application named If you have these helper apps but not the db files, in theory it could also be launched manually in style:
|
This is just a wild quess but another thing to try out is to remove these two lines
from src_projects/rocFFT/library/src/CMakeLists.txt** and then do
|
source /opt/rocm_sdk_612/bin/env_rocm.sh
|
I've been told by people in the KoboldCPP Discord that ROCM 6.0 works for my card, the RX 570, which is GFX803, and which I've been struggling to get working with ROCM 5.5. This is exactly what was told to me, maybe it will help out here:
Additionally, they told me that there are some distros that actually build their own ROCMs that target architectures like GFX803, including Debian (Debian uses 6.2 though, so I dunno if that would work or not, but the guy who was using 6.0 that I quoted all this from said that he did all of this before they did whatever they did to rocblas, and he suspects 6.2 would work now.) Hopefully this helps. |
Cheers, I've tried just installing via yay on arch with (I know it's very messy). I've also installed Kobold and a few other things so will see what happens. I really don't know enough to say which solution is the right one...
Here's RVS
|
Nice, so, you got the rocFFT now build for gfx803 and gfx900? Did you remove the "list( REMOVE_ITEM AMDGPU_TARGETS_AOT gfx803 )" and list( REMOVE_ITEM AMDGPU_TARGETS_AOT gfx900 ) lines from the CMakelists.txt for that? Basic hipcc and opencl examples could propably already work now if you try to build and run them from directories /opt/rocm_sdk_612/docs/examples/hipcc/ and /opt/rocm_sdk_612/docs/examples/opencl. To get the rocBLAS build, both the Tensile and Rocblas needs most likely to be patched to add gfx803 and gfx900 support. Both repositories in rocm_sdk_builder contains patches for some other GPU's for which I have added the support earlier. |
This lines up with what a docker image I have for ROCM 5.5 was doing, WORKDIR /src
# Download deps
RUN --mount=type=cache,target=/var/cache/apt,rw --mount=type=cache,target=/var/lib/apt,rw \
apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
gfortran \
git \
python-is-python3 \
nano \
wget \
make \
pkg-config \
libnuma1 \
cmake \
libopenblas-dev \
ninja-build
# Download rocBLAS
RUN --mount=type=cache,target=/tmp/cache/download,rw \
curl -L -o /tmp/cache/download/rocBLAS-${ROCM_VERSION}.tar.gz https://github.com/ROCmSoftwarePlatform/rocBLAS/archive/rocm-${ROCM_VERSION}.tar.gz \
&& tar -xf /tmp/cache/download/rocBLAS-${ROCM_VERSION}.tar.gz -C /src \
&& rm -f /tmp/cache/download/rocBLAS-${ROCM_VERSION}.tar.gz
# Download tensile
RUN --mount=type=cache,target=/tmp/cache/download,rw \
curl -L -o /tmp/cache/download/Tensile-${ROCM_VERSION}.tar.gz https://github.com/ROCmSoftwarePlatform/Tensile/archive/rocm-${ROCM_VERSION}.tar.gz \
&& tar -xvf /tmp/cache/download/Tensile-${ROCM_VERSION}.tar.gz -C /src \
&& rm -f /tmp/cache/download/Tensile-${ROCM_VERSION}.tar.gz
# gfx803 fix
RUN rm -rf /src/rocBLAS-rocm-${ROCM_VERSION}/library/src/blas3/Tensile/Logic/asm_full/r9nano*
# Download rocSPARSE
RUN --mount=type=cache,target=/tmp/cache/download,rw \
curl -L -o /tmp/cache/download/rocSPARSE-${ROCM_VERSION}.tar.gz https://github.com/ROCmSoftwarePlatform/rocSPARSE/archive/rocm-${ROCM_VERSION}.tar.gz \
&& tar -xf /tmp/cache/download/rocSPARSE-${ROCM_VERSION}.tar.gz -C /src \
&& rm -f /tmp/cache/download/rocSPARSE-${ROCM_VERSION}.tar.gz
# Download Magma
RUN --mount=type=cache,target=/tmp/cache/download,rw \
wget -O /tmp/cache/download/magma-${MAGMA_VERSION}.tar.gz https://icl.utk.edu/projectsfiles/magma/downloads/magma-${MAGMA_VERSION}.tar.gz \
&& tar -xf /tmp/cache/download/magma-${MAGMA_VERSION}.tar.gz -C /src \
&& rm -f /src/magma-${MAGMA_VERSION}.tar.gz
# Build and make rocBLAS, Tensile
WORKDIR /src/rocBLAS-rocm-${ROCM_VERSION}
COPY patches /patches
RUN patch -Np1 -d /src/Tensile-rocm-${ROCM_VERSION} -i /patches/Tensile-fix-fallback-arch-build.patch
RUN patch -Np1 -d /src/rocBLAS-rocm-${ROCM_VERSION} -i /patches/rocBLAS-configure-but-dont-build.patch
RUN --mount=type=cache,target=/var/cache/apt,rw --mount=type=cache,target=/var/lib/apt,rw \
apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends cmake \
&& DEBIAN_FRONTEND=noninteractive ./install.sh \
--cmake_install \
--dependencies \
--test_local_path /src/Tensile-rocm-${ROCM_VERSION} \
--architecture "gfx803" \
--logic asm_full \
--msgpack \
&& rm -rf /var/lib/apt/lists/*
RUN make -C build/release -j$(nproc) TENSILE_LIBRARY_TARGET
# Build and make rocSPARSE
WORKDIR /src/rocSPARSE-rocm-${ROCM_VERSION}
RUN cmake \
-Wno-dev \
-B build \
-S "/src/rocSPARSE-rocm-${ROCM_VERSION}" \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_C_COMPILER=/opt/rocm-${ROCM_VERSION}/bin/hipcc \
-DCMAKE_CXX_COMPILER=/opt/rocm-${ROCM_VERSION}/bin/hipcc \
-DCMAKE_INSTALL_PREFIX=/opt/rocm-${ROCM_VERSION} \
-DBUILD_FILE_REORG_BACKWARD_COMPATIBILITY=ON \
-DCPACK_SET_DESTDIR=OFF \
-DCPACK_PACKAGING_INSTALL_PREFIX=/opt/rocm-${ROCM_VERSION} \
-DROCM_PATH="${ROCM_PATH}" \
-DAMDGPU_TARGETS="gfx803"
RUN cmake --build build --target package -j$(nproc) However I am unsure if these patches exist for ROCM 6+(or if they even exist, but you say you already have some patches so that's :D), or if they're necessary given what the guy in the KoboldCPP Discord said from up above in my comment,
So he said AMD fixed Rocblas issues in GFX803 already I guess? |
I'm just out at the moment but I compiled rocblas from aur and I don't think it's worked... It finishes compiling and I did also install zluda which I think is why torch.cuda.is_available = true but I still get segfault when trying to run kobold with rock... The Vulkan implementation works well for Kobold on RX580 it seems processing is about 2-3x faster than CPU. I'll try making those modifications later as I could see stuff about gfx1010 and a few others but nothing gfx803 other than target arch |
Wait why are you installing Zluda? Isn't that just a drop-in CUDA replacement by AMD? I didn't think Zluda was needed if one could get Rocm 6.0.3 to compile for the GFX803 and just compiled tensile and rocblas and whatever against it so one could just use that over CUDA(not sure if this is actually how it works or not tbh, I am new to AI.) Or is this just something specific to Windows people do? I've been mainly sticking to Linux for AI stuff because it's just easier to work with. |
- do not remove gfx803 (AMD RX470/480) and gfx900 (VEGA 64) from the list of devices for which the FFT db is generated. - only build tested to help to solve a problem raised on #173 - more testing for the stack needed by people having these gpus Signed-off-by: Mika Laitio <lamikr@gmail.com>
@AWilliams17 or @chboishabba |
@mika will do
@austin yes it is cuda drop in. we are still testing rocm build. i ended up
using the vulkan version of kobold in the mean time.
…On Thu, 23 Jan 2025 at 05:28, Mika Laitio ***@***.***> wrote:
I have added the rocFFT patch for gfx803. I can only build test this but I
have now been able to build whole rocm sdk builder stack with gfx803
selected as a target. Can you try to update to latest version and try again?
—
Reply to this email directly, view it on GitHub
<#173 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGM4B3UL3VH4XTMOOHRHH732L7WMNAVCNFSM6AAAAABRCHV7OGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBYGA4TCMZQGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@chboishabba wrong (mika) account :) |
@mika apologies :) honestly not sure what I expected to happen replying by email haha I imagine I just git clone the updated repo then something like ./babs.sh --clean binfo/core/021_rocFFT.binfo as you said before? |
@chboishabba Sorry, I missed your question earlier. If you are using the rocm_sdk_builder directory that you have used earlier, then that will work and start building from the 021_rocFFT.binfo. babs.sh is checking from the builddir directory which projects have already been build and only build the missing ones.
command will basically delete the "builddir/021_rocFFT" directory so that it will get rebuild when you next time run the "./babs.sh -b" Then
command that I recommend to be used for just updating the source code to get my latest versions, will check which projects has been modified and then it's running the "babs.sh --clean" for those projects. And if you really only want to build a single project you can use
|
Hiya!
I got pretty excited seeing this, as gfx803 is generally not supported in recent ROCm and I presumed it would be included in this project.
I'm running on RX580 which was highly sought after during GPU shortages of COVID... I believe latest supported ROCm for gfx803 and gfx900 is 5.4.2
https://github.com/jrcichra/rocm-pytorch-gfx803
https://github.com/xuhuisheng/rocm-build/tree/develop/gfx803
https://wiki.archlinux.org/title/AMD_Radeon_Instinct_MI25#ROCm
The text was updated successfully, but these errors were encountered: