stux @stux

0 posts0 participants0 posts today

**RenézuCode** @ReneRebe@chaos.social · Apr 1

100 CPU threads & 240GB RAM to make @risc_v #AI @amd #ROCm and #t2linux https://www.twitch.tv/videos/2421181919

Twitch100 CPU threads & 240GB RAM to make RISCV AI ROCm! - t2sde on Twitcht2sde went live on Twitch. Catch up on their Software and Game Development VOD now.

**Mauve** @mauve@mastodon.mauve.moe · Apr 1

Apr 1

Mauve @mauve@mastodon.mauve.moe

Makes me sad that #AMD #ROCm isn't officially supported on iGPUs. Like, it'll kinda run, but then it's likely to crash my window manager and freeze my machine. Sadly the stuff I want to use doesn't have vulkan support yet.

**pafurijaz** @pafurijaz@mastodon.social · Mar 31

Mar 31

pafurijaz @pafurijaz@mastodon.social

It seems that #Vulkan could be the real alternative for using #AI on GPUs or CPUs of any brand, without necessarily having to rely on #CUDA or #AMD's #ROCm. I thought #SYCL was the alternative. This might finally free us from of monopoly #Nvidia.
#Khronos

**Natasha Nox** @Natanox@chaos.social · Mar 29

Mar 29

Natasha Nox @Natanox@chaos.social

ffs, why does their docker only support Navi 31 and not Navi 32?
https://hub.docker.com/r/rocm/pytorch

I just wish both #Nvidia and #AMD would stop with that whole licensing bullshit around #CUDA and #ROCm and just include that damn stuff in the default driver.
I just want to run #Codestral on my local machine so I can use it with non-public code. Will be troublesome enough to cram it into 16gb VRAM.
#computer #Linux #AI

**RenézuCode** @ReneRebe@chaos.social · Mar 27

Mar 27

RenézuCode @ReneRebe@chaos.social

we did it! #amd #ROCm #HPE compute stack and #AI acceleration now (mostly) available in https://t2linux.com for #riscv https://www.twitch.tv/videos/2416606393 #t2sde #t2linux

t2linux.comT2 Package Manager and Linux Distribution

**Michael DiLeo** @mdileo@michaeldileo.org · Mar 26

Mar 26

Michael DiLeo @mdileo@michaeldileo.org

Last night I was up until 2AM trying to get #trunas #amd drivers installed inside of a #docker #container so that #ollama would actually use the #gpu. I was so close. It sees the gpu, it sees it has 16GB of ram, then it uses the #cpu.

Trunas locks down the file system at the root level, so if you want to do much of anything, you have to do it inside of a container. So I made a container for the #rocm drivers, which btw comes to like 40GB in size.

It's detecting, but I don't know if the ollama container has some missing commands, ie rocm or rocm-info, that it may need.

Another alternative is one I don't really want, and that's to install either #debian or windows as a VM - windows because I did a test on the application that runs locally in windows on this machine before and it was super fast. It isn't ideal from RAM usage, but I may be able to run the models more easily with the #windows drivers than the #linux ones.

But anyway, last night was too much of #onemoreturn for a weeknight.

**ℒӱḏɩę** @Lydie@tech.lgbt · Mar 25

Mar 25

ℒӱḏɩę @Lydie@tech.lgbt

The B-17 Bomber was amazing and helped win WWII. I flew on one in 2002 as a tourist - I have family members that were ball turret gunners - bad place to be.

This video was shot on Hi-8, and thankfully I digitized it (at 720x480) way back in that day. Now, I've up-scaled it with local AI (1408x954) and the improvement is astounding.

Sadly, this actual B17 crashed in 2019: https://en.wikipedia.org/wiki/2019_Boeing_B-17_Flying_Fortress_crash

#localai
#stablediffusion
#rocm
#amd
#b17
#flyingfortress

**Hacker News** @h4ckernews@mastodon.social · Mar 24

Mar 24

Hacker News @h4ckernews@mastodon.social

Aiter: AI Tensor Engine for ROCm

https://rocm.blogs.amd.com/software-tools-optimization/aiter:-ai-tensor-engine-for-rocm™/README.html

rocm.blogs.amd.comAITER: AI Tensor Engine For ROCm — ROCm BlogsWe introduce AMD's AI Tensor Engine for ROCm (AITER), our centralized high performance AI operators repository, designed to significantly accelerate AI workloads on AMD GPUs

#HackerNews #Aiter #AI

**HGPU group** @hgpu@mast.hpc.social · Mar 23

Mar 23

HGPU group @hgpu@mast.hpc.social

The Shamrock code: I- Smoothed Particle Hydrodynamics on GPUs

#SYCL #ROCm #CUDA #PTX #OpenMP #MPI #Astrophysics #Physics #Package

https://hgpu.org/?p=29827

hgpu.org · Mar 23The Shamrock code: I- Smoothed Particle Hydrodynamics on GPUsWe present Shamrock, a performance portable framework developed in C++17 with the SYCL programming standard, tailored for numerical astrophysics on Exascale architectures. The core of Shamrock is a…

**Dantali0n** @dantalion@fosstodon.org · Mar 13

Mar 13

Dantali0n @dantalion@fosstodon.org

I wish I had a GPU where #PyTorch would actually work on.

But after an extensive battle with #ROCm and HSA_OVERRIDE_GFX_VERSION.

I have given up. Seems PyTorch on RDNA1 is out of the question.

Continued thread

**Giuseppe Bilotta** @giuseppebilotta@fediscience.org · Mar 10

Mar 10

Giuseppe Bilotta @giuseppebilotta@fediscience.org

Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <https://doi.org/10.1002/cpe.8313>

This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <https://doi.org/10.1016/j.jcp.2022.111413>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

a. too much effort
b. probably not worth it.

Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

**HGPU group** @hgpu@mast.hpc.social · Mar 3

Mar 3

HGPU group @hgpu@mast.hpc.social

CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads

#ROCm #CUDA #DeepLearning #DL #Package

https://hgpu.org/?p=29795

hgpu.org · Mar 3CRIUgpu: Transparent Checkpointing of GPU-Accelerated WorkloadsDeep learning training at scale is resource-intensive and time-consuming, often running across hundreds or thousands of GPUs for weeks or months. Efficient checkpointing is crucial for running thes…

**Fishd** @Fishd@infosec.exchange · Feb 21

Feb 21

Fishd @Fishd@infosec.exchange

AI rabbit hole ... I've been playing with Ollama and some stability diffusion tools on my MacBook Pro M2 Max and my Linux desktop ... the desktop is way faster and only has an RX6800 in it, so of course I'm now thinking about an Rx7900XTX ... (I don't do Nvidia cards) ...

Anyone have experience with this upgrade? Is going from 16gb of VRAM to 24gb going to make a massive difference?

Using radeontop I can see it's using all 16gb at some points, but not consistently ... and I'm not sure if that's an issue or a feature. I believe #rocm still has some issues.

#selfhosting #ai #sdxl

**Ryan He** @ryanhe@pastwind.top · Feb 21

Feb 21

Ryan He @ryanhe@pastwind.top

在Ryzen 7 #8845HS w/ Radeon #780M 用 #ComfyUI 生圖（ #Linux ）
https://blog.pastwind.org/2025/02/ryzen-7-8845hs-w-radeon-780mcomfyuilinux.html

試了很久才發現成功的方程式…這是因為每次安裝 #ROCm 都需要下載安裝超過30GB的檔案！！！

tl;dr 直接說結論

OS: Ubuntu 22.04（因為ROCm 6.1只支援此以下的版本）

ROCm: <= 6.1.2，6.2跟6.3都沒辦法正常運行

PyTorch: <= 2.4.1，2.5.1版會顯示不支援硬體的警告，圖片有時候無法正確產生。

UserWarning: Attempting to use hipBLASLt on an unsupported architecture! Overriding blas backend to hipblas

2.6以上則完全無法正常運行。使用PyTorch官網的版本而不是AMD提供的。

https://pytorch.org/get-started/previous-versions/

ComfyUI: 當前版本v0.3.14可正常運行。

blog.pastwind.org在Ryzen 7 8845HS w/ Radeon 780M用ComfyUI生圖（Linux）試了很久才發現成功的方程式…這是因為每次安裝ROCm都需要下載安裝超過30GB的檔案！！！ tl;dr 直接說結論 OS: Ubuntu 22.04（因為ROCm 6.1只支援此以下的版本） ROCm: <= 6.1.2，6.2跟6.3都沒辦法正常運行 PyTorch: <...

**gary** @gary_alderson@infosec.exchange · Feb 12 *

Feb 12 *

gary @gary_alderson@infosec.exchange

https://wccftech.com/amd-radeon-rx-9070-xt-gpu-32-gb-memory-rumor/ #amd #rocm #wccftech.com

Wccftech · Feb 12AMD Rumored To Unveil A 32 GB Radeon RX 9070 XT GPU This Year, And No, It’s Not A PRO GPU!AMD is reportedly working on an RX 9070 XT GPU, featuring 32 GB GDDR6 VRAM. This card is supposedly coming in the first half of 2025.

**alip** @alip@mastodon.online · Feb 12

Feb 12

alip @alip@mastodon.online

#sydbox 3.32.0 is released! We now officially support #GPU access for #ROCm and #nVIDIA! See the release mail here: https://is.gd/kN1rUt and here is a profile auto-generated by #pandora for #hashcat accessing an #nVIDIA #GPU using #cuda libraries: https://dpaste.com/6DQ97T2DM #exherbo #linux #security

**JP Lehr** @jplehr@mast.hpc.social · Feb 3

Feb 3

JP Lehr @jplehr@mast.hpc.social

I also uploaded the slides for my talk in the #hpc devroom at #fosdem about the #programming models in #ROCm.
Video is reviewed too and waiting to be released to the public. You can get the slides and the video (once released) at https://fosdem.org/2025/schedule/event/fosdem-2025-5143-programming-models-with-the-rocm-compiler/

fosdem.orgFOSDEM 2025 - Programming models with the ROCm™ compiler

**JP Lehr** @jplehr@mast.hpc.social · Feb 2

Feb 2

JP Lehr @jplehr@mast.hpc.social

#HPC devroom starting early with great content and good participation!
I’ll be talking about #ROCm at 11.35 — come and say hello or follow the livestream at https://fosdem.org/2025/schedule/event/fosdem-2025-5143-programming-models-with-the-rocm-compiler/

**Carlos Solís** @csolisr@hub.azkware.net · Jan 27

Jan 27

Carlos Solís @csolisr@hub.azkware.net

I feel vindicated to see #NVIDIA humbled a notch. I remember what they did to #Nouveau with their encrypted firmware "to thwart miners". I remember their attempts to monopolize parallel computing with #CUDA and the interference against #ROCm. And Linus Torvalds has plenty of reasons to loathe them.

**gary** @gary_alderson@infosec.exchange · Jan 26

Jan 26

gary @gary_alderson@infosec.exchange

2 xtx is 120 tflop so 2500-3k gets you 100 tflop ws - mostly gpu dependent #rocm #cuda #python #violent python #grayhats

Recent searches

Search options

Administered by:

Server stats:

#rocm