The caveat is that sometimes a library might be expecting an older version of cuda.
The vram on the GPU does make a difference, so it would at some point be worth looking at another GPU or increasing your system ram if you start running into limits.
However I wouldn't worry too much right away, it's more important to get started and get an understanding of how these local LLMs operate and take advantage of the optimisations that the community is making to make it more accessible. Not everyone has a 5090, and if LLMs remain in the realms of high end hardware, it's not worth the time.
The caveat is that sometimes a library might be expecting an older version of cuda.
The vram on the GPU does make a difference, so it would at some point be worth looking at another GPU or increasing your system ram if you start running into limits.
However I wouldn't worry too much right away, it's more important to get started and get an understanding of how these local LLMs operate and take advantage of the optimisations that the community is making to make it more accessible. Not everyone has a 5090, and if LLMs remain in the realms of high end hardware, it's not worth the time.