Sibling commenter tvararu is correct. 2023 Apple Macbook with 128GiB RAM, all available to the GPU. No quantisation required :)
Other sibling commenter refulgentis is correct too. The Apple M{1-3} Max chips have up to 400GB/s memory bandwidth. I think that's noticably faster than every other consumer CPU out there. But it's slower than a top Nvidia GPU. If the entire 96GB model has to be read by the GPU for each token, that will limit unquantised performance to 4 tokens/s at best. However, as the "Mixtral" model under discussion is a mixture-of-experts, it doesn't have to read the whole model for each token, so it might go faster. Perhaps still single-digit tokens/s though, for unquantised.