Hacker News new | past | comments | ask | show | jobs | submit login

how is this better or different from suno besides api? I'm assuming since you are smaller the quality is not as good and the depth not as wide.



Suno's RVQ-token-based language model is tuned give you an acceptable song that most of their userbase would prefer every single time, but isn't very diverse. Our diffusion model is much more diverse and has higher vocal audio quality, but the results aren't always consistent (just like Flux et al). However, since we have unlimited generations this can be worked around. We're also never going to preference tune our model because I think the stuff that is lost in that process is valuable.


I use both. Sonauto sounds more "real" and varied than what I can get with suno




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: