Our primary focus is on RL post-training. We think that is the best way to get t... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		srush 20 days ago \| parent \| context \| favorite \| on: Composer: Building a fast frontier model with RL Our primary focus is on RL post-training. We think that is the best way to get the model to be a strong interactive agent.

comex 20 days ago [–]

So, yes, but you won’t say what the base model is? :)

typpilol 19 days ago | [–]

It seems like a sort of sonnet model as a lot of people are reporting it like to spam documentation on Twitter like sonnet 4.5

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact