Yes, eventually we think there is more value of owning the entire stack than jus...

xnx · 2025-06-21T18:46:47 1750531607

> Few ideas we were thinking of: integrating a small LLM

Chrome has a built-in LLM: https://developer.chrome.com/docs/ai/built-in

pickpuck · 2025-06-20T19:37:50 1750448270

> building a more AI friendly DOM

You might consider the Accessibility Tree and its semantics. Plain divs are basically filtered out so you're left with interactive objects and some structural/layout cues.

faxmeyourcode · 2025-06-21T19:03:04 1750532584

I've been trying (albeit not very hard) to build an accessibility library and toolset that can be exposed via mcp server. I think it has the potential to be much more ergonomic for generalized computer-use agents than stuff like playwright or the classic screenshot approach. Low latency computer use is another thing that I'd like to solve.

The issue is mac and windows accessibility APIs are opaque and I have no idea what I'm doing so I'm forced to vibe code it all which is not turning out too well... :-)

I suffer from mild carpal tunnel so I want to build a really low latency computer use agent that can do anything on my computer without me having to learn the talon voice syntax or some other traditional accessibility software like mac dictation.

pickpuck · 2025-06-21T21:37:25 1750541845

Neat, is it on github?

faxmeyourcode · 2025-06-21T23:03:58 1750547038

Not yet, I've gone through a few prototypes that haven't really worked. Nothing has stuck enough to really get far enough for a repo.

I will try to publish something on gh this weekend.