I imagine there's various disabilities where audio readings greatly simplify people's lives. They're probably appreciative of anything accurate regardless of whether it's humans talking or not.
I sometimes use llm from the command line, for instance with a fragment, or piping a resource from the web with curl, and then pick up the cid with `llm gtk-chat --cid MYCID`.
I'm actually planning on abandoning Simon's infra soon. I want a multi-stream, routing based solution that is more aware of the modern API advancements.
The Unix shell is good at being the glue between programs. We've increased the dimensionality with LLMs.
Some kind of ports based system like named pipes with consumers and producers.
Maybe something like gRPC or NATS (https://github.com/nats-io). MQTT might also work. Network transparent would be great.
(Also check out https://github.com/day50-dev/llmehelp which features a tmux tool I built on top of Simon's llm. I use it every day. Really. It's become indispensable)
The conversational context is nice. The ongoing command building is convenient and the # syntax carryover makes a lot of sense!
My next step is recursion and composability. I want to be able to do things contextualized. Stuff like this:
$ echo PUBLIC_KEY=(( get the users public key pertaining to the private key for this repo )) >> .env
or some other contextually complex thing that is actually fairly simple, just tedious to code. Then I want that <as the code> so people collectively program and revise stuff <at that level as the language>.
Then you can do this through composability like so:
with ((find the variable store for this repo by looking in the .gitignore)) as m:
((write in the format of m))SSH_PUBLICKEY=(( get the users public key pertaining to the private key for this repo ))
or even recursively:
((
((
((rsync, rclone, or similar)) with compression
))
$HOME exclude ((find directories with secrets))
((read the backup.md and find the server))
((make sure it goes to the right path))
));
it's not a fully formed syntax yet but then people will be able to do something like:
and compile publicly shared snippets as specific to their context and you get abstract infra management at a fractional complexity.
It's basically GCC's RTL but for LLMs.
The point of this approach is your building blocks remain fairly atomic simple dumb things that even a 1b model can reliably handle - kinda like the guarantee of the RTL.
Then if you want to move from terraform to opentofu or whatever, who cares ... your stuff is in the llm metalanguage ... it's just a different compile target.
It's kinda like PHP. You just go along like normal and occasionally break form for the special metalanguage whenever your hit a point of contextual variance.
The real solution is semantic routing. You want to be able to define routing rules based on something like mdast (https://github.com/syntax-tree/mdast) . I've built a few hacked versions. This would not only allow for things like terminal rendering but is also a great complement to tool calling. Being able to siphon and multiplex inputs for the future where cerebras like speeds become more common, dynamic configurable stream routing will unlock quite a bit more use cases.
We have cost, latency, context window and model routing but I haven't seen anything semantic yet. Someone's going to do it, might as well be me.
Neat! I've written streaming Markdown renderers in a couple of languages for quickly displaying streaming LLM output. Nice to see I'm not the only one! :)
It's a wildly nontrivial problem if you're trying to only be forward moving and want to minimize your buffer.
That's why everybody else either rerenders (such as rich) or relies on the whole buffer (such as glow).
I didn't write Streamdown for fun - there are genuinely no suitable tools that did what I needed.
Also various models have various ideas of what markdown should be and coding against CommonMark doesn't get you there.
Then there's other things. You have to check individual character width and the language family type to do proper word wrap. I've seen a number of interesting tmux and alacritty bugs in doing multi language support
The only real break I do is I render h6 (######) as muted grey.
Compare:
for i in $(seq 1 6); do
printf "%${i}sh${i}\n\n-----\n" | tr " " "#";
done | pv -bqL 30 | sd -w 30
to swapping out `sd` with `glow`. You'll see glow's lag - waiting for that EOF is annoying.
Also try sd -b 0.4 or even -b 0.7,0.8,0.8 for a nice blue. It's a bit easier to configure than the usual catalog of themes that requires a compilation after modification like with pygments.
A simple bash script provides quick command line access to the tool. Output is paged syntax highlighted markdown.
echo "$@" | llm "Provide a brief response to the question, if the question is related to command provide the command and short description" | bat --plain -l md
I've thought about redoing it because my needs are things like
$ ls | wtf which endpoints do these things talk to, give me a map and line numbers.
What this will eventually be is "ai-grep" built transparently on https://ast-grep.github.io/ where the llm writes the complicated query (these coding agents all seem to use ripgrep but this works better)
Conceptual grep is what I've wanted my while life
Semantic routing, which I alluded to above, could get this to work progressively so you quickly get adequate results which then pareto their way up as the token count increases.
Really you'd like some tampering, like a coreutils timeout(1) but for simplex optimization.
> DO NOT include the file name. Again, DO NOT INCLUDE THE FILE NAME.
Lmao. Does it work? I hate that it needs to be repeated (in general). ChatGPT could not care less to follow my instructions, through the API it probably would?
The consistency in the quality and sharpness of the photos isn't lost on me. There's obviously lots of curation in this collections, must be some work!
reply