Show HN: SnapQL – Desktop app to query Postgres with AI

gabrielruttner · 2025-06-20T16:07:27 1750435647

This is nice -- we're heavy users of postgresql and haven't found the right tool here yet.

I could see this being incredible if it had a set of performance related queries or ran explain analyze and offered some interpreted results.

Can this be run fully locally with a local llm?

stephancill · 2025-06-20T16:18:00 1750436280

just opened a PR for local llm support https://github.com/NickTikhonov/snap-ql/pull/11

nicktikhonov · 2025-06-20T16:36:23 1750437383

Merged! Thanks Stephan

nicktikhonov · 2025-06-20T16:08:20 1750435700

Thank you for the feedback. Please feel free to raise some issues on the repo and we can jam this out there

joshstrange · 2025-06-21T11:46:55 1750506415

I might test this out, but I worry that it suffers from the same problems that I ran into the last time I played with LLMs writing queries. Specifically not understanding your schema. It might understand relations but most production tables have oddly named columns, potentially columns that changed function overtime, potentially deprecated columns, internal-lingo columns, and the list goes on.

Granted, I was using 3.5 at the time, but even with heavy prompting and trying to explain what certain tables/columns are used for, feeding it the schema, and feeding it sample rows, more often than not it produced garbage. Maybe 4o/o3/Claude4/etc can do better now, but I’m still skeptical.

brulard · 2025-06-21T17:11:03 1750525863

I got better results with Claude Code + PostgreSQL MCP. I let claude understand my drizzle schema first, and i can instruct it to also look at the usage of some entities in the code. Then it is smarter in understanding what the data represents.

liquidki · 2025-06-21T13:05:13 1750511113

I think this is the achilles heel of LLM-based AI: the attention mechanisms are far, far, inferior to a human, and I haven't seen much progress here. I regularly test models by feeding in a 20-30 minute transcript of a podcast and ask them to state the key points.

This is not a lot of text, maybe 5 pages. I then skim it myself in about 2-3 minutes and I write down what I would consider the key points. I compare the results and I find the AI usually (over 50% of the time) misses 1 or more points that I would consider key.

I encourage everyone to reproduce this test just to see how well current AI works for this use case.

For me, AI can't adequately do one of the first things that people claim it does really well (summarization). I'll keep testing, maybe someday it will be satisfactory in this, but I think this is a basic flaw in the attention mechanism that will not be solved by throwing more data and more GPUs at the problem.

joshstrange · 2025-06-21T13:44:32 1750513472

> I encourage everyone to reproduce this test just to see how well current AI works for this use case.

I do this regularly and find it very enlightening. After I’ve read a news article or done my own research on a topic I’ll ask ChatGPT to do the same.

You have to be careful when reading its response to not grade on a curve, read it as if you didn’t do the research and you don’t know the background. I find myself saying “I can see why it might be confused into thinking X but it doesn’t change the fact that it was wrong/misleading”.

I do like when LLM‘s cite their sources, mostly because I find out they’re wrong. Many times I’ve read a summary, then followed it to the source, read the entire source, and realized it says nothing of the sort. But almost always, I can see where it glued together pieces of the source, incorrectly.

A great micro example of this are the Apple Siri summaries for notifications. Every time they mess up hilariously I can see exactly how they got there. But it’s also a mistake that no human would ever make.

pu_pu · 2025-06-21T13:35:42 1750512942

This is not a difficult problem to solve. We can add the schema, columns and column descriptions in the system prompt. It can significantly improve performance.

All it will take is a form where the user supplies details about each column and relation. For some reason, most LLM based apps don't add this simple feature.

joshstrange · 2025-06-21T13:38:41 1750513121

It’s not a difficult problem to solve, I did it, last year, with 3.5, it didn’t help. That’s not to say that newer models wouldn’t do better, but I have tried this approach. It is a difficult problem to actually get working.

pu_pu · 2025-06-21T13:49:03 1750513743

So, I have not tried it on a very complex database myself so I can't comment how well it will work in production systems I have tried this approach with a single Big Query table and it worked pretty well for my toy example.

If by 3.5 you mean ChatGPT 3.5 you should absolutely try it with newer models, there is a huge difference in capabilities.

joshstrange · 2025-06-21T13:53:14 1750513994

Yes, ChatGPT 3.5, this testing was a while back. I’m sure it has improved but I doubt it’s solid enough for me to trust.

Example/clean/demo datasets it does very well on. Incredibly impressive even. But on real world schema/data for an app developed over many years, it struggled. Even when I could finally prompt my way into getting it to work for 1 type of query, my others would randomly break.

It would have been easier to just provide tools for hard-coded queries if I wanted to expose a chat interface to the data.

nicktikhonov · 2025-06-21T12:05:51 1750507551

might be possible to solve this with prompt configuration. e.g. you'd be able to explain to the llm all the weird naming conventions and unintuitive mappings

joshstrange · 2025-06-21T12:13:57 1750508037

I did that the last time (again, only with 3.5, things have hopefully improved in this area).

And I could potentially see LLMs being useful to generate the “bones” of a query for me but I’d never expose it to end-users (which was what I was playing with). So instead of letting my users do something like “What were my sales for last month?” I could use LLMs to help build queries that were hardcoded for various reports.

The problem is that I know SQL, I’m pretty good at, and I have a perfect understanding of my company’s schema. I might ask an LLM a generic SQL question but trying to feed it my schema just leads to (or rather “led to” in my trials before) prompt hell. I spent hours tweaking the prompts, feeding it more context, begging with it to ignore the “cash” column that has been depreciated for 4+ years, etc. After all of that it still would make simple mistakes that I hard specially warned against.

pu_pu · 2025-06-21T13:44:58 1750513498

Can you please add support to add descriptions of each column and enumerated types?

For example, if a column contains 0 or 1 encoding the absence of presence of something, LLMs need to know what 0 and 1 stand for. Same goes of column names because they can be cryptic in production databases.

jasonthorsness · 2025-06-20T14:34:35 1750430075

Looks useful! And the system prompt didn't require too much finessing. I wonder how it would work with some later models than gpt-4o as in my own dabbling around gpt-4o wasn't quite there yet and the latest models are getting really good.

For analytical purposes, this text-to-SQL is the future; it's already huge with Snowflake (https://www.snowflake.com/en/engineering-blog/cortex-analyst...).

nicktikhonov · 2025-06-20T15:28:59 1750433339

Appreciate the input! I'd love to be able to support more models. That's one of the issues in the repo right now. And I'd be more than happy to welcome contributions to add this and other features

anshumankmr · 2025-06-20T15:57:07 1750435027

Would love to contribute. I have made a fork, will try and raise a PR if contributions are welcome.

Question, how are you testing this? Like doing it on dummy data is a bit too easy. These models, even 4o, falter when it comes to something really specific to a domain (like I work with supply chain data and other column names specific to the work that I do, that only makes sense to me and my team, but wouldn't make any sense to an LLM unless it somehow knows what those columns are)

nicktikhonov · 2025-06-20T16:09:39 1750435779

I'm using my own production databases at the moment. But it might be quite nice to be able to generate complex databases with dummy data in order to test the prompts at the higher levels of complexity!

And thank you for offering to contribute. I'll be very active on GitHub!

sgarland · 2025-06-20T23:36:14 1750462574

Genuinely do not understand the point of these tools. There is already a practically natural language to query RDBMS; it’s called SQL. I guarantee you, anyone who knows any other language could learn enough SQL to do 99% of what they wanted in a couple of hours. Give it a day of intensive study, and you’d know the rest. It’s just not that complicated.

brulard · 2025-06-21T01:06:07 1750467967

SQL is simple for simple needs, basic joins and some basic aggregates. Even that you won't learn in 2 hours. And that is just scratching the surface of what can be done in SQL and what you need to query. With LLMs and tools like this you simply say what you need in english, you don't need to understand the normalizations, m:n relation tables, CTEs, functions, JSON access operators, etc.

sgarland · 2025-06-21T02:04:34 1750471474

For reference, I’m a DBRE. IMO, yes, most people can learn basic joins and aggregates in a couple of hours, but that is subjective.

> you don’t need to understand the normalizations

You definitely should. Normalizing isn’t that difficult of a concept, Wikipedia has terrific descriptions of each level.

As to the rest, maybe read docs? This is my primary frustration with LLMs in general: people seem to believe that they’re just as good of developers as someone who has read the source documentation, because a robot told them the answer. If you don’t understand what you’re doing, you cannot possibly understand the implications and trade-offs.

aurareturn · 2025-06-21T11:13:18 1750504398

Thank goodness 99% don’t want to understand everything. Otherwise, you wouldn’t be paid very well at your job, right?

physix · 2025-06-21T02:15:21 1750472121

Without having looked at it, I would assume the value comes from not having to know the data model in great detail, such that you can phrase your query using natural language, like

"Give me all the back office account postings for payment transfers of CCP cleared IRD trades which settled yesterday with a payment amount over 1M having a value date in two days"

That's what I'd like to be able to say and get an accurate response.

v5v3 · 2025-06-21T08:43:38 1750495418

In a business, a management decision maker has to rely on a Db analyst if any query they have cannot be answered by any front end tool they have been given. And that introduces latency to the process

A 100% accurate ai powered solution would have many customers.

But can this generation of llms produce 100% accuracy?

nicktikhonov · 2025-06-20T23:45:58 1750463158

and yet this was on the front page of hacker news for an entire day :D

it's all about friction. why spend minutes writing a query when you can spend 5 seconds speaking the result you want and get 90-100% of the way there.

sgarland · 2025-06-20T23:49:33 1750463373

Mostly because you don’t know if it’s correct unless you know SQL. It’s entirely too easy to get results that look correct but aren’t, especially when using windowing functions and the like.

But honestly, most queries I’ve ever seen are just simple joins, which shouldn’t take you 5 minutes to write.

AdieuToLogic · 2025-06-21T01:01:30 1750467690

> Mostly because you don’t know if it’s correct unless you know SQL. It’s entirely too easy to get results that look correct but aren’t ...

This is the fundamental problem when attempting to use "GenAI" to make program code, SQL or otherwise. All one would have to do is substitute SQL with language/library of choice above and it would be just as applicable.

sgarland · 2025-06-21T02:17:02 1750472222

Fully agree, I just harp on SQL because a. It’s my niche b. It always seems to be a “you can know this, but it doesn’t really matter” thing even for people who regularly interact with RDBMS, and it drives me bonkers.

brulard · 2025-06-21T01:13:56 1750468436

> most queries I’ve ever seen are just simple joins

Good for you. Some of us deal with more complex queries, even if it may not seems so from the outside. For example getting hierarchical data based on parent_id, while having non-trivial conditions for the parents and the children or product search queries which need to use trigram functions with some ranking, depending on product availability across stores and user preferences.

I agree knowing SQL is still useful, but more for double checking the queries from LLMs than for trying to build queries yourself.

sgarland · 2025-06-21T02:15:24 1750472124

> getting hierarchical data based on parent_id

So, an adjacency list (probably, though there are many alternatives, which are usually better). That’s not complex, that’s a self-join.

> trigram functions

That’s an indexing decision, not a query. It’s also usually a waste: if you’re doing something like looking up a user by email or name, and you don’t want case sensitivity to wreck your plan, then use a case-insensitive collation for that column.

> I agree knowing SQL is still useful, but more for double checking the queries from LLMs

“I agree knowing Python / TypeScript / Golang is still useful, but more for double checking the queries from LLMs.” This sounds utterly absurd, because it is. Why SQL is seen as a nice-to-have instead of its reality - the beating heart of every company - is beyond me.

brulard · 2025-06-21T10:22:42 1750501362

Your Python / TypeScript etc. argument is a strawman, thats why it sounds absurd. Your arguments would hold better if an average person was good and very quick at learning and memoizing complex new things. I don't know if you work with people like that, but that's definitely not the norm. Even developers know little SQL unless it's their specific focus.

In the original comment you said:

> I guarantee you, anyone who knows any other language could learn enough SQL to do 99% of what they wanted in a couple of hours. Give it a day of intensive study, and you’d know the rest. It’s just not that complicated.

Well your "guarantee" does not hold up. Where I live, every college level developer went through multiple semesters of database courses and yet I don't see these people proficient in SQL. In couple hours? 99% of what they need? Absurd

sgarland · 2025-06-21T17:17:56 1750526276

It's not a strawman, it's reductio ad absurdum. SQL and Python are both languages that are commonly used. It would be (currently; who knows in a few years) laughable if someone said they didn't need to deeply understand Python to be able to correctly write Python at an employable level, modulo experience levels - I don't expect a Junior to know the vagaries of the language, e.g. that bools are aliased to integers.

> Even developers know little SQL unless it's their specific focus.

Yes, and I believe this to be deeply problematic. We don't generally allow people to use a language they don't understand in production, except for SQL.

> Where I live, every college level developer went through multiple semesters of database courses and yet I don't see these people proficient in SQL.

That's horrifying.

Look, while I would love it if everyone writing SQL knew relational algebra, basic set theory, and the ins and outs of their specific RDBMS implementation, I think the below suffices for the majority of work in web dev:

    SELECT: extract the columns that are named, optionally with an alias with AS (or simply a space)
    FROM: the [main] table to extract columns from
    [INNER] JOIN: an additional table to examine, returning only their intersection
    LEFT [OUTER] JOIN: an additional table to examine, returning everything in the LHS table, as well as any matches from the RHS table, with NULLs filling in missing data
    RIGHT [OUTER] JOIN: the same as LEFT JOIN, but with the logic swapped
    FULL [OUTER] JOIN: an additional table to examine, returning the union of both tables, regardless of matches
    ON: an expression to use for joining tables, generally consisting of at least one column from each table to match
    WHERE: a predicate (or series of predicates, with boolean operators joining them) to use for filtering the result set
    ORDER BY: one or more columns to order the result set by, in either ascending (ASC) or descending (DESC) order
    GROUP BY: one or more columns (though strictly speaking, this number must match the number of non-aggregated columns in the SELECT) to group the result set by
    LIMIT: a limit for the maximum number of rows returned

You're telling me that given a simple educational schema like Northwind Traders, and the documentation for their RDBMS, that someone who already knows a programming language couldn't use the above to figure it out in a fairly short order?

throwmeaway9876 · 2025-06-20T16:23:47 1750436627

Great tool!

Pardon my technical ignorance, but what exactly is OpenAI's API being used for in this?

nicktikhonov · 2025-06-20T16:37:10 1750437430

OpenAI LLM is used to generate SQL based on a combination of a user prompt and the database schema.

jpb0104 · 2025-06-20T15:50:30 1750434630

I like this a lot. I am looking forward to having something similar built into Metabase.

sirjaz · 2025-06-20T15:17:45 1750432665

Looks like a good idea. Any reason you didn't use React native?

nicktikhonov · 2025-06-20T15:20:51 1750432851

Not really - I had some previous experience with electron and wanted to finish the core feature set in a few hours, so just went with what I already know.

s1mplicissimus · 2025-06-20T14:54:57 1750431297

Are there plans to support other LLM sources, in particular ollama?

nicktikhonov · 2025-06-20T15:21:15 1750432875

Yes! https://github.com/NickTikhonov/snap-ql/issues/1

s1mplicissimus · 2025-06-20T16:08:36 1750435716

awesome, looking forward to try it with a self hosted model

kebsup · 2025-06-20T14:44:46 1750430686

I was looking for something like this that supports graphs.

nicktikhonov · 2025-06-20T14:49:19 1750430959

Graph generation is next on the list.

JofArnold · 2025-06-20T16:53:22 1750438402

Neo4j?

iJohnDoe · 2025-06-20T21:50:26 1750456226

Which MCP is the recommended or “official” for SQLite and PostgreSQl for use with Cursor?

revskill · 2025-06-20T15:04:38 1750431878

What's the underlying model to enable this ?

nicktikhonov · 2025-06-20T15:21:32 1750432892

Currently OpenAI 4o

revskill · 2025-06-20T15:27:41 1750433261

So u already train all knowledgebase or fine tune? Would love to know how can u evaluate correctness.

ramoz · 2025-06-20T17:38:35 1750441115

they don't it's simple a zero-shot text to sql interface. the app development started 2days ago.

https://github.com/NickTikhonov/snap-ql/blob/main/src/main/l...

thedudeabides5 · 2025-06-20T19:19:24 1750447164

data engineering about to be eaten by llms

zicon35 · 2025-06-20T11:18:28 1750418308

congrats on the launch! This looks very interesting

GarrickDrgn · 2025-06-20T14:29:46 1750429786

Am I misunderstanding something? How is this "Everything runs locally" if it's talking to OpenAI's APIs?

whilenot-dev · 2025-06-20T14:59:56 1750431596

This app is using OpenAI via the ai package[0][1], so "Everything runs locally" is definitely misleading.

[0]: https://github.com/NickTikhonov/snap-ql/blob/409e937fa330deb...

[1]: https://github.com/vercel/ai

piskov · 2025-06-20T15:10:13 1750432213

I guess he means there is no proxy between you and openai. API key won’t leak, etc.

nicktikhonov · 2025-06-20T15:22:07 1750432927

What I meant was that it isn't a web app and I don't store your connection strings or query results. I'll make this more clear

kokanee · 2025-06-20T16:19:48 1750436388

It is a web app, though. You just aren't running the server, OpenAI is. And you're packaging the front end in electron instead of chrome to make it feel as if it all runs locally, even though it doesn't.

Side note: I don't see a license anywhere, so technically it isn't open source.

omega3 · 2025-06-20T15:27:29 1750433249

You might not but openai does.

nicktikhonov · 2025-06-20T15:52:04 1750434724

That makes no sense. OpenAI doesn't know the secret database connection string or any query results. Perhaps you should have read the code before making baseless claims.

nessbot · 2025-06-20T15:55:07 1750434907

But it knows what you're querying, which depending on what you're doing may also give away a good bit about whats in the DB.

doctorpangloss · 2025-06-20T15:37:41 1750433861

API gateways could accept public keys instead of generating bearer tokens. Then the private key could reside in an HSM, and apps like this could give HSMs requests to sign. IMO even though this could be done in an afternoon, everyone - Apple and Google, the CDN / WAF provider, the service provider - is too addicted to the telemetry.

esafak · 2025-06-20T14:07:25 1750428445

If you can do this, can't you create a read-only user and use it with a database MCP like https://github.com/executeautomation/mcp-database-server ? Am I missing something?

nicktikhonov · 2025-06-20T14:24:22 1750429462

You can set up an MCP and use it in your existing AI app, but is afaiu the first open source standalone app that gives you a familiar interface to other SQL workspace tools. I built it to be a familiar but much more powerful experience for both technical and nontechnical people.

esafak · 2025-06-20T14:49:42 1750430982

There are competitors with a GUI too, such as https://www.sqlchat.ai/ and https://www.jetbrains.com/datagrip/features/ai/

I wish you luck in refining your differentiation.

BenderV · 2025-06-20T15:17:05 1750432625

Selfless plug, our own tool => https://www.myriade.ai

> I wish you luck in refining your differentiation. Can't agree more with you. It's about distribution (which Snowflake/Databricks/... have) or differentiation.

Still, chatting with your data is already working and useful for lots.

nicktikhonov · 2025-06-20T15:27:33 1750433253

The first doesn't have good UX and the second isn't open source. SnapQL is both :) But I'll find new ways to differentiate for sure, it's part of the fun of building.

un1970ix · 2025-06-20T16:51:59 1750438319

Your project is source-available, not open-source. Consider adding a license.

ramoz · 2025-06-20T17:40:38 1750441238

https://dbeaver.com/docs/dbeaver/AI-Smart-Assistance/

bobbyraduloff · 2025-06-20T18:21:12 1750443672

[flagged]

nicktikhonov · 2025-06-20T18:32:14 1750444334

Interesting lead. What else would they be looking for in a tool like this? My bad re the video, I'll make sure not to toggle dark mode in the next one.

jaimin888patel · 2025-06-20T11:36:12 1750419372

awesome work nick, literally been asking for a vibe coding SQL interface for months

nicktikhonov · 2025-06-20T15:29:23 1750433363

thanks Jaimin. happy you finally found what you were looking for :D