More

MattJ100 · 2025-04-19T09:39:43 1745055583

As a FOSS project which publishes debs, it's a real hassle using current tooling. The tools are heavily biased towards the needs of distro maintainers, forcing projects into overly complex workflows which are overkill for a small numbers of packages. Features that would benefit smaller repositories with more frequent updates are limited or missing.

The other options is to DIY without tooling (essentially, write custom tooling as scripts or whatever), which is messy and full of pitfalls to the inexperienced.

ilikebits · 2025-04-19T19:46:34 1745091994

Yup, this was similar to our experience as well. If publishing is still a source of toil for you, feel free to reach out at eliza@attunehq.com - we'd love to see if we can do any dev work or hosting pro bono for your open source projects.

MattJ100 · 2025-04-12T11:10:55 1744456255

I agree, but only for situations where the probabilistic nature is acceptable. It would be the same if you had a large team of humans doing the same work. Inevitably misclassifications would occur on an ongoing basis.

Compare this to the situation where you have a team develop schemas for your datasets which can be tested and verified, and fixed in the event of errors. You can't really "fix" an LLM or human agent in that way.

So I feel like traditionally computing excelled at many tasks that humans couldn't do - computers are crazy fast and don't make mistakes, as a rule. LLMs remove this speed and accuracy, becoming something more like scalable humans (their "intelligence" is debateable, but possibly a moving target - I've yet to see an LLM that I would trust more than a very junior developer). LLMs (and ML generally) will always have higher error margins, it's how they can do what they do.

mnky9800n · 2025-04-12T11:52:48 1744458768

Yes but i see it as multiple steps. Like perhaps the llm solution has some probabilistic issues that only get you 80% of the way there. But that probably already has given you some ideas how to better solve the problem. And this case the problem is somewhat intractable because of the size and complexity of the way the data is stored. So like in my example the first step is LLMs but the second step would be to use what they do as structure for building a deterministic pipeline. This is because the problem isn’t that there are ten thousand different meta data, but that the structure of those metadata are diffuse. The llm solution will first help identify the main points of what needs to be conformed to the monolithic schema. Then I will build more production ready and deterministic pipelines. At least that is the plan. I’ll write a substack about it eventually if this plan works haha.

MattJ100 · 2025-02-26T20:15:44 1740600944

I don't think you can say that is the problem. It may have exacerbated the issue, but problems exist when summarising full news content too: https://www.bbc.co.uk/news/articles/c0m17d8827ko

MattJ100 · 2025-01-08T14:37:55 1736347075

Of course XMPP doesn't count... even though it's a standard, allows people to run their own server, have an email-like address of user@server, allows you to communicate from anywhere, and is how I, my family, and many many others chat online? :)

MattJ100 · 2025-01-01T20:04:22 1735761862

I agree with you that "server-managed" E2EE wouldn't really be E2EE, and I agree about most proprietary platforms lacking the necessary transparency around this.

From the XMPP perspective though, I want to clarify that ejabberd does not have "its own E2EE" and the E2EE that is used in modern XMPP apps (OMEMO) is client-managed and allows you to verify keys using e.g. a QR code.

OTR's limitations are quite significant (lack of file sharing, group chats, offline messages, to name a few). I don't think that helps E2EE adoption. Unless someone picks up the OTRv4 work, but even that had excluded some of those items from its scope IIRC.

Bender · 2025-01-03T16:19:58 1735921198

OTR's limitations are quite significant (lack of file sharing, group chats, offline messages, to name a few). I don't think that helps E2EE adoption.

Absolutely fair points. I suppose a part of me was hoping that if it were adopted then work would continue on it with a new set of eyes looking into the limitations.

MattJ100 · 2024-12-27T08:37:01 1735288621

I admire Dijkstra for many things, but this has always been a weak argument to me. To quote:

"when starting with subscript 1, the subscript range 1 ≤ i < N+1; starting with 0, however, gives the nicer range 0 ≤ i < N"

So it's "nicer", ok! Lua has a numeric for..loop, which doesn't require this kind of range syntax. Looping is x,y,step where x and y are inclusive in the range, i.e. Dijkstra's option (b). Dijkstra doesn't like this because iterating the empty set is awkward. But it's far more natural (if you aren't already used to languages from the 0-indexed lineage) to simply specify the lower and upper bounds of your search.

I actually work a lot with Lua, all the time, alongside other 0-indexed languages such as C and JS. I believe 0 makes sense in C, where arrays are pointers and the subscript is actually an offset. That still doesn't make the 1st item the 0th item.

Between this, and the fact that, regardless of language, I find myself having to add or subtract 1 frequently in different scenarios, I think it's less of a deal than people make it out to be.

saurik · 2024-12-27T14:47:15 1735310835

In any language, arrays are inherently regions of memory and indexes are -- whether they start at 0 or 1 -- offsets into that region. When you implement more complicated algorithms in any language, whether or not it has pointers or how arrays are syntactically manipulated, you start having to do mathematical operations on both indexes and on ranges of index, and it feels really important to make these situations easier.

If you then even consider the simple case of nested arrays, I think it becomes really difficult to defend 1-based indexing as being cognitively easier to manipulate, as the unit of "index" doesn't naturally map to a counting number like that... if you use 0-based indexes, all of the math is simple, whereas with 1-based you have to rebalance your 1s depending on "how many" indexes your compound unit now represents.

Certhas · 2024-12-27T09:47:11 1735292831

And the reason to dismiss c) and d) is so that the difference between the delimiters is the length. That's not exactly profound either.

If the word for word same argument was made by an anonymous blogger no one would even consider citing this as a definitive argument that ends the discussion.

MattJ100 · 2024-12-27T08:05:54 1735286754

You should take a look at Fennel ( https://fennel-lang.org/ ) then - it takes Lua in a lisp direction, including macro support.

MattJ100 · 2024-12-18T11:32:35 1734521555

The protocol isn't really an issue for the use-case you talk about. I founded the Snikket project, which aims squarely at the family-and-friends use case you mention (after all, it was made to scratch my own itch - my family's excessive use of WhatsApp for communicating with each other). I can tell you that my family don't care a bit whether Snikket uses IRC, XMPP or Matrix or some real-time Gemini equivalent.

There may be some scalability differences between different protocols/implementations for the admin of the service, but Snikket fits comfortably on even low-end Raspberry Pi devices, and literally over half of the typical resource usage is by the web dashboard (yay Python).

So what difference does the protocol make? It can make a difference to the developer experience. If all you want to do is exchange text messages, then yeah, XMPP and Matrix are absolutely overkill. But - especially for a family-and-friends use case - people also want file sharing, audio/video calls, and all that stuff. It very quickly gets quite complex to support all this stuff, especially in a way that allows you to evolve the protocol over time (trust me, what you think of as core messaging features today, were not a thing 10+ years ago, and messaging in 10+ years will also involve a new set of features).

There will always be a set of users for whom plain text messaging is enough (90% of my own daily communication is via messaging in a terminal app). However that set does not intersect significantly with the general population, and practically none of my family members would accept such a solution as a replacement for WhatsApp.

imiric · 2024-12-18T19:22:13 1734549733

Hey, your project looks interesting, thanks for building and sharing it.

One question: are you aware of Jami[1], f.k.a. Ring? If so, how does it compare to Snikket?

I see that Snikket requires a server, whereas Jami is P2P. The benefit of a server is probably that messages can be stored centrally and not on each device. But I can see pros and cons of either approach.

[1]: https://jami.net/

MattJ100 · 2024-12-18T23:28:51 1734564531

Hey, thanks! I've been into messaging for quite a long time - network protocols and particularly those involving online communication are among my favourite tech topics :) So yeah, I follow various projects.

You're right that there are pros and cons. Obviously, not having to run a server is a big pro for many. However, the first thing to remember when researching messaging solutions - no matter what anyone tells you - there are always servers! What differs between projects/platforms is what the servers do, and who runs them.

Jami uses a network of public servers that form a distributed hash table (see https://github.com/savoirfairelinux/opendht ). It's a neat design, and they have done a good job tackling the challenges of P2P messaging. Last time I looked in, it still required both users to be connected at the same time for message delivery/sync to work (the devices use the DHT to discover each other, and then exchange messages). This is a fairly common issue for P2P systems, and can be frustrating in a mobile-dominated world. Their DHT software does support push notifications, which helps with that though.

Another project in this category is Briar, which uses the existing network of Tor servers - and therefore adds IP address masking and a layer of metadata protection (as always, there are limitations, e.g. https://code.briarproject.org/briar/briar/-/wikis/FAQ#does-b... ). Briar built the concept of "mailbox" nodes you can run ( https://briarproject.org/download-briar-mailbox/ ) to overcome some of the problems with P2P messaging.

With Snikket, instead of using existing publicly shared infrastructure, you just run your own server (e.g. VPS or Raspberry Pi or whatever) which is responsible just for your users, and your users connect directly to it, improving (meta)data locality. This makes the design very simple, reliable and efficient (e.g. with battery/bandwidth). It also enables some important (for our use case) convenience/UX features, such as the ability to add restrictions on certain accounts (e.g. for children), and server-managed contact lists so all your family members don't have to manually add each other as contacts one-by-one. Things like that.

No approach is universally better than every other, but I much prefer the Snikket model for the family-and-friends use case. Not that we don't have our own challenges. Our iOS app is probably the weakest part right now (in terms of UX and general polish). Something I'm working hard to get fixed in 2025.

imiric · 2024-12-19T21:24:49 1734643489

Thanks for your perspective.

Yeah, there are definite challenges of the P2P architecture. But like you say, Jami seems to have done a good job addressing them.

I looked at Briar, but it has a different focus and is more limited in functionality and less polished than Jami. My use case is text messaging and audio/video calls with a close group of contacts, so Jami and your project look like a better fit. I also considered Matrix/Element/FluffyChat, but the Matrix architecture is confusing, and the clients are underwhelming.

Anyway, good luck with Snikket! If Jami doesn't work out for me, I'll definitely give it a try.

linsomniac · 2024-12-18T13:01:40 1734526900

TIL: https://snikket.org/

MattJ100 · 2024-11-19T16:05:55 1732032355

> In theory you could have brisk competition in Mastodon clients but we have been that way with IRC, XMPP and RSS where clients have not improved in 20 years.

This is an often-repeated myth.

I have a lot of experience with XMPP, and I can tell you that many of today's clients are unrecognisable compared to equivalents 20 years ago.

Then IRCv3 is a thing in multiple IRC clients and networks.

RSS, well, I'm not sure how much evolution is needed there anyhow.

By longevity, the most successful networks have all been open ones. The proprietary ones come and go.

PaulHoule · 2024-11-19T18:33:06 1732041186

What I have seen with XMPP is major uptake with law enforcement and military. Lots of custom systems there.

RSS is a great protocol to talk about, it's not such a great protocol to use.

Right now my RSS reader subscribes to about 110 feeds for 10 cents a month per feed through Feedburner which does all the crawling for me and just hits a websocket when a new article comes in which posts it to SQS. When I want to add new articles to my database I just suck 'em out.

It's a bargain for feeds like MDPI and arXiv and the Guardian that have many articles per day but I could not afford to subscribe to the 2000+ indie blogs that I'd like to subscribe to and that YOShInOn could easily find articles I find interesting in.

I could write my own RSS crawler and probably will at some point but it is a hassle. I can crawl frequently and make my home internet unusable or I can crawl infrequently and have stale results. Worse yet if a feed changes rapidly I might even miss updates completely.

I can define some loss function that lets me find an optional combination of "doesn't poll too fast" and "isn't too stale" for each feed individually but I expect to experience both the consequences of underpolling and overpolling unless I make a definite decision that I am going to underpoll or overpoll by a few order of magnitude.

There was so much fighting between RSS and ATOM in terms of the exact syntax of the records but none about the problem of polling and polling and polling and polling. Confront Dave Winer with ActivityPub or AT protocol and it is like you are showing somebody a car and they don't want to hear about the engine and the brakes and all that they just want to know where to hitch the horse.

MattJ100 · 2024-11-10T20:01:10 1731268870

Yeah, existing proprietary (e.g. Apple/Google) push notifications are not E2EE, so apps that need to convey sensitive information (secure messaging apps, etc.) tend to take this approach already regardless of push provider.