Back in university I had an experience that was really instructive. For a projec...

skohan · on Jan 28, 2022

Yeah this just hasn't been my experience. If you're working on a house you measure twice and cut once because the cost of reworking physical materials is a lot more expensive than the cost of doing a second measurement.

If you could delete half your house and re-build it at zero cost, it might be more valuable to just go for the first attempt and learn from it rather than trying to do everything in theory up front.

If you find yourself working on a "monstrosity" maybe you haven't seen the signs soon enough that you need to take a step back and refactor. But in my experience, at least starting on the problem with a POC gives you so much more high quality information that even if you have to scrap and re-write part way in, you're going to reach such a better result than if you try to map the whole thing out first without actually having tried to solve the problem.

nthj · on Jan 28, 2022

I’ve found people often overlook that while code can be very quickly deleted, gigabytes or terabytes of production data is a huge pain to ETL later. Investing in the data model upfront has huge payoffs for your code and avoiding ETLs later on.

gefhfffh · on Jan 28, 2022

When can you rebuild at zero cost?

I have made similar avoidable mistakes of not thinking it through enough, could have saved me a lot of rewriting, which was pretty expensive

skohan · on Jan 28, 2022

Rewriting is incredibly cheap! And you learn a lot from the failed attempts.

Again to the house analogy, if you could just build 3 vestibules to see how they fit with just a little typing, that would be far and away preferable to committing to everything on paper before hand.

gefhfffh · on Jan 28, 2022

No, it is not cheap, generally.

Example:

Dendrite[0] was going to be a Rewrite of Synapse (which was a prototype which ended up going into production). The rewrite started more than 5 years ago, had lots of development breaks and it still is nowhere complete or close to replace a existing Synapse instance (which even today is ... Well ... suboptimal software).

The current plans are to support and use both servers long term, because Synapse is already too widespread.

I wouldn't call that cheap.

[0]: https://github.com/matrix-org/dendrite

skohan · on Jan 28, 2022

Well starting from scratch for a rewrite is generally a bad idea. I'm more a fan of incremental re-wrties. But I have code bases I have been working on for years with tens of thousands of lines where basically the entire thing gets rewritten ever 18 months or so, a bit at a time.

Isn't your example proof of the benefit of pragmatic coding? Synapse is actually serving users right now. Dendrite sounds like a "better design" which is stuck in purgatory.

gefhfffh · on Jan 28, 2022

They tried an incremental rewrite before. They failed, and started Dendrite.

https://matrix.org/blog/2020/10/08/dendrite-is-entering-beta

MrPatan · on Jan 30, 2022

Survivor bias.

You don't see all the perfectly designed codebases that never went anywhere.

We only talk about legacy code because it's the one that pays the bills.

nfw2 · on Jan 28, 2022

Are you the only working on the codebase? It may be easy to rewrite your own codebase, but it's certainly not easy to rewrite someone else's. Especially if they haven't been caring about code quality and/or test coverage.

My definition of "good-quality" code is pretty much exactly "how difficult would this codebase be for a new engineer to understand and modify safely."

hinkley · on Jan 28, 2022

You can go really really fast when you don’t give a shit about consequences.

Often the worst code comes from prolific people. There’s just so much if it. And if you touch it you will break it at least 1% of the time, so you have to pick your battles when you are trying to keep the ratio under control.

skohan · on Jan 28, 2022

If you write loosely-coupled code, it shouldn't be that easy to break things unless you're careless.

hinkley · on Jan 28, 2022

Nobody in this whole post is talking about loosely coupled code. If you find someone complaining about how hard it is to modify loosely coupled code, you have my permission to fire them.

For everyone else this is tautological. Good code can continue to be good code.

hinkley · on Jan 28, 2022

If it were loosely coupled it wouldn’t be an anecdote in this conversation. There is no “I”, there is no “you”. There is only “us”. I can only control Us so much, and I don’t have a time machine.

People who only have green field projects as their context are very frustrating in conversations like this. They make suggestions like, well, don’t fuck up in the first place. I don’t know what your history is but that’s the feeling I’m getting.

skohan · on Jan 28, 2022

Not that it matters but I've worked on a mix of green-field projects and mature codebases in various domains with teams of various sizes over more than a decade of professional experience.

I could assume you're throwing shade on "prolific programmers" out of some sense of insecurity, but it wouldn't be fair to generalize about strangers on the internet ;)

hinkley · on Jan 28, 2022

No, It’s an overdeveloped sense of justice.

I just punch up at condescending people. Every discipline has a bunch of armchair people who don’t understand the problem who think “get more exercise” is the response to depressed people or people with chronic fatigue, “eat fewer calories” is the answer to weight issues or diabetes, or “write it right the first time” is a useful response to people trying to solve real world problems.

Kindly let the grownups talk and keep your flash card answers to yourself.

skohan · on Jan 28, 2022

> Kindly let the grownups talk

Well one of us is certainly being condescending ;)

And did you really answer twice to the same comment? My goodness I must have really struck a nerve.

nfw2 · on Jan 29, 2022

Well this escalated quickly

hinkley · on Jan 28, 2022

If everyone is discussing a problem that you don’t see, then why offer your simple solution except to appear smart? And who needs to appear smart? That’s your insecurity. Not mine.

skohan · on Jan 28, 2022

Idk maybe it's because I did freelancing for quite some time and was often hired to clean up somebody's mess, but I think it's not so bad to rewrite someone else's code. What I normally do is quarantine the old code behind a clean interface, or else start fresh with a better code structure, and then copy and paste the good bits of business logic from the old project into the new one.

Code has a lot of really great properties which make it easy to modify in provably safe ways if you know what you're doing.

vault_ · on Jan 28, 2022

That really depends on what you're working on and to what degree it's coupled to the system it's a part of. A form on a web application, or an API endpoint? Sure, rewriting it is probably trivial. A new process scheduler for Linux? The caching system in an HTTP server? Maybe writing the code will be easy (though probably not), but building any confidence that it doesn't break something that's unexpectedly load bearing will be anything but cheap. And if what you're rewriting that started out as "just code and see what happens" it's going to be more expensive still.

Which isn't to say that rewriting can't be cheap, but some intentional design (or at least diligent maintenance and refactoring) must have gone into the system to support that style of development. At which point you're back to targeting "quality," even if it's no longer a focus of on the smaller scale.

DerArzt · on Jan 28, 2022

How are defining cheep, and how big is the metaphorical house? Developer time is not free (unless you are working a personal project) and if an hour or two up front of planning things out stops you from making a lynch pin mistake that needs to be re-written it is better to do the planning.

If the amount of time that it takes to rewrite something is trivial (a week or less) then you aren't working on something all that big/complicated.

runald · on Jan 28, 2022

Probably not literally without cost, but if the code was written with disposability in mind combined with just a little bit of pre-planning, then rewriting or refactoring should be indeed trivial.

gefhfffh · on Jan 28, 2022

When start off with "wrong" higher level design concepts, neither of those is trivial.

If you're writing a function implementation without thinking about design, well you might be right.

sseagull · on Jan 28, 2022

I sometimes wonder if there is a miscommunication. I’m scratching my head sometimes like “how can rewriting/refactoring an entire 10-20k project take negligible time?”

Maybe some people have very small projects compared to what I work on? Or maybe they are talking about the design of a single small component?

I inherited a codebase that needed some refactoring because it was written “to just get it shipped”. If completely fell apart with more users and has taken me a year to get it where it needs to be.

skohan · on Jan 28, 2022

But how often would you actually have to rewrite an entire 10-20k LOC project at once?

As you need to add functionality, you can opportunistically find ways to improve the codebase toward a better/more maintainable structure.

I have several large codebases I work on which probably get totally rewritten every 18 months or 2 years, however incrementally.

If you have a working project and you think you need to throw it all out and start from scratch probably it's worth a second thought.

hinkley · on Jan 28, 2022

We always talk about time when the elephant in the room is energy. People say we don’t have “time” for that and someone else gets out a calendar and tries to disprove them. Followed by a bunch of backpedaling with other excuses and followed up with foot dragging.

The second elephant in the room is job security. People who write baroque code are hard to fire. Nobody wants to invest energy in understanding their private little Bedlam.

runald · on Jan 29, 2022

When starting, yes it's easy to replace code, even with a shitty design. If it had to take more than ten hundred thousands lines of code to realize that the design is wrong, then it means the author lacks enough awareness or foresight to plan ahead, and no amount of planning will fix that. Code should be replaced/rewritten if the earliest signs faulty design show up, which should be trivial if the code was and remains "disposable".

mint2 · on Jan 28, 2022

Not having to pay for physical materials doesn’t mean there are no costs. Throwing hours, days, or weeks of focused work time out the window does not sound cost free to me or Whoever signs paychecks.

skohan · on Jan 28, 2022

It's not throwing it out the window if it's an iteration toward a better solution. There's a reason nobody does waterfall anymore in software - up-front planning is less productive than rapid prototyping in most cases.

mannykannot · on Jan 28, 2022

The development of the relational model of databases is an example where thinking things through led to a radically different and superior solution, going in a very different direction than ad-hoc development had produced, or was ever likely to produce. At the time Codd published his seminal paper, there were no implementations of those ideas.

It is also notoriously difficult to get the design of concurrency primitives correct without thinking things through.

geodel · on Jan 28, 2022

Yeah, but >99% code is about using relational DBs and concurrency primitives not developing them. There are somethings that would depend on solid theoretical understanding but most of the software today is much more amenable to trying out things first and reworking failed parts.

mannykannot · on Jan 28, 2022

In practice, one does not write a single line of code without feeling that it is somehow getting you closer to the desired outcome. If you have the ability to anticipate that it will not contribute to that goal, or that it will create problems on the way, or that there is a better way, even before you have written and tested it, then it would be counterproductive not to do so.

skohan · on Jan 28, 2022

But there's also such a thing as over-planning. If every function you write, you are thinking about 100 different rules about "best practices" you can end up not writing anything at all. Sometimes it's better to accept some level of imperfection first and refine later.

mannykannot · on Jan 28, 2022

Sure, but your reply to Jcbrand was dismissive of the idea that there is any value to thinking ahead. Jcbrand was not advocating 'thinking about 100 different rules about "best practices"', only that it is useful to try to work out the consequences of the choices you make, in advance of those consequences being revealed to you by failed tests (or failures in use.)

I have no idea why there is a large (or at least vocal) community of developers whose dogma seems to be that thinking things through is a waste of time (though maybe it is just an overreaction to the equally dogmatic clean coders and similar prescriptivists.)

skohan · on Jan 29, 2022

I mean it's always a bit of a middle ground isn't it? I'm not suggesting that you should literally just sit down at a keyboard and blindly start typing - of course you want to have at least some concept of how you want to approach the problem.

My point is more that coding itself is an excellent tool for probing for solutions. In many cases I think "software design" is over valued, and time spent prototyping is often more valuable than time spent thinking though the problem if you want to arrive at a high quality answer.

Aqueous · on Jan 28, 2022

As someone who is currently over a year into rewriting a massive system with a fundamental design error by the original designer I can assure you that failing to plan your data model up front can have huge costs not just for you but for anyone who picks up your code in the future, and can hamstring a system so that it is impossible to extend or evolve.

skohan · on Jan 28, 2022

What was the design error?

Aqueous · on Feb 3, 2022

It's a platform for managing a kind of appointment, but it doesn't have an appointments table. The appointments data is combined with a different table, and there was no way to disentangle them easily because the entire system was built around that object model. Case study in failure to normalize.

The hard part is done, but basically we had to switch the engine while the car was running so to speak. In order to switch this out you need to start writing the correct data shape, then switch everything over to reading that shape. When people work strict 9 to 5s this will take you forever especially when managing a large amount of volume, which requires you to be extremely risk-averse and slow.

d0mine · on Jan 28, 2022

I don't know about your experience but refactoring is just not on the menu in most commercial environments I'm familiar with, therefore if you always pick the first solution that cones to find, it is likely you have to leave with the consequences of the hack for a long time (until the system crumbles under its own complexity).

skohan · on Jan 28, 2022

I guess I have been lucky - I've always been able to negotiate for time to refactor if needed, or just find time for it in lulls between tasks. If you're in an environment where engineers don't have the freedom to improve their codebase I would not consider that to be a healthy practice, but that certainly would change the value proposition around upfront design.

deterministic · on Jan 30, 2022

I refactor all the time and have done it for 25+ years. The result is solid code that is easy to maintain and close to bug free (no bugs in production the last 5+ years).

d0mine · on Jan 30, 2022

Could you describe the process: did you need to justify the time spent on refactoring in any way? ("why known bugs, requested new features should wait until the refactoring is done"--I'm playing devils advocate here. I'm interested, how you justified it before the management if you had one)

deterministic · on Jan 31, 2022

I never asked permission to do it. I consider refactoring to be part of my job. Small refactorings I do right away when implementing a new feature. Large ones I split into many small steps and work on for months in-between working on new features. Never underestimate the power of making a small improvement every day. I make sure that the long term refactorings never breaks the system or introduce new bugs. I have solid tests in place making sure changes doesn’t change behaviour.

hinkley · on Jan 28, 2022

This is a very self-centered way to think about things. I don’t mean selfish, I mean thinking as a “me” problem instead of an “us” problem.

If I have to rewrite a bit of my code then them’s the breaks. But I work on a team, sometimes a big team. I don’t have “a house” I have a construction crew that is building many houses and will go on building them. If they’re doing it wrong then I have not only the problem in front of me but five copies elsewhere. And I can’t fix problems N times faster than they are made. And I can’t always sell them on the better technique, even when there are demonstrable problems with theirs.

skohan · on Jan 28, 2022

I want to work with people who I trust to rewrite my code.

And of course I don't mean that you should check in code which is a mess. But from the time you start a feature to the time you open a PR, you can go through several iterations of messy code before arriving at a solution which is fit to share with your colleagues.

hinkley · on Jan 28, 2022

I do too, but I can’t spend my whole life interviewing, and sometimes great teams just run out of money.

nfw2 · on Jan 28, 2022

Usually you can't delete half your codebase and re-build it at zero cost. The cost of developing a codebase is often the primary cost for software companies.

hinkley · on Jan 28, 2022

The cost of the knowledge of that code and the techniques that created is spread in five to fifty other brains. That’s the hard part. You keep finding new copies of patterns you’re trying to remove.

In a couple of notable cases, that didn’t stop until I removed the last copy. My theory is that certain people were cutting and pasting code from one of the three surviving copies.

AlchemistCamp · on Jan 28, 2022

> "If you could delete half your house and re-build it at zero cost, it might..."

I look forward to that day when atoms are commanded like bits!

bschwindHN · on Jan 28, 2022

Weeks of programming can save you hours of planning!

skohan · on Jan 28, 2022

You can also spend weeks trying to solve a problem on paper which can be solved with hours of experimentation

CipherThrowaway · on Jan 28, 2022

I think this is the kind of thing you learn at uni and then potentially unlearn later on. Over the years I became better at using code to explore problem spaces and as a design tool. Nowadays I feel that incremental design delivers better results in less time than upfront design.

marcos100 · on Jan 28, 2022

I think your incremental design delivers better results because you already know or at least have a hunch of what wouldn't work and avoid that. You have an abstract architecture when starting and change accordingly on the fly, while programming, using your own best practices.

Top down and bottom up architecture have their places. Being extreme in favor of one side is usually bad, as almost anything in life.

skohan · on Jan 28, 2022

I'm just having trouble understanding what you're talking about. Like what would be a concrete example of how a poor up-front design decision would paint you into an unrecoverable corner?

kaba0 · on Jan 28, 2022

My experience says that you might not need much design for a typical CRUD app, but try to write a JVM/compiler/database and you will quickly see that a bad design pretty much aborts the given project and you have to start from almost scratch.

There is no incrementel rewrite between different stack/heap handling as those are an absolutely central parts of the design, which are pretty much impossible to try to encapsulate, as opposed to the 34th API endpoint. So what it means is that certain domains have much higher essential complexity and at that point the average encapsulation given by OOP/language tools are not sufficient to contain these parts, complexity will triumph and the whole program has to be viewed as one unified whole. Concurrent applications are a similar can of worms.

CipherThrowaway · on Jan 29, 2022

All the technologies you mentioned do undergo large component rewrites and refactoring very often. It's true that e.g. the JVM is sometimes hemmed in by decisions from the past. But it is a decades old project and it is not clear that more up front design and deliberation would have future proofed the project for the language and VM conventions of the 2020s.

I have applied incremental design to concurrent applications and a compiler + stack VM project that runs in embedded environments. You don't go in blind. You do need domain experience and broad strokes knowledge of the conventions. You make some major architectural decisions up front but these don't involve much planning or design. Contrary to your point about CRUD apps, API design is harder to achieve incrementally since it is an interface and requires cross-team (sometimes cross-organizational) iteration. It's still possible, but your organization needs to be equipped for incremental/agile work.

skohan · on Jan 28, 2022

Yeah I mean for sure it depends on the domain.

But it's interesting you mentioned compilers - I'm in the process of writing one, and I very much used an incremental approach.

The first pass I essentially wrote a parser and a component which walked the AST and produced output. At a certain point it was clear that local knowledge of the AST wasn't sufficient to capture non-local details about the program which were required to produce the correct output. I got away for a short time with dirty tricks, but eventually transitioned to a new design: I kept the lexer and parser, and implemented a data driven IR in the style of an ECS system to be built up before emitting output.

So I threw out the initial output component, but I learned a ton by starting with an end-to-end compiler, however incomplete. If I hadn't taken that step, and tried first to plan the perfect IR on paper, I am certain I would have reached an inferior result.

edit: and even the IR and compiler middleware is loosely coupled. The IR is essentially a set of flat data tables, each of which is built independently by walking the AST. And the compiler is implemented in a series of independent passes: i.e. one pass to build the IR, one pass to derive type information etc. so it's very much grown into a series of independent components, each of which could be independently rewritten without affecting the others very much.

marcos100 · on Jan 29, 2022

If you have infinite time to recover, there is no problem, but you could also design something perfect using that infinite time.

A bit of thinking about design and architecture can save you a lot of time. Start with the wrong data structures and maybe you'll have to patch a lot of thing or just redesign everything from scratch.

Be an architecture astronaut and you may never release whatever you're suppose to develop.

It's all about trade-offs.

rawoke083600 · on Jan 28, 2022

>upfront designing their programs instead of just jumping in. >This has taught me the lesson to always first try and think things through and come up with some kind of initial design, instead of just jumping in and writing code blindly.

100% One of my favorite techniques is "Super Pseudo Code" ! Why SUPER ?

Lol cause the "pseudo code" I write can barley be called "code at all" - It is usually just a text-file with a bunch of loosey-goosey-function-calls and parameters.

You know just to get a "feel" for how different entities(classes,struct,tables or libs - pick your poison) will interact and what might be needed. We not talking any UML-Diagrams here - really just text-files and functions/entities

This also works super-well for any multi-step-processes.

mpartel · on Jan 28, 2022

This is true for early high-level decisions. For every-day coding of smallish components in a large project, it matters much less.

The more code your design decision will affect, the more up-front planning it deserves.

skohan · on Jan 28, 2022

I would argue if “high level design decisions” are getting in the way of coding, this is a sign of over-design or premature abstraction. If you write sufficiently loosely-coupled code, it’s not hard to re-organize later.

mpartel · on Jan 28, 2022

Some examples of high-level decisions: - which web framework? - which database? which ORM? which transaction isolation level by default? - will this game/UI be multiplayer? - what's our testing discipline?

You can't loosely couple around questions like these most of the time, at last not without excessive abstraction.

For "how do I structure this reasonably isolated 0-1kLOC component", I agree, easy to fix later if needed.

skohan · on Jan 28, 2022

I think there are always ways to minimize coupling. For example if most of the code you write is pure functions operating on values, then swapping out your ORM might be a bit laborious, but it's going to be largely just a matter of typing, not tricky problem solving.

And like for example if you want to change web frameworks, that's something you can do incrementally. If you're talking about front-end, just find a way to encapsulate your old code behind a clean interface and start implementing new components in the new framework. If you're talking about back-end, then you can just implement new endpoints in a new language if you want even, and gradually migrate things over when you have to modify them.

agumonkey · on Jan 28, 2022

> As they say "measure twice, cut once".

and also the reuse of the term architecture.. don't put beams at random and see if it works, plan in advance and calculate. They probably had these realizations thousands of years ago, it's a generic economy principle.

I forgot what the saying is but solution unfold themselves when you thought about the problem long enough.

Oh and lastly, Grothendieck said he wasn't in the business of solving problems, but expressing them.

skohan · on Jan 28, 2022

If architects could freely reposition beams in a building, they might take advantage of that fast iteration rather than spending a lot of time calculating up-front.

agumonkey · on Jan 28, 2022

I think it's antithetical, their profession exist only to avoid moving complicated stuff when it's too late. Of course architect can design wrong and then it's costly too.