An important point not mentioned by the article is that of "co-recursion" with inheritance (of implementation).
That is: an instance of a subclass calls a method defined on a parent class, which in turn may call a method that's been overridden by the subclass (or even another sub-subclass in the hierarchy) and that one in turn may call another parent method, and so on. It can easily become a pinball of calls around the hierarchy.
Add to that the fact that "objects" have state, and each class in the hierarchy may add more state, and modify state declared on parents. Perfect combinatory explosion of state and control-flow complexity.
I've seen this scenario way too many times in projects, and worse thing is: many developers think it's fine... and are even proud of navigating such a mess. Heck, many popular "frameworks" encourage this.
Basically: every time you modify a class, you must review the inner implementation of all other classes in the hierarchy, and call paths to ensure your change is safe. That's a horrendous way to write software, against the most basic principles of modularity and low coupling.
This is only the case when the language does not distinguish between methods that can be overridden versus those that cannot. C++ gives you the keyword "virtual" to put in front of each member function that you want to opt into this behavior, and in my experience people tend to give it some thought on which should be virtual. So I rarely have this issue in C++. But in languages like Python where everything is overridable, the issue you mention is very real.
Good point. In Java and many other languages you can opt out instead... which might make a big difference. Is it more of a "cultural" thing?... again, many frameworks encourage it by design, and so do many courses/tutorials... so those devs would be happy to put "virtual" everywhere in C++
While in Python everything is overridable, does this show up in practice outside of (testing) frameworks? I feel like this is way more common in Java. My experience in Python is limited to small micro service like backends and data science apps.
The virtual keyword in c++ is more of a compiler optimization and less of a design decision. C++ doesn't want everyone paying the overhead of virtual function calls like other languages
I think that's an over-simplification. There was pressure on the language to ensure that data structures were compatible with C structs, so avoiding the vtable with simple classes was a win for moving data between these languages.
Of course these days with LTO the whole performance space is somewhat blurred since de-virtualisation can happen across whole applications at link time, and so the presumed performance cost can disappear (even if it wasn't actually a performance issue in reality). It's tough to create hard and fast rules in this case.
I 100% agree. And even though I use C# which is kind of OOP heavy, I use inheritance and encapsulation as least as possible. I try to use o more functional worklflow, with data separated from functions/methods. I keep data in immutable Records and use methods/functions to transform it, trying to isolate side effects and minimize keeping state.
It's a much pleasurable and easier way to work, for me at least.
Trying to follow the flow through gazillion of objects with state changing everywhere is a nightmare and I rather not return to that.
I agree that changing object state and having side effects should be avoided, but you can achieve both immutability and encapsulation very easily with C#:
public record Thing()
{
private string _state = "Initial";
public Thing Change() => this with { _state = "Changed" };
}
> It can easily become a pinball of calls around the hierarchy.
This is why hierarchies should have limited depth. I'd argue some amount of "co-recursion" is to be expected: after all the point of the child class is to reuse logic of the parent but to overwrite some logic.
But if the lineage goes too deep, it becomes hard to follow.
> every time you modify a class, you must review the inner implementation of all other classes in the hierarchy, and call paths to ensure your change is safe.
I'd say this is a fact of life for all pieces of code which are reused more than once. This is another reason why low coupling high cohesion is so important: if the parent method does one thing and does it well, when it needs to be changed, it probably needs to be changed for all child classes. If not, then the question arises why they're all using that same piece of code, and if this refactor shouldn't include breaking that apart into separate methods.
This problem also becomes less pressing if the test pyramid is followed properly, because that parent method should be tested in the integration tests too.
> I'd argue some amount of "co-recursion" is to be expected: after all the point of the child class is to reuse logic of the parent
That's the point: You can reuse code without paying that price of inheritance. You DON'T have to expect co-recursion or shared state just for "code-reuse".
And, I think, is the key point: Behavior inheritance is NOT a good technique for code-reuse... Type-inheritance, however, IS good for abstraction, for defining boundaries, to enable polymorphism.
> I'd say this is a fact of life for all pieces of code which are reused more than once
But you want to minimize that complexity. If you call a pure function, you know it only depends on its arguments... done. If you can a method on a mutable object, you have to read its implementation line-by-line, you have to navigate a web of possibly polymorphic calls which may even modify shared state.
> This is another reason why low coupling high cohesion is so important
exactly. Now, I would phrase it the other way around though: "... low coupling high cohesion is so important..." that's the reason why using inheritance of implementation for code-reuse is often a bad idea.
I actually can't imagine for the life of me why I'm defending OOP implementation hierarchies here- I guess I got so used to them at work, I've changed my strategy from opposing them to "it's okay as long as you use them sparingly". I have found that argument to do a lot better with my colleagues...
That's not true. If Outer has a member Inner, Outer always has to invoke `my_inner.foo()` to use Inner::foo, and `foo()` always refers to Outer::foo (and some languages will force you to write `self.foo()`, which is even better).
If Outer extends Inner, though, you can't tell whether `foo()` refers to Inner::foo or Outer::foo without checking to see whether Outer overrides foo or not. And the number of places you have to check scales linearly with the depth of the inheritance hierarchy.
If object A calls a method of object B (composition), then B cannot call back on B, and neither A nor B can override any behavior of the other (And this is the original core tenet of OO: being all about "message-passing").
Of course they can accept and pass other objects/functions are arguments, but that would be explicit and specific, without having to expose the whole state/impl to each other.
> Add to that the fact that "objects" have state, and each class in the hierarchy may add more state, and modify state declared on parents. Perfect combinatory explosion of state and control-flow complexity.
What if you are actually dealing with state and control-flow complexity. I'm curious what would be the "ideal" way to do this in your view. I am trying to implement a navigation system stripping interface design and all the application logic, even at this level it can get pretty complicated.
You are always dealing with state and control-flow in software design. The challenge is to minimize state at much as possible, make it immutable as much as possible and simplify you control-flow as much as possible. OO-style inheritance of implementation (with mutable state dispersed all over the place and pinball-style control-flow) goes against those goals.
Closer to the "ideal": declarative approaches, pure functions, data-oriented pipelines, logic programming.
I tried to contribute a bug fix to a Common Lisp project and found this exact issue. In CL you can trace methods but if the call hierarchy is several dozen levels deep with multiple type overrides and several :around, :before and :after combinations, it’s just impossible to keep track of what does what. This is not a language issue though, CLOS is really powerful and can be a life saver in good hands, but when people use it just to try the feature it creates monstrosities.
If the author intended a function to be overridable and designed the class as such, none of this is a problem. I never need to look inside the parent class, let alone the entire hierarchy.
On the flip side, if the author didn't want to let me do that, I really appreciate having the ability to do it anyways, even if it means tighter coupling for that one part.
I think the fundamental issue with implementation-inheritance is the class diagram looks nice, but it hides a ton of method-level complexity if you consider the distinction between calling and subtyping interfaces, complexity that is basically impossible to encapsulate and would be better expressed in terms of other design approaches.
With interface-inheritance, each method is providing two interfaces with one single possible usage pattern: to be called by client code, but implemented by a subclass.
With implementation-inheritance, suddenly, you have any of the following possibilities for how a given method is meant to be used:
(a) called by client code, implemented by subclass (as with interface-inheritance)
(b) called by client code, implemented by superclass (e.g.: template method)
(c) called by subclass, implemented by superclass (e.g.: utility methods)
(d) called by superclass, implemented by subclass (e.g.: template's helper methods)
And these cases inevitably bleed into each other. For example, default methods mix (a) and (b), and mixins frequently combine (c) and (b).
Because of the added complexity, you have to carefully design the relationship between the superclass, the subclass, and the client code, making sure to correctly identify which methods should have what visibility (if your language even allows for that level of granularity!). You must carefully document which methods are intended for overriding and which are intended for use by whom.
But the code structure itself in no way documents that complexity. (If we want to talk SOLID, it flies in the face of the Interface Segregation Principle). All these relationships get implicitly crammed into one class that might be better expressed explicitly. Split out the subclassing interface from the superclass and inject it so it can be delegated to -- that's basically what implementation-inheritance is syntactic sugar for anyway and now the complexity can be seen clearly laid out (and maybe mitigated with refactoring).
There is a trade-off in verbosity to be sure, especially at the call site where you might have to explicitly compose objects, but when considering the system complexity as a whole I think it's rarely worth it when composition and a tiny factory function provides the same external benefit without the headache.
These are powerful tools, if used with discipline. But especially in application code interfaces change often and are rarely well-documented. It seems inevitable that if the tool is made available, it will eventually be used to get around some design problem that would have required a more in-depth refactor otherwise -- a refactor more costly in the short-term but resulting in more maintainable code.
Author here. I wrote “ But even a modestly more recent language like Java has visibility attributes that let a class control what its subtypes can view or change, meaning that any modification in a subclass can be designed before we even know that a subtype is needed.” which covers your situation: if you need to ensure that subtypes use the supertype’s behaviour in limited ways, use the visibility modifiers and `final` modifier to impose those limits.
The fact that Java had to add a whole extra set of keywords to control this indicates that this is a site of complexity. Since it isn't needed for composition, it's a site of unnecessary complexity.
What you lose by using composition is that the composing object is no longer a subtype of the constituent object, so you can't use it as a "decoration" of the original object in a program that expects an instance of the original.
It can be, if the composing object re-implements the constituent object's interface. This way, code reuse and polymorphism are orthogonal features, which I think is better. If you want both, you can do both, but inheritance pushes you toward using both even when you only need one.
That is: an instance of a subclass calls a method defined on a parent class, which in turn may call a method that's been overridden by the subclass (or even another sub-subclass in the hierarchy) and that one in turn may call another parent method, and so on. It can easily become a pinball of calls around the hierarchy.
Add to that the fact that "objects" have state, and each class in the hierarchy may add more state, and modify state declared on parents. Perfect combinatory explosion of state and control-flow complexity.
I've seen this scenario way too many times in projects, and worse thing is: many developers think it's fine... and are even proud of navigating such a mess. Heck, many popular "frameworks" encourage this.
Basically: every time you modify a class, you must review the inner implementation of all other classes in the hierarchy, and call paths to ensure your change is safe. That's a horrendous way to write software, against the most basic principles of modularity and low coupling.