Something that people may not see immediately is that flatMap is more general than map and filter. Say, for a contrived example, that you'd like to filter out the even numbers in an array, and then double the odd numbers that remain. Instead of:
[1, 2, 3, 4, 5].filter(n => n % 2 === 1).map(n => n * 2)
For those wondering at home, the reason you shouldn't do this is immediately spelled out in the Mozilla docs for flatMap:
> Note, however, that this is inefficient and should be avoided for large arrays: in each iteration, it creates a new temporary array that must be garbage-collected, and it copies elements from the current accumulator array into a new array instead of just adding the new elements to the existing array.
I assumed you were aware of the issues, which is why I said those at home who might just take it at face value as a neat pattern. :)
I personally do actually use flatMap sometimes as outlined in your example - perf is an after issue and I'll refactor if the concise code isn't worth the cost.
It still creates a temporary [x, 2*x] array for every element though. This is an unavoidable problem with flatMap, while reduce can easily be changed to reuse the same accumulator array, making it twice as fast as flatMap and almost as fast as the simple for-loop approach.
My preference for when I have to do multiple operations on the same array (like filter then map) is "reduce". It's the most straightforward way other developers will understand what's going on.
Not that I disagree with your preference, in fact I share it. But this:
> It's the most straightforward way other developers will understand what's going on.
… has been the opposite of my experience. Both on the job (where I’ve always conceded to team preference for imperative loops) and observing the community (hating on reduce is a whole meme on JavaScript/TypeScript Twitter, and the contrary meme has never shown up at least on my feed).
And I honestly understand why it’s not very popular. Reduce/fold is a very FP concept which isn’t particularly idiomatic in real world JS. When I learned and embraced it (myself coming from a JS background), it took dozens of real uses before I felt like I had committed to my own memory what’s actually happening. And by then I think I was writing Clojure.
> My preference for when I have to do multiple operations on the same array (like filter then map) is "reduce". It's the most straightforward way other developers will understand what's going on.
There are definitely operations that are more intuitive as reduces, but any sequences consisting of exclusively filter, map, and/or flatMap are, IME and IMO, about the worst candidates.
(That said, because the JS filter, map, etc. operations are eager rather than lazy, sequenced operations produce intermediate arrays that may be undesirable especially with large datasets, so reduce can be desirable even without being more clear in intent.)
In addition to being essentially a combined "filter" and "map", it's also a "better" filter than filter itself in TypeScript in such that it narrows types much more ergonomically[0].
In TypeScript, you might have an array of multiple types (e.g. `Array<A | B>`), and use a `filter` call to only keep the `A`s. However, in many situations TypeScript can't figure this out and the resulting array type is still `Array<A | B>`. However, when you just use `flatMap` to do nothing more than filtering in the same way, TypeScript can determine that the resulting type is just `Array<A>`. It's a bit unfortunate really - `filter` is faster and more readable, but the ergonomics of `flatMap` type-wise are so much nicer! Just some interesting trivia.
I wonder if it is possible to add a feature to Typescript to help with this:
You could potentially add a syntax for type guards function types, then add a signature to filter that accepts a type guard and returns an array of the guarded types.
Shouldn't be too much of a stretch given that we have type guards.
The syntax is a bit annoying... should be something like filter<A, B>(cb: A => A is B)
The interesting generalization is that once you realize that flatMap lets you map and filter at the same time is that you can generate arbitrary elements in the output list corresponding to each item in the input list. For example,
>What do empty arrays have to do with doubling even integers
nothing, but they do have some relationship to 0, "" and Promise.resolve() - the array is handling the logic that will make the results be combined, not the doubling part
Filter and then map will iterate the list twice. JS really needs some iterator-based methods like Lodash where it will only go through the list once in this case.
You've discovered transducers (which I think have a rather horrible and confusing for newcomers higher-order function presentation in the language that popularized them, i.e. Clojure, when they could just be lists)! All transducers are is a function `x -> List(x)` and then you can use other functions such as `flatMap` to apply them (as your example illustrates nicely this is why map, filter, and its combination can all be described as single transducers).
You do have to make sure that your implementation of list is extremely efficient on zero and one element lists (ideally it generates no garbage at all in those cases) otherwise as other commentators have pointed out you'll have a lot of GC pressure.
And even though the transducer itself is `x -> List(x)` note that the `List` is only produced as an intermediate step and doesn't need to exist in the final product. You could apply a `x -> List(x)` to a generator for example and just "absorb" the list back into the resulting generator.
I'm not sure it's any more general considering that you have to return an array and also treat an empty array as a 'null' value.
Or to put it another way, if I reviewed code where someone used flatMap for anything other than lists of lists I'd be likely to suggest filter/map or reduce or some other convenient equivalent depending on the purpose of the code.
Something like Ruby's filter_map[0] would do the job, although not with this particular example (because 0 is truthy in Ruby).
/** Return this symbol to skip the current value. */
const SKIP = Symbol("mapFilter.SKIP");
/**
* @template T, R
* @param {T[]} array
* @param {(SKIP: Symbol, currentValue: T, index: number, array: T[]) => R|SKIP} callback return `SKIP` to filter out an element
* @param {number} [begin] defaults to 0
* @param {number} [end] defaults to `array.length`
* @returns {R[]}
*/
function mapFilter(array, callback, begin = 0, end = array.length) {
const ret = [];
for (let i = begin; i < end; i++) {
const v = callback(SKIP, array[i], i, array);
if (v !== SKIP) ret.push( /** @type {R} */ (v));
}
return ret;
};
Here. Less than ten lines without JSDoc type annotations, twenty with them. It lets you slice, map and filter all in one call without allocating intermediate arrays like you would when chaining them, making it almost as fast as a plain for-loop. It's also easy to turn it into an in-place version, removing even the array allocation overhead.
The spread operator looks cool and makes just returning the ternary operator work here but its performance implications are non-obvious (it's makin' copies). With reduce() you're really wanting something like this:
In an expression position, the push statement will be executed, then its return will be discarded, and the final expression will be the result of the expression. Is it “better”? Almost certainly not. But it lets you stay in expression syntax while executing statements. (Much more useful for logging than meaningful runtime side effects IMO, but I think it should be more widely known in general.)
Edit: and I’m glad to see another reference to it down thread!
Wouldn't recommend doing this - if the original array is of significant length this'll get quite slow because `acc.concat` has to create a brand new array of slightly longer length on each iteration it's called. Better to just use `push` like you suggested before and then return the array if you want to use `reduce`.
let input = [1, 2, 3, 4, 5], output = [];
for (let i = 0; i < input.length; ++i) {
let n = input[i];
if (n % 2) output.push(2*n);
}
return output;
But in some circumstances the other style can be more convenient / legible. The immediate question was about pushing to an array and then returning the array, for which the comma operator can be handy.
Minority opinion: please `let` your mutable references. I know `const` doesn’t signal immutability, but we as humans with eyeballs and limited attention span certainly benefit from knowing when a value might change at runtime.
Disagree: virtually everything in JS is mutable, so this almost means "never use the `const` keyword". Pretending that the `const` keyword means something that it doesn't makes things harder for my limited human mind to understand, not easier. Plus using `let` inappropriately makes my linter yells at me, and I usually like to just do whatever my linter tells me.
Anyway, I use TypeScript, so if I really want to assert that my array is immutable (as immutable as stuff in JS-land gets anyway) I just write:
I realize I could be clearer in what I’m asking for: please use const when you use reference types as values, and use let when you intend to mutate the reference. Using const and then changing a value is certainly allowed but it’s confusing and it’s missing an opportunity to signal in the code where changes might happen.
I `readonly` and `as const` everything I possibly can. I do know that const doesn’t mean immutable, as I said, but I think it should and I think there’s value in establishing the idiom even if it’s not currently adopted. Because otherwise const basically means nothing unless you’re mutating everything already.
Hilariously enough, I think reduce would be much more palatable to JS devs if it was named fold. Not because the familiarity would jump out but because it’s a much more evocative name that could make visualizing its behavior more intuitive.
I've made it a habit to check Mozilla's JS docs once in a while for functions like Flat() and FlatMap(). Sometimes if I find myself reaching for underscore/lodash, I'll check Mozilla docs first to see if there's some new function that can let me omit using lodash.
I'm often delighted to find new convenience functions I can just use without adding another dependable.
Not that you have to, but just FYI you can disable "smart" punctuation in iOS Settings: General -> Keyboard -> Smart Punctuation. I don't know why they omitted the quotes around smart there.
there have been a lot of those added to the ES spec lately and a lot more coming soon. So it is always good to check MDN because I usually end up learning something new mostly.
If you're chaining, you're still MUCH better off with Lodash. Native functions iterate once per function. Lodash uses an iterator and only loops through the list once.
I've always found `flatMap` to be interesting because it feels like a convenience function, combining .map(fn).flat() into one function. It's interesting because JS doesn't really have a lot of convenience functions that are this shallow (ie. that provide just minimal cleanup compared to the functions they're wrapping).
Is there some specific reason that `flatMap` made the cut?
probably because flatMap (also called chain) is an important array (or monads in general) function in functional programming. If anything, .flat() is the aberration.
In mathematics it would be 'flat' that is the important one, not flatMap. You get 'flatMap' by just composing the monad multiplication 'flat' with the monad functor's mapping of arrows 'map'. That's mainly just to highlight the similarity with regular monoids though.
But in a program it usually makes sense to do the flattening in one go, since it avoids the need to do some possibly expensive intermediate calculations.
For a lot of common cases (where the returned lists from the fn are short, but the complete list is long), a combined flatMap can be optimized to be much faster than doing the functions one by one.
Small mistake, and maybe that's why this function exists, but it's actually combines `.map(fn).flat(1)`.
I think conceptually, `flatMap` is a `map` where each iteration can return multiple values (or none). So it feels quite more powerful than just `map` and is a quite common convenience function.
I think the depth argument of flat defaults to 1, so it'd be the same anyway. Not sure why the default is 1, I'd prefer Infinity so it'd always flatten everything by default, but maybe there are some good reasons for it.
But in the real world, if we must have a default, I think it should be 1, not infinity.
Infinity has a frustrating fail state, in my opinion: you might be flattening arrays that weren't meant to be flattened. And now you've got a cluster!@#$ of data to look at and reason about.
The fail state of 1 is that it doesn't flatten enough but you have the original structure in your 1-level flattened output. So you're far more likely to make heads and tails of how much more flattening you need.
Arguably the same argument works the other way around as well, that in thinking flat recursively flattens all arrays of arrays you get confused when it doesn't. (I've definitely run across that footgun.)
For me, I just find that in almost all cases I've used flat it's been with Infinity as the depth value. It seems to me as well that most of the time you'd probably either want 1 or Infinity, not 5 or 12 or whatever, so perhaps it would've even been better to just have two functions: flat and flatDeep (or some such, naming is hard) the latter of which would default to Infinity but allow different depths as well.
Probably reads better than just flat as well, e.g. `.flatDeep(5)` rather than `.flat(5)`. Oh well, we're stuck with this now so it's all an academic exercise anyway, but I'd be curious to know the rationale of the design. Maybe I'll wade through the spec repo one day to see if there's any discussion about it.
I have very little experience with arrays in JS, but I always have a nagging feeling that, if my algorithm needs me to flatten an array, then there is something wrong with the data structures I am using.
> What is a realistic scenario where an array needs to be flattened?
Concatenating the result of a paginated API.
Showing all the objects two or more 1:N steps away from you in the object graph. The events your friends are attending, the issues your coworkers are working on, the people belonging to any of your same groups. Basically any time you would do a SELECT... JOIN in SQL.
If you’re doing that aren’t you kinda defeating the point of pagination? Paginated APIs to me are a contract that says the size of the results can be arbitrarily large and there be dragons if you try to fit them in memory.
With how many wrappers around paginated APIs to unpaginate them I must be wrong but it still bugs me.
It occurs commonly any time you are dealing with nested data.
For example: at work we deal with "roles" (professions), and underneath those there can be different specialisms. In some of our views we have filters that users can use to show a subset of the data. There is a filter for roles, and a filter for specialisms. But the specialisms filter should only show options that are relevant given the roles selected in the roles filter. So the code for generating the options to display in the specialisms filter dropdown is something like:
When `x => [x * 2]` is a function that returns an array, but you do not want an array of arrays. Doesn't make much sense with such a lambda, of course.
And of course you could call `flatmap` bind, if you prefer ;)
added this to the post as well, a place where there is an array inside an array of objects and you want all those. For example, if we want to extract all roles from the array:
you are right, I was trying to format it for HN but somehow screwed up copy pasting, though it'd be ["system-admin", "developer", "basic"] and not ['basic', 'system-admin', 'developer']
I'm glad JS is adding these methods to Array. It would be nice if iterables had similar methods for working with lazy sequences, similar to Rust, but this isn't practical since "iterable" is a protocol. Array can do it because it's a class. Perhaps JS could also adopt something similar to Rust's traits, such that implementing the protocol would make a set of related methods callable on that object.
This stirs up some memories of Perl, where flattening arrays/lists is the default, and you need explicit references to have nested arrays or hash tables. I remember it used to be a kind of foot-seeking gun for newbies, but I never ran into trouble with it once I understood references (which isn't super hard).