Hacker News new | past | comments | ask | show | jobs | submit login
Flatten Arrays in Vanilla JavaScript with Flat() and FlatMap() (wisdomgeek.com)
63 points by saranshk on Jan 5, 2022 | hide | past | favorite | 89 comments



Something that people may not see immediately is that flatMap is more general than map and filter. Say, for a contrived example, that you'd like to filter out the even numbers in an array, and then double the odd numbers that remain. Instead of:

  [1, 2, 3, 4, 5].filter(n => n % 2 === 1).map(n => n * 2)
You can do:

  [1, 2, 3, 4, 5].flatMap(n => n % 2 === 1 ? [n * 2] : [])
Again, this is a contrived example, but I think it's interesting since the generality is not obvious (to me)


For those wondering at home, the reason you shouldn't do this is immediately spelled out in the Mozilla docs for flatMap:

> Note, however, that this is inefficient and should be avoided for large arrays: in each iteration, it creates a new temporary array that must be garbage-collected, and it copies elements from the current accumulator array into a new array instead of just adding the new elements to the existing array.


Sure. I'm not suggesting it be used to this effect; I'm noting the generality as an interesting point.


I assumed you were aware of the issues, which is why I said those at home who might just take it at face value as a neat pattern. :)

I personally do actually use flatMap sometimes as outlined in your example - perf is an after issue and I'll refactor if the concise code isn't worth the cost.


I just filed this issue on the MDN page: https://github.com/mdn/content/issues/11763

That note is misleading.


It still creates a temporary [x, 2*x] array for every element though. This is an unavoidable problem with flatMap, while reduce can easily be changed to reuse the same accumulator array, making it twice as fast as flatMap and almost as fast as the simple for-loop approach.


My preference for when I have to do multiple operations on the same array (like filter then map) is "reduce". It's the most straightforward way other developers will understand what's going on.


Not that I disagree with your preference, in fact I share it. But this:

> It's the most straightforward way other developers will understand what's going on.

… has been the opposite of my experience. Both on the job (where I’ve always conceded to team preference for imperative loops) and observing the community (hating on reduce is a whole meme on JavaScript/TypeScript Twitter, and the contrary meme has never shown up at least on my feed).

And I honestly understand why it’s not very popular. Reduce/fold is a very FP concept which isn’t particularly idiomatic in real world JS. When I learned and embraced it (myself coming from a JS background), it took dozens of real uses before I felt like I had committed to my own memory what’s actually happening. And by then I think I was writing Clojure.


> My preference for when I have to do multiple operations on the same array (like filter then map) is "reduce". It's the most straightforward way other developers will understand what's going on.

There are definitely operations that are more intuitive as reduces, but any sequences consisting of exclusively filter, map, and/or flatMap are, IME and IMO, about the worst candidates.

(That said, because the JS filter, map, etc. operations are eager rather than lazy, sequenced operations produce intermediate arrays that may be undesirable especially with large datasets, so reduce can be desirable even without being more clear in intent.)


In addition to being essentially a combined "filter" and "map", it's also a "better" filter than filter itself in TypeScript in such that it narrows types much more ergonomically[0].

In TypeScript, you might have an array of multiple types (e.g. `Array<A | B>`), and use a `filter` call to only keep the `A`s. However, in many situations TypeScript can't figure this out and the resulting array type is still `Array<A | B>`. However, when you just use `flatMap` to do nothing more than filtering in the same way, TypeScript can determine that the resulting type is just `Array<A>`. It's a bit unfortunate really - `filter` is faster and more readable, but the ergonomics of `flatMap` type-wise are so much nicer! Just some interesting trivia.

[0]: https://github.com/microsoft/TypeScript/issues/16069#issueco...


I wonder if it is possible to add a feature to Typescript to help with this:

You could potentially add a syntax for type guards function types, then add a signature to filter that accepts a type guard and returns an array of the guarded types.

Shouldn't be too much of a stretch given that we have type guards.

The syntax is a bit annoying... should be something like filter<A, B>(cb: A => A is B)

:/


You can use a type guard[1] as an argument to Array.filter, but the function has to be explicitly typed as such.

I don't know why the type isn't narrowed in Array.filter like it is in if statements without this weird workaround.

  const array: (number | string)[] = [];
  
  const mixedArray = array.filter(value => typeof value === 'string');
  // mixedArray: (number | string)[]
  const arrayOfString = array.filter((value): value is string => typeof value === 'string');
  // arrayOfString: string[]
This example in Typescript playground: https://www.typescriptlang.org/play?#code/MYewdgzgLgBAhgJwXA...

[1]: https://www.typescriptlang.org/docs/handbook/advanced-types....


Oh so there is an overload!

filter<U extends T>(pred: (a: T) => a is U): U[];

Additionally, getting TS better at inferring type guards is an open issue (literally): https://github.com/microsoft/TypeScript/issues/38390


You could do that, but I’d argue using filter then map is more readable. What do empty arrays have to do with doubling even integers?


The interesting generalization is that once you realize that flatMap lets you map and filter at the same time is that you can generate arbitrary elements in the output list corresponding to each item in the input list. For example,

    ls.flatMap(x => {
      if (x < 0) {
         return [] 
      } else if (x == 0) {
         return [0]
      } else {
         return [Math.sqrt(x), -Math.sqrt(x)]
      }
    })
gives you all the real square roots from the original list, doing the mapping, flattening, and filtering all in one function call.


(just for laughs)

  ls.filter(o=>0<o).map(o=>o||[Math.sqrt(x, -Math.sqrt(x)])


I agree on the readability. There's a TC proposal floating around for a pipeline operator. I don't think it's moved but that would be a game changer.


>What do empty arrays have to do with doubling even integers

nothing, but they do have some relationship to 0, "" and Promise.resolve() - the array is handling the logic that will make the results be combined, not the doubling part


Filter and then map will iterate the list twice. JS really needs some iterator-based methods like Lodash where it will only go through the list once in this case.


You've discovered transducers (which I think have a rather horrible and confusing for newcomers higher-order function presentation in the language that popularized them, i.e. Clojure, when they could just be lists)! All transducers are is a function `x -> List(x)` and then you can use other functions such as `flatMap` to apply them (as your example illustrates nicely this is why map, filter, and its combination can all be described as single transducers).

You do have to make sure that your implementation of list is extremely efficient on zero and one element lists (ideally it generates no garbage at all in those cases) otherwise as other commentators have pointed out you'll have a lot of GC pressure.

And even though the transducer itself is `x -> List(x)` note that the `List` is only produced as an intermediate step and doesn't need to exist in the final product. You could apply a `x -> List(x)` to a generator for example and just "absorb" the list back into the resulting generator.


I'm not sure it's any more general considering that you have to return an array and also treat an empty array as a 'null' value.

Or to put it another way, if I reviewed code where someone used flatMap for anything other than lists of lists I'd be likely to suggest filter/map or reduce or some other convenient equivalent depending on the purpose of the code.

Something like Ruby's filter_map[0] would do the job, although not with this particular example (because 0 is truthy in Ruby).

[0] https://ruby-doc.org/core-3.1.0/Enumerable.html#method-i-fil...


    /** Return this symbol to skip the current value. */
    const SKIP = Symbol("mapFilter.SKIP");
    
    /**
     * @template T, R
     * @param {T[]} array
     * @param {(SKIP: Symbol, currentValue: T, index: number, array: T[]) => R|SKIP} callback return `SKIP` to filter out an element
     * @param {number} [begin] defaults to 0
     * @param {number} [end] defaults to `array.length`
     * @returns {R[]}
     */
    function mapFilter(array, callback, begin = 0, end = array.length) {
      const ret = [];
      for (let i = begin; i < end; i++) {
        const v = callback(SKIP, array[i], i, array);
        if (v !== SKIP) ret.push( /** @type {R} */ (v));
      }
      return ret;
    };
Here. Less than ten lines without JSDoc type annotations, twenty with them. It lets you slice, map and filter all in one call without allocating intermediate arrays like you would when chaining them, making it almost as fast as a plain for-loop. It's also easy to turn it into an in-place version, removing even the array allocation overhead.


It is also much less readable.


If you are looking for really general and powerful, then there is the mighty reduce:

    [1, 2, 3, 4, 5].reduce((x, y) => y % 2 === 1 ? [...x, y * 2] : x, [])


The spread operator looks cool and makes just returning the ternary operator work here but its performance implications are non-obvious (it's makin' copies). With reduce() you're really wanting something like this:

    [1, 2, 3, 4, 5].reduce((x, y) => { if (y % 2 === 1) x.push(y * 2); return x; }, [])
I've many times wished that push() would just return the array, it would make reduce() far easier for this sort of use case.


I guess...

  x.concat([y*2])
would return the array (but makes a duplicate)

Anyway, I find this to be a whole lot more sensible:

  x=[];
  for(y of [1,2,3,4,5]){
    if(y%2===1)x.push(y*2)
  }
Or even!

  y=[1,2,3,4,5];
  x=[];
  // map reduce/flatmap/map/filter etc omg wtf
  for( i=0; i < y.length; i++ ){
    if( y[i]%2 === 1 ){ x.push( y[i] * 2 ); }
  }
I cant even tell what language this is but there is nothing here that needs fixing.


Or… use reduce.


Yes, but you could also use `fold`^H^H^H^H`reduce`.

[1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? acc.push(2*n) : acc, [])


Push returns the length of the array, though, so that won't work.


Comma operator to the rescue:

    (acc.push(2*n), acc)
In an expression position, the push statement will be executed, then its return will be discarded, and the final expression will be the result of the expression. Is it “better”? Almost certainly not. But it lets you stay in expression syntax while executing statements. (Much more useful for logging than meaningful runtime side effects IMO, but I think it should be more widely known in general.)

Edit: and I’m glad to see another reference to it down thread!


Ah, sorry, so you have to concat the arrays using `concat`.

    [1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? acc.concat([2*n]) : acc, [])


Wouldn't recommend doing this - if the original array is of significant length this'll get quite slow because `acc.concat` has to create a brand new array of slightly longer length on each iteration it's called. Better to just use `push` like you suggested before and then return the array if you want to use `reduce`.


Yes, of course, that's why I used `push` at first.


Use the comma operator: (acc.push(2*n), acc) will return acc. Or e.g.

  [1, 2, 3, 4, 5].reduce((acc, n) => (n % 2 ? acc.push(2*n) : null, acc), [])


If you're just iterating through the array and mutating an object on each iteration, just use a for loop.


Obviously you can alternately write:

  let input = [1, 2, 3, 4, 5], output = [];
  for (let i = 0; i < input.length; ++i) {
    let n = input[i];
    if (n % 2) output.push(2*n);
  }
  return output;
But in some circumstances the other style can be more convenient / legible. The immediate question was about pushing to an array and then returning the array, for which the comma operator can be handy.


No argument that the comma operator is a neat trick when you need it.

FWIW, it's 2022:

  const output = [];
  for (const n of [1, 2, 3, 4, 5]) {
    if (n % 2) output.push(2 * n);
  }


Minority opinion: please `let` your mutable references. I know `const` doesn’t signal immutability, but we as humans with eyeballs and limited attention span certainly benefit from knowing when a value might change at runtime.


Disagree: virtually everything in JS is mutable, so this almost means "never use the `const` keyword". Pretending that the `const` keyword means something that it doesn't makes things harder for my limited human mind to understand, not easier. Plus using `let` inappropriately makes my linter yells at me, and I usually like to just do whatever my linter tells me.

Anyway, I use TypeScript, so if I really want to assert that my array is immutable (as immutable as stuff in JS-land gets anyway) I just write:

  const input: readonly number[] = [1, 2, 3, 4, 5];
or even

  const input = [1, 2, 3, 4, 5] as const;


I realize I could be clearer in what I’m asking for: please use const when you use reference types as values, and use let when you intend to mutate the reference. Using const and then changing a value is certainly allowed but it’s confusing and it’s missing an opportunity to signal in the code where changes might happen.


I `readonly` and `as const` everything I possibly can. I do know that const doesn’t mean immutable, as I said, but I think it should and I think there’s value in establishing the idiom even if it’s not currently adopted. Because otherwise const basically means nothing unless you’re mutating everything already.


Surely the spread operator is nicer here?

    [1, 2, 3, 4, 5].reduce((acc, n) => n % 2 === 1 ? [ ...acc, 2 * n ] : acc, [])


This is not efficient. Each iteration creates a new array instance due to the spread operator.


`acc.concat()` also creates a new array instance, so I don't get your point.


Hilariously enough, I think reduce would be much more palatable to JS devs if it was named fold. Not because the familiarity would jump out but because it’s a much more evocative name that could make visualizing its behavior more intuitive.


What is `f`reduce`?


I've made it a habit to check Mozilla's JS docs once in a while for functions like Flat() and FlatMap(). Sometimes if I find myself reaching for underscore/lodash, I'll check Mozilla docs first to see if there's some new function that can let me omit using lodash.

I'm often delighted to find new convenience functions I can just use without adding another dependable.


I often use Lodash’s sumBy, groupBy and orderBy. Love the versatility and one-liner aspect.

E.g.

    sumBy(“orders”, “total.amount”)


Which input method are you using that double quotes are ” (U+201C) instead of " (U+0022)?


Wow, no clue. That was on mobile... web app or Octal (native app). Stock iOS English (US) keyboard.


Not that you have to, but just FYI you can disable "smart" punctuation in iOS Settings: General -> Keyboard -> Smart Punctuation. I don't know why they omitted the quotes around smart there.


there have been a lot of those added to the ES spec lately and a lot more coming soon. So it is always good to check MDN because I usually end up learning something new mostly.


Lodash includes guards for all it's functions which can help avoid "Cannot call function map of undefined" and similar errors.

So I still prefer using lodash map, reduce, filter, etc for that added defensiveness.


If you're chaining, you're still MUCH better off with Lodash. Native functions iterate once per function. Lodash uses an iterator and only loops through the list once.


Is there a Google equivalent to MDN?


No.


I've always found `flatMap` to be interesting because it feels like a convenience function, combining .map(fn).flat() into one function. It's interesting because JS doesn't really have a lot of convenience functions that are this shallow (ie. that provide just minimal cleanup compared to the functions they're wrapping).

Is there some specific reason that `flatMap` made the cut?


probably because flatMap (also called chain) is an important array (or monads in general) function in functional programming. If anything, .flat() is the aberration.


> If anything, .flat() is the aberration.

Same as `flatmap` (`bind`), `flat` is 'part' of a monad (as `join`).


good point, I didn't think of that equivalence.


In mathematics it would be 'flat' that is the important one, not flatMap. You get 'flatMap' by just composing the monad multiplication 'flat' with the monad functor's mapping of arrows 'map'. That's mainly just to highlight the similarity with regular monoids though.

But in a program it usually makes sense to do the flattening in one go, since it avoids the need to do some possibly expensive intermediate calculations.


For a lot of common cases (where the returned lists from the fn are short, but the complete list is long), a combined flatMap can be optimized to be much faster than doing the functions one by one.


Small mistake, and maybe that's why this function exists, but it's actually combines `.map(fn).flat(1)`.

I think conceptually, `flatMap` is a `map` where each iteration can return multiple values (or none). So it feels quite more powerful than just `map` and is a quite common convenience function.


I think the depth argument of flat defaults to 1, so it'd be the same anyway. Not sure why the default is 1, I'd prefer Infinity so it'd always flatten everything by default, but maybe there are some good reasons for it.


I think it should be explicit without a default.

But in the real world, if we must have a default, I think it should be 1, not infinity.

Infinity has a frustrating fail state, in my opinion: you might be flattening arrays that weren't meant to be flattened. And now you've got a cluster!@#$ of data to look at and reason about.

The fail state of 1 is that it doesn't flatten enough but you have the original structure in your 1-level flattened output. So you're far more likely to make heads and tails of how much more flattening you need.


Arguably the same argument works the other way around as well, that in thinking flat recursively flattens all arrays of arrays you get confused when it doesn't. (I've definitely run across that footgun.)

For me, I just find that in almost all cases I've used flat it's been with Infinity as the depth value. It seems to me as well that most of the time you'd probably either want 1 or Infinity, not 5 or 12 or whatever, so perhaps it would've even been better to just have two functions: flat and flatDeep (or some such, naming is hard) the latter of which would default to Infinity but allow different depths as well.

Probably reads better than just flat as well, e.g. `.flatDeep(5)` rather than `.flat(5)`. Oh well, we're stuck with this now so it's all an academic exercise anyway, but I'd be curious to know the rationale of the design. Maybe I'll wade through the spec repo one day to see if there's any discussion about it.


In what circumstances is this common?

I've been a developer for 15 years and I have rarely found a need for flatMap.


perhaps it’s a case of “names are fun”


I have very little experience with arrays in JS, but I always have a nagging feeling that, if my algorithm needs me to flatten an array, then there is something wrong with the data structures I am using.

Example in the article is meaningless to me:

    array.map(x => [x \* 2]);
    // [[2], [4], [6], [8]]

    array.flatMap(x => [x * 2]);
    // [2, 4, 6, 8]
because the right way would be array.map(x => x *2); anyway.

What am I missing? What is a realistic scenario where an array needs to be flattened?


> What is a realistic scenario where an array needs to be flattened?

Concatenating the result of a paginated API.

Showing all the objects two or more 1:N steps away from you in the object graph. The events your friends are attending, the issues your coworkers are working on, the people belonging to any of your same groups. Basically any time you would do a SELECT... JOIN in SQL.


If you’re doing that aren’t you kinda defeating the point of pagination? Paginated APIs to me are a contract that says the size of the results can be arbitrarily large and there be dragons if you try to fit them in memory.

With how many wrappers around paginated APIs to unpaginate them I must be wrong but it still bugs me.


Sometimes the pagination is optional to allow the client to only get as many results as they can handle.

But far more often, the pagination is forced - get 100 results, hit this link for the next 100 - in order to limit the load on the server.

Most trivial scenario, client runs a search that's too generic, you don't want to waste server resources actually preparing 999999 results.


It occurs commonly any time you are dealing with nested data.

For example: at work we deal with "roles" (professions), and underneath those there can be different specialisms. In some of our views we have filters that users can use to show a subset of the data. There is a filter for roles, and a filter for specialisms. But the specialisms filter should only show options that are relevant given the roles selected in the roles filter. So the code for generating the options to display in the specialisms filter dropdown is something like:

    const availableSpecialisms = roles
      .filter(role => selectedRoles.includes(role.id))
      .flatMap(role => role.specialisms)


Consider a parent -> child relation. You have an array `parents`, but want all their children.

    parents.flatMap(parent => parent.children)


Some sort of function that returns a variable number of arguments.

It's basically the same as the list monad in haskell, and there are examples here https://en.wikibooks.org/wiki/Haskell/Understanding_monads/L... that can be followed even if you don't know any Haskell.


Let's say I have a list of friend objects:

myFriends: { name: string, friends: friend[] }[]

If I wanted to get a list of friends of friends, I could do something like

myFriends.map(prop('friends')).flatten()

Or

myFriends.flatMap(prop('friends'))

My data is a list of lists but I really want a list of friends so I flatten.

Maybe the Friend object is poorly created/designed but often times you'll just have to deal with whatever the API gives you.


When `x => [x * 2]` is a function that returns an array, but you do not want an array of arrays. Doesn't make much sense with such a lambda, of course.

And of course you could call `flatmap` bind, if you prefer ;)


added this to the post as well, a place where there is an array inside an array of objects and you want all those. For example, if we want to extract all roles from the array:

[ { name: "Saransh Kataria" , roles: ['system-admin', 'developer'] }, { name: "Wisdom Geek" , roles: ['basic'] }, ].flatMap(x => x.roles);

// Output => ["admin", "system-admin", "developer"]


wouldn't the output be ['basic', 'system-admin', 'developer'] rather than array[0] being 'admin'?


You are right that it should be 'basic', not 'admin'. But the order should be 'system-admin', 'developer', and finally 'basic'.


you are right, I was trying to format it for HN but somehow screwed up copy pasting, though it'd be ["system-admin", "developer", "basic"] and not ['basic', 'system-admin', 'developer']


I'm glad JS is adding these methods to Array. It would be nice if iterables had similar methods for working with lazy sequences, similar to Rust, but this isn't practical since "iterable" is a protocol. Array can do it because it's a class. Perhaps JS could also adopt something similar to Rust's traits, such that implementing the protocol would make a set of related methods callable on that object.



More like ; than ,


This stirs up some memories of Perl, where flattening arrays/lists is the default, and you need explicit references to have nested arrays or hash tables. I remember it used to be a kind of foot-seeking gun for newbies, but I never ran into trouble with it once I understood references (which isn't super hard).


A lot of funtional primitives have really counterintuitive names for me. flatMap is one of them.


agreed, though I doubt there's anything we can do about it.


In F# we call it 'collect' instead, partially for this reason.


When I tried to come up with better name for flatMap I got gather.

So maybe collect is a better idea.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: