Erik Meijer: The World According to LINQ

smhinsey · on Sept 4, 2011

Syntactically LINQ is a breath of fresh air, but it's totally the opposite when it comes to debugging. Right now the experience of debugging a nested loop (which typically could be converted to some form of LINQ) is way better than if you look at the equivalent LINQ.

It's totally possible I am missing some magical IDE feature, but if you want to do something like take a look at the 4th item in some intermediary part of a LINQ query, you're out of luck unless you can either break it up into smaller units of syntax or convert it to a loop.

politician · on Sept 4, 2011

Put your cursor on the part of the expression that you want to debug and press F9 to set a breakpoint. Setting breakpoints by clicking in the gutter doesn't work for LINQ like you might expect - you have to use the hotkey.

I frequently hear complaints about debugging LINQ, but Visual Studio supports it well enough. Microsoft could have done a better job informing people about it.

smhinsey · on Sept 4, 2011

Wow, that is probably going to help a lot, thanks. I've been using VS since the 90s and I still feel like I don't know half of what it can do.

I'm not sure how to fix this, but I am pretty sure the increasing tendency to release documentation as videos is not the answer.

sausagefeet · on Sept 4, 2011

Does anyone know about what kind of optimizations LINQ gets in the VM? Does it do something like stream fusion? I wish MS had decided to make .Net multiplatform, there are a lot of really great things on there, but Windows just isn't my cup of tea.

MichaelGG · on Sept 4, 2011

I don't believe the VM is aware of LINQ at all. As far as optimization, there's no guarantee to pureness with CLR code, so the compiler/JIT cannot do very much. (That is, a query composed of multiple parts cannot be turned into a nice simple loop.) The generated code creates enumerable objects, and possibly closure objects which are called via delegates.

On the topic of multiplatform, Mono has been working very well for many people. http://www.mono-project.com/

papaf · on Sept 3, 2011

His example of counting unique words is not impressive compared to the unix command line:

     sed -e 's/ \+/\n/' words.txt | uniq -c

wanorris · on Sept 3, 2011

Unix shell scripts are fantastic for what they do. But most people find that it's easier (and more performant) to build, say, web applications using a programming language than by using shell scripts over CGI. Assuming you commit to a programming language, then you either then commit to multiparadigmatic programming and call out into shell whenever you need to do work like that, or you give up shell inside your program in favor of using the API tools inside the language. Outside of possibly perl, calling out into shell for things like that isn't very common anywhere I've ever worked, and with good reason.

Once you commit to a programming language instead of shell, then you have various types of IO to worry about. What you want might be in an array or list, a file, a database table, an xml file, or the response from a web service. Typically, you either need to write a decent chunk of code for each of these sources, either for abstraction or for marshaling.

With LINQ, all you need is a change in the upfront handling, and you can reuse the same comprehension logic across all the different data sources. This is a pretty nice deal, all things considered.

And really, one way of looking at adding comprehensions like this to languages is to make programming in languages more like chaining together code in a shell script -- something that programming in traditional Algol-descended languages is not normally like at all.

joblessjunkie · on Sept 3, 2011

You're either trolling, or you've completely missed the point.

This is not an article about trivial word counting. This an article about mathematical abstractions behind all batch data processing (unix pipes included).

Just because we already know how to count small numbers of words in bash doesn't mean we can't learn anything from this article.

epistasis · on Sept 3, 2011

My apologies for being pedantic, but that's not quite right, to do something similar you'd need

  sed 's/ \+/\n/' words.txt | tr A-Z a-z | sort | uniq -c

And to get the top five, you'd need another sort:

  sed 's/ \+/\n/' words.txt | tr A-Z a-z | sort | uniq -c | sort -gr | head -5

Where the delimiters of interest go where the space is in the sed expression. Of course, I can never remember what strange regex syntax sed uses, so I have spend 5 minutes trying to puzzle through man page.

Jacob4u2 · on Sept 3, 2011

Thats not exactly apples to apples. The powershell equivalent is probably more comparable.

Also, it all depends on what you value; cryptic one liners definitely will get you geek cred, but working with other people not as smart as you will not be fun when having to explain and maintain code like that.

Of course, you may never have to worry about that.

michaelcampbell · on Sept 3, 2011

Mentioning powershell and denigrating "cryptic one liners" in the same post almost made me spit milk out my nose.

fhars · on Sept 4, 2011

You may not realize it, but your little funny quip does actually rely on a deep mathematical equivalence: LINQ is basically syntax for monads, and unix shell programming can be seen as monadic, too: http://okmij.org/ftp/Computation/monadic-shell.html

So, in a way, yes, it is the same, but that is exactly the point.

prototype56 · on Sept 3, 2011

Now use the same sytax to query yahoo weather. Get the point?

epistasis · on Sept 3, 2011

I don't really get the point. Unix has been fantastic at scraping and munging text for decades.

  curl http://weather.yahoo.com/united-states/california/san-jose-2488042/ | sed '/Current conditions/s/.*id="yw-temp">\([0-9]\+\).*/\1/'

It may be fragile, but any method of extracting data out of HTML is going to be fragile when the provider changes design or layout.

A tiny bit of knowledge of grep, sed, and awk, and other simple unix text utilities such as join, comm, cut, paste, goes a long long way.

Jacob4u2 · on Sept 3, 2011

The example happens to use "munging" text, but I think the GP is trying to make the point that you can't use sed (effectively) to parse, for instance, a collection of database entries from an SQL server in the same way that LINQ would be able to do so.

The tl;dr I got from the article was LINQ is effective at working with sets of data; not just sets of text data from a text file.

_delirium · on Sept 3, 2011

True, although Plan9 pushed that part of the Unix philosophy even further, towards where it arguably handles some of those more general cases as well, with "structural regexes" that work on things other than collections of lines: http://doc.cat-v.org/bell_labs/structural_regexps/

Jacob4u2 · on Sept 3, 2011

Isn't the general consensus that regexes are hard to maintain and debug? I'm not sure that "structural regexes" are solving the right problem.

I think maybe we're not shooting at the same baskets though (basketball reference, apologies if you're not from US); I'm trying to write software applications, not shell scripts.

papaf · on Sept 3, 2011

For illustration (probably wrong again):

    curl 'http://weather.yahooapis.com/forecastrss?w=615702&u=c' | \
        xmlstarlet sel -N 'y=http://xml.weather.yahoo.com/ns/rss/1.0' \
              -t -m '//y:forecast' -v '@text'

edit: corrected xml line

moomin · on Sept 4, 2011

I'm sorry, but you're still missing the point. The point is that the syntax with LINQ is actually the same. Saying you have a program to do X in bash is the equivalent to saying you've got a library to do X in C# or Java.

LINQ, on the other hand, can work with arbitrary runtime objects and arbitrary backend implementations. Haskell's still cooler, though.