Jach's personal blog

(Largely containing a mind-dump to myselves: past, present, and future)
Current favorite quote: "Supposedly smart people are weirdly ignorant of Bayes' Rule." William B Vogt, 2010

My preference for dynamic typing

Alternatively: why I lost all excitement about Clojure's core.typed library after learning it makes no speed improvements.

There's a little-known paper published a few years ago called "An experiment About Static and Dynamic Type Systems: Doubts About the Positive Impact of Static Type Systems on Development Time".

Go read the paper, it's short. But since I know you're lazy, I'll give a brief overview of the experimental setup and its results in my own biased words. The authors made a programming language and IDE and had two versions of it: one with static typing and one with dynamic typing. They took a group of students and separated them into dynamic/static groups and trained them respectively -- the training for the static version took a little longer since they had to cover the type system. Then they were asked to implement a scanner and parser.

For the scanner task, the dynamic group finished faster. This surprised the researchers because they figured that if nothing else the static group would have the benefit of better- and self-documented code supposedly gained by using types and thus have an advantage in development time from this alone (one of the claims static typing believers often make). So they attempted to estimate time spent debugging, where they thought types should win out for sure, and they do see that in their estimate static typers seem to spend less time, but the statistical test they use does not allow them to conclude any meaningful difference in times. In other words, the type system doesn't affect debugging time at all, and in any case the total time to completion was faster with the dynamic group.

For the parser task, the researchers were interested in the quality of students' solutions, and so they had a bunch of test cases. Notably only one student (in the static typing camp) had 100% success in passing the test cases. However they found no statistical difference between the percentages of passed test cases among the two groups, even after removing those students who had a 50% (the lowest) success rate, or comparing among under-performing and outperforming student groups. So in terms of software quality, there's no benefit or disadvantage to using either type system. Finally they checked debugging times for the parser task, and were blown away: for the dynamic group, they debugged faster.

In summary, this paper shows two tasks where dynamic typing means the task is finished sooner than with static typing, and in one of the two tasks debugging time was also faster for dynamic typing while in the other there was no difference. It also shows that there was no difference in end-product quality for the parser task.

There are some problems with the paper. It's important to have replication and to my knowledge (admittedly I haven't looked hard) there haven't been numerous follow-up studies (the paper actually contradicts the results of a previous paper) -- there ought to be more studies, with the same setup and with different setups. I'd like to see a version with a static language including generics, and a dynamic language including a REPL. I think that the wins in development and debugging time for dynamic languages become even more impressive when a REPL is used. If you're using your REPL properly, you'll rarely have to run your program and wait for a piece of code to run before discovering a no-such-function or cannot-use-this-type-here error that a static system would have caught at compile time. (Plus it's not like dynamic languages don't have linters and other means of static analysis available.)

The paper it also littered with qualifications on claims. I agree with their title that the amount of evidence is only sufficient to inspire "doubts" about static typing's positive claims to a true believer, and that a default position of "typing doesn't matter to development time" is weakly shifted to favor dynamic typing being faster. But going with a different conservative position that a static type system offers no benefits in development time or quality of the finished work, one starts to wonder: why the hell would I want to use one?

The paper does only test for single-programmer efficiency and quality -- perhaps static typing shows benefits on one or the other or both when working on a large project involving several programmers. It also may be that certain tasks are better suited to using static or dynamic typing in their implementations. More powerful type systems like the one found in Haskell can also do more work for you and function as more than just ceremony. Adding in user-defined types can make intent and restrictions more evident and cut back on assert() tests. This all needs more research. My bias is that there's little or no benefit with static typing in these cases and there may even be detriments. Especially if mutability is the default, and enforcing immutability requires type declarations.

To me, there's only one clear benefit of static typing that matters: efficiency of execution on a real machine. Sure dynamic languages have made great progress on improving their execution efficiency (specifically with JIT or AOT compilers or just by being a super-simple on-the-metal language such as Forth), but in general you're going to find that having a solution in C or C++ is going to beat a native Python or Ruby or Lisp solution (but not always, especially in the case of Lisp), and that a pure Java solution is going to generally beat a Clojure or Groovy or Jython solution.

But for most programming cases is that reduction of execution time worth the cost of extra development time? I don't think so; hardware is fast (so you may not need the speedup) and more hardware is cheap (possibly cheaper than paying an engineer the extra time to do the statically typed version). I also think that for most popular languages the means to talk to compiled C binaries aren't that hard or time consuming, with the majority of the time being writing the C itself, so in the cases where you do need that extra performance a smaller static typed version of the program is waiting for you. This is in some sense a duplication of effort and will cost development time, but the total development time will still be lesser than what it would be if you used the static language for everything to begin with.

This brings me back to core.typed, which as a project and implementation of optional types is pretty impressive and if the obviously smart devs of it are enjoying their work on it I have nothing against them. I'm left with just one angry question resulting from my existing bias that prefers dynamic typing which gets supported by research such as the paper mentioned in this post: why the fuck would anyone use this when there are no gains in speed?! Speed is the only real reason to deal with that shit. Bleh.

Posted on 2014-05-20 by Jach

Tags: personal, programming


Trackback URL:

Back to the top

Back to the first comment

Comment using the form below

(Only if you want to be notified of further responses, never displayed.)

Your Comment:

LaTeX allowed in comments, use $$\$\$...\$\$$$ to wrap inline and $$[math]...[/math]$$ to wrap blocks.