TheJach.com

Jach's personal blog

(Largely containing a mind-dump to myselves: past, present, and future)
Current favorite quote: "Supposedly smart people are weirdly ignorant of Bayes' Rule." William B Vogt, 2010

Ramblings on data structure literals

I've been stewing a bit about Clojure's pretty data structure literals, and finding myself wanting them in other languages. In Common Lisp, maybe enough to use one of the various reader macros for maps. But what is it I really want?

In PHP, you used to have to do array() to make a new array or list. You could have data literals though: array(1, 2) for lists/arrays, array('foo' => 3, 'bar' => 4) for maps. It worked for me. I didn't need literal syntax for sets or anything else.

Later on I learned Python, I prefer the syntax a bit more. (1,2) for a "tuple" (list), [1,2] for an array/list/stack/queue/... very versatile, {'a': 3, 'b': 4} for a map. In JavaScript it (can) be about the same for arrays and maps, though you can leave off the key quotes for maps, but leaving them there (and using double quotes) has the nice side-benefit of being more likely to be valid JSON. It depends on whether your value is something serializable or not of course.

In Java, you get nothing. Ok that's not that true (anymore), speaking of modern Java you have things like new String[]{"a", "b"} for a classic native array of strings, but usually you want a Collection and there are various classes that let you write things like List.of("a", "b"). There's even a Map class that lets you write Map.of("key1", "val1", "key2", "val2").

Clojure has some really clean syntax for these that feel very Lispy in a certain sense. When introducing Lisp, a common freak-out is that fn(a, b) turns to (fn a b). But all you're doing is moving the opening paren and dropping the redundant commas (like Python dropped the redundant curlies and semicolons). Though Clojure lets you add back in the commas if you really like them (similarly Python lets you have semicolons), (fn a, b) works in Clojure since commas are whitespace. So in the similar spirit, Clojure drops the necessary commas and other syntax chars, and you're left with the essence of the data: '(1 2), [1 2 3], {"a" 2 :b 3}. The addition of symbols and keywords that is common to Lisps make for particularly nice mappings. It's hard to really describe why "somekey" looks worse than :somekey (and both look worse than SOME_KEY to me though I have coworkers who disagree) even though it's just one extra character, but it does.

In Scheme and Lisp, you get things like '(1 2 3) aka (list 1 2 3) and #(1 2 3) aka (vector 1 2 3). The lists here are a bit odd, built on cons pairs. It's not that odd if you've ever written a linked list in C before, but the nested-cons can still look weird.

It gets kind of sad when you want a map. Immediately you're into Java equivalence for expressiveness! (setf m (make-hash-table)) (setf (gethash :key) 2) (setf (gethash "b") 4) ... Ugh, reminds me of the static Java blocks of making a new hashmap and putting several things. And that's probably not even what you want anyway, the default Lisp hash table that is. Now you try printing out m, and ugh again! It shows you a Java-ism of the pointer where the object lives, not the data literals inside that you want to see.

You can easily make a reader macro to have whatever syntax you want, including Clojure's, or define a (make-my-hash k1 v1 k2 v2) function, but will anyone else use it? What if they don't like your defaults for the underlying implementation?

Lisp provides something about as good that avoids all these questions by not using a hash table under the hood. Instead it uses an "association list". Which is just a simple linked list. You also get some fancy extra features like shadowing that might be useful. Anyway, is '((:key . 2) ("b" . 4)) really so much worse to type than the buttery Clojure version? You can even leave off the dots if you're willing to give up simple cdr to access the value. Another flexible syntax is (pairlis (:key "b") (2 4)), where you have a list of keys followed by a list of vals.

The bigger objection is probably what happens if you want to use strings for your keys, which you will if you're just trying to slurp a JSON file... Then your lookup has to be (cdr (assoc "bar" '((:key . 3) ("bar" . 4)) :test #'equal)). Maybe there's some way to override the default tests but that would probably break things... Anyway it's an ugh moment.

After stewing, I came to the conclusion that the final ugh moment doesn't matter that much. In most cases where I'm using a map literal, I'm not using string keys, the alist syntax is fine, and there are enough supports in the language to manipulate them. If I really care, I'll just use the FSet library to have syntax like (map (:a 1) (:b 2)) that really creates an immutable map or if I really really care the folio2 library on top where I can have my Clojure-like macro of {:a 2 :b 4} and call it a day. Then the only thing I think I miss is the fact that in Clojure keys are callable functions of maps, and maps are callable functions of keys.


Posted on 2018-08-03 by Jach

Tags: clojure, lisp, programming

Permalink: https://www.thejach.com/view/id/353

Trackback URL: https://www.thejach.com/view/2018/8/ramblings_on_data_structure_literals

Back to the top

Back to the first comment

Comment using the form below

(Only if you want to be notified of further responses, never displayed.)

Your Comment:

LaTeX allowed in comments, use $$\$\$...\$\$$$ to wrap inline and $$[math]...[/math]$$ to wrap blocks.