Monday, October 15, 2012

Functional Query Functions

So, five months ago I said I’d cover functional query utility functions “next time”. Well, next time is finally here. Sorry for the wait; hope it was worth it.

Before I begin, though, two quick notes:
  1. Functional queries have no performance advantage. They are identical to qsql queries in speed and memory usage.
  2. Funcitonal queries are hard to maintain. Even using the techniques described here, functional queries are still harder to read and understand than qsql queries.
Therefore, functional queries should only be used when qsql is incapable of accomplishing the task at hand. (Typically, this involves dynamically choosing columns.)

As mentioned previously, getting the parse trees of the c, b, and a arguments to functional queries right is quite tricky. One technique I frequently use to make this easier is to use utility functions to abstract away as much of that complexity as possible. In their most basic form, they simply generate (composable) elements of the relevant structures:
q)c:{parse["select from t",$[count x;" where ",x;""]]. 2 0}
q)b:{parse["select",$[count x;" by ",x;""]," from t"]3}
q)a:{parse["select ",x," from t"]4}
For example:
q)t:([]x:1 2 3 4;y:1 1 2 3;z:7 8 9 10)
q)select sum x by y from t where x<>1
y| x
-| -
1| 2
2| 3
3| 4
q)?[t;c"x<>1";b"y";a"sum x"]
y| x
-| -
1| 2
2| 3
3| 4
q)
In practice, of course, these functions should be used to express in qsql whatever parts of a query are so expressible, while reserving the actual parse trees for the parts of the query that need them:
q)show each t{?[x;c"x<>1";b"y";(enlist y)!enlist(sum y)]}/:`x`z;
y| x
-| -
1| 2
2| 3
3| 4
y| z
-| --
1| 8
2| 9
3| 10
q)
More specialized utilities can also be written, once their need becomes apparent. One particularly difficult thing to express in functional form is a multi-column fby:
q)t2::update k:`a`a`b`c from t
q)select from t2 where i=(last;i)fby([]y;k)
x y z  k
--------
2 1 8  a
3 2 9  b
4 3 10 c
q)parse"select from t2 where i=(last;i)fby([]y;k)"
?
`t2
,,(=;`i;(k){@[(#y)#x[0]0#x 1;g;:;x[0]'x[1]g:.=y]};(enlist;last;`i);(+:;(!;,`y`k;(enlist;`y;`k)))))
0b
()
q)
The primary problem is that the parse of this query contains the instructions for assembling a table literal out of its individual pieces. Here’s the functional equivalent, written all the way out:
q)?[t2;enlist(=;`i;(fby;(enlist;last;`i);(flip;(!;enlist`y`k;(enlist;`y;`k)))));0b;()]
x y z  k
--------
2 1 8  a
3 2 9  b
4 3 10 c
q)
Using a utility function, it can instead be written much more simply:
q)fbyx:{.[parse["select from t where ",x,"fby c"]. 2 0 0;2 2;:;(flip;(!;enlist(),y;(enlist,y)))]}
q)?[t2;enlist fbyx["i=(last;i)";`y`k];0b;()]
x y z  k
--------
2 1 8  a
3 2 9  b
4 3 10 c
q)
Judicious application of these techniques (not to mention judicious application of functional querying in the first place) can make code much more readable.

Labels: