### Quick Tip: Statistics on Booleans

Due to `q`

’s type promotion rules, it’s entirely legal to use statistical functions on boolean vectors. `avg`

tends to be the most useful, but all of them should work as expected.

A typical use case: average `null`

ness (handy while developing ETL code). We will simulate with a vector of `float`

s mistakenly parsed as `int`

s (due to mostly looking like `int`

s):

q)v:@[string 1000?1000;-10?1000;,[;".123"]] q)t:flip(enlist`v)!(enlist"I";" ")0:v q)t v --- 468 959 221 694 934 865 344 997 314 580 45 745 898 935 64 177 238 361 850 241 .. q)avg null t v| 0.01 q)

So 1~~0~~% of `t.v`

is null, which (unless this is expected) should cause us to ask ourselves whether `"I"`

was really the right parse.

Labels: tip

## 2 Comments:

Don't you mean 1% of t.v is null? But yeah, avg null is very useful (just used it to see how many nulls in a table's columns)

Oops, yeah, thanks, I'll fix that.

Another useful one is "sum null", which is the same thing as "count where null" but shorter, and

muchshorter, and simpler, when used in a query.## Post a Comment

Subscribe to Post Comments [Atom]

<< Home