Friday 1 December 2017

Cast your NULLs in case you use CASE!

You might tolerate a programming environment which is unable to cast (for example) numbers in string "containers" to numbers, but not being able to cast a NULL value from one datatype to another could drive any Buddhist monk crazy!

So lock your Buddhist monks away, because that can happen in [Oracle] SQL!

However the issue could manifest itself in situations not worthy of being known as "best practice!"

(Wow! Three exclamation marks already! Well, five including this sentence. This post better be good.)

You stored your Dates as WHAT?!


You might remember Richard Foote's articles (1, 2, and 3) on the consequences of storing Date values as Numbers or Varchars.

This is another example of the consequences of such bad practices. I recently came across this example, which made this post happen.

SQL>WITH RND AS
(
    SELECT
        CASE
            WHEN DBMS_RANDOM.VALUE(1, 100) <= 50
                THEN 0
            ELSE
                20171009
        END NUM
    FROM DUAL
    CONNECT BY LEVEL < 10
)
SELECT
    CASE
        WHEN NUM = 0 THEN DECODE(NUM,0,NULL,TO_DATE(TO_CHAR(NUM), 'YYYYMMDD'))
        ELSE
            SYSDATE
    END N
FROM RND;
The "RND" factored subquery represents a horizontal-and-vertical slice of a staging table. Dates were stored as Numbers and if not present, stored as 0!

During the ETL the "date" value was merged into a table considering an existing "Date" value from another table (presented by SYSDATE), using a CASE expression. The main SELECT statement represents an equivalent of that MERGE.

Note: Do not take the "WHEN NUM = 0" literally. It's just a simple/silly replacement for a meaningful predicate.

This is the result:
SQL Error: ORA-00932: inconsistent datatypes: expected CHAR got DATE
00932. 00000 -  "inconsistent datatypes: expected %s got %s"

It's too confusing for Oracle, too!


It takes a couple of re-writes to understand that the "DECODE" is not the culprit here.
SQL>WITH RND AS
(
    SELECT
        CASE
            WHEN DBMS_RANDOM.VALUE(1, 100) <= 50
                THEN 0
            ELSE
                20171009
        END NUM
    FROM DUAL
    CONNECT BY LEVEL < 10
)
SELECT
    DECODE(NUM,0,NULL,TO_DATE(TO_CHAR(NUM), 'YYYYMMDD')) AS N
FROM RND;

N                
------------------
09-OCT-17

09-OCT-17

09-OCT-17
09-OCT-17
09-OCT-17

09-OCT-17

9 rows selected.
As you can see the DECODE function has no problems handling the NULL value alongside a value of type Date.

"In case you use CASE"


You don't need to be Bryn Llewelyn to infer that the CASE expression seems to be trying to make sure that the result of all of its conditions have matching data types.

In this case, it's really easy for Oracle to find a reference point. SYSDATE is easily revealing the expected datatype, but if "NUM = 0" the output of the DECODE function would be NULL.

What is the intrinsic data type of NULL?


Just to expose mysel to humour, I'd say: NULL!

My completely uneducated (and somewhat silly) guess is that at some point during interpretation of the CASE expression Oracle must be doing a comparison between the output datatypes of different conditions in the CASE expression and (if you're living in fantasy-land and imagine that Oracle will still be using SQL for that!) it will run a statement like this:

SQL>SELECT
        CASE
            WHEN 'DATE' = NULL THEN 'DATE'
            ELSE 'NOT DATE'
        END
    "DATE = NULL?"
FROM
    dual;

DATE = N?
--------
NOT DATE

Why is DECODE not guilty?


Because of the ridiculous "TO_DATE(NULL)" in the statement below:
SQL>WITH RND AS
(
    SELECT
        CASE
            WHEN DBMS_RANDOM.VALUE(1, 100) <= 50
                THEN 0
            ELSE
                20171009
        END NUM
    FROM DUAL
    CONNECT BY LEVEL < 10
)
SELECT
    CASE
        WHEN NUM = 0 THEN DECODE(NUM,0,TO_DATE(NULL),TO_DATE(TO_CHAR(NUM), 'YYYYMMDD'))
        ELSE
            SYSDATE
    END AS N
FROM RND;

N       
---------
30-NOV-17
30-NOV-17
30-NOV-17




30-NOV-17
30-NOV-17

9 rows selected.

DECODE is not the same as CASE


This is one of the two points which makes me not fully agree with a statement Steven Feuerstein made in one of his recent articles (go to "Hey, What about DECODE?" at the bottom).

DECODE behaves differently than CASE expression, specifically when it comes to dealing with NULL values.
SQL>WITH test_values AS
(
    SELECT A, b
    FROM
    (  
        SELECT 1 A FROM dual
        UNION
        SELECT NULL A FROM dual
    )
    CROSS JOIN
    (
        SELECT 1 b FROM dual
        UNION
        SELECT NULL b FROM dual
    )
)
SELECT
    A,
    b,
        CASE
            WHEN A = b THEN 'EQUAL'
            ELSE 'NOT EQUAL'
        END AS "CASE equality test",
       DECODE(A,b,'EQUAL','NOT EQUAL') AS "DECODE equality test"
FROM
    test_values;

A     B     CASE equality test    DECODE equality test
----- ----- --------------------- --------------------
1     1     EQUAL                 EQUAL             
1     null  NOT EQUAL             NOT EQUAL         
null  1     NOT EQUAL             NOT EQUAL         
null  null  NOT EQUAL             EQUAL            
What we saw in this post had to do with casting data types, but another thing which makes me (sometimes - especially when the number of conditions are one or two) choose the geeky, yet concise DECODE function over the human-readable CASE expression is the fact that DECODE is capable of comparing NULL values (and not only doing an equality test) and inferring TRUE.