How to handle NULL in Polybase – Quick Tip

How to prepare a file for ingestion into Azure SQL Data Warehouse using Polybase, when there are NULLs in the data?

There are only a few web pages that talk about how Polybase handles NULL values… And most of them are misleading. For example:

So here is the answer:

I’ll talk in the context of delimited files. To indicate a NULL in a column you simply put nothing between the delimiters. i.e. a missing value.

E.g. A row with three values, where the second value is NULL would look like this: “A”,,”B”

Simple right? If only those other resources were a bit clearer!

There is one more thing that you need to do, when defining your External File Format you should set the Format Option USE_TYPE_DEFAULT = False. e.g.

CREATE EXTERNAL FILE FORMAT Fmt_psg_SV_Service_335f7_SV_Service
WITH (FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS(
FIELD_TERMINATOR = ‘,’,
STRING_DELIMITER = ‘”‘,
DATE_FORMAT = ”,
USE_TYPE_DEFAULT = False)
)

As per this CREATE EXTERNAL FILE FORMAT (Transact-SQL) article, which explains that, if USE_TYPE_DEFAULT = False, then missing values are treated as NULL. It’s a good explanation, you just need to know to look there !

I’ve tested this, and it works for int, varchar and DateTime type columns.

 

[box type=”download”] If you found this useful please like (via one of the share buttons below) or link to it, to help others find it![/box]

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.