Including File Properties and Metadata in a U-SQL Script
When working on big data systems, it can be very helpful to include file properties and other metadata directly within the data results. Capturing data lineage can come in very handy, especially if reconciling or troubleshooting issues (for instance, if retry logic occurred in the data stream and now you have duplicate rows to be handled).
I just learned we have some new U-SQL syntax which supports the following file properties:
- URI (uniform resource identifier)
- Modified date
- Created date
- Length (file size in bytes)
In the following example, I’m using U-SQL (Azure Data Lake Analytics) to iterate over files which are in date-partitioned subfolders under Raw Data within Azure Data Lake Store. As part of the schema-on-read definition of the source files (aka the extract statement), the new file properties are shown in yellow:

The output for the virtual columns looks like this:
