Serverless SQL query over files
SELECT TOP 100 *
FROM OPENROWSET(
BULK 'https://storageaccount.dfs.core.windows.net/raw/sales/*.parquet',
FORMAT = 'PARQUET'
) AS sales;
SQL and Spark examples
Azure Synapse supports multiple execution models. The examples below show common SQL and Spark patterns used for querying, loading, and transforming data.
SELECT TOP 100 *
FROM OPENROWSET(
BULK 'https://storageaccount.dfs.core.windows.net/raw/sales/*.parquet',
FORMAT = 'PARQUET'
) AS sales;
CREATE EXTERNAL TABLE dbo.SalesExternal
WITH (
LOCATION = 'curated/sales/',
DATA_SOURCE = DataLakeSource,
FILE_FORMAT = ParquetFormat
)
AS SELECT * FROM dbo.SalesStaging;
CREATE TABLE dbo.FactSales
(
SaleId bigint NOT NULL,
CustomerId int NOT NULL,
SaleDate date NOT NULL,
Amount decimal(18,2) NOT NULL
)
WITH
(
DISTRIBUTION = HASH(CustomerId),
CLUSTERED COLUMNSTORE INDEX
);
df = spark.read.parquet('/mnt/raw/sales')
clean = df.dropDuplicates(['SaleId']) \
.filter('Amount > 0')
clean.write.mode('overwrite') \
.parquet('/mnt/curated/sales')
Exact syntax can vary depending on whether you use serverless SQL pools, dedicated SQL pools, Spark notebooks, or pipeline activities. Always match the command to the compute engine and data format in your Synapse workspace.