动态查询
概览
Wren 引擎提供了多种方式来定义您的计算,例如计算字段或指标。然而,默认情况下,Wren 引擎中的所有数据集都是静态的。无论您是否使用它们,它们都会自动生成。
在本节中,我们将介绍一种称为动态查询的查询模式,它将使 Wren 引擎更智能,因为它只会生成您已使用的内容。
启用动态查询模式
只需在配置文件中启用特定配置,即可查询您想要的内容。
wren.experimental-enable-dynamic-fields=true
模型中的计算字段动态查询
对于模型,Wren 引擎将仅生成您在查询中使用的计算字段。让我们快速看一个例子。给定两个模型,Orders
和 Customer
。
{
"name": "Customer",
"refSql": "select * from tpch.customer",
"columns": [
{
"name": "custkey",
"type": "integer",
"expression": "c_orderkey"
},
{
"name": "name",
"type": "varchar",
"expression": "c_name"
},
{..}
{
"name": "count_orders",
"type": "int",
"isCalculated": true,
"expression": "sum(orders.orderkey)"
},
],
"primaryKey": "custkey"
},
{
"name": "Orders",
"refSql": "select * from tpch.orders",
"columns": [
{
"name": "orderkey",
"type": "integer",
"expression": "o_orderkey"
},
{
"name": "custkey",
"type": "integer",
"expression": "o_custkey"
},
{
"name": "totalprice",
"type": "integer",
"expression": "o_totalprice"
},
{
"name": "customer",
"type": "Customer",
"relationship": "CustomerOrders"
},
{
"name": "customer_name",
"type": "varchar",
"isCalculated": true,
"expression": "customer.name"
},
{
"name": "customer_name_lowercase",
"type": "varchar",
"isCalculated": true,
"expression": "lower(customer.name)"
}
],
"primaryKey": "orderkey"
}
默认情况下,如果您查询 Orders
并选择特定列,例如
select customer_name from Orders
您将获得如下 SQL
WITH
"Customer" AS (
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"',count_orders
FROM
(
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
c_orderkey "custkey"
, c_name "name"
FROM
(
SELECT *
FROM
tpch.customer
) "Customer"
) "Customer"
) "Customer"
)
, "Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
, "Orders_relationsub"."customer_name_lowercase" "customer_name_lowercase"
FROM
((
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
FROM
(
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
) "Orders"
LEFT JOIN (
SELECT
"Orders"."orderkey"
, "Customer"."name" "customer_name"
, lower("Customer"."name") "customer_name_lowercase"
FROM
((
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
LEFT JOIN "Customer" ON ("Customer"."custkey" = "Orders"."custkey"))
) "Orders_relationsub" ON ("Orders"."orderkey" = "Orders_relationsub"."orderkey"))
)
SELECT customer_name
FROM
Orders
您可以在生成的 SQL 中找到 Orders 模型的输出
"Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
, "Orders_relationsub"."customer_name_lowercase" "customer_name_lowercase"
FROM
它包含所有列,无论您是否使用过它们。
这可能会导致一些问题
- 容易遇到循环依赖错误。
- 查询引擎需要额外的优化工作。
- 如果数据源无法很好地优化未使用的投影,可能会产生额外的计算成本。
如何启用动态查询?
设置配置后,提交相同的查询
select customer_name from Orders
我们可以得到生成的 SQL
WITH
"Customer" AS (
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
c_orderkey "custkey"
, c_name "name"
FROM
(
SELECT *
FROM
tpch.customer
) "Customer"
) "Customer"
) "Customer"
)
, "Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
FROM
((
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
FROM
(
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
) "Orders"
LEFT JOIN (
SELECT
"Orders"."orderkey"
, "Customer"."name" "customer_name"
FROM
((
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
LEFT JOIN "Customer" ON ("Customer"."custkey" = "Orders"."custkey"))
) "Orders_relationsub" ON ("Orders"."orderkey" = "Orders_relationsub"."orderkey"))
)
SELECT customer_name
FROM
Orders
您会发现 Orders 的输出是不同的。
"Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
FROM
它只包含您需要的列。这解决了当左右两侧都使用关系时无法查询模型的问题。
动态查询在视图中也有效
在动态查询模式下,视图中存储的 SQL 也遵循此行为,无论您创建此视图时是否启用了动态查询。视图的行为取决于运行时配置。您可以期望它与直接查询视图语句相同。