跳到主要内容

动态查询

概览

Wren 引擎提供了多种方式来定义您的计算,例如计算字段或指标。然而,默认情况下,Wren 引擎中的所有数据集都是静态的。无论您是否使用它们,它们都会自动生成。

在本节中,我们将介绍一种称为动态查询的查询模式,它将使 Wren 引擎更智能,因为它只会生成您已使用的内容。

启用动态查询模式

只需在配置文件中启用特定配置,即可查询您想要的内容。

wren.experimental-enable-dynamic-fields=true

模型中的计算字段动态查询

对于模型,Wren 引擎将仅生成您在查询中使用的计算字段。让我们快速看一个例子。给定两个模型,OrdersCustomer

{
"name": "Customer",
"refSql": "select * from tpch.customer",
"columns": [
{
"name": "custkey",
"type": "integer",
"expression": "c_orderkey"
},
{
"name": "name",
"type": "varchar",
"expression": "c_name"
},
{..}
{
"name": "count_orders",
"type": "int",
"isCalculated": true,
"expression": "sum(orders.orderkey)"
},
],
"primaryKey": "custkey"
},
{
"name": "Orders",
"refSql": "select * from tpch.orders",
"columns": [
{
"name": "orderkey",
"type": "integer",
"expression": "o_orderkey"
},
{
"name": "custkey",
"type": "integer",
"expression": "o_custkey"
},
{
"name": "totalprice",
"type": "integer",
"expression": "o_totalprice"
},
{
"name": "customer",
"type": "Customer",
"relationship": "CustomerOrders"
},
{
"name": "customer_name",
"type": "varchar",
"isCalculated": true,
"expression": "customer.name"
},
{
"name": "customer_name_lowercase",
"type": "varchar",
"isCalculated": true,
"expression": "lower(customer.name)"
}
],
"primaryKey": "orderkey"
}

默认情况下,如果您查询 Orders 并选择特定列,例如

select customer_name from Orders

您将获得如下 SQL

WITH
"Customer" AS (
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"',count_orders
FROM
(
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
c_orderkey "custkey"
, c_name "name"
FROM
(
SELECT *
FROM
tpch.customer
) "Customer"
) "Customer"
) "Customer"
)
, "Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
, "Orders_relationsub"."customer_name_lowercase" "customer_name_lowercase"
FROM
((
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
FROM
(
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
) "Orders"
LEFT JOIN (
SELECT
"Orders"."orderkey"
, "Customer"."name" "customer_name"
, lower("Customer"."name") "customer_name_lowercase"
FROM
((
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
LEFT JOIN "Customer" ON ("Customer"."custkey" = "Orders"."custkey"))
) "Orders_relationsub" ON ("Orders"."orderkey" = "Orders_relationsub"."orderkey"))
)
SELECT customer_name
FROM
Orders

您可以在生成的 SQL 中找到 Orders 模型的输出

"Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
, "Orders_relationsub"."customer_name_lowercase" "customer_name_lowercase"
FROM

它包含所有列,无论您是否使用过它们。

这可能会导致一些问题

  • 容易遇到循环依赖错误。
  • 查询引擎需要额外的优化工作。
    • 如果数据源无法很好地优化未使用的投影,可能会产生额外的计算成本。

如何启用动态查询?

设置配置后,提交相同的查询

select customer_name from Orders

我们可以得到生成的 SQL

WITH
"Customer" AS (
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
c_orderkey "custkey"
, c_name "name"
FROM
(
SELECT *
FROM
tpch.customer
) "Customer"
) "Customer"
) "Customer"
)
, "Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
FROM
((
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
FROM
(
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
) "Orders"
LEFT JOIN (
SELECT
"Orders"."orderkey"
, "Customer"."name" "customer_name"
FROM
((
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
LEFT JOIN "Customer" ON ("Customer"."custkey" = "Orders"."custkey"))
) "Orders_relationsub" ON ("Orders"."orderkey" = "Orders_relationsub"."orderkey"))
)
SELECT customer_name
FROM
Orders

您会发现 Orders 的输出是不同的。

"Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
FROM

它只包含您需要的列。这解决了当左右两侧都使用关系时无法查询模型的问题。

动态查询在视图中也有效

在动态查询模式下,视图中存储的 SQL 也遵循此行为,无论您创建此视图时是否启用了动态查询。视图的行为取决于运行时配置。您可以期望它与直接查询视图语句相同。