From 23865365e89536feb816b321c9bcba40d44155e1 Mon Sep 17 00:00:00 2001
From: Alan Cai <caialan@amazon.com>
Date: Mon, 27 Nov 2023 18:45:50 -0500
Subject: [PATCH] Apply John's feedback - struct -> tuple - list -> array -
 consistent variable definitions - empty subsumption rule - other feedback

---
 RFCs/0051-exclude-operator.adoc | 203 ++++++++++++++++----------------
 1 file changed, 103 insertions(+), 100 deletions(-)
diff --git a/RFCs/0051-exclude-operator.adoc b/RFCs/0051-exclude-operator.adoc
index 13ad1c5..9ab2076 100644
--- a/RFCs/0051-exclude-operator.adoc
+++ b/RFCs/0051-exclude-operator.adoc
@@ -4,7 +4,7 @@
 
 
 * Start Date: 2023-11-07
-* PartiQL Issue: https://github.com/partiql/partiql-spec/issues/39
+* PartiQL Issue: https://github.com/partiql/partiql-lang/issues/27
 * RFC PR: https://github.com/partiql/partiql-docs/pull/51
 
 == Summary
@@ -13,13 +13,13 @@ This doc defines the `EXCLUDE` binding tuple operator used to omit nested values
 
 == Motivation
 
-SQL users often use `SELECT *` to project all of the columns of a table. There is frequently a use case in which a user would like to project all the columns from a table other than a subset of the columns (see https://stackoverflow.com/q/729197[slack overflow question]). There are workarounds in some database systems that are somewhat inefficient (e.g. creating a new table and dropping a specific column), but it can be helpful to have a dedicated syntax to filter out certain columns. <<Prior art>> lists out a few databases that provide some version of this column filtering.
+SQL users often use `SELECT *` to project all of the columns of a table. There is frequently a use case in which a user would like to project all the columns from a table other than a subset of the columns (see https://stackoverflow.com/q/729197[Stack Overflow question]). There are workarounds in some database systems that are somewhat inefficient (e.g. creating a new table and dropping a specific column), but it can be helpful to have a dedicated syntax to filter out certain columns. <<Prior art>> lists out a few databases that provide some version of this column filtering.
 
 There is a similar need among PartiQL users to exclude certain nested fields from semi-structured data. PartiQL supports `SELECT *` to project all of the fields of a binding tuple. If a user wanted to omit one field from this projection, they would need to list out all of the projection fields or perform some intricate combination of `PIVOT` and ``UNPIVOT``s.
 
 [source,partiql,subs="+{markup-in-source}"]
 ----
--- Suppose `tbl` is a collection of structs that have `n` fields, `field~1~,...,field~n~`.
+-- Suppose `tbl` is a collection of tuples that have `n` fields, `field~1~,...,field~n~`.
 -- To filter out `field~i~`, we would have to list out all fields other than `field~i~`.
 SELECT
     field~1~, ..., field~i-1~, field~i+1~, ..., field~n~ -- omit `field~i~` from tbl
@@ -95,22 +95,24 @@ PartiQL should support s-expression types and values since PartiQL's type system
 ==== Step 1: subsumption of `EXCLUDE` paths
 We perform the following step to ensure that there are no redundant `EXCLUDE` paths. That is, there is no path such that all of its excluded binding tuple values are excluded by another exclude path. footnote:[This subsumption step is included to make the subsequent rewrite steps easier to reason about. In a query without redundant exclude paths, this step is not necessary.]
 
-For each `<exclude path>` `p=root~p~s~1~...s~m~`, we compare it with all other ``<exclude path>``s. `<exclude path>` `p` is said to be subsumed by another path `q=root~q~t~1~...t~n~` and not included in the rewritten `EXCLUDE` clause if any of the following rules apply:
+For each `<exclude path>` `p=root~p~s~1~...s~x~`, we compare it with all other ``<exclude path>``s. `<exclude path>` `p` is said to be subsumed by another path `q=root~q~t~1~...t~y~` and not included in the rewritten `EXCLUDE` clause if any of the following rules apply:
 
 NOTE: The following rules assume `root~p~=root~q~`.
 
 .Subsumption rules
 [[anchor-1a]] Rule 1.a::
-    If `m ≥ n` and `s~1~...s~m~=t~1~...t~m~`, `q` subsumes `p`. Put another way if `p` has at least as many steps as `q` and the steps up to ``q``'s length are equivalent, `q` subsumes `p`.
+    If `y = 0` (i.e. `q` has no steps), `q` subsumes `p`.
+[[anchor-1b]] Rule 1.b::
+    If `y ≥ x` and `s~1~...s~x~=t~1~...t~x~`, `q` subsumes `p`. Put another way if `p` has at least as many steps as `q` and the steps up to ``q``'s length are equivalent, `q` subsumes `p`.
 
 Otherwise, there must be some step at which `p` and `q` diverge. Let's call this step's index `i`.
 
-[[anchor-1b]] Rule 1.b::
-    If `s~i~` is a tuple attribute and `t~i~` is a tuple wildcard and `t~i+1~...t~n~` subsumes `s~i+1~...s~m~` (i.e. the steps following `t~i~` subsumes the steps following `s~i~`), then `q` subsumes `p`.
 [[anchor-1c]] Rule 1.c::
-    If `s~i~` is a collection index and `t~i~` is a collection wildcard and `t~i+1~...t~n~` subsumes `s~i+1~...s~m~` (i.e. the steps following `t~i~` subsumes the steps following `s~i~`), then `q` subsumes `p`.
+    If `s~i~` is a tuple attribute and `t~i~` is a tuple wildcard and `t~i+1~...t~y~` subsumes `s~i+1~...s~x~` (i.e. the steps following `t~i~` subsumes the steps following `s~i~`), then `q` subsumes `p`.
 [[anchor-1d]] Rule 1.d::
-    If `s~i~` is a case-sensitive tuple attribute and `t~i~` is a case-insensitive tuple attribute and `t~i+1~...t~n~` subsumes `s~i+1~...s~m~` (i.e. the steps following `t~i~` subsumes the steps following `s~i~`), then `q` subsumes `p`.
+    If `s~i~` is a collection index and `t~i~` is a collection wildcard and `t~i+1~...t~y~` subsumes `s~i+1~...s~x~` (i.e. the steps following `t~i~` subsumes the steps following `s~i~`), then `q` subsumes `p`.
+[[anchor-1e]] Rule 1.e::
+    If `s~i~` is a case-sensitive tuple attribute and `t~i~` is a case-insensitive tuple attribute and `t~i+1~...t~y~` subsumes `s~i+1~...s~x~` (i.e. the steps following `t~i~` subsumes the steps following `s~i~`), then `q` subsumes `p`.
 
 .Subsumption Examples
 [options="header,footer"]
@@ -119,17 +121,18 @@ Otherwise, there must be some step at which `p` and `q` diverge. Let's call this
 |`s.a`        |`t.a`       |No subsumption rules apply (roots differ)
 |`t.a`        |`t.b`       |No subsumption rules apply
 |`t.a.b.c`    |`t.a.*.d`   |No subsumption rules apply
-|`t.a.b.c`    |`t.a.b.c`   |`q` subsumes `p` (by <<anchor-1a, 1.a>>)
-|`t.a.b.c`    |`t.a.b`     |`q` subsumes `p` (by <<anchor-1a, 1.a>>)
-|`t.a.b.c`    |`t.a.b.*`   |`q` subsumes `p` (by <<anchor-1b, 1.b>> then  <<anchor-1a, 1.a>>)
-|`t.a.b.c`    |`t.a.*.c`   |`q` subsumes `p` (by <<anchor-1b, 1.b>> then <<anchor-1a, 1.a>>)
-|`t.a.b[1]`   |`t.a.b`     |`q` subsumes `p` (by <<anchor-1a, 1.a>>)
-|`t.a.b[1]`   |`t.a.b[*]`  |`q` subsumes `p` (by <<anchor-1c, 1.c>> then <<anchor-1a, 1.a>>)
-|`t.a.b[1].c` |`t.a.b[1]`  |`q` subsumes `p` (by <<anchor-1a, 1.a>>)
-|`t.a.b[1].c` |`t.a.b[*].c`|`q` subsumes `p` (by <<anchor-1c, 1.c>> then <<anchor-1a, 1.a>>)
-|`t.a.b[1].c` |`t.a.b[*]`  |`q` subsumes `p` (by <<anchor-1c, 1.c>> then <<anchor-1a, 1.a>>)
-|`t.a."b"`    |`t.a.b`     |`q` subsumes `p` (by <<anchor-1d, 1.d>> then <<anchor-1a, 1.a>>)
-|`t.a."b".c`  |`t.a.b.c`   |`q` subsumes `p` (by <<anchor-1d, 1.d>> then <<anchor-1a, 1.a>>)
+|`t.a`        |`t`         |`q` subsumes `p` (by <<anchor-1a, 1.a>>)
+|`t.a.b.c`    |`t.a.b.c`   |`q` subsumes `p` (by <<anchor-1b, 1.b>>)
+|`t.a.b.c`    |`t.a.b`     |`q` subsumes `p` (by <<anchor-1b, 1.b>>)
+|`t.a.b.c`    |`t.a.b.*`   |`q` subsumes `p` (by <<anchor-1c, 1.c>> then <<anchor-1a, 1.a>>)
+|`t.a.b.c`    |`t.a.*.c`   |`q` subsumes `p` (by <<anchor-1c, 1.c>> then <<anchor-1b, 1.b>>)
+|`t.a.b[1]`   |`t.a.b`     |`q` subsumes `p` (by <<anchor-1b, 1.b>>)
+|`t.a.b[1]`   |`t.a.b[*]`  |`q` subsumes `p` (by <<anchor-1d, 1.d>> then <<anchor-1a, 1.a>>)
+|`t.a.b[1].c` |`t.a.b[1]`  |`q` subsumes `p` (by <<anchor-1b, 1.b>>)
+|`t.a.b[1].c` |`t.a.b[*]`  |`q` subsumes `p` (by <<anchor-1d, 1.d>> then <<anchor-1a, 1.a>>)
+|`t.a.b[1].c` |`t.a.b[*].c`|`q` subsumes `p` (by <<anchor-1d, 1.d>> then <<anchor-1b, 1.b>>)
+|`t.a."b"`    |`t.a.b`     |`q` subsumes `p` (by <<anchor-1e, 1.e>> then <<anchor-1a, 1.a>>)
+|`t.a."b".c`  |`t.a.b.c`   |`q` subsumes `p` (by <<anchor-1e, 1.e>> then <<anchor-1b, 1.b>>)
 |=======================
 
 ---
@@ -137,9 +140,9 @@ We first illustrate the rewrite rule for a single `EXCLUDE` path and then explai
 
 ==== Step 2 (single): rewrite a single `EXCLUDE` path
 
-To rewrite a single `EXCLUDE` path with `n` steps, `p=r.s~1~...s~n~`, we move the clauses other than the `SELECT`/`PIVOT` into a subquery, which will `EXCLUDE` the binding tuple values at the path `p`. This subquery essentially reconstructs the binding tuple of the other clauses using a `SELECT VALUE` struct to project back the binding tuple variables. All of the variables created from the other clauses not matching the `EXCLUDE` root `r` will use the identity function (e.g. binding tuple variable `foo` will have attribute `'foo'` and value `foo` in the `SELECT VALUE` struct). For the variable matching the `EXCLUDE` path root `r`, we apply the following rewrite rules to define ``r``'s value within the `SELECT VALUE` struct. If there is no such variable matching `EXCLUDE` path root `r`, the `EXCLUDE` path will not alter any of the binding tuple values. Hence, no rewrite rule is applied.
+To rewrite a single `EXCLUDE` path with `n` steps, `p=r.s~1~...s~n~`, we move the clauses other than the `SELECT`/`PIVOT` into a subquery, which will `EXCLUDE` the binding tuple values at the path `p`. This subquery essentially reconstructs the binding tuple of the other clauses using a `SELECT VALUE` tuple to project back the binding tuple variables. All of the variables created from the other clauses not matching the `EXCLUDE` root `r` will use the identity function (e.g. binding tuple variable `foo` will have attribute `'foo'` and value `foo` in the `SELECT VALUE` tuple). For the variable matching the `EXCLUDE` path root `r`, we apply the following rewrite rules to define ``r``'s value within the `SELECT VALUE` tuple. If there is no such variable matching `EXCLUDE` path root `r`, the `EXCLUDE` path will not alter any of the binding tuple values. Hence, no rewrite rule is applied.
 
-If the other clauses include an `ORDER BY`, we convert the top-level query back into a list by adding a position variable (i.e. `AT` clause) along with an `ORDER BY` over the position variable.
+If the other clauses include an `ORDER BY`, we convert the top-level query back into an array by adding a position variable (i.e. `AT` clause) along with an `ORDER BY` over the position variable.
 
 [source,partiql,subs="+{markup-in-source}"]
 ----
@@ -159,7 +162,7 @@ FROM (
     <from clause>
     <other clauses>
 )
-[   -- Include conversion back to list if `ORDER BY` present in `<other clauses>`
+[   -- Include conversion back to array if `ORDER BY` present in `<other clauses>`
     -- Assume `<topLevelTbl>` and `<idx>` are fresh variables
     AS <topLevelTbl> AT <idx>
     ORDER BY <idx>
@@ -167,11 +170,11 @@ FROM (
 ----
 
 
-The main idea for rewriting the `EXCLUDE` steps `s~1~,...,s~n~` is to create a nested `CASE` expression for each step, whereby the nested `CASE` expressions for `s~1~,...,s~n-1~` unnest the input binding tuple and the final `CASE` expression for `s~n~` (i.e. the final step) filters out the desired struct field(s) or collection index(es). Every exclude step has an expected type to process during evaluation. Tuple attribute and wildcard exclude steps expect a struct. Whereas a collection index expects a list and a collection wildcard expects a list or bag. The `CASE` expression at each level `i` recreates this expected type by including a `WHEN` branch based on the expected type. Each `CASE` expression will include an `ELSE` branch which outputs the previous level's identifier. This set of branches ensures that at evaluation time, if there is a type mismatch (e.g. evaluation value is a list while the exclude step is a tuple attribute), there is no evaluation error and the previous level's value is returned through the `ELSE` branch. This behavior applies to both the permissive and strict typing modes.
+The main idea for rewriting the `EXCLUDE` steps `s~1~,...,s~n~` is to create a nested `CASE` expression for each step, whereby the nested `CASE` expressions for `s~1~,...,s~n-1~` unnest the input binding tuple and the final `CASE` expression for `s~n~` (i.e. the final step) filters out the desired tuple field(s) or collection index(es). Every exclude step has an expected type to process during evaluation. Tuple attribute and wildcard exclude steps expect a tuple. Whereas a collection index expects an array and a collection wildcard expects an array or bag. The `CASE` expression at each level `i` recreates this expected type by including a `WHEN` branch based on the expected type. Each `CASE` expression will include an `ELSE` branch which outputs the previous level's identifier. This set of branches ensures that at evaluation time, if there is a type mismatch (e.g. evaluation value is an array while the exclude step is a tuple attribute), there is no evaluation error and the previous level's value is returned through the `ELSE` branch. This behavior applies to both the permissive and strict typing modes.
 
 [source,partiql,subs="+{markup-in-source}"]
 ----
--- For the value `r` in our `SELECT VALUE` struct:
+-- For the value `r` in our `SELECT VALUE` tuple:
 -- Assuming `<v~n-1~>` is the identifier created from the previous exclude step, `s~n-1~`
 SELECT VALUE {
     'r':
@@ -195,7 +198,7 @@ For this rewrite rule definition, let `<v~i-1~>` be the identifier created from
     If `s~i~` is a case-sensitive tuple attribute exclude step (e.g. `."foo"` or `['foo']`), where `<v~i~>` and `<attr~i~>` are fresh variables, add the following `WHEN` branch to the `i`^th^ nested `CASE`.
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~i-1~> IS STRUCT THEN (
+WHEN <v~i-1~> IS TUPLE THEN (
     PIVOT (
         CASE 
             WHEN <attr~i~> = <s~i~> THEN
@@ -211,7 +214,7 @@ WHEN <v~i-1~> IS STRUCT THEN (
     If `s~i~` is a case-insensitive tuple attribute exclude step (e.g. `.foo`), where `<v~i~>` and `<attr~i~>` are fresh variables, add the following `WHEN` branch to the the `i`^th^ nested `CASE`.
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~i-1~> IS STRUCT THEN (
+WHEN <v~i-1~> IS TUPLE THEN (
     PIVOT (
         CASE 
             WHEN LOWER(<attr~i~>) = LOWER(<s~i~>) THEN
@@ -229,7 +232,7 @@ NOTE: This is essentially the same as <<anchor-2ai>> but wraps the inner `CASE W
     If `s~i~` is a tuple wildcard exclude step, where `<v~i~>` and `<attr~i~>` are fresh variables, add the following `WHEN` branch to the `i`^th^ nested `CASE`.
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~i-1~> IS STRUCT THEN (
+WHEN <v~i-1~> IS TUPLE THEN (
     PIVOT 
         -- Apply rewrite rules on remaining exclude steps `s~i+1~,...,s~n~`
     AT <attr~i~>
@@ -240,7 +243,7 @@ WHEN <v~i-1~> IS STRUCT THEN (
     If `s~i~` is a collection index exclude step, where `<v~i~>` and `<idx~i~>` are fresh variables, add the following `WHEN` branch to the `i`^th^ nested `CASE`.
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~i-1~> IS LIST THEN (
+WHEN <v~i-1~> IS ARRAY THEN (
     SELECT VALUE
         CASE 
             WHEN <idx~i~> = <s~i~> THEN
@@ -255,7 +258,7 @@ WHEN <v~i-1~> IS LIST THEN (
     If `s~i~` is a collection wildcard exclude step, where `<v~i~>` and `<idx~i~>` are fresh variables, add the following `WHEN` branches to the `i`^th^ nested `CASE`.
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~i-1~> IS LIST THEN (
+WHEN <v~i-1~> IS ARRAY THEN (
     SELECT VALUE
         -- Apply rewrite rules on remaining exclude steps `s~i+1~,...,s~n~`
     FROM <v~i-1~> AS <v~i~> AT <idx~i~>
@@ -285,7 +288,7 @@ Similar to <<anchor-2>>, we case on the type of exclude step to determine which
     If the last step, `s~n~`, is a case-sensitive tuple attribute exclude step, where `<v~n~>` and `<attr~n~>` are fresh variables, we add the following `WHEN` branch:
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~n-1~> IS STRUCT THEN (
+WHEN <v~n-1~> IS TUPLE THEN (
     PIVOT <v~n~> AT <attr~n~>
     FROM UNPIVOT <v~n-1~> AS <v~n~> AT <attr~n~>
     WHERE <attr~n~> NOT IN [ <s~n~> ]
@@ -295,7 +298,7 @@ WHEN <v~n-1~> IS STRUCT THEN (
     If the last step, `s~n~`, is a case-insensitive tuple attribute exclude step, where `<v~n~>` and `<attr~n~>` are fresh variables, we add the following `WHEN` branch:
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~n-1~> IS STRUCT THEN (
+WHEN <v~n-1~> IS TUPLE THEN (
     PIVOT <v~n~> AT <attr~n~>
     FROM UNPIVOT <v~n-1~> AS <v~n~> AT <attr~n~>
     WHERE LOWER( <attr~n~> ) NOT IN [ LOWER(<s~n~>) ] -- difference w/ 3.a.i is `LOWER` call on `<attr~n~>` and `<s~n~>`
@@ -305,14 +308,14 @@ WHEN <v~n-1~> IS STRUCT THEN (
     If the last step, `s~n~`, is a tuple wildcard exclude step, we add the following `WHEN` branch:
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~n-1~> IS STRUCT THEN
-    { }     -- empty struct
+WHEN <v~n-1~> IS TUPLE THEN
+    { }     -- empty tuple
 ----
 [[anchor-3c]] Rule 3.c::
     If the last step is a collection index exclude step, where `<v~n~>` and `<idx~i~>` are fresh variables, we add the following `WHEN` branch:
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~n-1~> IS LIST THEN
+WHEN <v~n-1~> IS ARRAY THEN
     SELECT VALUE <v~n~>
     FROM <v~n-1~> AS <v~n~> AT <idx~i~>
     WHERE <idx~i~> NOT IN [<s~n~>]
@@ -322,8 +325,8 @@ WHEN <v~n-1~> IS LIST THEN
     If the last step, `s~n~`, is a collection wildcard exclude step, we add the following two `WHEN` branches:
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~n-1~> IS LIST THEN
-    []      -- empty list
+WHEN <v~n-1~> IS ARRAY THEN
+    []      -- empty array
 WHEN <v~n-1~> IS BAG THEN
     <<>>    -- empty bag
 ----
@@ -332,45 +335,45 @@ Based on the defined rules for single `EXCLUDE` path rewrites, we will now cover
 
 ==== Step 2 (multiple): rewriting multiple `EXCLUDE` paths
 
-For multiple `EXCLUDE` paths, we employ a similar idea as the rewrite for a single path. The clauses other than the `SELECT`/`PIVOT` are moved to a subquery that will be ranged over. This subquery contains a `SELECT VALUE` struct which will reconstruct the binding tuple of the other clauses with the exclude paths' rewrite. Variables created from the other clauses without a matching exclude path root will be included in the struct with the identity function. Every binding tuple variable matching one or more exclude path roots will have a struct value defined using the below rewrites.
+For multiple `EXCLUDE` paths, we employ a similar idea as the rewrite for a single path. The clauses other than the `SELECT`/`PIVOT` are moved to a subquery that will be ranged over. This subquery contains a `SELECT VALUE` tuple which will reconstruct the binding tuple of the other clauses with the exclude paths' rewrite. Variables created from the other clauses without a matching exclude path root will be included in the tuple with the identity function. Every binding tuple variable matching one or more exclude path roots will have a tuple value defined using the below rewrites.
 
 [source,partiql,subs="+{markup-in-source}"]
 ----
--- Let `n` represent the number of `EXCLUDE` paths
+-- Let `M` represent the number of `EXCLUDE` paths
 
 -- Original query:
 <select clause>
-EXCLUDE p~1~,...,p~n~
+EXCLUDE p~1~,...,p~M~
 <from clause>
 <other clauses>
 
--- Let `m` represent the number of unique `EXCLUDE` path roots
+-- Let `R` represent the number of unique `EXCLUDE` path roots
 -- Rewritten to:
 <select clause>
 FROM (
     SELECT VALUE {
         'r~1~': -- apply rewrite rules on exclude paths that have root `r~1~`
           ⋮
-        'r~m~': -- apply rewrite rules on exclude paths that have root `r~m~`
+        'r~R~': -- apply rewrite rules on exclude paths that have root `r~R~`
         ...   -- other variables created from the other clauses
     }
     <from clause>
     <other clauses>
 )
-[   -- Include conversion back to list if `ORDER BY` present in `<other clauses>`
+[   -- Include conversion back to array if `ORDER BY` present in `<other clauses>`
     -- Assume `<topLevelTbl>` and `<idx>` are fresh variables
     AS <topLevelTbl> AT <idx>
     ORDER BY <idx>
 ]
 ----
-Like single path rewriting, we create a nested `CASE` expression for each step. However, for multiple paths, we look at all the paths in parallel and process the steps at the same level. For the following, let `i=1,...,z` where `z` is the length of the longest exclude path. The nested `CASE` expressions for all `i` are created as before. For the following, let `<v~i-1~>` be the identifier from the previous level (or the root identifier if `i = 1`).
+Like single path rewriting, we create a nested `CASE` expression for each step. However, for multiple paths, we look at all the applicable paths in parallel and process the steps at the same level. Applicable paths refers to the subset of paths that have the same root and same tuple attributes/collection indexes at previous levels. For the following, let `z` be the length of the longest exclude path. The nested `CASE` expressions for all level `i=1,...,z` are created as before. For the following, let `<v~i-1~>` be the identifier from the previous level (or the root identifier if `i = 1`).
 
 [source,partiql,subs="+{markup-in-source}"]
 ----
 CASE
-    WHEN <v~i-1~> IS STRUCT THEN
+    WHEN <v~i-1~> IS TUPLE THEN
         ... -- apply tuple attr and wildcard path rewrite (rule 4.a)
-    WHEN <v~i-1~> IS LIST THEN
+    WHEN <v~i-1~> IS ARRAY THEN
         ... -- apply collection index and wildcard path rewrite (rule 4.b)
     WHEN <v~i-1~> IS BAG THEN
         ... -- apply collection wildcard path rewrite (rule 4.b)
@@ -391,21 +394,21 @@ If there are any `EXCLUDE` paths of length `i`, then similar to <<anchor-3ai, Ru
 If there are any `EXCLUDE` paths of length greater than `i`, then similar to <<anchor-2ai, Rule 2.a.i>> and <<anchor-2aii, Rule 2.a.ii>>, we add a `CASE` expression within the `PIVOT`. This `CASE` expression within the `PIVOT` will define a `WHEN` branch for each of the unique tuple attribute steps. Each of these `WHEN` branches will apply the rewrite rules for the exclude paths that have additional steps and equivalent tuple attribute or tuple wildcard. An `ELSE` branch will be added to this `CASE` expression which will apply the rewrite rules for the exclude paths with a tuple wildcard at level `i` and additional steps.
 [source,partiql,subs="+{markup-in-source}"]
 ----
--- Let `k` represent the number of unique exclude tuple attrs for paths of length
+-- Let `T` represent the number of unique exclude tuple attrs for paths of length
 -- greater than `i`.
 -- `<v~i~>` and `<attr~i~>` are fresh variables
-WHEN <v~i-1~> IS STRUCT THEN (
+WHEN <v~i-1~> IS TUPLE THEN (
     PIVOT (
         CASE
-            WHEN <attr~i~> = <exclude path tuple attr~1~> THEN
+            WHEN <attr~i~> = <exclude path tuple attr~unique1~> THEN
                 -- Apply rewrite rules for exclude paths with
                 -- length > i AND
-                -- tuple attr~1~ or tuple wildcard at ith step
+                -- tuple attr~unique1~ or tuple wildcard at ith step
               ⋮
-            WHEN <attr~i~> = <exclude path tuple attr~k~> THEN
+            WHEN <attr~i~> = <exclude path tuple attr~uniqueT~> THEN
                 -- Apply rewrite rules for exclude paths with
                 -- length > i AND
-                -- tuple attr~k~ or tuple wildcard at ith step
+                -- tuple attr~uniqueT~ or tuple wildcard at ith step
             ELSE
                 -- Apply rewrite rules for exclude paths with
                 -- length > i AND
@@ -414,23 +417,23 @@ WHEN <v~i-1~> IS STRUCT THEN (
     ) AT <attr~n~>
     FROM UNPIVOT <v~i-1~> AS <v~i~> AT <attr~i~>
     WHERE 
-        <attr~i~> NOT IN [<case-sensitive tuple attrs with last step i>]
+        <attr~i~> NOT IN [<case-sensitive tuple attrs with last step at i>]
         AND
-        LOWER(<attr~i~>) NOT IN [<case-insensitive tuple attrs with last step i>] -- call `LOWER` on each of the case-insensitive tuple attrs
+        LOWER(<attr~i~>) NOT IN [<case-insensitive tuple attrs with last step at i>] -- call `LOWER` on each of the case-insensitive tuple attrs
 )
 ----
 
 =====
-NOTE: If the only applicable path at level `i` is a tuple wildcard and this path is of length `i`, we know there are no other applicable tuple paths by the subsumption rules. In this case, we can just return an empty struct for the `ith` nested `CASE` like <<anchor-3b, rule 3.b>>:
+NOTE: If the only applicable path at level `i` is a tuple wildcard and this path is of length `i`, we know there are no other applicable tuple paths by the subsumption rules. In this case, we can just return an empty tuple for the `ith` nested `CASE` like <<anchor-3b, rule 3.b>>:
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~i-1~> IS STRUCT THEN
+WHEN <v~i-1~> IS TUPLE THEN
     { }
 ----
 =====
 ---
 
-If any of the applicable `EXCLUDE` paths at level `i` have a collection index or wildcard exclude step, then we add the following `WHEN` branches to the `i`^th^ nested `CASE` expression. If the exclude paths at level `i` are all collection index steps, only a `WHEN` branch casing on if the previous level's value `<v~i-1~>` was a list will be added. Otherwise, a `WHEN` branch casing on if `<v~i-1~>` is a bag will also be added. Alike the collection exclude rules defined for single `EXCLUDE` paths, we add a `SELECT VALUE ... FROM` over `<v~i-1~>`.
+If any of the applicable `EXCLUDE` paths at level `i` have a collection index or wildcard exclude step, then we add the following `WHEN` branches to the `i`^th^ nested `CASE` expression. If the exclude paths at level `i` are all collection index steps, only a `WHEN` branch casing on if the previous level's value `<v~i-1~>` was an array will be added. Otherwise, a `WHEN` branch casing on if `<v~i-1~>` is a bag will also be added. Alike the collection exclude rules defined for single `EXCLUDE` paths, we add a `SELECT VALUE ... FROM` over `<v~i-1~>`.
 
 Rule 4.b::
 We divide the set of applicable `EXCLUDE` paths into two subsets:
@@ -438,35 +441,35 @@ We divide the set of applicable `EXCLUDE` paths into two subsets:
 1. paths of length `i` (i.e. final step is `i`)
 2. paths of length greater than `i` (i.e. have additional steps)
 
-If there are any `EXCLUDE` paths of length `i`, then similar to <<anchor-3c, Rule 3.c>>, we add a `WHERE` clause to filter out those fields. The fields to exclude will be grouped together within a list.
+If there are any `EXCLUDE` paths of length `i`, then similar to <<anchor-3c, Rule 3.c>>, we add a `WHERE` clause to filter out those fields. The fields to exclude will be grouped together within an array.
 
-(Within the `WHEN IS LIST` branch) If there are any `EXCLUDE` paths of length greater than `i`, then similar to <<anchor-2c, Rule 2.c>>, we add a `CASE` expression within the `SELECT VALUE ... AT ... ORDER BY`. This `CASE` expression within the `SELECT VALUE` will define a `WHEN` branch for each of the unique collection index steps. Each of these `WHEN` branches will apply the rewrite rules for the exclude paths that have additional steps and equivalent collection indexes or collection wildcard. An `ELSE` branch will be added to this `CASE` expression which will apply the rewrite rules for the exclude paths with additional steps and collection wildcard.
+(Within the `WHEN IS ARRAY` branch) If there are any `EXCLUDE` paths of length greater than `i`, then similar to <<anchor-2c, Rule 2.c>>, we add a `CASE` expression within the `SELECT VALUE ... AT ... ORDER BY`. This `CASE` expression within the `SELECT VALUE` will define a `WHEN` branch for each of the unique collection index steps. Each of these `WHEN` branches will apply the rewrite rules for the exclude paths that have additional steps and equivalent collection indexes or collection wildcard. An `ELSE` branch will be added to this `CASE` expression which will apply the rewrite rules for the exclude paths with additional steps and collection wildcard.
 
 (Within the `WHEN IS BAG` branch, if applicable) We simply have a `FROM` over `<v~i-1~>` with a `SELECT VALUE` that applies the rewrite rules for exclude paths that have additional steps and collection wildcard at level `i`.
 [source,partiql,subs="+{markup-in-source}"]
 ----
--- Let `k` represent the number of unique exclude collection indexes for exclude paths of length
+-- Let `C` represent the number of unique exclude collection indexes for exclude paths of length
 -- greater than `i`.
 -- `<v~i~>` and `<idx~i~>` are fresh variables
-WHEN <v~i-1~> IS LIST THEN (
+WHEN <v~i-1~> IS ARRAY THEN (
     SELECT VALUE
         CASE 
-            WHEN <idx~i~> = <exclude path collection idx~1~> THEN
+            WHEN <idx~i~> = <exclude path collection idx~unique1~> THEN
                 -- Apply rewrite rules for exclude paths with
                 -- length > i AND
-                -- collection index idx~1~ or wildcard at ith step
+                -- collection index idx~unique1~ or wildcard at ith step
               ⋮
-            WHEN <idx~i~> = <exclude path collection idx~k~> THEN
+            WHEN <idx~i~> = <exclude path collection idx~uniqueK~> THEN
                 -- Apply rewrite rules for exclude paths with
                 -- length > i AND
-                -- collection index idx~k~ or wildcard at ith step
+                -- collection index idx~uniqueC~ or wildcard at ith step
             ELSE 
                 -- Apply rewrite rules for exclude paths with
                 -- length > i AND
                 -- collection wildcard at ith step 
         END
     FROM <v~i-1~> AS <v~i~> AT <idx~i~>
-    WHERE <idx~i~> NOT IN [<exclude indexes with last step i>]
+    WHERE <idx~i~> NOT IN [<exclude indexes with last step at i>]
     ORDER BY <idx~i~>
 )
 WHEN <v~i-1~> IS BAG THEN (
@@ -477,11 +480,11 @@ WHEN <v~i-1~> IS BAG THEN (
 ----
 
 =====
-NOTE: If the only applicable path at level `i` is a collection wildcard and this path is of length `i`, we know there are no other applicable collection paths by the subsumption rules. In this case, we can just return an empty list or bag for the `ith` nested `CASE` like <<anchor-3d, rule 3.d>>:
+NOTE: If the only applicable path at level `i` is a collection wildcard and this path is of length `i`, we know there are no other applicable collection paths by the subsumption rules. In this case, we can just return an empty array or bag for the `ith` nested `CASE` like <<anchor-3d, rule 3.d>>:
 [source,partiql,subs="+{markup-in-source}"]
 ----
-WHEN <v~i-1~> IS LIST THEN
-    []      -- empty list
+WHEN <v~i-1~> IS ARRAY THEN
+    []      -- empty array
 WHEN <v~i-1~> IS BAG THEN
     <<>>    -- empty bag
 ----
@@ -510,12 +513,12 @@ FROM (
     SELECT VALUE {
         't': 
             CASE 
-                WHEN t IS STRUCT THEN (
+                WHEN t IS TUPLE THEN (
                     PIVOT (
                         CASE 
                             WHEN LOWER(attr_1) = LOWER('a') THEN
                                 CASE 
-                                    WHEN v_1 IS STRUCT THEN (
+                                    WHEN v_1 IS TUPLE THEN (
                                         PIVOT v_2 AT attr_2
                                         FROM UNPIVOT v_1 AS v_2 AT attr_2
                                         WHERE LOWER(attr_2) NOT IN [LOWER('field_x')]
@@ -581,12 +584,12 @@ FROM (
     SELECT VALUE {
         't': 
             CASE 
-                WHEN t IS STRUCT THEN (
+                WHEN t IS TUPLE THEN (
                     PIVOT (
                         CASE 
                             WHEN LOWER(attr_1) = LOWER('a') THEN
                                 CASE 
-                                    WHEN v_1 IS STRUCT THEN
+                                    WHEN v_1 IS TUPLE THEN
                                         {}
                                     ELSE v_1
                                 END
@@ -648,10 +651,10 @@ FROM (
     SELECT VALUE {
         't': 
             CASE 
-                WHEN t IS STRUCT THEN (
+                WHEN t IS TUPLE THEN (
                     PIVOT (
                         CASE 
-                            WHEN v_1 IS STRUCT THEN (
+                            WHEN v_1 IS TUPLE THEN (
                                 PIVOT v_2 AT attr_2 
                                 FROM UNPIVOT v_1 AS v_2 AT attr_2
                                 WHERE LOWER(attr_2) NOT IN [LOWER('field_x')]
@@ -716,12 +719,12 @@ FROM (
     SELECT VALUE {
         't': 
             CASE 
-                WHEN t IS STRUCT THEN (
+                WHEN t IS TUPLE THEN (
                     PIVOT (
                         CASE 
                             WHEN LOWER(attr_1) = LOWER('a') THEN 
                                 CASE 
-                                    WHEN v_1 IS LIST THEN (
+                                    WHEN v_1 IS ARRAY THEN (
                                         SELECT VALUE v_2
                                         FROM v_1 AS v_2 AT idx_2 
                                         WHERE idx_2 NOT IN [1] 
@@ -797,12 +800,12 @@ FROM (
     SELECT VALUE {
         't': 
             CASE 
-                WHEN t IS STRUCT THEN (
+                WHEN t IS TUPLE THEN (
                     PIVOT (
                         CASE 
                             WHEN LOWER(attr_1) = LOWER('a') THEN 
                                 CASE 
-                                    WHEN v_1 IS LIST THEN
+                                    WHEN v_1 IS ARRAY THEN
                                         []
                                     WHEN v_1 IS BAG THEN
                                         <<>>
@@ -865,13 +868,13 @@ Rewritten query:
 SELECT t.*
 FROM (
     SELECT VALUE {
-        't': CASE WHEN t IS STRUCT THEN (
+        't': CASE WHEN t IS TUPLE THEN (
             PIVOT (
                 CASE WHEN LOWER(attr_1) = LOWER('a') THEN 
-                    CASE WHEN v_1 IS LIST THEN (
+                    CASE WHEN v_1 IS ARRAY THEN (
                         SELECT VALUE
                             CASE WHEN idx_2 = 1 THEN
-                                CASE WHEN v_2 IS STRUCT THEN (
+                                CASE WHEN v_2 IS TUPLE THEN (
                                     PIVOT v_3 AT attr_3
                                     FROM UNPIVOT v_2 AS v_3 AT attr_3
                                     WHERE LOWER(attr_3) NOT IN [LOWER('field_x')]
@@ -953,12 +956,12 @@ Rewritten query:
 SELECT t.*
 FROM (
     SELECT VALUE {
-        't': CASE WHEN t IS STRUCT THEN (
+        't': CASE WHEN t IS TUPLE THEN (
             PIVOT (
                 CASE WHEN LOWER(attr_1) = LOWER('a') THEN 
-                    CASE WHEN v_1 IS LIST THEN (
+                    CASE WHEN v_1 IS ARRAY THEN (
                         SELECT VALUE 
-                            CASE WHEN v_2 IS STRUCT THEN (
+                            CASE WHEN v_2 IS TUPLE THEN (
                                 PIVOT v_3 AT attr_3 
                                 FROM UNPIVOT v_2 AS v_3 AT attr_3
                                 WHERE LOWER(attr_3) NOT IN [LOWER('field_x')]
@@ -970,7 +973,7 @@ FROM (
                     )
                     WHEN v_1 IS BAG THEN (
                         SELECT VALUE 
-                            CASE WHEN v_2 IS STRUCT THEN (
+                            CASE WHEN v_2 IS TUPLE THEN (
                                 PIVOT v_3 AT attr_3 
                                 FROM UNPIVOT v_2 AS v_3 AT attr_3
                                 WHERE LOWER(attr_3) NOT IN [LOWER('field_x')]
@@ -1046,7 +1049,7 @@ FROM (
     SELECT VALUE {
         'foo': foo,
         'bar': 
-            CASE WHEN bar is STRUCT THEN (
+            CASE WHEN bar IS TUPLE THEN (
                 PIVOT v AT attr
                 FROM UNPIVOT bar AS v AT attr
                 WHERE LOWER(attr) NOT IN [LOWER('d')]
@@ -1113,7 +1116,7 @@ SELECT v, attr
 FROM (
     SELECT VALUE {
         'v': 
-            CASE WHEN v IS STRUCT THEN (
+            CASE WHEN v IS TUPLE THEN (
                 PIVOT v_v AT attr_v
                 FROM UNPIVOT v AS v_v AT attr_v
                 WHERE LOWER(attr_v) NOT IN [LOWER('foo')]
@@ -1182,7 +1185,7 @@ FROM (
     SELECT VALUE {
         't': 
             CASE 
-                WHEN t IS STRUCT THEN (
+                WHEN t IS TUPLE THEN (
                     PIVOT v AT attr
                     FROM UNPIVOT t AS v AT attr
                     WHERE LOWER(attr) NOT IN [LOWER('a')]
@@ -1242,7 +1245,7 @@ FROM (
     SELECT VALUE {
         't': 
             CASE
-                WHEN t IS STRUCT THEN (
+                WHEN t IS TUPLE THEN (
                     PIVOT v_1 AT attr_1
                     FROM UNPIVOT t AS v_1 AT attr_1
                     WHERE
@@ -1300,12 +1303,12 @@ FROM (
     SELECT VALUE {
         't': 
             CASE 
-                WHEN t IS STRUCT THEN (
+                WHEN t IS TUPLE THEN (
                     PIVOT (
                         CASE 
                             WHEN LOWER(attr_1) = LOWER('a') THEN
                                 CASE 
-                                    WHEN v_1 IS STRUCT THEN (
+                                    WHEN v_1 IS TUPLE THEN (
                                         PIVOT v_2 AT attr_2 
                                         FROM UNPIVOT v_1 AS v_2 AT attr_2
                                         WHERE LOWER(attr_2) NOT IN [LOWER('a1')]
@@ -1371,12 +1374,12 @@ SELECT t.*
 FROM (
     SELECT VALUE {
         't': 
-            CASE WHEN t IS STRUCT THEN (
+            CASE WHEN t IS TUPLE THEN (
                 PIVOT (
                     CASE WHEN LOWER(attr_1) = LOWER('a') THEN
-                        CASE WHEN v_1 IS STRUCT THEN (
+                        CASE WHEN v_1 IS TUPLE THEN (
                             PIVOT (
-                                CASE WHEN v_2 IS STRUCT THEN (
+                                CASE WHEN v_2 IS TUPLE THEN (
                                     PIVOT v_3 AT attr_3
                                     FROM UNPIVOT v_2 AS v_3 AT attr_3
                                     WHERE LOWER(attr_3) NOT IN [LOWER('bar')]
@@ -1387,9 +1390,9 @@ FROM (
                             FROM UNPIVOT v_1 AS v_2 AT attr_2
                             WHERE LOWER(attr_2) NOT IN [LOWER('bar')]
                         )
-                        WHEN v_1 IS LIST THEN (
+                        WHEN v_1 IS ARRAY THEN (
                             SELECT VALUE 
-                                CASE WHEN v_2 IS STRUCT THEN (
+                                CASE WHEN v_2 IS TUPLE THEN (
                                     PIVOT v_3 AT attr_3
                                     FROM UNPIVOT v_2 AS v_3 AT attr_3
                                     WHERE LOWER(attr_3) NOT IN [LOWER('bar')]
@@ -1400,7 +1403,7 @@ FROM (
                             ORDER BY idx_2
                         )
                         -- WHEN v_1 IS BAG THEN ... 
-                        -- same as for LIST but remove `AT` and `ORDER BY`
+                        -- same as for ARRAY but remove `AT` and `ORDER BY`
                         ELSE v_1
                         END
                     ELSE v_1
@@ -1494,7 +1497,7 @@ We choose to model `EXCLUDE` as a syntactic rewrite over existing clauses (e.g.
 
 Why does `EXCLUDE` not give an evaluation error when an exclude path does not remove anything? Or on data type mismatch (e.g. tuple attribute exclude step on collection)?::
 
-We have opted to not error at evaluation time when `EXCLUDE` does not omit any values or in data type mismatch cases.  It is very possible in the schemaless, semi-structured data domain that our data is missing some fields or has different structures. The idea here is that `EXCLUDE` will guarantee that all values at the exclude path will be omitted from the output binding tuple. This can enable use cases such as <<Example: EXCLUDE with different FROM source bindings>> in which the data we wish to exclude is nested within a heterogeneous set of structs and containers.
+We have opted to not error at evaluation time when `EXCLUDE` does not omit any values or in data type mismatch cases.  It is very possible in the schemaless, semi-structured data domain that our data is missing some fields or has different structures. The idea here is that `EXCLUDE` will guarantee that all values at the exclude path will be omitted from the output binding tuple. This can enable use cases such as <<Example: EXCLUDE with different FROM source bindings>> in which the data we wish to exclude is nested within a heterogeneous set of tuples and collections.
 +
 A future RFC could opt to give a warning/error in these cases when schema is present and we know at static time that an `EXCLUDE` path will not omit values. See <<Unresolved questions>> for more discussion on schema.
 
@@ -1511,7 +1514,7 @@ PartiQL users have frequently asked us for this capability to omit certain neste
 * Some helpful discussion on the issue of `EXCLUDE` being added to AsterixDB: https://issues.apache.org/jira/browse/ASTERIXDB-3059
 * More info on AsterixDB: https://dbdb.io/db/asterixdb
 
-AsterixDB, an implementation of SQL++, has defined an `EXCLUDE` clause to operate on semi-structured data to omit certain nested struct fields; however, AsterixDB's definition is limited and does not cover other common use cases involving collections and multi-struct field exclusions.
+AsterixDB, an implementation of SQL++, has defined an `EXCLUDE` clause to operate on semi-structured data to omit certain nested tuple fields; however, AsterixDB's definition is limited and does not cover other common use cases involving collections and multi-tuple field exclusions.
 
 Another key difference is that the `EXCLUDE` clause is evaluated on the output of the `SELECT` projection.