Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions _data/navigation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -623,6 +623,13 @@ items:

- url: /transformations/snowflake-plain/
title: Snowflake Transformations
items:
- url: /transformations/snowflake-plain/how-to/
title: How do I run a Snowflake transformation?
- url: /transformations/snowflake-plain/reference/
title: Reference
- url: /transformations/snowflake-plain/explanation/
title: When to use it

- url: /transformations/bigquery/
title: BigQuery Transformations
Expand Down
154 changes: 154 additions & 0 deletions revamp/diataxis-page-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# Diátaxis page template

Copy the block for the page type you are writing into a new Markdown file under
`src/content/docs/…`, then fill it in. One page serves **exactly one** reader
need. If you find yourself writing two of these on one page, split it.

Frontmatter is identical across types except `type:`. Required keys: `title`,
`slug`, `description`, `keywords`, `type`. Add `redirect_from` only on a page
that takes over an old URL (usually the hub).

Conventions to match (see existing pages):
- **Title + `description` in the user's words / symptom vocabulary**, not feature
labels. Cover singular/plural + obvious synonyms in `keywords`.
- **The text must carry all the meaning.** Remove every screenshot and the page
must still be fully doable. Keep a screenshot only if it genuinely helps a
human locate something in the UI, and give it real alt text.
- **All code/config in fenced blocks**, never as a screenshot.
- Root-relative links (`/transformations/…/`).
- Anything you can't verify against the component code / config schema →
`<!-- TODO(human-review): … -->` inline. Never silently add/rename/remove fields.

---

## How-to

```markdown
---
title: How do I <do the task>?
slug: '<section>/<page>/how-to'
description: <One sentence: the task, the tool, and the end state, in the user's words.>
keywords:
- <task phrase>
- <synonym / plural>
- <product term the user might search>
type: how-to
---

<1–2 sentences: the situation the reader is in and what this page gets them to.
Link to the matching explanation and reference pages.>

**Time:** ~N minutes · **You will need:** <prerequisites in one line.>

## Before you start

<Bulleted prerequisites: access, credentials, an existing resource, etc.>

## Step 1 — <imperative>

1. <Literal control + navigation path, e.g. **Components → Transformations**.>
2. <Copy-pasteable value or config.>

## Step 2 — <imperative>


## Step N — Run it and confirm it worked

1. <Run / save action.>
2. <The explicit, observable success check — what the reader should SEE.>

## Troubleshooting

| Symptom | Likely cause | Fix |
|---|---|---|
| <error/symptom> | <cause> | <fix, with a link if relevant> |

## Related

- [<reference page>](…)
- [<explanation page>](…)
```

---

## Reference

```markdown
---
title: <Thing> reference
slug: '<section>/<page>/reference'
description: <One sentence: lookup reference for X — list the main things covered.>
keywords:
- <field / parameter names>
- <limits / types the user searches>
type: reference
---

<One line: what this is reference for, with links to the how-to and explanation.>

<!-- TODO(human-review): state which facts below are unverified against the
component code / config schema, if any. -->

## <Limits / Parameters / Types …>

| <Field> | <Value/Type> | Notes |
|---|---|---|
| … | … | … |

<Reference is lookup-only: no narrative, no steps. Tables and fenced examples.
Written once and shared across user/dev audiences.>
```

---

## Explanation

```markdown
---
title: When should I use <thing>? (or: Understanding <thing>)
slug: '<section>/<page>/explanation'
description: <One sentence: what the reader will understand and the decision it helps them make.>
keywords:
- <concept>
- when to use <thing>
- <thing> vs <alternative>
type: explanation
---

<What it is, in plain terms, with links to the how-to and reference.>

## What it is

## Why / how it fits

## When to use it (and when not to)

<Explanation is conceptual: no step-by-step, no exhaustive field lists. Written
once and shared across user/dev audiences.>
```

---

## The hub (thin link page left at the OLD url)

```markdown
---
title: <Existing page title>
slug: '<existing/old/slug>'
description: <One sentence pointing readers onward.>
keywords:
- <existing search terms>
type: explanation
redirect_from:
- <older path(s) that already pointed here>
---

<1–2 sentences of what the thing is.>

This page is split by what you need:

- **[How do I …?](…/how-to/)** — …
- **[… reference](…/reference/)** — …
- **[When should I use …?](…/explanation/)** — …
```
71 changes: 71 additions & 0 deletions revamp/diataxis-split-checklist.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Diátaxis split checklist

Standard checklist for splitting one "frankenstein" page into how-to / reference
/ explanation + hub. Use it per page for Issue B (the remaining ~15). The
Snowflake transformation split is the validated reference example.

## 0. Classify (Block 0)

- [ ] Find the page's row in the Block 0 classification (Linear PRDCT-354): its
Diátaxis type(s), frankenstein flag, audience, and **machine
source-of-truth** (the component repo / config schema to verify against).
- [ ] Confirm it is actually a frankenstein (mixes ≥2 of how-to / reference /
explanation). If it is a clean single type, it does not need splitting.

## 1. Inventory the source

- [ ] Read the source page top to bottom; list every section.
- [ ] Tag each section: how-to (a task), reference (lookup), or explanation
(why/when). This mapping is the split plan — get it reviewed before writing.
- [ ] Note every screenshot and decide: does the text already carry it? Drop it
unless it helps locate something in the UI.
- [ ] Note every inbound link/anchor you'll need to preserve.

## 2. Verify facts against code (not the UI, not the old text)

- [ ] For each field / parameter / type / limit, check the component code or
config schema (the Block 0 source-of-truth).
- [ ] Anything you cannot verify → `<!-- TODO(human-review): … -->` inline.
- [ ] Do **not** add, rename, or remove config fields — flag for a human instead.

## 3. Write the three pages (use the template)

- [ ] How-to, reference, explanation each created from the template.
- [ ] Frontmatter on every page: `title`, `slug`, `description`, `keywords`,
`type` (+ `redirect_from` where relevant).
- [ ] Titles + descriptions in the user's words / symptom vocabulary; keywords
cover singular/plural + synonyms.
- [ ] How-to has: literal control names + nav paths, copy-pasteable config, an
explicit "confirm it worked" step, and a troubleshooting section.
- [ ] Reference is lookup-only; explanation is conceptual-only.
- [ ] All code/config in fenced blocks. No code in screenshots.
- [ ] Clean up migration leftovers on the pages you create (e.g. `{: width}`,
stale anchors).

## 4. Hub + redirect (don't break the old URL)

- [ ] Replace the old page at its existing slug with a thin hub linking to the
three new pages.
- [ ] Keep existing `redirect_from` entries; the old URL must still resolve.
- [ ] Cross-link: each new page links back to the other two.

## 5. Wire navigation

- [ ] Add the three pages under the hub in `_data/navigation.yml`.
- [ ] `npm run gen:sidebar` (don't hand-edit `src/sidebar.mjs`).

## 6. Verify the build

- [ ] `npm run build` is clean.
- [ ] `node scripts/audit-phase2.mjs` shows no new issues on the pages you touched.
- [ ] Every cross-page anchor resolves (new heading IDs + the ones you link to on
existing pages).
- [ ] The old URL 301-redirects to the hub.

## 7. Ship

- [ ] Branch name and PR title carry the Linear id (e.g. `PRDCT-354: …`).
- [ ] Touch only the pilot page and its split outputs — nothing else.
- [ ] PR body lists: BEFORE → AFTER (what went where) and the human-review queue
(every `TODO(human-review)`).
- [ ] Share a preview link for review.
4 changes: 4 additions & 0 deletions src/content.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ export const collections = {
icon: z.string().optional(),
section: z.string().optional(),
beacon: z.boolean().optional(),
// Docs revamp (Diátaxis) — every revamped page declares the single
// reader need it serves, plus user-vocabulary keywords for search/RAG.
keywords: z.array(z.string()).optional(),
type: z.enum(['how-to', 'reference', 'explanation']).optional(),
}),
}),
}),
Expand Down
59 changes: 59 additions & 0 deletions src/content/docs/transformations/snowflake-plain/explanation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
title: When should I use a Snowflake transformation?
slug: 'transformations/snowflake-plain/explanation'
description: Understand what a Snowflake SQL transformation is in Keboola, why and when to choose it over Python, R, BigQuery, or DuckDB, and how it fits into the input-mapping → script → output-mapping flow.
keywords:
- Snowflake transformation
- Snowflake transformations
- when to use Snowflake transformation
- SQL transformation Keboola
- Snowflake vs Python transformation
- Snowflake backend
type: explanation
---

A **Snowflake transformation** runs your [SQL](https://www.snowflake.com/) against a Snowflake database that Keboola manages for you. You write `SELECT` / `CREATE TABLE` statements; Keboola takes care of the warehouse, the staging area, and moving results back to [Storage](/storage/tables/). This page explains what that means and when it is the right choice. To build one, follow the [how-to](/transformations/snowflake-plain/how-to/); for exact limits and syntax rules, see the [reference](/transformations/snowflake-plain/reference/).

## What it is

Like every [transformation](/transformations/), a Snowflake transformation operates on an isolated copy of your data, not on Storage directly:

1. **Input mapping** copies the Storage tables you name into a temporary staging schema.
2. Your **SQL script** runs against that staging schema.
3. **Output mapping** writes the resulting tables back to Storage.

Because it works on a copy, you can rename or restructure Storage tables without breaking the script, and a failed run never corrupts your source data.

## Why Snowflake

Snowflake is a cloud data warehouse, which removes most of the operational burden of traditional databases:

- **No database administration** — no servers, vacuuming, or patching to manage.
- **No indexes, sort keys, distribution styles, or column compression** to design and tune.
- **Easy scaling** — increase the [backend size](/transformations/snowflake-plain/reference/#backend-sizes-dynamic-backends) when a job needs more power, without rewriting anything.
- **Simple data types** and a familiar SQL dialect.
- **Strong processing power and throughput** for large joins and aggregations.

Being a managed cloud service, Snowflake also ships continuous updates; occasionally that means behavioral changes worth tracking in the [release notes](https://docs.snowflake.com/en/release-notes/overview).

## When to use it (and when not to)

Choose a Snowflake transformation when:

- Your logic is naturally expressed in **SQL** — joins, aggregations, filtering, denormalizing, integrity checks.
- Your data is **tabular** and you want set-based processing close to where the data already lives.
- You want to scale up heavy jobs simply by [changing the backend size](/transformations/snowflake-plain/how-to/#make-it-faster-backend-size).

Consider a different backend when:

- You need **procedural code**, custom libraries, or ML — use a [Python](/transformations/python-plain/) or [R](/transformations/r-plain/) transformation.
- Your project runs on a different warehouse — Keboola also offers [BigQuery](/transformations/bigquery/), [DuckDB](/transformations/duckdb/), and [Oracle](/transformations/oracle/) transformations. The concepts on this page are the same; the SQL dialect and limits differ.

## Things to understand up front

Two Snowflake behaviors trip people up; both are detailed in the [reference](/transformations/snowflake-plain/reference/):

- **Case sensitivity.** Snowflake folds unquoted identifiers to upper case, but Keboola creates tables and columns in their original case. Quote your identifiers (`"my_column"`) so they match — see [identifier case sensitivity](/transformations/snowflake-plain/reference/#identifier-case-sensitivity).
- **Everything lands as character data.** Storage stores columns as character types, so values are cast to char on output — and `ARRAY`, `OBJECT`, and `VARIANT` must be cast explicitly. See [working with data types](/transformations/snowflake-plain/reference/#working-with-data-types).

Understanding these two points early saves most of the debugging time newcomers spend on Snowflake transformations.
Loading
Loading