Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: docs for dwh cost estimates and savings #3554

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions docs/cloud/features/costs_savings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Data Warehouse Costs and Savings with Tobiko

Cloud data warehouses can be expensive, and finance departments are paying attention. When they come asking about your cloud budget, the best preparation is a granular understanding of how your project's data pipelines map to those costs.

Cloud warehouse costs are determined by how much computation occurs. Clouds report computation/cost for each executed query, but it is challenging to align every query with a specific model in your project.

Tobiko Cloud solves this problem for you. It tracks data warehouse cost estimates per model for BigQuery and Snowflake projects that use a supported pricing plan.

This granular cost information allows you to directly explain/justify your cloud warehouse spend. It also uncovers the models that could most benefit from efficiency improvements.

Beyond tracking your warehouse costs, Tobiko Cloud estimates how much money it saved you! Its advanced column-level impact analysis saves you even more money than open-source SQLMesh.

## Supported Data Warehouse Pricing Plans

Tobiko Cloud supports these cloud pricing plans:

- BigQuery [On Demand](https://cloud.google.com/bigquery/pricing#on_demand_pricing)
- Snowflake [Credits](https://docs.snowflake.com/en/user-guide/cost-understanding-compute#label-what-are-credits)

## Data Warehouse Cost Configuration

Configure Tobiko Cloud to show cost information by navigating to the Settings page.

![Image highlighting location of the Settings link in the left site navigation](./costs_savings/costs-navigation.png)

On the General settings page (1), select your pricing plan (2), enter your costs, and then save (3).
caiters marked this conversation as resolved.
Show resolved Hide resolved

![Annotated image showing locations of the general settings link, pricing plan form fields, and save button](./costs_savings/costs-steps.png)

For the BigQuery On Demand plan, we supply BigQuery's default cost per terabyte processed of $6.25\*:
![Costs form for BigQuery On-Demand](./costs_savings/costs-bigquery-on-demand.png)

For the Snowflake Credits plan, we supply Snowflake's default cost per credit of $3.00\*:
![Costs form for Snowflake Credits](./costs_savings/costs-snowflake-credits.png)

If your organization has negotiated a different cost, please enter that cost into the "Cost per" field instead so your organization's data warehouse cost estimates will be more accurate.

\*These are the current default costs as of January 2025, and will be updated in Tobiko Cloud as warehouse cost defaults change.

## Where to find cost and savings estimates information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest introducing savings before where to find them.


Estimated costs and savings are displayed on the Tobiko Cloud homepage, production environment page, runs and plans pages, and individual model pages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the first time estimated is included. Should we add a paragraph to the first section with the disclaimer that it may not 100% match, but is based on the data retrieved from the datawarehouse itself? And then remove estimated from here and elsewhere.


Cost information on each page will look similar to:

![Example of costs and savings data as seen on the Tobiko Cloud homepage](./costs_savings/costs-example.png)

## Savings Categories

When calculating your data warehouse costs, Tobiko Cloud also estimates how much money it saved you!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: who is it isn't clear here. It could be the dwh or tcloud.


Cost savings are broken up into three main categories:
caiters marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are three types of cost savings tracked by Tobiko Cloud:


- **Prevented Reruns**: If SQLMesh has already run an execution for a change in one environment, we won't need to rerun it in another environment (such as when a model change is backfilled on a development environment and then applied to prod).
Copy link
Contributor

@crericha crericha Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SQLMesh has already executed the model for the interval(s) requested, it knows that it doesn't need to execute the interval(s) again (such as...)

- **Unaffected Downstream**: Because SQLMesh understands SQL, we know if a downstream model is not affected by an upstream change and can avoid re-execution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can detect detect when a downstream model is not affected by an upstream change and avoid an re-execution that we know would produce the same data

- **Virtual Environments**: Using the Virtual data environments feature means new environments can be created without backfilling models within the new environment.

### Where to find cost savings information

Cost savings are shown in most of the places data warehouse costs are displayed. You can find how much Tobiko Cloud has saved you by viewing the homepage, production environment page, or individual model pages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not show on the plan page yet?

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/cloud/features/costs_savings/costs-steps.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,10 @@ nav:
- Cloud Features:
# - "Alerts & Notifications": cloud/features/alerts_notifications.md
- cloud/features/debugger_view.md
# - cloud/features/data_catalog.md
- cloud/features/data_catalog.md
# - cloud/features/model_freshness.md
# - cloud/features/scheduler.md
- "Cost & Savings Estimates": cloud/features/costs_savings.md
# - Observability:
# - cloud/observability/monitoring.md
# - "Measures & Dashboards": cloud/observability/measures_dashboards.md
Expand Down
Loading