Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/chunked database dumps #76

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,8 @@ window.onload = connectWebSocket
<h3>SQL Dump</h3>
You can request a `database_dump.sql` file that exports your database schema and data into a single file.

For small databases (< 100MB), you can use the direct download endpoint:

<pre>
<code>
curl --location 'https://starbasedb.YOUR-ID-HERE.workers.dev/export/dump' \
Expand All @@ -238,6 +240,66 @@ curl --location 'https://starbasedb.YOUR-ID-HERE.workers.dev/export/dump' \
</code>
</pre>

For large databases, use the chunked dump endpoint which processes the dump in the background:

1. Start the dump:
<pre>
<code>
curl --location --request POST 'https://starbasedb.YOUR-ID-HERE.workers.dev/export/dump/chunked' \
--header 'Authorization: Bearer ABC123'
</code>
</pre>

Response:

```json
{
"message": "Database dump started",
"dumpId": "123e4567-e89b-12d3-a456-426614174000",
"status": "in_progress",
"downloadUrl": "https://starbasedb.YOUR-ID-HERE.workers.dev/export/dump/123e4567-e89b-12d3-a456-426614174000"
}
```

2. Check dump status:
<pre>
<code>
curl --location 'https://starbasedb.YOUR-ID-HERE.workers.dev/export/dump/123e4567-e89b-12d3-a456-426614174000/status' \
--header 'Authorization: Bearer ABC123'
</code>
</pre>

Response:

```json
{
"status": "in_progress",
"progress": {
"currentTable": "users",
"processedTables": 2,
"totalTables": 5
}
}
```

3. Download the completed dump:
<pre>
<code>
curl --location 'https://starbasedb.YOUR-ID-HERE.workers.dev/export/dump/123e4567-e89b-12d3-a456-426614174000' \
--header 'Authorization: Bearer ABC123' \
--output database_dump.sql
</code>
</pre>

The chunked dump endpoint:

- Processes large databases in chunks to avoid memory issues
- Stores the dump file in R2 storage
- Takes "breathing intervals" to prevent database locking
- Supports databases up to 10GB in size
- Provides progress tracking
- Returns a download URL when complete

<h3>JSON Data Export</h3>
<pre>
<code>
Expand Down
72 changes: 67 additions & 5 deletions src/do.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
import { DurableObject } from 'cloudflare:workers'
import { processDumpChunk } from './export/chunked-dump'
import { StarbaseDBConfiguration } from './handler'
import { DataSource } from './types'

export class StarbaseDBDurableObject extends DurableObject {
// Durable storage for the SQL database
Expand All @@ -9,14 +12,14 @@ export class StarbaseDBDurableObject extends DurableObject {
public connections = new Map<string, WebSocket>()
// Store the client auth token for requests back to our Worker
private clientAuthToken: string

/**
* The constructor is invoked once upon creation of the Durable Object, i.e. the first call to
* `DurableObjectStub::get` for a given identifier (no-op constructors can be omitted)
*
* @param ctx - The interface for interacting with Durable Object state
* @param env - The interface to reference bindings declared in wrangler.toml
*/

constructor(ctx: DurableObjectState, env: Env) {
super(ctx, env)
this.clientAuthToken = env.CLIENT_AUTHORIZATION_TOKEN
Expand Down Expand Up @@ -59,10 +62,8 @@ export class StarbaseDBDurableObject extends DurableObject {
"operator" TEXT DEFAULT '='
)`

this.executeQuery({ sql: cacheStatement })
this.executeQuery({ sql: allowlistStatement })
this.executeQuery({ sql: allowlistRejectedStatement })
this.executeQuery({ sql: rlsStatement })
// Initialize tables
this.initializeTables()
}

init() {
Expand All @@ -72,6 +73,11 @@ export class StarbaseDBDurableObject extends DurableObject {
deleteAlarm: this.deleteAlarm.bind(this),
getStatistics: this.getStatistics.bind(this),
executeQuery: this.executeQuery.bind(this),
storage: {
get: this.storage.get.bind(this.storage),
put: this.storage.put.bind(this.storage),
delete: this.storage.delete.bind(this.storage),
},
}
}

Expand Down Expand Up @@ -324,4 +330,60 @@ export class StarbaseDBDurableObject extends DurableObject {
throw error
}
}

private convertToStubArrayBuffer(value: ArrayBuffer): {
byteLength: number
slice: (begin: number, end?: number) => Promise<ArrayBuffer>
[Symbol.toStringTag]: string
} {
return {
byteLength: value.byteLength,
slice: async (begin: number, end?: number) =>
value.slice(begin, end),
[Symbol.toStringTag]: 'ArrayBuffer',
}
}

private async initializeTables() {
// Install default necessary `tmp_` tables for various features here.
const cacheStatement = `
CREATE TABLE IF NOT EXISTS tmp_cache (
"id" INTEGER PRIMARY KEY AUTOINCREMENT,
"timestamp" REAL NOT NULL,
"ttl" INTEGER NOT NULL,
"query" TEXT UNIQUE NOT NULL,
"results" TEXT
);`

const allowlistStatement = `
CREATE TABLE IF NOT EXISTS tmp_allowlist_queries (
id INTEGER PRIMARY KEY AUTOINCREMENT,
sql_statement TEXT NOT NULL,
source TEXT DEFAULT 'external'
)`
const allowlistRejectedStatement = `
CREATE TABLE IF NOT EXISTS tmp_allowlist_rejections (
id INTEGER PRIMARY KEY AUTOINCREMENT,
sql_statement TEXT NOT NULL,
source TEXT DEFAULT 'external',
created_at TEXT DEFAULT (datetime('now'))
)`

const rlsStatement = `
CREATE TABLE IF NOT EXISTS tmp_rls_policies (
"id" INTEGER PRIMARY KEY AUTOINCREMENT,
"actions" TEXT NOT NULL CHECK(actions IN ('SELECT', 'UPDATE', 'INSERT', 'DELETE')),
"schema" TEXT,
"table" TEXT NOT NULL,
"column" TEXT NOT NULL,
"value" TEXT NOT NULL,
"value_type" TEXT NOT NULL DEFAULT 'string',
"operator" TEXT DEFAULT '='
)`

await this.executeQuery({ sql: cacheStatement })
await this.executeQuery({ sql: allowlistStatement })
await this.executeQuery({ sql: allowlistRejectedStatement })
await this.executeQuery({ sql: rlsStatement })
}
}
Loading