Custom GCS Storage

Configure Slateo to use your own Google Cloud Storage (GCS) buckets for query results and file storage.

Overview

By default, Slateo stores query results and uploaded files in Slateo-managed storage. For organizations with data residency requirements or compliance needs, you can configure Slateo to use GCS buckets in your own Google Cloud project.

Slateo accesses your GCS buckets using a GCP service account that you provide. You can either supply a dedicated service account JSON key for storage, or reuse the service account credentials from your BigQuery datasource configuration.

Key Benefits:

  • Data never stored outside your Google Cloud project
  • Full control over encryption, retention, and access policies
  • Audit trail in your own Cloud Audit Logs
  • Option to reuse existing BigQuery service account credentials

Prerequisites

Before starting, ensure you have:

  • Google Cloud project with permissions to create GCS buckets and manage IAM
  • A service account with a JSON key (or an existing BigQuery datasource configured in Slateo with service account credentials)

Setup steps

Step 1: Create GCS buckets

Create two GCS buckets in your Google Cloud project:

Bucket PurposeRecommended NamingDescription
Uploads{company}-slateo-uploadsUser file uploads, CSVs, exports
Cache{company}-slateo-cacheQuery result caching

Both buckets should be created in the same project and location as your BigQuery datasets for lowest latency.

Using gcloud CLI:

# Create uploads bucket
gcloud storage buckets create gs://{company}-slateo-uploads \
  --project={your-project-id} \
  --location=us-central1 \
  --uniform-bucket-level-access

# Create cache bucket
gcloud storage buckets create gs://{company}-slateo-cache \
  --project={your-project-id} \
  --location=us-central1 \
  --uniform-bucket-level-access

Enforce public access prevention (required for both buckets):

gcloud storage buckets update gs://{company}-slateo-uploads \
  --public-access-prevention=enforced

gcloud storage buckets update gs://{company}-slateo-cache \
  --public-access-prevention=enforced

Step 2: Grant service account permissions

The service account used by Slateo needs access to both buckets.

Option A: Predefined role (simplest)

Grant roles/storage.objectAdmin on each bucket:

gsutil iam ch serviceAccount:SERVICE_ACCOUNT_EMAIL:objectAdmin \
  gs://{company}-slateo-uploads

gsutil iam ch serviceAccount:SERVICE_ACCOUNT_EMAIL:objectAdmin \
  gs://{company}-slateo-cache

Option B: Custom role (least privilege)

Create a custom role with these permissions and bind it to the service account on each bucket:

  • storage.objects.create
  • storage.objects.get
  • storage.objects.delete
  • storage.objects.list
  • storage.buckets.get

Step 3: Configure CORS on cache bucket

Add CORS configuration to the cache bucket to allow browser-based downloads of query results.

Save this as cors.json:

[
  {
    "origin": [
      "https://*.slateo.ai",
      "https://slateo.ai"
    ],
    "method": ["GET", "HEAD"],
    "responseHeader": ["ETag"],
    "maxAgeSeconds": 3600
  }
]

Apply it:

gsutil cors set cors.json gs://{company}-slateo-cache

To verify:

gsutil cors get gs://{company}-slateo-cache

Step 4: (Optional) Configure CMEK encryption

If using Customer-Managed Encryption Keys (CMEK) for encryption at rest, the service account must have roles/cloudkms.cryptoKeyEncrypterDecrypter on the relevant key ring:

gcloud kms keys add-iam-policy-binding {key-name} \
  --project={your-project-id} \
  --location={location} \
  --keyring={keyring-name} \
  --member=serviceAccount:SERVICE_ACCOUNT_EMAIL \
  --role=roles/cloudkms.cryptoKeyEncrypterDecrypter

Step 5: Configure buckets in Slateo

After configuring your buckets and permissions, update your database configuration in Slateo:

  1. Go to AdminDatabases
  2. Click the settings icon (gear) on your database connection
  3. Select GCS as the storage provider
  4. Enter the following fields:
    • GCS Upload Bucket: Your uploads bucket name
    • GCS Cache Bucket: Your cache bucket name
    • GCP Project ID: The project containing your buckets
    • Service Account JSON (optional): A service account JSON key with storage access. If omitted, Slateo uses the credentials from your BigQuery datasource configuration.
  5. Click Save to apply the configuration

Security considerations

PracticeDescription
Uniform bucket-level accessEnable on both buckets for consistent IAM-based access control
Public access preventionEnforce on both buckets to block any public access
Dedicated service accountUse a separate service account for Slateo storage (not the same SA used for BigQuery queries)
Cloud Audit LogsEnable data access logging for monitoring
CMEK encryptionUse customer-managed keys for additional control over encryption at rest

Troubleshooting

Access denied errors

  1. Check GCS storage status - Ensure the status shows Ready in Admin → Databases
  2. Verify service account permissions - The service account must have roles/storage.objectAdmin (or equivalent custom role) on both buckets
  3. Check bucket names match exactly (case-sensitive)
  4. Verify service account JSON key in Slateo matches the account with permissions

Browser download failures (query results)

If you see CORS errors or TypeError: Failed to fetch when viewing query results:

  1. Verify CORS configuration is applied to the cache bucket (not uploads)
  2. Check origin values include https://*.slateo.ai and https://slateo.ai
  3. Verify methods include GET and HEAD
  4. Clear browser cache - The browser may have cached a failed CORS preflight. Try a hard refresh (Cmd+Shift+R / Ctrl+Shift+R) or incognito window.

Run gsutil cors get gs://YOUR-CACHE-BUCKET to verify the configuration.

Signed URL errors

The service account must have the iam.serviceAccounts.signBlob permission, or the JSON key must include the private key. If using Workload Identity Federation instead of a JSON key, ensure the roles/iam.serviceAccountTokenCreator role is granted.

CMEK errors

  1. Verify the key exists and is enabled (not pending deletion)
  2. Ensure the key is in the same location as the bucket
  3. Check the service account has roles/cloudkms.cryptoKeyEncrypterDecrypter on the key

General issues

  1. Confirm buckets exist in the specified project
  2. Check Cloud Audit Logs for detailed error information
  3. Contact support with your organization slug and error details

Frequently asked questions

Can I use a single bucket for both uploads and cache?

No. Both upload and cache bucket fields must be specified, and they should be separate buckets for proper isolation and different lifecycle policies.

Can I reuse my BigQuery service account credentials?

Yes. If your BigQuery datasource is configured with a service account JSON key, you can leave the storage service account field empty and Slateo will use the BigQuery credentials. The BigQuery service account must have the required storage permissions on both buckets.

How do I verify the integration is working?

After configuration, run a query and check that the results are cached in your bucket. Query cache files are stored with the prefix {org-id}/query-cache/parquet/. You can also upload a CSV file and verify it appears under {org-id}/uploads/.

Can I migrate existing data to my buckets?

When you configure custom buckets, new data will automatically go to your buckets. Existing cached query results will remain in their previous location and become inaccessible through Slateo (you can re-run queries to regenerate them). Previously uploaded files will also need to be re-uploaded. Contact support if you need to discuss migration of historical data.

What locations are supported?

Any GCS location is supported. For lowest latency, place your buckets in the same location as your BigQuery datasets.

How does this affect my Google Cloud bill?

You will see GCS storage and operation charges in your Google Cloud project for data stored in your buckets. Standard GCS pricing applies.


Was this page helpful?

Was this page helpful?