Custom GCS Storage
Configure Slateo to use your own Google Cloud Storage (GCS) buckets for query results and file storage.
Overview
By default, Slateo stores query results and uploaded files in Slateo-managed storage. For organizations with data residency requirements or compliance needs, you can configure Slateo to use GCS buckets in your own Google Cloud project.
Slateo accesses your GCS buckets using a GCP service account that you provide. You can either supply a dedicated service account JSON key for storage, or reuse the service account credentials from your BigQuery datasource configuration.
Key Benefits:
- Data never stored outside your Google Cloud project
- Full control over encryption, retention, and access policies
- Audit trail in your own Cloud Audit Logs
- Option to reuse existing BigQuery service account credentials
Prerequisites
Before starting, ensure you have:
- Google Cloud project with permissions to create GCS buckets and manage IAM
- A service account with a JSON key (or an existing BigQuery datasource configured in Slateo with service account credentials)
Setup steps
Step 1: Create GCS buckets
Create two GCS buckets in your Google Cloud project:
| Bucket Purpose | Recommended Naming | Description |
|---|---|---|
| Uploads | {company}-slateo-uploads | User file uploads, CSVs, exports |
| Cache | {company}-slateo-cache | Query result caching |
Both buckets should be created in the same project and location as your BigQuery datasets for lowest latency.
Using gcloud CLI:
# Create uploads bucket
gcloud storage buckets create gs://{company}-slateo-uploads \
--project={your-project-id} \
--location=us-central1 \
--uniform-bucket-level-access
# Create cache bucket
gcloud storage buckets create gs://{company}-slateo-cache \
--project={your-project-id} \
--location=us-central1 \
--uniform-bucket-level-access
Enforce public access prevention (required for both buckets):
gcloud storage buckets update gs://{company}-slateo-uploads \
--public-access-prevention=enforced
gcloud storage buckets update gs://{company}-slateo-cache \
--public-access-prevention=enforced
Bucket naming rules: 3-63 characters, lowercase letters, numbers, hyphens, underscores, and periods only. Cannot start with goog or contain google.
Step 2: Grant service account permissions
The service account used by Slateo needs access to both buckets.
Option A: Predefined role (simplest)
Grant roles/storage.objectAdmin on each bucket:
gsutil iam ch serviceAccount:SERVICE_ACCOUNT_EMAIL:objectAdmin \
gs://{company}-slateo-uploads
gsutil iam ch serviceAccount:SERVICE_ACCOUNT_EMAIL:objectAdmin \
gs://{company}-slateo-cache
Option B: Custom role (least privilege)
Create a custom role with these permissions and bind it to the service account on each bucket:
storage.objects.createstorage.objects.getstorage.objects.deletestorage.objects.liststorage.buckets.get
If you plan to reuse your BigQuery datasource credentials (instead of providing a separate JSON key), the BigQuery service account must also have these storage permissions.
Step 3: Configure CORS on cache bucket
Add CORS configuration to the cache bucket to allow browser-based downloads of query results.
Save this as cors.json:
[
{
"origin": [
"https://*.slateo.ai",
"https://slateo.ai"
],
"method": ["GET", "HEAD"],
"responseHeader": ["ETag"],
"maxAgeSeconds": 3600
}
]
Apply it:
gsutil cors set cors.json gs://{company}-slateo-cache
To verify:
gsutil cors get gs://{company}-slateo-cache
CORS is required on the cache bucket because query results are downloaded directly by the browser. The uploads bucket does not need CORS since file uploads use server-side presigned URLs.
Step 4: (Optional) Configure CMEK encryption
If using Customer-Managed Encryption Keys (CMEK) for encryption at rest, the service account must have roles/cloudkms.cryptoKeyEncrypterDecrypter on the relevant key ring:
gcloud kms keys add-iam-policy-binding {key-name} \
--project={your-project-id} \
--location={location} \
--keyring={keyring-name} \
--member=serviceAccount:SERVICE_ACCOUNT_EMAIL \
--role=roles/cloudkms.cryptoKeyEncrypterDecrypter
Step 5: Configure buckets in Slateo
After configuring your buckets and permissions, update your database configuration in Slateo:
- Go to Admin → Databases
- Click the settings icon (gear) on your database connection
- Select GCS as the storage provider
- Enter the following fields:
- GCS Upload Bucket: Your uploads bucket name
- GCS Cache Bucket: Your cache bucket name
- GCP Project ID: The project containing your buckets
- Service Account JSON (optional): A service account JSON key with storage access. If omitted, Slateo uses the credentials from your BigQuery datasource configuration.
- Click Save to apply the configuration
After saving, the GCS storage status will show as Pending. Wait until the status changes to Ready before testing the integration.
All three bucket/project fields must be specified together. You cannot configure only one bucket or omit the project ID.
Only organization admins can access database configuration settings.
Security considerations
| Practice | Description |
|---|---|
| Uniform bucket-level access | Enable on both buckets for consistent IAM-based access control |
| Public access prevention | Enforce on both buckets to block any public access |
| Dedicated service account | Use a separate service account for Slateo storage (not the same SA used for BigQuery queries) |
| Cloud Audit Logs | Enable data access logging for monitoring |
| CMEK encryption | Use customer-managed keys for additional control over encryption at rest |
Troubleshooting
Access denied errors
- Check GCS storage status - Ensure the status shows Ready in Admin → Databases
- Verify service account permissions - The service account must have
roles/storage.objectAdmin(or equivalent custom role) on both buckets - Check bucket names match exactly (case-sensitive)
- Verify service account JSON key in Slateo matches the account with permissions
Browser download failures (query results)
If you see CORS errors or TypeError: Failed to fetch when viewing query results:
- Verify CORS configuration is applied to the cache bucket (not uploads)
- Check origin values include
https://*.slateo.aiandhttps://slateo.ai - Verify methods include
GETandHEAD - Clear browser cache - The browser may have cached a failed CORS preflight. Try a hard refresh (
Cmd+Shift+R/Ctrl+Shift+R) or incognito window.
Run gsutil cors get gs://YOUR-CACHE-BUCKET to verify the configuration.
Signed URL errors
The service account must have the iam.serviceAccounts.signBlob permission, or the JSON key must include the private key. If using Workload Identity Federation instead of a JSON key, ensure the roles/iam.serviceAccountTokenCreator role is granted.
CMEK errors
- Verify the key exists and is enabled (not pending deletion)
- Ensure the key is in the same location as the bucket
- Check the service account has
roles/cloudkms.cryptoKeyEncrypterDecrypteron the key
General issues
- Confirm buckets exist in the specified project
- Check Cloud Audit Logs for detailed error information
- Contact support with your organization slug and error details
Frequently asked questions
Can I use a single bucket for both uploads and cache?
No. Both upload and cache bucket fields must be specified, and they should be separate buckets for proper isolation and different lifecycle policies.
Can I reuse my BigQuery service account credentials?
Yes. If your BigQuery datasource is configured with a service account JSON key, you can leave the storage service account field empty and Slateo will use the BigQuery credentials. The BigQuery service account must have the required storage permissions on both buckets.
How do I verify the integration is working?
After configuration, run a query and check that the results are cached in your bucket. Query cache files are stored with the prefix {org-id}/query-cache/parquet/. You can also upload a CSV file and verify it appears under {org-id}/uploads/.
Can I migrate existing data to my buckets?
When you configure custom buckets, new data will automatically go to your buckets. Existing cached query results will remain in their previous location and become inaccessible through Slateo (you can re-run queries to regenerate them). Previously uploaded files will also need to be re-uploaded. Contact support if you need to discuss migration of historical data.
What locations are supported?
Any GCS location is supported. For lowest latency, place your buckets in the same location as your BigQuery datasets.
How does this affect my Google Cloud bill?
You will see GCS storage and operation charges in your Google Cloud project for data stored in your buckets. Standard GCS pricing applies.