GCP Connector
Integrate with Google Cloud Platform — Compute, Storage, Functions, and BigQuery
Google Cloud Platform (GCP) Connector
Integrate your workflows with Google Cloud Platform and leverage services like Cloud Storage, Cloud Functions, Pub/Sub, Firestore, and BigQuery—all the tools for building scalable cloud applications.
Overview
The GCP connector provides 18 operations across Google's cloud ecosystem. Whether you're storing data in Cloud Storage, running BigQuery analytics, triggering serverless functions, or managing real-time messaging, we've got you covered.
Authentication
GCP uses service accounts—think of them as robot users that can authenticate your workflows securely without needing human login credentials.
Option 1: Service Account (Recommended)
Create a service account in your GCP project and download the JSON key:
auth_type: service_account
project_id: "your-project-id"
credentials_json: "{{ credentials.gcp.service_account }}"
How to set it up:
- Go to Google Cloud Console > Service Accounts
- Click Create Service Account
- Give it a name like "DeepChain Connector"
- Grant necessary roles (Cloud Storage Admin, Cloud Functions Developer, etc.)
- Create a JSON key and paste it in the
credentials_jsonfield
Tip: Use the principle of least privilege—grant only the specific roles your workflows need, not "Owner" or "Editor".
Option 2: OAuth 2.0 (For User Impersonation)
If you need to access user-owned resources, use OAuth 2.0:
auth_type: oauth2
client_id: "your-client-id"
client_secret: "your-client-secret"
project_id: "your-project-id"
Available Operations
Cloud Storage (File Storage)
Store and retrieve files from Google's object storage:
| Operation | What It Does |
|---|---|
listBuckets |
List all storage buckets in your project |
listObjects |
List objects (files) in a bucket |
getObject |
Download a file |
uploadObject |
Upload a file |
deleteObject |
Delete a file |
Cloud Functions (Serverless Compute)
Trigger custom code without managing servers:
| Operation | What It Does |
|---|---|
invokeFunction |
Call a Cloud Function |
listFunctions |
List functions in your project |
Pub/Sub (Real-Time Messaging)
Build scalable event-driven systems:
| Operation | What It Does |
|---|---|
publish |
Send a message to a topic |
pull |
Retrieve messages from a subscription |
acknowledge |
Mark messages as processed |
Firestore (NoSQL Database)
Store and query semi-structured data:
| Operation | What It Does |
|---|---|
getDocument |
Fetch a document by path |
createDocument |
Create a new document |
updateDocument |
Update a document |
deleteDocument |
Delete a document |
query |
Query a collection with filters |
BigQuery (Data Analytics)
Run SQL queries against massive datasets:
| Operation | What It Does |
|---|---|
query |
Execute a BigQuery SQL query |
insertRows |
Stream rows into a table |
Practical Examples
Example 1: Upload Exports to Cloud Storage
Archive daily reports with organized naming:
- id: export_to_gcs
type: gcp_connector
config:
operation: uploadObject
bucket: "company-exports"
objectName: "reports/{{ formatDate(now(), 'yyyy/MM/dd') }}/report-{{ formatDate(now(), 'HH-mm-ss') }}.csv"
content: "{{ json_stringify(input.data) }}"
contentType: "text/csv"
Access your file later at:
gs://company-exports/reports/2025/02/10/report-14-30-45.csv
Example 2: Run Complex Analytics with BigQuery
Query your data warehouse for insights:
- id: monthly_analytics
type: gcp_connector
config:
operation: query
sql: |
SELECT
DATE(event_timestamp) as date,
event_name,
COUNT(*) as total_events,
COUNT(DISTINCT user_id) as unique_users
FROM `{{ input.project_id }}.analytics.events`
WHERE DATE(event_timestamp) = @query_date
GROUP BY date, event_name
ORDER BY total_events DESC
parameters:
query_date: "{{ input.date }}"
Example 3: Trigger a Cloud Function
Call a function to do heavy lifting like image processing:
- id: process_image
type: gcp_connector
config:
operation: invokeFunction
functionName: "image-thumbnails"
data:
imageUrl: "{{ input.source_image }}"
sizes: [120, 240, 480]
format: "webp"
Example 4: Publish Events to Pub/Sub
Send real-time events for other services to consume:
- id: notify_event
type: gcp_connector
config:
operation: publish
topic: "projects/{{ input.project_id }}/topics/user-events"
message:
data: "{{ base64_encode(json_stringify({
eventType: 'order.created',
orderId: input.order_id,
customerId: input.customer_id,
timestamp: now()
})) }}"
attributes:
event_type: "order"
priority: "high"
Pub/Sub encodes the message body in base64, which is why we use base64_encode().
Example 5: Store Documents in Firestore
Save structured data in Firestore:
- id: save_user_profile
type: gcp_connector
config:
operation: createDocument
collection: "users"
documentId: "{{ input.user_id }}"
data:
name: "{{ input.full_name }}"
email: "{{ input.email }}"
createdAt: "{{ now() }}"
metadata:
source: "{{ input.source }}"
tags: "{{ input.tags }}"
Later, you can query this document:
- id: fetch_user
type: gcp_connector
config:
operation: getDocument
collection: "users"
documentId: "{{ input.user_id }}"
Rate Limits
Google Cloud services scale automatically, but here are the baseline limits:
- Cloud Storage: 1 request per second per object (metadata), unlimited data throughput
- Pub/Sub: 10,000 messages per second per topic (can be increased)
- Firestore: 10,000 writes per second per database
- BigQuery: 100 concurrent queries, 1 TB per query
Note: DeepChain handles retries and backoff automatically, so you typically won't hit these limits.
Error Handling
Common GCP Errors
| Error | What It Means | How to Fix |
|---|---|---|
PERMISSION_DENIED |
Your service account lacks permissions | Check IAM roles; add Cloud Storage Admin, Cloud Functions Developer, etc. |
NOT_FOUND |
Resource doesn't exist (bucket, function, etc.) | Verify the resource exists and is in the right project |
ALREADY_EXISTS |
You're trying to create a duplicate | Use a different name or update the existing resource |
INVALID_ARGUMENT |
Bad parameters or data format | Check field names and data types in the operation docs |
RESOURCE_EXHAUSTED |
You've hit a quota limit | Wait a bit or request a quota increase from GCP |
Debugging
Enable debug logging:
Node Configuration:
debug: true
logRequest: true
logResponse: true
Check the execution logs for the full request/response.
Best Practices
1. Organize Cloud Storage Like a Pro
GCS doesn't have real folders, but you can use naming conventions:
# Good organization by date and type
objectName: "exports/reports/{{ formatDate(now(), 'yyyy/MM/dd') }}/report.json"
objectName: "logs/{{ input.service }}/{{ formatDate(now(), 'yyyy-MM-dd') }}.log"
objectName: "backups/database/{{ formatDate(now(), 'yyyy-MM-dd-HHmmss') }}.sql"
2. Use Proper IAM Roles
Don't grant Editor or Owner roles. Use specific roles:
- Cloud Storage:
roles/storage.objectAdmin(specific to your buckets) - Cloud Functions:
roles/cloudfunctions.developer - BigQuery:
roles/bigquery.dataEditor+roles/bigquery.jobUser - Firestore:
roles/datastore.user
3. Set Query Timeouts for BigQuery
BigQuery queries can take time. Set reasonable timeouts:
- id: big_query
type: gcp_connector
config:
operation: query
sql: "SELECT COUNT(*) FROM `project.dataset.huge_table`"
timeoutMs: 300000 # 5 minutes
4. Handle Pub/Sub Message Format
Pub/Sub messages require base64 encoding. Always encode:
# Correct
message:
data: "{{ base64_encode(json_stringify(input)) }}"
# Wrong—will fail
message:
data: "{{ json_stringify(input) }}"
5. Batch BigQuery Inserts
Instead of inserting rows one by one, batch them:
- id: batch_insert
type: gcp_connector
config:
operation: insertRows
table: "project.dataset.events"
rows:
- event_id: "123"
user_id: "456"
timestamp: "{{ now() }}"
- event_id: "789"
user_id: "012"
timestamp: "{{ now() }}"
This is much faster than individual inserts.
Next Steps
- Connectors Overview — Explore all connectors
- AWS Connector — Compare with AWS
- BigQuery Documentation — Master BigQuery SQL
- Cloud Functions Guide — Learn serverless on GCP
- Pub/Sub Best Practices — Design event-driven systems