Deploy Containerized Apps on Google Cloud Run
Google Cloud Run lets you run containers without managing servers. You deploy a Docker image, and Cloud Run handles scaling, HTTPS, and infrastructure automatically. In this tutorial, we'll build a Python Flask REST API, containerize it, and deploy it to Cloud Run.
Prerequisites
- Google Cloud account with billing enabled
gcloudCLI installed and authenticated- Docker installed locally
- Python 3.11+
Step 1: Create the Flask Application
Create a project directory and set up the application:
mkdir cloud-run-demo && cd cloud-run-demo
python -m venv venv && source venv/bin/activate
pip install flask gunicorn google-cloud-firestore
Create app.py:
import os
from flask import Flask, jsonify, request
from google.cloud import firestore
app = Flask(__name__)
# Initialize Firestore client (auto-detects credentials on Cloud Run)
db = firestore.Client()
@app.route('/health')
def health():
return jsonify({'status': 'healthy', 'service': 'cloud-run-demo'})
@app.route('/api/items', methods=['GET'])
def list_items():
items_ref = db.collection('items')
docs = items_ref.stream()
items = [{'id': doc.id, **doc.to_dict()} for doc in docs]
return jsonify({'items': items, 'count': len(items)})
@app.route('/api/items', methods=['POST'])
def create_item():
data = request.get_json()
if not data or 'name' not in data:
return jsonify({'error': 'name is required'}), 400
item = {
'name': data['name'],
'description': data.get('description', ''),
'created_at': firestore.SERVER_TIMESTAMP
}
doc_ref = db.collection('items').add(item)
return jsonify({'id': doc_ref[1].id, 'message': 'created'}), 201
@app.route('/api/items/<item_id>', methods=['DELETE'])
def delete_item(item_id):
db.collection('items').document(item_id).delete()
return jsonify({'message': 'deleted'}), 200
if __name__ == '__main__':
port = int(os.environ.get('PORT', 8080))
app.run(host='0.0.0.0', port=port, debug=False)
Create requirements.txt:
flask==3.0.0
gunicorn==21.2.0
google-cloud-firestore==2.14.0
Step 2: Write the Dockerfile
Create a multi-stage Dockerfile for a lean production image:
# Build stage
FROM python:3.11-slim AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
# Production stage
FROM python:3.11-slim
WORKDIR /app
# Copy installed packages from builder
COPY --from=builder /install /usr/local
COPY . .
# Cloud Run sets PORT env var
ENV PORT=8080
EXPOSE 8080
# Use gunicorn for production
CMD exec gunicorn --bind :$PORT --workers 2 --threads 4 --timeout 120 app:app
Create .dockerignore:
venv/
__pycache__/
*.pyc
.git/
.env
Step 3: Test Locally with Docker
# Build the image
docker build -t cloud-run-demo .
# Run locally
docker run -p 8080:8080 -e PORT=8080 cloud-run-demo
# Test the health endpoint
curl http://localhost:8080/health
# {"service":"cloud-run-demo","status":"healthy"}
Step 4: Deploy to Google Cloud Run
Set up your GCP project and deploy:
# Set project
export PROJECT_ID=your-project-id
gcloud config set project $PROJECT_ID
# Enable required APIs
gcloud services enable run.googleapis.com \
containerregistry.googleapis.com \
cloudbuild.googleapis.com \
firestore.googleapis.com
# Build and push using Cloud Build (no local Docker needed)
gcloud builds submit --tag gcr.io/$PROJECT_ID/cloud-run-demo
# Deploy to Cloud Run
gcloud run deploy cloud-run-demo \
--image gcr.io/$PROJECT_ID/cloud-run-demo \
--platform managed \
--region asia-northeast1 \
--allow-unauthenticated \
--memory 256Mi \
--cpu 1 \
--min-instances 0 \
--max-instances 10 \
--set-env-vars "GOOGLE_CLOUD_PROJECT=$PROJECT_ID"
Cloud Run outputs the service URL:
Service [cloud-run-demo] revision [cloud-run-demo-00001] has been deployed
Service URL: https://cloud-run-demo-xxxxx-an.a.run.app
Step 5: Configure Auto-Scaling
Cloud Run scales based on concurrent requests. Fine-tune with:
gcloud run services update cloud-run-demo \
--region asia-northeast1 \
--concurrency 80 \
--min-instances 1 \
--max-instances 50 \
--cpu-throttling
Key scaling parameters:
| Parameter | Description | Default |
|-----------|-------------|---------|
| --concurrency | Max requests per instance | 80 |
| --min-instances | Minimum warm instances | 0 |
| --max-instances | Maximum instances | 100 |
| --cpu-throttling | Reduce CPU when idle | Enabled |
Step 6: Set Up Custom Domain
# Map custom domain
gcloud run domain-mappings create \
--service cloud-run-demo \
--domain api.yourdomain.com \
--region asia-northeast1
# Follow DNS instructions output by the command
# Add CNAME record: api.yourdomain.com -> ghs.googlehosted.com
Step 7: Add CI/CD with Cloud Build
Create cloudbuild.yaml for automatic deployments:
steps:
# Build the container image
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/cloud-run-demo:$COMMIT_SHA', '.']
# Push the container image to Container Registry
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/$PROJECT_ID/cloud-run-demo:$COMMIT_SHA']
# Deploy to Cloud Run
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'deploy'
- 'cloud-run-demo'
- '--image'
- 'gcr.io/$PROJECT_ID/cloud-run-demo:$COMMIT_SHA'
- '--region'
- 'asia-northeast1'
- '--platform'
- 'managed'
images:
- 'gcr.io/$PROJECT_ID/cloud-run-demo:$COMMIT_SHA'
Connect to a GitHub repository:
gcloud builds triggers create github \
--repo-name=cloud-run-demo \
--repo-owner=your-org \
--branch-pattern="^main$" \
--build-config=cloudbuild.yaml
Step 8: Monitoring and Logging
Cloud Run integrates with Cloud Monitoring out of the box:
# View recent logs
gcloud logging read "resource.type=cloud_run_revision \
AND resource.labels.service_name=cloud-run-demo" \
--limit 50 --format json
# Check metrics
gcloud run services describe cloud-run-demo \
--region asia-northeast1 --format yaml
Add structured logging in your app:
import json
import sys
def log(severity, message, **kwargs):
entry = {
'severity': severity,
'message': message,
**kwargs
}
print(json.dumps(entry), file=sys.stdout if severity != 'ERROR' else sys.stderr)
# Usage
log('INFO', 'Item created', item_id='abc123', user='anonymous')
Cost Optimization Tips
1. Set min-instances to 0 for dev/staging — you only pay when requests come in
2. Use CPU throttling to reduce costs during idle periods
3. Right-size memory — start with 256Mi and increase only if needed
4. Use --cpu-boost for faster cold starts instead of keeping instances warm
5. Set concurrency high (80-100) to maximize requests per instance
Cloud Run vs. Other Options
| Feature | Cloud Run | Cloud Functions | GKE |
|---------|-----------|----------------|-----|
| Container support | ✅ Any | Limited | ✅ Any |
| Cold start | ~1-2s | ~1-5s | None |
| Max timeout | 60 min | 9 min (gen1) | Unlimited |
| Auto-scale to zero | ✅ | ✅ | ❌ |
| Pricing | Per request | Per invocation | Per node |
Summary
Google Cloud Run provides a powerful middle ground between serverless functions and full Kubernetes. You get container flexibility with serverless simplicity — no cluster management, automatic HTTPS, and pay-per-use pricing. Perfect for APIs, microservices, and web applications that need to scale efficiently.