Deploy Containerized Apps on Google Cloud Run

Google Cloud Run lets you run containers without managing servers. You deploy a Docker image, and Cloud Run handles scaling, HTTPS, and infrastructure automatically. In this tutorial, we'll build a Python Flask REST API, containerize it, and deploy it to Cloud Run.

Prerequisites

  • Google Cloud account with billing enabled
  • gcloud CLI installed and authenticated
  • Docker installed locally
  • Python 3.11+

Step 1: Create the Flask Application

Create a project directory and set up the application:

mkdir cloud-run-demo && cd cloud-run-demo

python -m venv venv && source venv/bin/activate

pip install flask gunicorn google-cloud-firestore

Create app.py:

import os

from flask import Flask, jsonify, request

from google.cloud import firestore

app = Flask(__name__)

# Initialize Firestore client (auto-detects credentials on Cloud Run)

db = firestore.Client()

@app.route('/health')

def health():

return jsonify({'status': 'healthy', 'service': 'cloud-run-demo'})

@app.route('/api/items', methods=['GET'])

def list_items():

items_ref = db.collection('items')

docs = items_ref.stream()

items = [{'id': doc.id, **doc.to_dict()} for doc in docs]

return jsonify({'items': items, 'count': len(items)})

@app.route('/api/items', methods=['POST'])

def create_item():

data = request.get_json()

if not data or 'name' not in data:

return jsonify({'error': 'name is required'}), 400

item = {

'name': data['name'],

'description': data.get('description', ''),

'created_at': firestore.SERVER_TIMESTAMP

}

doc_ref = db.collection('items').add(item)

return jsonify({'id': doc_ref[1].id, 'message': 'created'}), 201

@app.route('/api/items/<item_id>', methods=['DELETE'])

def delete_item(item_id):

db.collection('items').document(item_id).delete()

return jsonify({'message': 'deleted'}), 200

if __name__ == '__main__':

port = int(os.environ.get('PORT', 8080))

app.run(host='0.0.0.0', port=port, debug=False)

Create requirements.txt:

flask==3.0.0

gunicorn==21.2.0

google-cloud-firestore==2.14.0

Step 2: Write the Dockerfile

Create a multi-stage Dockerfile for a lean production image:

# Build stage

FROM python:3.11-slim AS builder

WORKDIR /app

COPY requirements.txt .

RUN pip install --no-cache-dir --prefix=/install -r requirements.txt

# Production stage

FROM python:3.11-slim

WORKDIR /app

# Copy installed packages from builder

COPY --from=builder /install /usr/local

COPY . .

# Cloud Run sets PORT env var

ENV PORT=8080

EXPOSE 8080

# Use gunicorn for production

CMD exec gunicorn --bind :$PORT --workers 2 --threads 4 --timeout 120 app:app

Create .dockerignore:

venv/

__pycache__/

*.pyc

.git/

.env

Step 3: Test Locally with Docker

# Build the image

docker build -t cloud-run-demo .

# Run locally

docker run -p 8080:8080 -e PORT=8080 cloud-run-demo

# Test the health endpoint

curl http://localhost:8080/health

# {"service":"cloud-run-demo","status":"healthy"}

Step 4: Deploy to Google Cloud Run

Set up your GCP project and deploy:

# Set project

export PROJECT_ID=your-project-id

gcloud config set project $PROJECT_ID

# Enable required APIs

gcloud services enable run.googleapis.com \

containerregistry.googleapis.com \

cloudbuild.googleapis.com \

firestore.googleapis.com

# Build and push using Cloud Build (no local Docker needed)

gcloud builds submit --tag gcr.io/$PROJECT_ID/cloud-run-demo

# Deploy to Cloud Run

gcloud run deploy cloud-run-demo \

--image gcr.io/$PROJECT_ID/cloud-run-demo \

--platform managed \

--region asia-northeast1 \

--allow-unauthenticated \

--memory 256Mi \

--cpu 1 \

--min-instances 0 \

--max-instances 10 \

--set-env-vars "GOOGLE_CLOUD_PROJECT=$PROJECT_ID"

Cloud Run outputs the service URL:

Service [cloud-run-demo] revision [cloud-run-demo-00001] has been deployed

Service URL: https://cloud-run-demo-xxxxx-an.a.run.app

Step 5: Configure Auto-Scaling

Cloud Run scales based on concurrent requests. Fine-tune with:

gcloud run services update cloud-run-demo \

--region asia-northeast1 \

--concurrency 80 \

--min-instances 1 \

--max-instances 50 \

--cpu-throttling

Key scaling parameters:

| Parameter | Description | Default |

|-----------|-------------|---------|

| --concurrency | Max requests per instance | 80 |

| --min-instances | Minimum warm instances | 0 |

| --max-instances | Maximum instances | 100 |

| --cpu-throttling | Reduce CPU when idle | Enabled |

Step 6: Set Up Custom Domain

# Map custom domain

gcloud run domain-mappings create \

--service cloud-run-demo \

--domain api.yourdomain.com \

--region asia-northeast1

# Follow DNS instructions output by the command

# Add CNAME record: api.yourdomain.com -> ghs.googlehosted.com

Step 7: Add CI/CD with Cloud Build

Create cloudbuild.yaml for automatic deployments:

steps:

# Build the container image

- name: 'gcr.io/cloud-builders/docker'

args: ['build', '-t', 'gcr.io/$PROJECT_ID/cloud-run-demo:$COMMIT_SHA', '.']

# Push the container image to Container Registry

- name: 'gcr.io/cloud-builders/docker'

args: ['push', 'gcr.io/$PROJECT_ID/cloud-run-demo:$COMMIT_SHA']

# Deploy to Cloud Run

- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'

entrypoint: gcloud

args:

- 'run'

- 'deploy'

- 'cloud-run-demo'

- '--image'

- 'gcr.io/$PROJECT_ID/cloud-run-demo:$COMMIT_SHA'

- '--region'

- 'asia-northeast1'

- '--platform'

- 'managed'

images:

- 'gcr.io/$PROJECT_ID/cloud-run-demo:$COMMIT_SHA'

Connect to a GitHub repository:

gcloud builds triggers create github \

--repo-name=cloud-run-demo \

--repo-owner=your-org \

--branch-pattern="^main$" \

--build-config=cloudbuild.yaml

Step 8: Monitoring and Logging

Cloud Run integrates with Cloud Monitoring out of the box:

# View recent logs

gcloud logging read "resource.type=cloud_run_revision \

AND resource.labels.service_name=cloud-run-demo" \

--limit 50 --format json

# Check metrics

gcloud run services describe cloud-run-demo \

--region asia-northeast1 --format yaml

Add structured logging in your app:

import json

import sys

def log(severity, message, **kwargs):

entry = {

'severity': severity,

'message': message,

**kwargs

}

print(json.dumps(entry), file=sys.stdout if severity != 'ERROR' else sys.stderr)

# Usage

log('INFO', 'Item created', item_id='abc123', user='anonymous')

Cost Optimization Tips

1. Set min-instances to 0 for dev/staging — you only pay when requests come in

2. Use CPU throttling to reduce costs during idle periods

3. Right-size memory — start with 256Mi and increase only if needed

4. Use --cpu-boost for faster cold starts instead of keeping instances warm

5. Set concurrency high (80-100) to maximize requests per instance

Cloud Run vs. Other Options

| Feature | Cloud Run | Cloud Functions | GKE |

|---------|-----------|----------------|-----|

| Container support | ✅ Any | Limited | ✅ Any |

| Cold start | ~1-2s | ~1-5s | None |

| Max timeout | 60 min | 9 min (gen1) | Unlimited |

| Auto-scale to zero | ✅ | ✅ | ❌ |

| Pricing | Per request | Per invocation | Per node |

Summary

Google Cloud Run provides a powerful middle ground between serverless functions and full Kubernetes. You get container flexibility with serverless simplicity — no cluster management, automatic HTTPS, and pay-per-use pricing. Perfect for APIs, microservices, and web applications that need to scale efficiently.