# BaoLife Deployment Validation Checklist

Use this checklist to validate deployments and ensure everything is working correctly.

## Pre-Deployment Checklist

### Prerequisites
- [ ] gcloud CLI installed and authenticated
- [ ] GCP project created with billing enabled
- [ ] Required IAM roles assigned (Cloud Run Admin, Cloud SQL Admin, Secret Manager Admin)
- [ ] Database password generated (strong, 16+ characters)
- [ ] `.env` files not committed to git

### Code Preparation
- [ ] All tests passing locally
- [ ] Code reviewed and approved
- [ ] Version/tag created (for production)
- [ ] Dockerfile present in `ws/` directory
- [ ] No hardcoded secrets in code

## Deployment Execution Checklist

### Initial Deployment

**Development Environment:**
```bash
./deploy-gcp.sh \
  --project YOUR_PROJECT_ID \
  --environment dev \
  --db-password "YOUR_DEV_PASSWORD"
```

- [ ] Script completes without errors
- [ ] All APIs enabled successfully
- [ ] Cloud SQL instance created (`baolife-db-dev`)
- [ ] Database and user created
- [ ] Secrets stored in Secret Manager (`dev-db-password`, `dev-jwt-secret`)
- [ ] Docker image built and pushed
- [ ] Cloud Run service deployed (`baolife-backend-dev`)
- [ ] Service URL displayed in output

**Production Environment:**
```bash
./deploy-gcp.sh \
  --project YOUR_PROJECT_ID \
  --environment production \
  --db-password "YOUR_PROD_PASSWORD"
```

- [ ] Script completes without errors
- [ ] All APIs enabled successfully
- [ ] Cloud SQL instance created (`baolife-db-prod`)
- [ ] Database and user created
- [ ] Secrets stored in Secret Manager (`production-db-password`, `production-jwt-secret`)
- [ ] Docker image built and pushed
- [ ] Cloud Run service deployed (`baolife-backend-prod`)
- [ ] Service URL displayed in output

### Update Deployment

**Development:**
```bash
./deploy-gcp.sh \
  --project YOUR_PROJECT_ID \
  --environment dev \
  --skip-db \
  --skip-secrets
```

- [ ] Script completes without errors
- [ ] Docker image rebuilt with latest code
- [ ] New Cloud Run revision created
- [ ] Traffic switched to new revision
- [ ] Old revision still available for rollback

**Production:**
```bash
./deploy-gcp.sh \
  --project YOUR_PROJECT_ID \
  --environment production \
  --skip-db \
  --skip-secrets
```

- [ ] Script completes without errors
- [ ] Docker image rebuilt with latest code
- [ ] New Cloud Run revision created
- [ ] Traffic switched to new revision
- [ ] Old revision still available for rollback

## Post-Deployment Validation

### Service Health Checks

**Development:**
```bash
# Check service status
gcloud run services describe baolife-backend-dev --region=us-central1

# Check health
curl -I https://DEVELOPMENT_SERVICE_URL/health

# View recent logs
gcloud run logs read baolife-backend-dev --region=us-central1 --limit=50
```

- [ ] Service status shows "Ready"
- [ ] Health endpoint returns 200 OK
- [ ] No error logs in recent output
- [ ] Service responding within acceptable latency

**Production:**
```bash
# Check service status
gcloud run services describe baolife-backend-prod --region=us-central1

# Check health
curl -I https://PRODUCTION_SERVICE_URL/health

# View recent logs
gcloud run logs read baolife-backend-prod --region=us-central1 --limit=50
```

- [ ] Service status shows "Ready"
- [ ] Health endpoint returns 200 OK
- [ ] No error logs in recent output
- [ ] Service responding within acceptable latency

### Database Connectivity

**Development:**
```bash
# Connect to database
gcloud sql connect baolife-db-dev --user=baolife

# Run test query
SHOW DATABASES;
USE lifesim;
SHOW TABLES;
```

- [ ] Connection successful
- [ ] `lifesim` database exists
- [ ] Expected tables present
- [ ] User has correct permissions

**Production:**
```bash
# Connect to database
gcloud sql connect baolife-db-prod --user=baolife

# Run test query
SHOW DATABASES;
USE lifesim;
SHOW TABLES;
```

- [ ] Connection successful
- [ ] `lifesim` database exists
- [ ] Expected tables present
- [ ] User has correct permissions

### Secret Manager Verification

**Development:**
```bash
# List secrets
gcloud secrets list --filter="name:dev-" --project=YOUR_PROJECT_ID

# Verify secret exists (don't view value)
gcloud secrets describe dev-db-password --project=YOUR_PROJECT_ID
gcloud secrets describe dev-jwt-secret --project=YOUR_PROJECT_ID
```

- [ ] `dev-db-password` secret exists
- [ ] `dev-jwt-secret` secret exists
- [ ] Secrets have latest version

**Production:**
```bash
# List secrets
gcloud secrets list --filter="name:production-" --project=YOUR_PROJECT_ID

# Verify secret exists (don't view value)
gcloud secrets describe production-db-password --project=YOUR_PROJECT_ID
gcloud secrets describe production-jwt-secret --project=YOUR_PROJECT_ID
```

- [ ] `production-db-password` secret exists
- [ ] `production-jwt-secret` secret exists
- [ ] Secrets have latest version

### WebSocket Functionality

**Development:**
```bash
# Test WebSocket connection (using websocat or similar)
websocat wss://DEVELOPMENT_SERVICE_URL/

# Send test message
{"type": "init", "userID": "test-user"}
```

- [ ] WebSocket connection established
- [ ] Server responds to init message
- [ ] Connection remains stable
- [ ] No unexpected disconnections

**Production:**
```bash
# Test WebSocket connection
websocat wss://PRODUCTION_SERVICE_URL/

# Send test message
{"type": "init", "userID": "test-user"}
```

- [ ] WebSocket connection established
- [ ] Server responds to init message
- [ ] Connection remains stable
- [ ] No unexpected disconnections

### Performance Verification

**Development:**
```bash
# Check instance metrics
gcloud run services describe baolife-backend-dev \
  --region=us-central1 \
  --format="table(spec.template.spec.containers[0].resources.limits)"
```

- [ ] Memory limit correct (512Mi for dev)
- [ ] CPU limit correct (1)
- [ ] Min instances correct (0 for dev)
- [ ] Max instances correct (3 for dev)
- [ ] Concurrency correct (40 for dev)

**Production:**
```bash
# Check instance metrics
gcloud run services describe baolife-backend-prod \
  --region=us-central1 \
  --format="table(spec.template.spec.containers[0].resources.limits)"
```

- [ ] Memory limit correct (1Gi for prod)
- [ ] CPU limit correct (1)
- [ ] Min instances correct (1 for prod)
- [ ] Max instances correct (10 for prod)
- [ ] Concurrency correct (80 for prod)

### Environment Variables

**Development:**
```bash
# Check environment variables (sensitive values masked)
gcloud run services describe baolife-backend-dev \
  --region=us-central1 \
  --format="yaml(spec.template.spec.containers[0].env)"
```

- [ ] `ENVIRONMENT=dev` set correctly
- [ ] `DB_NAME=lifesim` set correctly
- [ ] `DB_USER=baolife` set correctly
- [ ] `DB_HOST` points to correct Cloud SQL socket
- [ ] `DB_PASSWORD` from `dev-db-password` secret
- [ ] `JWT_SECRET` from `dev-jwt-secret` secret

**Production:**
```bash
# Check environment variables (sensitive values masked)
gcloud run services describe baolife-backend-prod \
  --region=us-central1 \
  --format="yaml(spec.template.spec.containers[0].env)"
```

- [ ] `ENVIRONMENT=production` set correctly
- [ ] `DB_NAME=lifesim` set correctly
- [ ] `DB_USER=baolife` set correctly
- [ ] `DB_HOST` points to correct Cloud SQL socket
- [ ] `DB_PASSWORD` from `production-db-password` secret
- [ ] `JWT_SECRET` from `production-jwt-secret` secret

## Security Validation

### Access Control
- [ ] Cloud Run service allows unauthenticated access (or configured authentication if required)
- [ ] Cloud SQL only accessible via Cloud Run (no public IP if not needed)
- [ ] Secrets only accessible by Cloud Run service account
- [ ] IAM roles follow principle of least privilege

### Secret Management
- [ ] Database passwords are strong (16+ characters, mixed case, numbers, symbols)
- [ ] JWT secrets are randomly generated (32+ bytes)
- [ ] Secrets stored only in Secret Manager (not in git or environment)
- [ ] Different passwords used for dev and production
- [ ] No secrets in Cloud Run environment variables (all from Secret Manager)

### Network Security
- [ ] Cloud SQL private IP configured (if applicable)
- [ ] Cloud Run connects via Unix socket
- [ ] SSL/TLS enforced for all connections
- [ ] No unnecessary ports exposed

## Rollback Verification

### Rollback Capability Test

**Development:**
```bash
# List revisions
gcloud run revisions list --service=baolife-backend-dev --region=us-central1

# Note current and previous revision names
# Test rollback to previous revision (if needed)
gcloud run services update-traffic baolife-backend-dev \
  --to-revisions=PREVIOUS_REVISION=100 \
  --region=us-central1
```

- [ ] Previous revisions available
- [ ] Can roll back to previous revision
- [ ] Service remains healthy after rollback
- [ ] Can roll forward again

**Production:**
```bash
# List revisions
gcloud run revisions list --service=baolife-backend-prod --region=us-central1
```

- [ ] Previous revisions available
- [ ] Rollback procedure documented and tested in dev
- [ ] Production rollback only performed if critical issue found

## Monitoring Setup

### Logging
- [ ] Cloud Logging enabled
- [ ] Logs accessible via console
- [ ] Log levels appropriate (INFO for prod, DEBUG for dev)
- [ ] No sensitive data in logs

### Metrics
- [ ] Cloud Monitoring enabled
- [ ] Request count tracked
- [ ] Request latency tracked
- [ ] Error rate tracked
- [ ] Instance count tracked

### Alerting (Production Only)
- [ ] Alert policy for error rate threshold
- [ ] Alert policy for latency threshold
- [ ] Alert policy for instance scaling issues
- [ ] Notification channels configured

## Cost Verification

### Development
```bash
# Check current resource usage
gcloud run services describe baolife-backend-dev --region=us-central1
gcloud sql instances describe baolife-db-dev
```

- [ ] Resources match expected configuration
- [ ] Min instances set to 0 (scales to zero when idle)
- [ ] Database tier is db-f1-micro
- [ ] Storage size appropriate (10GB)
- [ ] Expected monthly cost: ~$7

### Production
```bash
# Check current resource usage
gcloud run services describe baolife-backend-prod --region=us-central1
gcloud sql instances describe baolife-db-prod
```

- [ ] Resources match expected configuration
- [ ] Min instances set to 1 (always on)
- [ ] Database tier is db-g1-small or higher
- [ ] Storage size appropriate (20GB)
- [ ] Expected monthly cost: ~$40+ (depending on traffic)

## Documentation

- [ ] Deployment parameters documented
- [ ] Service URLs recorded
- [ ] Database connection details recorded
- [ ] Custom domain configured (if applicable)
- [ ] Team notified of deployment
- [ ] Runbook updated with any changes

## Sign-Off

**Deployed By:** _________________

**Date:** _________________

**Environment:** [ ] Development [ ] Staging [ ] Production

**Version/Tag:** _________________

**Service URL:** _________________

**Database Instance:** _________________

**Notes:**
_____________________________________________________________________________
_____________________________________________________________________________
_____________________________________________________________________________

## Troubleshooting Reference

If any checks fail, refer to the [Troubleshooting section in DEPLOYMENT.md](./DEPLOYMENT.md#troubleshooting).

Common issues:
- **Service not responding**: Check logs for errors, verify database connectivity
- **Database connection failed**: Verify Cloud SQL instance is running, check credentials
- **Secrets not found**: Ensure secrets created with correct environment prefix
- **Deployment timeout**: Check Cloud Build logs, verify Dockerfile is correct
- **High costs**: Review instance count, check for unexpected traffic, verify auto-scaling settings
