The Database Decision That
Shaped CT Vision’s Operational
Architecture

12 min read•June 08, 2026

Why Enterprise Video Intelligence Is Harder Than It Looks
Why the 'Just Use RDS' Answer Didn't Work for Us
The Schema Evolution Problem
What Changes When the Database Is Inside the Workspace
Inside the Schema: Twelve Tables, One Boundary
The Column Type That Lets AI Models Evolve Without Breaking the Database
What 'Deploy in Seconds' Actually Means
What Enterprise Teams Actually Get
Industry Case Studies: Where CT Vision Creates Value
Governance Isn't a Feature. It's the Architecture

Why Enterprise Video Intelligence Is Harder Than It Looks
Why the 'Just Use RDS' Answer Didn't Work for Us
The Schema Evolution Problem
What Changes When the Database Is Inside the Workspace
Inside the Schema: Twelve Tables, One Boundary
The Column Type That Lets AI Models Evolve Without Breaking the Database
What 'Deploy in Seconds' Actually Means
What Enterprise Teams Actually Get
Industry Case Studies: Where CT Vision Creates Value
Governance Isn't a Feature. It's the Architecture

Share this article:

Why Enterprise Video Intelligence Is Harder Than It Looks

Large enterprises are sitting on vast libraries of unstructured video footage: retail floor recordings, industrial inspection feeds, traffic monitoring, and field operations. The challenge is no longer capturing video. It is extracting consistent, auditable, and actionable intelligence from that footage at scale, in a way that meets enterprise compliance requirements.

Enterprises that attempt to build these pipelines internally run into three compounding problems:

Fragmented infrastructure

AI compute, storage, and operational databases live in different environments with different security perimeters, different credential lifecycles, and different governance models. Every additional system is another thing to provision, monitor, and secure.

Schema brittleness

AI models evolve continuously. KPI definitions change. Confidence scoring methodologies mature. Every model update risks breaking the data schema and triggering migration work before insights can resume.

Compliance friction

Enterprise contracts increasingly require data residency guarantees and end-to-end auditability. A pipeline that spans multiple cloud services and network boundaries makes those guarantees difficult to provide, and even harder to prove.

"The question was never whether we could extract insights from video. It was whether we could do it in a way that an enterprise security team would sign off on, and that an operations team could actually run."

Why the "Just Use RDS" Answer Didn't Work for Us

When we started designing CT Vision on Databricks, the first question we had to answer was: where does operational state live?

The default industry answer is an external managed database such as RDS, Cloud SQL, Azure Database for PostgreSQL. For most applications, that is a perfectly reasonable choice. For CT Vision, it created a structural contradiction.

CT Vision is built entirely inside the Databricks workspace:

Unity Catalog for governed storage
Model Serving for AI inference
Vector Search for semantic retrieval
Jobs API for ML training workflows

Routing operational state through an external database would punch a hole through that boundary. Specifically, it would introduce:

A separate network boundary requiring VPC peering or private link configuration
A separate credential lifecycle, connection strings stored as secrets, rotated on their own schedule
Cross-network latency on every transactional read and write
A second governance perimeter that Unity Catalog cannot see into
Infrastructure provisioning work that blocks every new customer deployment

For a platform where data residency and auditability are contractual requirements, this was architecturally unacceptable. The AI system needed to be secure end-to-end, not secure everywhere except the operational database.

The Schema Evolution Problem

There is a second, less obvious challenge specific to AI platforms: output schema instability.

When a language model extracts KPIs from video, the structure of that output changes over time as prompts improve, as new KPI types are added, and as confidence scoring methods evolve.

A normalized relational schema cannot absorb these changes without migration scripts. In a production environment serving multiple enterprise clients, migration scripts are operational risk events, coordinated releases with real downtime implications.

We needed a schema strategy that could keep pace with the AI layer without creating friction every time a model improved.

What Changes When the Database Is Inside the Workspace

Databricks Lakebase is a fully managed Postgres service that runs inside the Databricks workspace boundary. It is not a data warehouse. It is not an analytics layer. It is a fully transactional operational database, purpose-built for applications that need fast structured reads and writes, and that already live inside Databricks.

For CT Vision, this means every piece of operational state lives in a Lakebase instance that is co-resident with the compute, storage, and AI services the platform depends on. CT Vision composes its database connection from Databricks-provided Postgres environment variables at runtime. No external operational database is required.

The Structural Shift, Made Concrete

Lakebase eliminates the structural contradiction of the external database. Because it sits inside the workspace boundary:

Minimal network configuration by default

In the standard deployment path, there is no VPC peering, no private link setup, and no firewall rules to configure. The database is reachable from application compute the same way Unity Catalog storage is reachable. Enterprise teams that require additional private connectivity can still apply PrivateLink, IP allowlists, or network policies at the workspace layer.

No external database credential lifecycle

CT Vision composes its Postgres connection from Databricks-provided environment variables at runtime. There is no external database password to store or rotate. Authentication uses Databricks OAuth with managed token refresh, so the credential lifecycle stays within the Databricks-native model.

No second governance perimeter

Security controls are Unity Catalog permissions and OAuth authentication, the same controls that govern the rest of the platform.

No infrastructure to provision per deployment

A new customer environment is a Databricks Apps deploy command. The database bootstraps itself.

Inside the Schema: Twelve Tables, One Boundary

CT Vision's Lakebase instance carries twelve operational tables, organized around the platform's core workflows:

Identity & Access

• users: Application identity, role (Admin / User / Viewer), bcrypt-hashed credentials. No external identity provider dependency in the base deployment.

Video Processing Pipeline

• bundles: Logical groupings of videos with JSONB KPI definitions. A kpi_version integer column enables staleness detection: a single comparison flags whether a video's analysis is current against the latest KPI set.

• videos: Upload metadata, UC Volume path, processing status (UPLOADING → PROCESSING → COMPLETED / FAILED), and uploader reference.

• processed_videos: AI pipeline output: summary text, structured events (JSONB array), KPI results (JSONB map of name → value + confidence + frame samples). JSONB absorbs prompt evolution without schema changes.

• notifications: Per-user processing events, progress percentages, read flags. Polled over REST, no message broker required.

ML Training Lifecycle

• training_jobs: Full lifecycle tracking: Databricks run ID, state machine (QUEUED → RUNNING → COMPLETED / FAILED / CANCELLED), training configuration as JSONB, per-epoch metrics as JSONB array, full transition history. Every state change is auditable.

Annotation Studio

• datasets, dataset_images, annotations: Annotation shapes stored as JSONB arrays (bounding boxes, polygons), linked to images in UC Volumes.

• model_projects, rate_limits: Project groupings and per-user rate limit windows.

The Column Type That Lets AI Models Evolve Without Breaking the Database

The decision to store AI output in JSONB columns rather than normalized relational tables is one of the most consequential design choices in CT Vision's schema, and worth explaining carefully.

A bundle may define five KPIs today and eight next month. Each KPI has a name, a numeric or boolean value, a confidence score, and a per-frame sample array. Normalizing this into a kpi_values table would require a schema migration every time a client changes their KPI definitions. In a multi-tenant production environment, that is a coordinated release event with real operational risk.

With JSONB, the KPI structure is owned by the application layer. Clients add KPIs; the database absorbs the change without a migration. Crucially, this flexibility does not come at the cost of queryability.

PostgreSQL's JSONB indexing and path extraction operators mean CT Vision can query KPI results by key, filter events by type, and aggregate training metrics by epoch, all against JSONB columns, all through SQLAlchemy's parameterised query interface.

With the right JSONB indexes and query design, CT Vision gets schema flexibility without forcing migrations. PostgreSQL makes both possible simultaneously.

What "Deploy in Seconds" Actually Means

CT Vision's startup sequence shows what a workspace-native architecture looks like in practice. Every new environment bootstraps itself without manual intervention:

Uvicorn starts on DATABRICKS_APP_PORT
FastAPI lifespan initialises the SQLAlchemy engine, pointed at DATABASE_URL
create_all() bootstraps any missing tables on first run, no manual schema setup required
OAuth M2M token is fetched and cached
StorageProvider, SearchProvider, and VisionProcessor initialise
Background workers start and the application becomes ready

The Lakebase connection is a hard dependency at step two. If it fails, nothing else starts. This is intentional; the operational state is not optional. Because Lakebase is inside the workspace, the connection either works or it does not. There is no firewall rule to check, no peering connection to validate, no cross-region latency to troubleshoot.

What Enterprise Teams Actually Get

For Enterprise Buyers

The three concerns that most often slow or block enterprise AI procurement are data residency, auditability, and security posture. The Lakebase architecture addresses all three directly.

Data residency.

All operational state, including user data, video metadata, extracted intelligence, and training histories, stays within the Databricks workspace. There is no external database endpoint to scope into a data processing agreement.

Auditability

Every video processing event, KPI extraction, training job transition, and user notification is a PostgreSQL row with timestamps and foreign keys. The audit trail is complete, queryable, and in the same database as the application data.

Security posture

One workspace, one security perimeter, one credential model. No external endpoints to scope into penetration testing or SOC 2 audits.

For Platform Operations Teams

Deployment speed

New environments deploy with a single command. No database infrastructure to provision, no VPC to configure, no firewall rules to open.

Operational simplicity

No external database to monitor, patch, or scale independently. Lakebase is managed infrastructure within the workspace.

Schema agility

As AI models improve and KPI definitions evolve, JSONB storage absorbs those changes at the application layer, significantly reducing migration pressure for AI output fields.

The Architecture Compared

Challenge	Traditional Approach	With Lakebase
Database deployment	VPC peering + firewall rules	Zero network configuration
Credential management	Secrets manager + rotation policy	Single DATABASE_URL at runtime
AI schema evolution	Migration scripts per release	JSONB reduces migration pressure for AI output fields
New environment setup	Hours of infrastructure provisioning	Fast startup via create_all() bootstrap
Audit & governance	Fragmented logging infrastructure	Timestamps + FK trail in the same DB

Industry Case Studies: Where CT Vision Creates Value

CT Vision is industry-agnostic at the platform layer and industry-specific at the agent layer. The same Lakebase-backed, Unity Catalog-governed architecture powers very different vision problems depending on the sector; only the agent pool changes. Here is where it lands, and the Celebal Databricks footprint that backs each.

Energy & Utilities: Asset inspection and grid reliability

Transmission and distribution operators inspect transformers, substations, and feeders that are geographically dispersed, safety-critical, and historically inspected by eye. Manual inspection is slow, subjective, and leaves blind spots, exactly the Industrial AI and critical-infrastructure monitoring problem Databricks frames for the sector.

CT Vision automates transformer and asset inspection: it detects rust, oil leaks, vegetation encroachment, and structural damage, applies a standardized A–E severity grade, auto-completes inspection forms, and pushes them to the system of record.

Celebal has already delivered automated transformer inspection and work-order automation for a North American electricity utility, and its wider energy-and-utilities Databricks footprint spans integrated oil & gas majors, national oil companies, power generators, and water-and-electricity utilities.

Manufacturing: Defect detection and quality forensics

On high-throughput lines, defects and safety events are caught late; quality assessment is inconsistent across shifts, and video and sensor data sits unused.

CT Vision's vision agents run defect detection, production monitoring, and quality-event forensics against live camera and sensor feeds, mapping to Databricks' Manufacturing Industrial AI use cases, and turn raw video into structured, searchable evidence.

Celebal has delivered computer-vision and industrial-AI engagements in manufacturing, including automated-guided vehicle vision automation at a bottled-beverage manufacturer and anomaly detection for a power-grid technology maker.

Public Sector & Transportation: Critical-infrastructure monitoring

Agencies and grid operators must monitor critical infrastructure and public spaces continuously but cannot staff a human on every feed.

CT Vision provides automated monitoring, blind-spot and anomaly detection, and intelligent alerting across infrastructure and transportation video, mapped to the Databricks Public Sector use cases for critical-infrastructure monitoring, predictive maintenance, and smart transportation.

Celebal's public-sector Databricks footprint includes grid operators and government giga-project authorities whose mandate is exactly this kind of monitoring.

Retail: Store, safety, and operations intelligence

Retailers capture floor and back-of-house footage but rarely convert it into operational insight on safety, compliance, or process.

CT Vision turns store and distribution-center video into operational intelligence, safety and compliance monitoring, process adherence, and incident detection, feeding the same governed customer-and-operations estate retailers already run on Databricks.

Celebal operates large governed Databricks estates for grocery, fashion, and sportswear retailers, where the customer-data backbone CT Vision's operational insights extend.

Life Sciences & Healthcare: Inspection and imaging

Regulated manufacturing and clinical environments demand consistent visual inspection and auditable evidence.

CT Vision applies the same governed inspection pipeline to GxP manufacturing lines and image-classification workflows, with Unity Catalog lineage from frame to finding.

Celebal's life-sciences Databricks footprint spans global pharma, medical-device, and healthcare-provider organizations operating under strict audit requirements, where consistency and traceability are the whole point.

Across all five sectors the platform is identical: agentic vision on Databricks, Lakebase as the operational record, and Unity Catalog governing every frame and every finding. Only the agent pool changes.

Governance Isn't a Feature. It's the Architecture

The deepest principle behind CT Vision's use of Lakebase is not technical, it is organizational.

Enterprise AI platforms are judged not only on the quality of their insights, but on the credibility of their governance. A platform that punches holes in its own security boundary for database access has a governance story that is genuinely harder to tell, harder to document, harder to audit, and harder to sell.

Lakebase closes that gap. With compute, storage, AI inference, vector search, model training, and operational state all inside a single Databricks workspace, CT Vision can make a claim that matters to enterprise buyers: your data does not leave the perimeter.

For platform teams evaluating CT Vision: there is no database infrastructure to provision, configure, or secure outside Databricks. The platform is the workspace.

One workspace. One deployment boundary. Zero external infrastructure.

Interested in how CT Vision can work for your enterprise? Reach out at

enterprisesales@celebaltech.com

Share this article:

News/PR

The Database Decision That Shaped CT Vision’s Operational Architecture

Why Enterprise Video Intelligence Is Harder Than It Looks

Fragmented infrastructure

Schema brittleness

Compliance friction

Why the "Just Use RDS" Answer Didn't Work for Us

The Schema Evolution Problem

What Changes When the Database Is Inside the Workspace

Minimal network configuration by default

No external database credential lifecycle

No second governance perimeter

No infrastructure to provision per deployment

Inside the Schema: Twelve Tables, One Boundary

Identity & Access

Video Processing Pipeline

ML Training Lifecycle

Annotation Studio

The Column Type That Lets AI Models Evolve Without Breaking the Database

What "Deploy in Seconds" Actually Means

What Enterprise Teams Actually Get

For Enterprise Buyers

Data residency.

Auditability

Security posture

For Platform Operations Teams

Deployment speed

Operational simplicity

Schema agility

The Architecture Compared

Industry Case Studies: Where CT Vision Creates Value

Energy & Utilities: Asset inspection and grid reliability

Manufacturing: Defect detection and quality forensics

Public Sector & Transportation: Critical-infrastructure monitoring

Retail: Store, safety, and operations intelligence

Life Sciences & Healthcare: Inspection and imaging

Governance Isn't a Feature. It's the Architecture

Share this article:

Services

AI/ML

Accelerator

Data Engineering

SAP

Cloud Infra and Security

Biz Apps

Business Intelligence

Industry

Products

Partnership

Service Partners

About Us

Resources

Services

AI/ML

Accelerator

Data Engineering

SAP

Cloud Infra and Security

Biz Apps

Business Intelligence

Industry

Products

Partnership

Service Partners

About Us

Resources

The Database Decision That
Shaped CT Vision’s Operational
Architecture