DocuPipe Logo

DOCUPIPE

    Solutions

    Resources

    Pricing

The Deployment Reality: Cloud vs. VPC vs. On-Premise Document AI

Uri Merhav
Uri Merhav

Updated Mar 31st, 2026 · 12 min read

Table of Contents

  • Why Multi-Tenant SaaS Fails Government and Healthcare Audits
  • Defining True Air-Gapped AI and DoD IL6 Requirements
  • The DocuPipe Data Residency Decision Matrix
  • Conclusion
The Deployment Reality: Cloud vs. VPC vs. On-Premise Document AI
The sales pitch for document AI is simple: upload documents to our API, get structured data back. What the pitch omits is where that processing actually happens, who has access during processing, and whether the deployment model survives a security audit.
For enterprise buyers in government, healthcare, and financial services, deployment architecture is not a technical detail. It is the primary evaluation criterion. A document AI solution with 99% accuracy is worthless if it cannot be deployed in a compliant environment.
This article examines why multi-tenant SaaS fails enterprise security assessments, what true air-gapped deployment requires, and how to map document types to appropriate deployment models using the DocuPipe Data Residency Decision Matrix.
For the broader context on enterprise document AI infrastructure, see the Enterprise Document AI Infrastructure hub article.

Why Multi-Tenant SaaS Fails Government and Healthcare Audits

Multi-tenant SaaS is the default deployment model for most document AI vendors. Multiple customers share the same infrastructure, the same processing queues, and often the same model instances. This architecture optimizes vendor economics at the expense of customer security.

Shared Infrastructure Creates Audit Failures

When a government agency or healthcare system conducts a vendor security assessment, they ask specific questions about data isolation:
  • Where are documents stored during processing?
  • Which other customers share that storage?
  • How is network traffic segmented between tenants?
  • Who at the vendor can access documents during support operations?
Multi-tenant vendors struggle to answer these questions satisfactorily. The honest answer is that documents from many customers flow through shared components. Logical isolation exists, but physical isolation does not.
For FedRAMP authorization, this matters enormously. The shared responsibility model requires clear boundaries between customer and vendor responsibilities. When infrastructure is shared, those boundaries blur. Auditors see risk.

Telemetry Aggregation Exposes Metadata

Even when document contents are encrypted, multi-tenant architectures leak metadata through shared telemetry. Logging systems capture processing events across all customers. Metrics dashboards aggregate performance data from all tenants. Alerting systems trigger on patterns that span customer boundaries.
This metadata exposure creates risks that encryption does not address:
  • Processing patterns reveal business cycles and document volumes
  • Error rates indicate document complexity and potential data quality issues
  • Schema usage exposes what types of data customers are extracting
  • API call patterns reveal integration architectures
Sophisticated adversaries do not need document contents. Metadata provides sufficient intelligence for competitive analysis, regulatory arbitrage, or targeted attacks.

Compliance Certification Gaps

Multi-tenant SaaS vendors typically hold baseline certifications like SOC 2 Type II and ISO 27001. These certifications validate that the vendor has security controls in place. They do not validate that those controls meet every customer's specific requirements.
Government agencies need FedRAMP authorization at Moderate or High baselines. Healthcare organizations need HIPAA Business Associate Agreements with specific technical safeguards. Financial institutions need controls aligned with FFIEC guidance and state banking regulations.
When a vendor's multi-tenant architecture cannot demonstrate the required isolation, certification gaps emerge. The vendor may be "working toward" FedRAMP, but authorization takes years. The vendor may sign a BAA, but their architecture does not support the required access controls.
These gaps force enterprise buyers to either accept risk or seek alternative solutions with compliant deployment models.

Defining True Air-Gapped AI and DoD IL6 Requirements

Air-gapping is the most misunderstood concept in secure deployment. Vendors claim air-gapped capabilities when they mean VPC isolation or private endpoints. These are not the same thing.

What Air-Gapping Actually Means

A true air-gapped system has no network path to external systems. This is physical isolation, not logical isolation. No firewall rules can create an air gap. No VPN can bridge an air gap without compromising it.
Air-gapped requirements include:
Physical network separation:
  • Dedicated network infrastructure with no connections to other networks
  • No routing tables that include external destinations
  • Physical verification that cables do not connect to external systems
Update mechanisms:
  • Software updates delivered via removable media
  • Cross-domain transfer procedures with content inspection
  • Manual verification of update integrity before installation
Operational constraints:
  • No wireless capabilities on any system component
  • No Bluetooth, WiFi, or cellular radios
  • USB ports disabled or physically removed except for approved transfers
Personnel requirements:
  • Cleared operators for classified environments
  • Multi-person integrity for sensitive operations
  • Access logging with physical verification
For document AI, air-gapping means the entire processing stack runs within the isolated environment. OCR models, extraction models, validation logic, databases, and user interfaces must all operate without any external dependencies.

DoD Impact Level Requirements

The Department of Defense defines Impact Levels (IL) that specify security requirements for cloud systems handling different data classifications.
IL4: Controlled Unclassified Information (CUI)
  • Deployment in FedRAMP Moderate authorized environments
  • Logical separation from commercial workloads
  • U.S. data centers with access limited to U.S. persons
  • Suitable for most unclassified DoD workloads
IL5: Higher Sensitivity CUI and Mission Data
  • Deployment in FedRAMP High authorized environments
  • Physically separated infrastructure from commercial cloud
  • National security background checks for all personnel
  • Dedicated government cloud regions (AWS GovCloud, Azure Government, Google Distributed Cloud)
IL6: Classified Information up to SECRET
  • Deployment in government-owned facilities or approved contractor facilities
  • Air-gapped networks with no external connectivity
  • Personnel with active SECRET clearances or higher
  • Continuous monitoring by government security operations
The path from IL4 to IL6 represents an exponential increase in complexity and cost. Organizations processing classified documents cannot use commercial cloud services, regardless of the vendor's certifications.

GovCloud Is Not Air-Gapped

A common misconception is that deploying in AWS GovCloud or Azure Government provides air-gapped security. This is incorrect.
GovCloud regions provide:
  • Physical separation from commercial cloud regions
  • Access restricted to vetted U.S. entities
  • Infrastructure operated by cleared U.S. persons
  • FedRAMP High authorization for the underlying platform
GovCloud regions do not provide:
  • Network isolation from the internet
  • Freedom from external dependencies
  • Air-gapped operational characteristics
  • Automatic IL6 authorization
GovCloud is appropriate for IL4 and IL5 workloads. IL6 workloads require true air-gapped deployment in government facilities.

The DocuPipe Data Residency Decision Matrix

Cloud vs on-premise deployment comparisonCloud vs on-premise deployment comparison
Not every document requires air-gapped processing. Over-securing low-risk documents wastes resources and creates operational friction. Under-securing high-risk documents creates compliance violations and potential breaches.
The Data Residency Decision Matrix maps document characteristics to appropriate deployment models.

Document Classification Criteria

Evaluate each document type against these criteria:
Data sensitivity:
  • Public: Information available to anyone
  • Internal: Information restricted to organization members
  • Confidential: Information restricted to specific roles
  • Classified: Information requiring government security clearances
Regulatory scope:
  • Unregulated: No specific compliance requirements
  • Industry regulated: Subject to HIPAA, GLBA, PCI-DSS, or similar
  • Government regulated: Subject to FedRAMP, ITAR, EAR, or similar
  • Classified: Subject to national security classification guidelines
Breach impact:
  • Low: Embarrassment but no material harm
  • Medium: Financial penalties or operational disruption
  • High: Regulatory action, litigation, or significant financial loss
  • Critical: National security impact or existential business risk

Deployment Model Mapping

Based on classification criteria, map documents to deployment models:
Cloud SaaS (multi-tenant):
  • Public data with no regulatory requirements
  • Internal data with low breach impact
  • Documents where processing speed and cost efficiency are primary concerns
  • Example: Marketing materials, public filings, general correspondence
Dedicated Cloud (single-tenant):
  • Confidential data with industry regulation
  • Internal data with medium breach impact
  • Documents requiring audit trails but not physical isolation
  • Example: Customer contracts, employee records, financial transactions
VPC Deployment (customer-controlled):
  • Confidential data with government regulation
  • Any data with high breach impact
  • Documents requiring customer-managed encryption keys
  • Example: Protected health information, personally identifiable information, trade secrets
On-Premise Deployment:
  • Classified or controlled data
  • Data subject to data sovereignty requirements
  • Documents that cannot leave organizational control under any circumstances
  • Example: Government contracts, defense-related technical data, classified intelligence
Air-Gapped Deployment:
  • Classified information at SECRET or above
  • Documents requiring IL6 processing
  • Materials subject to SCIF requirements
  • Example: Intelligence reports, weapons system documentation, compartmented programs

DocuPipe Deployment Options

DocuPipe supports the full spectrum of deployment models:
Cloud processing for organizations comfortable with managed infrastructure and baseline certifications.
VPC deployment for organizations requiring customer-controlled environments with private endpoints and customer-managed keys.
On-premise deployment for organizations requiring complete infrastructure ownership with no external dependencies.
Air-gapped deployment for government and defense organizations requiring true network isolation with containerized delivery of the complete processing stack.
Each deployment model provides the same extraction capabilities. The differences are in security controls, operational procedures, and infrastructure ownership.

Conclusion

Deployment architecture determines whether document AI can be used at all in enterprise environments. Accuracy benchmarks and feature comparisons are irrelevant if the solution cannot be deployed in a compliant configuration.
Organizations should evaluate document AI vendors on deployment capabilities before evaluating extraction quality. A vendor who cannot meet deployment requirements is not a viable option, regardless of other attributes.

VPC deployment provides network isolation through private endpoints and customer-managed infrastructure within a cloud environment—suitable for IL4/IL5 workloads. Air-gapped deployment requires true physical network separation with no routable path to external systems, software updates via removable media, and often cleared personnel—required for IL6 and classified information processing.

No. GovCloud provides physical separation from commercial regions and access restricted to U.S. entities, but it is not air-gapped. GovCloud systems still have internet connectivity and external dependencies. True air-gapping requires government-owned facilities with no network path to external systems.

Use the Data Residency Decision Matrix: evaluate each document type by data sensitivity (public to classified), regulatory scope (unregulated to classified), and breach impact (low to critical). Match these classifications to deployment models—cloud SaaS for low-risk documents, VPC for confidential data, on-premise or air-gapped for classified or sovereignty-constrained documents.

Recommended Articles

Technical

HITL Protocol

Yadid Orlow

Yadid Orlow

Apr 22, 2026 · 12 min read

Technical

Secure RAG

Nitai Dean

Nitai Dean

Apr 19, 2026 · 12 min read

Technical

Table Extraction

Nitai Dean

Nitai Dean

Apr 17, 2026 · 13 min read

Related Documents

 

Related documents:

Related documents:

Check

RIB

CT-e

Non-Disclosure Agreement

BAS

NDA

NF-e

DPA

Move-In/Move-Out Inspection

Expense Report

Rent Roll

Stock Option Agreement

Sales Contract

Service Agreement

Workers Comp Claim

+