Documentation Index

Fetch the complete documentation index at: https://docs.aifabrix.ai/llms.txt

Use this file to discover all available pages before exploring further.

Build a document storage entity

Prev Next

Step-by-step checklist for entityType: documentStorage — file libraries with metadata normalization, sync ingestion, and governed document capabilities.

Prerequisites

  • External system JSON (often type: openapi for SharePoint, file APIs, or DMS)
  • Resource type document (or domain-specific content key) in catalog
  • Document provider credentials in env.template (kv:// paths)

Where it lives

integration/<systemKey>/<systemKey>-datasource-documents.json (one entity type per file).

Document storage uses the same root required fields as records, with entityType: documentStorage.

How to set

  1. Copy a document-storage fixture (library/list API) and trim. Align systemKey and application manifest file list.

  2. Set root identity:

{
  "key": "example-documents",
  "displayName": "Documents",
  "systemKey": "example-files",
  "entityType": "documentStorage",
  "resourceType": "document",
  "primaryKey": ["externalId"],
  "metadataSchema": {
    "type": "object",
    "properties": {
      "externalId": { "type": "string", "index": true },
      "name": { "type": "string", "index": true },
      "mimeType": { "type": "string", "index": true },
      "folderPath": { "type": "string", "index": true }
    }
  }
}
  1. Map file metadata in fieldMappings.attributes — include paths for web URL, modified time, owner email, sensitivity labels. Materialize ABAC dimensions on indexed properties.

  2. Configure document sync — ingestion pipeline (list/enumerate, download, metadata extract). Document sync differs from record CRM sync — validate early:

aifabrix datasource validate <datasourceKey>
  1. Configure exposed — document capabilities (document:read, document:search, upload/delete when allowed).

  2. Optional vector path — if you need semantic search, consider entityType: vectorStore instead; see Build vector store business entity.

  3. Repair and validate system:

aifabrix repair <systemKey> --expose
aifabrix validate <systemKey>
aifabrix upload <systemKey> --probe
aifabrix test-integration <systemKey>

Defaults and examples

Concern documentStorage guidance
Join identity metadataSchema.properties.externalId required
ABAC Index folder path, site, owner, sensitivity on metadata properties
Foreign keys Link documents to deals/customers via foreignKeys[] + protection
Sync Document ingestion — not the same defaults as record pull
Capabilities Prefer document:* permission prefix matching resourceType

Use documentStorage when files stay in the external system and metadata is normalized in Enterprise Knowledge. Use vectorStore when embeddings and RAG search are first-class.

Root identity band (same law as recordStorage):

Field documentStorage rule
key Business Entity key
displayName Operator label
systemKey Parent system
entityType documentStorage
resourceType Usually document
externalId Indexed join field in metadataSchema

Example sync band for document ingestion:

{
  "sync": {
    "enabled": true,
    "mode": "incremental",
    "operations": {
      "pull": {
        "enabled": true,
        "operationId": "listDocuments"
      }
    }
  }
}

Validate

aifabrix datasource validate <datasourceKey>
aifabrix validate <systemKey>
aifabrix datasource test-e2e <datasourceKey> --app <systemKey> --verbose

Limits

Sync failures often surface as empty catalogs or metadata shape errors — not auth failures. Document displayName and systemKey must match the integration bundle. primaryKey and externalId rules match recordStorage — document libraries still require indexed join identity even when binary content stays external. Large libraries should tune sync schedules incrementally — validate metadata shape with a single folder or site before enabling full tenant pulls. Provider-specific path envelopes are out of scope here — start from a known-good document fixture. Before E2E, run test-integration and confirm sync populated at least one row with indexed externalId — empty catalogs usually mean sync operationId or path mapping errors, not RBAC.

Common mistakes

Mistake Fix
recordStorage for file-only API Switch to documentStorage
Missing sync block Add document sync configuration
No indexed ABAC fields Materialize dimensions on metadataSchema properties
vectorStore without vector config Use documentStorage or add vector blocks
Skipping test-integration before E2E Run integration test after upload