Step-by-step checklist for entityType: documentStorage — file libraries with metadata normalization, sync ingestion, and governed document capabilities.
Prerequisites
- External system JSON (often
type: openapifor SharePoint, file APIs, or DMS) - Resource type
document(or domain-specific content key) in catalog - Document provider credentials in
env.template(kv://paths)
Where it lives
integration/<systemKey>/<systemKey>-datasource-documents.json (one entity type per file).
Document storage uses the same root required fields as records, with entityType: documentStorage.
How to set
-
Copy a document-storage fixture (library/list API) and trim. Align
systemKeyand application manifest file list. -
Set root identity:
{
"key": "example-documents",
"displayName": "Documents",
"systemKey": "example-files",
"entityType": "documentStorage",
"resourceType": "document",
"primaryKey": ["externalId"],
"metadataSchema": {
"type": "object",
"properties": {
"externalId": { "type": "string", "index": true },
"name": { "type": "string", "index": true },
"mimeType": { "type": "string", "index": true },
"folderPath": { "type": "string", "index": true }
}
}
}
-
Map file metadata in
fieldMappings.attributes— include paths for web URL, modified time, owner email, sensitivity labels. Materialize ABAC dimensions on indexed properties. -
Configure document
sync— ingestion pipeline (list/enumerate, download, metadata extract). Document sync differs from record CRM sync — validate early:
aifabrix datasource validate <datasourceKey>
-
Configure
exposed— document capabilities (document:read,document:search, upload/delete when allowed). -
Optional vector path — if you need semantic search, consider
entityType: vectorStoreinstead; see Build vector store business entity. -
Repair and validate system:
aifabrix repair <systemKey> --expose
aifabrix validate <systemKey>
aifabrix upload <systemKey> --probe
aifabrix test-integration <systemKey>
Defaults and examples
| Concern | documentStorage guidance |
|---|---|
| Join identity | metadataSchema.properties.externalId required |
| ABAC | Index folder path, site, owner, sensitivity on metadata properties |
| Foreign keys | Link documents to deals/customers via foreignKeys[] + protection |
| Sync | Document ingestion — not the same defaults as record pull |
| Capabilities | Prefer document:* permission prefix matching resourceType |
Use documentStorage when files stay in the external system and metadata is normalized in Enterprise Knowledge. Use vectorStore when embeddings and RAG search are first-class.
Root identity band (same law as recordStorage):
| Field | documentStorage rule |
|---|---|
key |
Business Entity key |
displayName |
Operator label |
systemKey |
Parent system |
entityType |
documentStorage |
resourceType |
Usually document |
externalId |
Indexed join field in metadataSchema |
Example sync band for document ingestion:
{
"sync": {
"enabled": true,
"mode": "incremental",
"operations": {
"pull": {
"enabled": true,
"operationId": "listDocuments"
}
}
}
}
Validate
aifabrix datasource validate <datasourceKey>
aifabrix validate <systemKey>
aifabrix datasource test-e2e <datasourceKey> --app <systemKey> --verbose
Limits
Sync failures often surface as empty catalogs or metadata shape errors — not auth failures. Document displayName and systemKey must match the integration bundle. primaryKey and externalId rules match recordStorage — document libraries still require indexed join identity even when binary content stays external. Large libraries should tune sync schedules incrementally — validate metadata shape with a single folder or site before enabling full tenant pulls. Provider-specific path envelopes are out of scope here — start from a known-good document fixture. Before E2E, run test-integration and confirm sync populated at least one row with indexed externalId — empty catalogs usually mean sync operationId or path mapping errors, not RBAC.
Common mistakes
| Mistake | Fix |
|---|---|
| recordStorage for file-only API | Switch to documentStorage |
| Missing sync block | Add document sync configuration |
| No indexed ABAC fields | Materialize dimensions on metadataSchema properties |
| vectorStore without vector config | Use documentStorage or add vector blocks |
| Skipping test-integration before E2E | Run integration test after upload |