Configure entityType: vectorStore when semantic search or RAG runs against a governed vector index — typically backed by a vendor vector API or platform embedding pipeline — not when you only need file catalog metadata.
Model note
vectorStore is not a substitute for documentStorage. Use documentStorage when files and metadata live in the vendor system. Use vectorStore when embeddings, chunking, and semantic retrieval are first-class and declared on the Business Entity manifest (often alongside a document source datasource via foreign keys).
Prerequisites
- Document source modeled or shared system JSON
- Decision: documentStorage vs vectorStore (Business Entity entity types)
- Resource type registered (often
documentor domain content key) - Platform LLM/embedding credentials resolved in
env.templatewhen required by your environment
Where it lives
Separate datasource JSON under integration/<systemKey>/ with root entityType": "vectorStore".
Vector pipelines add embedding and chunk configuration beyond documentStorage — keep one entityType per file.
How to set
-
Confirm vector semantics are required — if you only list/download files with metadata, use documentStorage.
-
Scaffold or copy a fixture — start from an org vectorStore JSON when available; trim
keyandsystemKey. -
Set root identity band:
{
"key": "example-documents-vector",
"displayName": "Document search",
"systemKey": "example-files",
"entityType": "vectorStore",
"resourceType": "document",
"primaryKey": ["externalId"],
"metadataSchema": {
"type": "object",
"properties": {
"externalId": { "type": "string", "index": true },
"title": { "type": "string", "index": true, "filter": true }
}
}
}
-
Add vector-specific blocks — embedding model references, chunking, index linkage per schema for your provider. Validate after each edit.
-
Map
fieldMappings.attributes— title, path, and content fields agents need for citations. -
Configure
exposed— search/read capabilities align withresourceTypepermissions (Capabilities). -
Validate and test:
aifabrix datasource validate <datasourceKey>
aifabrix validate <systemKey>
aifabrix test-integration <systemKey>
aifabrix datasource test-e2e <datasourceKey> --app <systemKey>
Defaults and examples
| Choose vectorStore when… | Stay on documentStorage when… |
|---|---|
| Workers need semantic search | File metadata + download is enough |
| Embeddings stored or referenced | No embedding pipeline |
| RAG certification required | Catalog-only document governance |
Illustrative capabilities and exposed search contract:
{
"capabilities": [
{ "key": "search", "description": "Semantic document search", "riskLevel": "medium" }
],
"exposed": {
"filterable": ["title"],
"schema": {
"externalId": "metadata.externalId",
"title": "metadata.title"
}
}
}
Pair vectorStore with governed dimensions when document libraries span regions or projects — top-level dimensions behave the same as recordStorage datasources.
Validate
aifabrix resource-type list
aifabrix datasource validate <datasourceKey>
aifabrix validate <systemKey>
aifabrix test-integration <systemKey>
aifabrix datasource test-e2e <datasourceKey> --app <systemKey>
Empty semantic search at runtime usually means missing vector blocks or exposed search operations — not vendor auth failures.
Common mistakes
| Mistake | Fix |
|---|---|
| vectorStore without vector config | Add required blocks or use documentStorage |
| Same JSON as recordStorage | Split files by entityType |
| Missing exposed search capability | Add governed search operations |
| Skipping datasource validate | Run after every JSON edit |
resourceType not in catalog |
Register before upload |
Limits
Embedding providers, chunk policies, and cost controls evolve quickly — confirm against your environment’s LLM/vector configuration. Detailed vector pipeline tuning lives in platform KB articles not yet fully ported to this public path.
This page covers manifest shape — not every embedding provider dialect. After first green datasource validate, keep running it after every JSON edit — it is faster than full system validate while iterating on vector blocks or fieldMappings.
Before first upload, confirm entityType is vectorStore, root resourceType appears in resource-type list, and indexed externalId exists — these checks prevent most probe-upload failures on semantic search datasources.
Upload only when both system and business entity validators pass without warnings you cannot explain. Treat displayName changes as operator-facing — they do not replace stable key or externalId semantics used by sync and search.
Semantic search costs and embedding quotas are environment-specific — validate vector blocks in a sandbox tenant before enabling exposed search capabilities in production catalogs.
Chunk size and embedding model choices affect verify-trust outcomes when agents cite document titles — re-run trust after vector pipeline edits even when validate stays green.