Documentation Index

Fetch the complete documentation index at: https://docs.aifabrix.ai/llms.txt

Use this file to discover all available pages before exploring further.

Build a vector store entity

Prev Next

Configure entityType: vectorStore when semantic search or RAG runs against a governed vector index — typically backed by a vendor vector API or platform embedding pipeline — not when you only need file catalog metadata.

Model note

vectorStore is not a substitute for documentStorage. Use documentStorage when files and metadata live in the vendor system. Use vectorStore when embeddings, chunking, and semantic retrieval are first-class and declared on the Business Entity manifest (often alongside a document source datasource via foreign keys).

Prerequisites

Where it lives

Separate datasource JSON under integration/<systemKey>/ with root entityType": "vectorStore".

Vector pipelines add embedding and chunk configuration beyond documentStorage — keep one entityType per file.

How to set

  1. Confirm vector semantics are required — if you only list/download files with metadata, use documentStorage.

  2. Scaffold or copy a fixture — start from an org vectorStore JSON when available; trim key and systemKey.

  3. Set root identity band:

{
  "key": "example-documents-vector",
  "displayName": "Document search",
  "systemKey": "example-files",
  "entityType": "vectorStore",
  "resourceType": "document",
  "primaryKey": ["externalId"],
  "metadataSchema": {
    "type": "object",
    "properties": {
      "externalId": { "type": "string", "index": true },
      "title": { "type": "string", "index": true, "filter": true }
    }
  }
}
  1. Add vector-specific blocks — embedding model references, chunking, index linkage per schema for your provider. Validate after each edit.

  2. Map fieldMappings.attributes — title, path, and content fields agents need for citations.

  3. Configure exposed — search/read capabilities align with resourceType permissions (Capabilities).

  4. Validate and test:

aifabrix datasource validate <datasourceKey>
aifabrix validate <systemKey>
aifabrix test-integration <systemKey>
aifabrix datasource test-e2e <datasourceKey> --app <systemKey>

Defaults and examples

Choose vectorStore when… Stay on documentStorage when…
Workers need semantic search File metadata + download is enough
Embeddings stored or referenced No embedding pipeline
RAG certification required Catalog-only document governance

Illustrative capabilities and exposed search contract:

{
  "capabilities": [
    { "key": "search", "description": "Semantic document search", "riskLevel": "medium" }
  ],
  "exposed": {
    "filterable": ["title"],
    "schema": {
      "externalId": "metadata.externalId",
      "title": "metadata.title"
    }
  }
}

Pair vectorStore with governed dimensions when document libraries span regions or projects — top-level dimensions behave the same as recordStorage datasources.

Validate

aifabrix resource-type list
aifabrix datasource validate <datasourceKey>
aifabrix validate <systemKey>
aifabrix test-integration <systemKey>
aifabrix datasource test-e2e <datasourceKey> --app <systemKey>

Empty semantic search at runtime usually means missing vector blocks or exposed search operations — not vendor auth failures.

Common mistakes

Mistake Fix
vectorStore without vector config Add required blocks or use documentStorage
Same JSON as recordStorage Split files by entityType
Missing exposed search capability Add governed search operations
Skipping datasource validate Run after every JSON edit
resourceType not in catalog Register before upload

Limits

Embedding providers, chunk policies, and cost controls evolve quickly — confirm against your environment’s LLM/vector configuration. Detailed vector pipeline tuning lives in platform KB articles not yet fully ported to this public path.

This page covers manifest shape — not every embedding provider dialect. After first green datasource validate, keep running it after every JSON edit — it is faster than full system validate while iterating on vector blocks or fieldMappings.

Before first upload, confirm entityType is vectorStore, root resourceType appears in resource-type list, and indexed externalId exists — these checks prevent most probe-upload failures on semantic search datasources.

Upload only when both system and business entity validators pass without warnings you cannot explain. Treat displayName changes as operator-facing — they do not replace stable key or externalId semantics used by sync and search.

Semantic search costs and embedding quotas are environment-specific — validate vector blocks in a sandbox tenant before enabling exposed search capabilities in production catalogs.

Chunk size and embedding model choices affect verify-trust outcomes when agents cite document titles — re-run trust after vector pipeline edits even when validate stays green.