Docs/Configuring Your Agent/Documents

Documents

The Documents page is where you upload and manage the files that form your agent's primary knowledge base. Supported formats are PDF, Markdown (.md), and plain text (.txt). Each uploaded document is parsed, split into searchable chunks, and indexed — after which the agent can draw on its content when answering user queries.

Navigate to Resources → Documents in the left sidebar.

Managing the Documents List

The list displays documents as cards (grid view) or rows (list view). Use the view toggle in the top-right to switch between them. Each card shows the document's title, code, author, status badge, and processing progress.

Document statuses: Draft → Pending → Processing → Processed (ready) or Failed (error).

Use Search Documents to filter by code, title, author, or status. Use the Sort By controls in the top-right to change the sort order.

Creating a New Document

Click New Document in the top-left of the page. This opens the document form where you can:

Enter a title, code, and description
Upload the file
Assign an author
Set the document's language and content type
Submit for processing

Ingestion Modes

When you upload a document, you choose an Ingestion Mode that controls how the system splits the document into searchable chunks. The right mode depends on the type and length of your document.

Quick Comparison

Mode	Best For	AI Processing	Speed
Semantic	General documents — prose, reports, articles	None	Fast
Skimming	Long structured documents — books, manuals, multi-chapter reports	Low–Medium	Slow
Regulatory Context	Legal documents — laws, bylaws, regulations	Minimal	Slow
Fact Sheet	Short documents — product sheets, fund fact sheets (1–3 pages)	Medium	Medium
Adaptive	Complex reference books, documents with heavy cross-references	High	Slowest

Not sure which to pick? Choose Skimming. It works well for most document types and preserves the document's natural structure.

Semantic

The simplest and most reliable mode. The document is split at natural sentence boundaries into evenly sized chunks. Each chunk overlaps slightly with the previous one so context is not lost at the edges.

Best for

General-purpose documents where structure does not matter much
Reports, articles, FAQs, knowledge base files
Any document when you are unsure which mode to use

Parameters

Parameter	Description
Content Size	Maximum number of tokens per chunk. Larger values produce fewer, longer chunks. Default: 500.
Overlapping Ratio	Percentage of sentences from the previous chunk that are repeated at the start of the next chunk. Higher values preserve more context across boundaries. Range: 1–80%. Default: 20%.
Start Page	First page to process. Leave blank to start from page 1.
Finish Page	Last page to process. Leave blank to process to the end.

Advanced Parameters (optional)

Parameter	Description
Table as Plain Text	When on, tables are extracted as plain text inside regular chunks. When off, each table becomes its own separate chunk.
Skip Image Analysis	When on, images are ignored. When off, an AI model generates a description for each image and includes it as a chunk.

Skimming

Detects the document's chapter and section structure using AI, then creates one chunk per section — each chunk is titled with the full path of headings (e.g., Chapter 1 › Section 1.2 › Subsection 1.2.3). Sections that are too small are automatically merged with their neighbours; sections that are too large are split at semantic boundaries.

Best for

Books, technical manuals, training materials
Any document with clear chapters, sections, and subsections
Documents where preserving heading structure in search results matters

Parameters

Parameter	Description
Content Training AI Model	The AI model used to analyse the document structure. A more capable model may detect structure more accurately on complex documents.
Use Summarization	When on, the system generates an AI summary for each parent section (e.g., a chapter summary derived from its subsections). Improves context for high-level queries but increases processing time and cost.

Advanced Parameters (optional)

Parameter	Description
Page Overlap	Number of pages shared between adjacent processing segments when the document is too large to analyse in one pass. Helps the AI detect section boundaries that span segment edges. Default: 2.
Keep Small Sections Separate	When on, small sections are not merged with their neighbours and are kept exactly as the AI detected them. Useful if you want strict per-section chunks regardless of size.
Minimum Section Size	Sections with fewer tokens than this value are merged with an adjacent section. Only applies when Keep Small Sections Separate is off.
Table as Plain Text	Same as in Semantic mode.
Skip Image Analysis	Same as in Semantic mode.

Regulatory Context

A dedicated parser built specifically for legal documents. It extracts each article individually — number, title, legal text, notes, and cross-references — and converts each article directly into one chunk. The article text is preserved verbatim; nothing is rewritten or summarised.

Best for

Laws, government regulations, bylaws
Compliance documents, legal codes
Any document where the exact wording of each article must be preserved

Parameters

No additional parameters are required. The parser handles structure detection automatically based on standard legal document patterns.

Fact Sheet

Processes the document one page at a time. Each page — including its layout, images, and tables — is sent to an AI model that decides the chunk boundaries, writes the content in clean markdown, and extracts key metadata (product name, product type, document type, version).

Best for

Fund fact sheets, product brochures, one-pagers
Short documents of 1–3 pages with mixed text, tables, and charts
Documents where preserving the visual layout and chart context matters

Parameters

No additional parameters are required. The AI model handles layout analysis and chunk boundaries automatically.

Adaptive

The most sophisticated mode. A two-phase AI pipeline first analyses the document's full structure using a reasoning model, then assembles each chunk with additional context: cross-referenced content is embedded inline, and ambiguous pronouns (e.g., "he", "it", "they") are resolved to their actual referents. The result is chunks that are independently understandable — each one makes sense on its own without needing surrounding context.

Best for

Large reference books and encyclopaedias
Documents with dense cross-references (e.g., "see Article 3" or "as described in Chapter 7")
Academic or technical documents where out-of-context sentences lose their meaning

Parameters

Parameter	Description
Document Type Hint	Optional free-text description of the document type (e.g., `"educational textbook"`, `"regulatory filing"`, `"product manual"`). Helps the AI understand context and improve structure detection.
Image Description AI Model	The AI model used to generate descriptions for images found in the document.
Excluded Pages	Page types to exclude from chunking. By default the system skips: cover pages, table of contents, references, glossary, copyright, and acknowledgments. You can override this list here.

Advanced Parameters (optional)

Parameter	Description
Force Full LLM Enrichment	When on, every section is processed through the AI enrichment pipeline, even sections that would normally be assembled programmatically. Produces richer context but increases processing time significantly.
LLM Enrichment Sections Threshold	If the number of sections eligible for AI enrichment exceeds this value, the system falls back to fully programmatic assembly for all sections. Increase this threshold to allow more AI enrichment on large documents.
Minimum Chunk Size	Minimum number of tokens per chunk. Chunks smaller than this are merged with adjacent sections.
Maximum Chunk Size	Maximum number of tokens per chunk. Chunks exceeding this are split at semantic boundaries.
Start Page	First page to process. Leave blank to start from page 1.
Finish Page	Last page to process. Leave blank to process to the end.
Page Overlap	Number of pages shared between adjacent processing segments. Same as in Skimming mode.

Deactivating and Activating a Document

A document that has been fully processed (status: Processed) can be deactivated to prevent the agent from using it in responses, without permanently deleting it. A deactivated document can be reactivated at any time.

Deactivate a document

Open the document detail page.
Click Deactivate in the top-right action bar.
The document status changes to Suspended. The agent will no longer reference this document.

Activate a document

Open the detail page of a suspended document.
Click Activate in the top-right action bar.
The document status returns to Processed and becomes available to the agent again.

Delete a document permanently

A suspended document can be permanently deleted.

Open the detail page of a suspended document.
Click Delete in the top-right action bar.
Confirm the deletion. The document and all its indexed chunks are permanently removed and cannot be recovered.

To delete a document that is still Processed, deactivate it first, then delete it.

Videos — add video content as a knowledge source
Search Chunks — inspect the text units the agent retrieves from your documents
Citations — manage authors and citation display settings
Advanced — tune how many document chunks the agent uses per query

PreviousUsing Resources to Ensure Answer Accuracy

NextVideos

Documents

Managing the Documents List

Creating a New Document

Ingestion Modes

Quick Comparison

Semantic

Skimming

Regulatory Context

Fact Sheet

Adaptive

Deactivating and Activating a Document

Deactivate a document

Activate a document

Delete a document permanently

Related pages