Docs/Analytics & Logs/Monitoring Logs

Monitoring Logs

The Monitoring page gives you visibility into what the agent's backend services are doing at a technical level. Where the Dashboard shows what is happening and Feedback shows how users feel, Monitoring logs show why — the exact sequence of events, service calls, and errors behind every conversation.

Use Monitoring to diagnose integration failures, confirm that configuration changes took effect, and build an audit trail for compliance or incident response.


Opening the Monitoring Page

In the sidebar, click Monitoring.

The Monitoring page loads with a predictable layout:

  1. A filter bar at the top.
  2. A results table in the middle.
  3. Row-level action buttons on the right side of each log entry.

If the table appears empty on first load, do not assume logging is broken. First check whether your current filters (especially Log Source, Level, and DateTime range) are too narrow.


Log Levels

Each log entry is labelled with one of six severity levels. The colour of the chip lets you triage at a glance.

LevelColourMeaningAction required
TraceGreyVery granular internal stepsDevelopment and deep debugging only
DebugBlueDiagnostic information useful during developmentDevelopment environments; rarely needed in production
InformationGreenNormal operation eventsReview periodically to understand flow
WarningAmberUnexpected situation; service continuedInvestigate — may escalate to Error if unresolved
ErrorRedOperation failed; service continuedInvestigate promptly
CriticalDark redService or system failureInvestigate immediately; contact support if unresolved

In day-to-day monitoring, filter to Warning and above. This removes the noise of normal operation and surfaces only the entries that require attention.

Reserve Information for targeted investigations — for example, reviewing the complete request/response sequence for a specific integration after a configuration change.

Use Trace and Debug only when you need to trace a very specific execution path and cannot identify the problem from higher-level logs.


Applying Filters

The filter bar above the log table lets you narrow results across three dimensions.

Use this operating sequence every time to avoid false negatives:

  1. Set Log Source first (Published or Draft).
  2. Set Log Level second (start at Warning and above for triage).
  3. Set Start DateTime and End DateTime last.
  4. Click Apply Filter.
  5. Confirm the table refreshes and that row timestamps match your intended incident window.

Log Source — Published Agent vs Draft Agent

This is one of the most important filters and is frequently overlooked.

  • Published Agent — Logs from the version your end users are talking to. Use this for all production monitoring and incident investigation.
  • Draft Agent — Logs from the configuration currently under development. Use this immediately after saving a change to verify the new settings are working before you publish.

Always check Draft Agent logs before publishing any configuration change. An error visible in Draft that you choose to ignore will become a production error the moment you publish.

Log Level

Select a minimum severity or choose All to see everything. The default is Information.

Changing the level filter does not reload the page — results update in place, making it fast to escalate from Warning to Error to all levels when chasing a specific incident.

Start / End DateTime

The date range defaults to the last 24 hours. When investigating a specific incident, narrow this to the exact window when the incident occurred. Reducing the time range dramatically reduces the number of rows in the table and makes it much easier to find the relevant entries.

Millisecond precision timestamps are available in the log rows themselves once you find the relevant event.

Click Apply Filter to reload the log table with your selections.

If no rows appear after applying filters, broaden in this order:

  1. Expand DateTime range (for example from 15 minutes to 24 hours).
  2. Lower Log Level from Warning to Information.
  3. Verify you are on the correct Log Source (Draft vs Published).

This order helps you recover signal quickly without immediately flooding the table with low-value events.


Reading Log Entries

Each row in the log table contains:

ColumnWhat it shows
TimestampExact date and time (millisecond precision)
LevelSeverity chip (colour-coded)
ServiceWhich internal service produced the log
MessageSummary of the event (hover for full text)

Read rows from left to right to avoid misdiagnosis:

  1. Start with Timestamp to verify event ordering.
  2. Check Level to decide urgency.
  3. Use Service to identify the owner layer (AI, retrieval, function tools, channel, auth).
  4. Use Message only as a summary, then open row actions for full evidence.

When several errors look similar, sort mentally by timestamp and inspect the earliest one first. The first failure often contains the most complete causal context, while later rows can be secondary effects.

Interpreting Service Names

Different services appear in the log depending on which part of the system is active:

  • AI service — Logs related to model completions, token usage, and context window management
  • Knowledge service — Retrieval operations against the vector database
  • Function tool service — Custom API calls and plugin executions
  • Channel service — Incoming and outgoing messages on connected channels (WhatsApp, widget, etc.)
  • Auth service — Authentication and authorisation events

When an error appears, the service name tells you which layer to investigate first.


Action Buttons

Every log row has three action buttons on the right side.

View Conversation

Opens the full conversation thread associated with this log event. Use this to trace a technical error back to the exact user interaction that caused it. Seeing the conversation alongside the log entry almost always reveals whether the failure was an integration problem, a content problem, or a model behaviour problem.

View User

Opens a dialog with details about the user who triggered this event. Useful for identifying whether an error pattern is isolated to one user (possibly a configuration or permission issue specific to that user) or affecting many users (a systemic integration failure).

View Context

Opens a panel showing structured key-value metadata attached to the log event.

The context panel is the richest source of debug information available in the Monitoring page. It typically includes:

  • The full request payload sent to an external API
  • The HTTP status code and response body returned by the external API
  • Timing data (how long each step took)
  • Token counts and model parameters for AI completions
  • Error detail from any failed operation

Use this quick reading pattern inside View Context:

  1. Find the upstream request payload and validate required fields.
  2. Find the downstream response status code.
  3. Read the response body for concrete error text.
  4. Compare timestamps/duration to spot timeout patterns.
  5. Correlate with Service and Message columns in the table row you opened.

This prevents focusing on the table summary while missing the actual root-cause payload details.

Tip: When debugging a failing Custom API call, always open View Context on the Error log. The context will contain the exact HTTP status code and response body from your external API — far more actionable than the summarised message in the table row.


Patterns to Act On

The table below lists the most common log patterns and what to do when you encounter them.

What you see in logsLikely causeNext step
Repeated Error from the same ServiceIntegration misconfiguration or credential expiryOpen View Context on one of the errors; check the API response details
Warning about token limitsResponses approaching the model context windowShorten knowledge chunks or trim Interaction instructions
Warning about rate limitsHigh traffic hitting API quotaReview Custom API configuration or upgrade your external API plan
Error immediately after a config saveNew configuration invalidSwitch Log Source to Draft Agent; check the specific error message
Error immediately after a knowledge uploadDocument parsing or chunking issueCheck the resource status in Knowledge → Resources
Same Error from many different users at the same timeExternal API or downstream service outageCheck the external service status page; not a configuration issue
Error for one specific user onlyUser-level permission or account issueUse View User to check their profile; check resource permissions
Critical level entryService-level failureNote the Timestamp and Service name; contact support with these details

The Difference Between Error and Warning

An Error means an operation failed and the user likely received a degraded or incorrect response. An Error requires prompt investigation.

A Warning means the system noticed something unexpected but recovered. Warnings that persist day after day should be investigated even if they have not escalated to Errors — they are often early indicators of an impending failure.


Using Monitoring for Post-Change Validation

Every time you publish a configuration change, open Monitoring and filter to:

  1. Log Source: Draft Agent (check before publishing)
  2. Log Level: Warning and above
  3. Start DateTime: The time of the config save

Scroll through the entries to confirm no new errors appear. Then publish. Switch to Published Agent and check again 15 minutes after publishing to confirm production is behaving as expected.

This two-step validation (Draft first, Published second) prevents a large proportion of configuration errors from reaching users.


Building a Debug Workflow

When a user reports that the agent gave a wrong or unexpected answer, follow this sequence in Monitoring:

  1. Identify the time window — ask the user for the approximate time, or look it up in the conversation thread.
  2. Filter to that window — set Start/End DateTime to ±5 minutes around the event.
  3. Filter to Warning and above — if nothing appears, widen to Information.
  4. Find the relevant row — look for entries from the function tool service or AI service that correspond to the conversation.
  5. Click View Conversation — confirm this is the right conversation.
  6. Click View Context — read the full request, response, and error detail.
  7. Identify the failure point — determine whether the failure is in the AI (model hallucination, context window truncation), the knowledge retrieval (no relevant chunks found), or a function tool (external API error).
  8. Act on the root cause — the failure point determines which part of the configuration to change.