Docs/Help/Webscraping

Webscraping

Scraping Controls

Enable Web ScrapingGlobal switch to allow page-content extraction for agent usage.
Include ImagesControls whether images are retained in scraped markdown output.
Timeout SettingDefines scraping timeout in seconds at UI level.

Image Filtering

Minimum Image Size FilterExcludes images smaller than configured threshold.
Maximum Image Size FilterExcludes images larger than configured threshold.
Size Unit ConversionConverts UI units (KB and MB) to backend byte values during persistence flow.

Path Exclusion Controls

Ignored Path PatternsConfigures path patterns to exclude from scraping coverage.
Pattern List ManagementSupports add/remove operations for ignored path entries.
Pattern Limit GuardEnforces maximum pattern list size for ignored paths.

Persistence and Data Mapping

UI-to-API Unit MappingConverts timeout and image-size units between UI representation and API representation.
Save Confirmation FlowUses explicit confirmation before persisting scraping configuration updates.
Agent Profile SyncWrites saved scraping settings back into active agent profile state.