> ## Documentation Index
> Fetch the complete documentation index at: https://docs.aui.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge Bases (RAG)

> Manage knowledge bases, upload files, scrape websites, and export/import RAG data.

<div style={{ display: 'flex', gap: '8px', marginBottom: '16px' }}>
  <span style={{ background: '#6366F1', color: '#fff', padding: '2px 10px', borderRadius: '12px', fontSize: '13px', fontWeight: 500 }}>Upcoming Mid April 2026</span>
</div>

**Base URL:** `https://api-staging.internal-aui.io/knowledge-base-manager/v1`

**Auth:** `x-api-key` header

***

## List Knowledge Bases

### GET `/knowledge-bases`

#### Query Parameters

<ParamField query="scope_type" type="string">
  Scope type filter.
</ParamField>

<ParamField query="network_id" type="string">
  Network ID.
</ParamField>

<ParamField query="account_id" type="string">
  Account ID.
</ParamField>

<ParamField query="seed_id" type="string">
  Seed ID.
</ParamField>

<ParamField query="include_hierarchy" type="boolean">
  Include hierarchy.
</ParamField>

```bash cURL theme={null}
curl "https://api-staging.internal-aui.io/knowledge-base-manager/v1/knowledge-bases?network_id=net-id&account_id=acc-id&seed_id=seed-id&include_hierarchy=true" \
  -H "x-api-key: your-kbm-api-key"
```

<ResponseExample>
  ```json 200 theme={null}
  [
    {
      "id": "string",
      "name": "string",
      "scope_id": "string",
      "scope": {
        "type": "string",
        "network_id": "string",
        "account_id": "string",
        "seed_id": "string",
        "network_category_id": "string"
      }
    }
  ]
  ```
</ResponseExample>

### Best Practices

* **Use `include_hierarchy=true` for full visibility** — This returns knowledge bases inherited from parent scopes (account, organization), not just network-level ones. Essential for understanding what data the agent has access to.
* **Filter by scope to avoid confusion** — Always pass `network_id` and `account_id` to scope the results. Without filters, you may see knowledge bases from other agents.

***

## Upload Files (Bulk)

Upload files to a knowledge base for RAG processing.

### POST `/bulk/files`

**Content-Type:** `multipart/form-data`

| Field                        | Type    | Description                   |
| ---------------------------- | ------- | ----------------------------- |
| `files`                      | File\[] | Files to upload               |
| `created_by`                 | string  | User email                    |
| `scope_type`                 | string  | Scope type                    |
| `network_id`                 | string  | Network ID                    |
| `account_id`                 | string  | Account ID                    |
| `seed_id`                    | string  | Seed ID                       |
| `knowledge_base_id`          | string  | Existing KB ID (optional)     |
| `knowledge_base_name`        | string  | New KB name (optional)        |
| `knowledge_base_description` | string  | New KB description (optional) |

<ResponseExample>
  ```json 200 theme={null}
  {
    "knowledge_base_id": "string",
    "knowledge_base_name": "string",
    "knowledge_base_created": true,
    "results": [
      { "resource_id": "string", "resource_name": "string", "status": "string" }
    ],
    "total_files": 3,
    "files_accepted": 3,
    "files_failed": 0
  }
  ```
</ResponseExample>

### Best Practices

* **Supported formats** — Upload PDF, DOCX, CSV, TXT, and other document formats. Check `files_failed` in the response to catch unsupported or corrupted files.
* **Use descriptive `knowledge_base_name`** — Name knowledge bases by content domain (e.g., `product-catalog-2026`, `return-policy`). This makes it easier to manage and audit.
* **Upload to an existing KB when possible** — Pass `knowledge_base_id` to add files to an existing knowledge base rather than creating a new one for every upload. This keeps related content together.
* **Batch uploads** — Send multiple files in a single request rather than one file per request. This reduces overhead and ensures atomicity.

***

## Upload Websites (Bulk Scrape)

Scrape websites and add them to a knowledge base.

### POST `/bulk/websites`

```json theme={null}
{
  "urls": [
    { "url": "https://example.com", "resource_name": "string", "description": "string" }
  ],
  "scope": {
    "type": "string",
    "network_id": "string",
    "account_id": "string",
    "seed_id": "string"
  },
  "created_by": "string",
  "knowledge_base_id": "string",
  "knowledge_base_name": "string",
  "knowledge_base_description": "string"
}
```

<ResponseExample>
  ```json 200 theme={null}
  {
    "knowledge_base_id": "string",
    "knowledge_base_name": "string",
    "knowledge_base_created": true,
    "results": [],
    "total_urls": 2,
    "urls_accepted": 2,
    "urls_failed": 0
  }
  ```
</ResponseExample>

### Best Practices

* **Provide meaningful `resource_name`** — Name each URL resource clearly (e.g., `FAQ Page`, `Pricing`) rather than using the URL as the name. This improves RAG retrieval quality.
* **Scrape sparingly** — Website scraping is resource-intensive. Avoid scraping entire sites — target specific pages with high-quality content relevant to the agent's domain.
* **Check `urls_failed` in the response** — Pages behind authentication, CAPTCHAs, or JavaScript-heavy SPAs may fail to scrape. Verify each URL is accessible before submitting.

***

## Export Knowledge Bases

Export all knowledge bases for a scope, including vectors.

### POST `/bulk/export`

```json theme={null}
{
  "scope": {
    "type": "string",
    "network_id": "string",
    "account_id": "string",
    "seed_id": "string",
    "network_category_id": "string"
  }
}
```

<ResponseExample>
  ```json 200 theme={null}
  {
    "scope_levels": [
      {
        "scope_level": "string",
        "scope": {},
        "knowledge_bases": [
          {
            "name": "string",
            "description": "string",
            "resources": [
              {
                "file_name": "string",
                "resource_type": "string",
                "vectors": [
                  {
                    "title": "string",
                    "content": "string",
                    "category": "string",
                    "sub_category": "string",
                    "tags": [],
                    "source_url": "string"
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
  ```
</ResponseExample>

### Best Practices

* **Export before destructive changes** — Always export knowledge bases before deleting resources, migrating agents, or making bulk vector changes. The export includes all vectors and can be re-imported.
* **Use exports for environment promotion** — Export from staging and import into production to ensure the same knowledge base content is available in both environments.
* **Store exports in version control** — Save export JSON files alongside your `.aui.json` agent configs. This gives you a full backup and change history of your agent's knowledge.

***

## Import Knowledge Bases

Import previously exported knowledge base data.

### POST `/bulk/import`

```json theme={null}
{
  "scope_levels": [],
  "created_by": "string"
}
```

<ResponseExample>
  ```json 200 theme={null}
  {
    "results": [
      {
        "scope_level": "string",
        "knowledge_bases_created": 1,
        "knowledge_bases_found": 0,
        "resources_created": 3,
        "resources_updated": 0,
        "vectors_created": 15,
        "vectors_updated": 0,
        "vectors_deleted": 0,
        "vectors_unchanged": 0
      }
    ]
  }
  ```
</ResponseExample>

### Best Practices

* **Review the results summary** — Check `vectors_created`, `vectors_updated`, and `vectors_deleted` to understand exactly what changed. Unexpected deletes may indicate a scope mismatch.
* **Import is idempotent** — Re-importing the same data will update existing vectors rather than creating duplicates. Use this for safe re-syncs.
* **Match scope carefully** — The import uses scope levels from the export. Ensure the target environment has matching network, account, and organization IDs, or remap them before importing.
