Indexing Data Sources

Learn how to index data sources for better searchability and access.

Once a data source is added to the AI/Run CodeMie platform, you can proceed with indexing it. Indexing enables assistants and workflows to interact with the data within the data source. Assistants can only access data that has been indexed, so you need to re-index the data source each time it is updated. Otherwise, AI/Run CodeMie will not recognize the new data.

Automatic Indexing

By default, AI/Run CodeMie automatically indexes new data sources as soon as they are added to the platform:

Automatic Indexing

info

You don't have to wait till a data source is fully indexed. You can already add this data source into your assistants.

Partial Data Access

If you add a data source before indexing completes, assistants will only access already-indexed content. Wait for full indexing for complete coverage.

Manual Reindexing

To manually trigger data source reindexing, click the actions button and select Full Reindex or Incremental Index:

Reindex Options

Indexing Types

Full Reindex

A complete reindexing of the data source. This involves scanning and updating all data from the source, regardless of any changes.

When to use Full Reindex:

After major updates to the data source
When data integrity issues are suspected
For initial indexing of large data sources
When switching indexing strategies or configurations

Incremental Index

This option is currently only supported for Jira data sources. With this option only new or changed data from the source will be updated. It only reindexes the data that has been modified since the last reindex.

When to use Incremental Index:

For frequent, small updates to Jira data sources
To save time and resources on large Jira projects
For regular synchronization of Jira issues
When only recent changes need to be reflected

Jira-Specific Feature

Incremental indexing is currently only available for Jira. Other data sources require full reindexing. We're working on expanding this feature.

Alternative Access

Full reindex is also available on Data Source Details page: Data Source tab → Selected data source → 3 dots → View → scroll the page to the bottom

Automatic Reindexing with Scheduler

Instead of manually triggering reindexing, you can configure automatic reindexing schedules directly in the data source settings. The built-in scheduler is available for all data source types except Provider and File data sources.

Setting Up Automatic Reindexing

The scheduler can be configured when creating a new data source or when updating an existing one:

Navigate to your data source creation or update form
Locate the Reindex Type section
Select your preferred schedule from the Scheduler dropdown

Available Schedules

No schedule (manual only) - Default, manual reindexing only
Every hour - Automatic reindexing every hour
Daily at midnight - Once per day at midnight
Weekly on Sunday at midnight - Once per week
Monthly on the 1st at midnight - Once per month
Custom cron expression - Enter your own cron expression for custom timing

When to Use Scheduled Reindexing

Use automatic scheduling when:

Data source updates regularly (Git repos, Confluence, Jira)
You want assistants to always have access to latest content
You prefer a "set it and forget it" approach

Use manual reindexing when:

Data source rarely changes
You have full control over update timing
Working in development/testing environments
Data source is very large and updates infrequently

For detailed scheduler configuration, see the Scheduler documentation.

Resuming Indexing

If indexing is interrupted or paused, you can resume the process by clicking the Resume Indexing button:

Resume Indexing

Indexing Best Practices

Data Sources with Scheduler Support

The built-in automatic reindexing scheduler availability by data source type:

Data Source Type	Automatic Scheduler
Git Repositories	✅ Available
Confluence	✅ Available
Jira	✅ Available
Google Docs	✅ Available
AWS Knowledge Bases	✅ Available
X-ray	✅ Available
File	❌ Not available
Provider	❌ Not available

Monitoring Indexing Status

Keep track of your data source indexing status:

Check the STATUS column in the data sources list
Monitor indexing progress for large data sources
Verify successful completion of indexing operations
Review any indexing errors or warnings

Optimizing Indexing Performance

Schedule Large Reindexes: Perform full reindexes during off-peak hours
Use Incremental Indexing: Leverage incremental indexing for Jira when possible
Batch Updates: Group multiple changes before triggering reindex
Clean Up First: Remove outdated or unnecessary data before reindexing

Data Loss Warning

Full reindex replaces all existing indexed data. If your data source has changed significantly (e.g., repository deleted, Confluence pages removed), those items will be removed from search results.

Indexing Status Indicators

Data sources display different statuses during the indexing process:

Indexing: Currently being indexed
Indexed: Successfully indexed and ready to use
Failed: Indexing encountered an error
Pending: Waiting to start indexing
Paused: Indexing process is paused

Now your data sources are indexed and ready to be used by assistants and workflows for enhanced AI-powered operations.

Automatic Indexing​

Manual Reindexing​

Indexing Types​

Full Reindex​

Incremental Index​

Automatic Reindexing with Scheduler​

Setting Up Automatic Reindexing​

Available Schedules​

When to Use Scheduled Reindexing​

Resuming Indexing​

Indexing Best Practices​

Data Sources with Scheduler Support​

Monitoring Indexing Status​

Optimizing Indexing Performance​

Indexing Status Indicators​

Automatic Indexing

Manual Reindexing

Indexing Types

Full Reindex

Incremental Index

Automatic Reindexing with Scheduler

Setting Up Automatic Reindexing

Available Schedules

When to Use Scheduled Reindexing

Resuming Indexing

Indexing Best Practices

Data Sources with Scheduler Support

Monitoring Indexing Status

Optimizing Indexing Performance

Indexing Status Indicators