Semantic Search
Simili Bot uses semantic search to find related issues across your repositories based on meaning, not just keywords. This approach allows the bot to identify relationships that traditional keyword-based search would miss.How It Works
Theory vs Reality
In a traditional search system, an issue titled “Login button doesn’t work” might be missed if you search for “authentication failures”. Semantic search bridges this gap.- Traditional Search: Only finds issues with exact word matches.
- Semantic Search: AI understands that “Can’t authenticate” is semantically related to “Sign-in issues” and “Login broken”.
The Process
Simili Bot follows a three-step process to enable semantic discovery:Embedding
Text from the issue title, body, and comments is converted into a 768-dimensional vector using Google’s
text-embedding-004 model.Indexing
These vectors are stored in the Qdrant vector database along with the issue’s metadata (repository, labels, author).
Configuration
Tuning the search sensitivity is crucial for balancing noise and discovery.Tuning Thresholds
Thesimilarity_threshold determines how strict the bot is when suggesting related issues.
| Level | Value | Effect |
|---|---|---|
| Conservative | 0.80 | Only returns issues that are nearly identical in meaning. |
| Recommended | 0.70 | Provides a good balance of accuracy and broad discovery. |
| Permissive | 0.60 | Returns loosely related issues; higher chance of false positives. |
Configuration Example
Add these settings to yoursimili.yaml or workflow environment: