Skip to main content

Gemini Integration

How Simili Bot uses Google Gemini for AI analysis.

Services

Simili Bot uses Gemini for:
  1. Text Embeddings - Convert text to vectors for semantic search
  2. LLM Analysis - AI reasoning, duplicate detection, routing, and triage

Embeddings

gemini-embedding-001

Default embedding model in v0.2.0:
Input: Text string
Output: 3072-dimensional vector
Speed: 100-500ms per request
Use Cases:
  • Convert issue text to vector for similarity search
  • Generate embeddings for all issues during bulk indexing
  • Embed PR content (title + body + changed files) for PR duplicate detection
Cost: Refer to the Google AI pricing page for current rates.
v0.1.0 used gemini-embedding-001 (768 dimensions). v0.2.0 uses gemini-embedding-001 (3072 dimensions). If you’re migrating from v0.1.0, you must re-index your collection after updating.

LLM analysis

Default LLM model: gemini-2.5-flash

1. Duplicate detection

Input: Current issue + similar issues
Task: Determine if duplicates
Output: Boolean + confidence (0.0-1.0) + reasoning
Speed: 2-5 seconds

2. Quality assessment

Input: Issue title + body
Task: Evaluate description quality
Output: Score (0-100) + suggestions
Speed: 1-3 seconds

3. Issue routing

Input: Issue + repository descriptions
Task: Determine correct repository
Output: Target repo name + reasoning
Speed: 2-5 seconds

4. Auto triage

Input: Issue content + available labels
Task: Suggest appropriate labels
Output: Labels + confidence scores
Speed: 1-2 seconds

Models

ModelTypeDefault
gemini-embedding-001EmbeddingsYes
gemini-2.5-flashLLMYes
gemini-2.0-flash-liteLLMNo (previous default)

Configuration

embedding:
  provider: "gemini"
  api_key: "${GEMINI_API_KEY}"
  model: "gemini-embedding-001"
  dimensions: 3072

llm:
  provider: "gemini"
  api_key: "${GEMINI_API_KEY}"
  model: "gemini-2.5-flash"

API quotas

Free Tier:
  • Embeddings: 50 requests/minute
  • LLM Calls: 15 requests/minute
  • Generous monthly limits
Paid:
  • Pay-as-you-go
  • Higher rate limits available

Error handling

Graceful degradation if Gemini is unavailable:
If embeddings fail:
  → Step skipped
  → No similarity search
  → Continue with other analysis

If LLM fails:
  → That analysis step skipped
  → Other steps continue
  → Error logged
Simili Bot uses exponential backoff retry (typed errors) for transient failures.

Performance

Typical latencies:
  • Embedding request: 200-500ms
  • LLM analysis: 2-5 seconds
  • Batch embedding: 500ms-2s
Bottleneck is usually external API calls.

Migration from v0.1.0

If upgrading from v0.1.0, update your simili.yaml:
# Before (v0.1.0)
embedding:
  model: "text-embedding-004"
  dimensions: 768

# After (v0.2.0)
embedding:
  model: "gemini-embedding-001"
  dimensions: 3072
Then re-index your collection:
simili index --repo owner/repo --since 2020-01-01
The Qdrant collection dimension cannot be changed in-place. Delete and recreate the collection, or create a new one with a different name.

Next steps

Gemini configuration

Setup Gemini for Simili Bot

OpenAI integration

Use OpenAI as an alternative