Skip to main content

Batch Command

Process multiple GitHub issues at once and export results as JSON or CSV. Always runs in dry-run mode — no changes are posted to GitHub.

Syntax

simili batch [OPTIONS]

Options

OptionShortTypeDescriptionDefault
--file-ffileJSON file containing array of issuesRequired
--out-file-ofileOutput file path (stdout if omitted)-
--formatstringOutput format: json or csvjson
--workers-wnumberConcurrent workers1
--workflowstringWorkflow presetissue-triage
--collectionstringOverride Qdrant collection name-
--thresholdnumberOverride similarity threshold-
--duplicate-thresholdnumberOverride duplicate confidence threshold-
--top-knumberOverride max similar issues to show-
--config-cfilePath to configuration file.github/simili.yaml
--help-hboolShow help message-
Batch mode always runs in dry-run. No comments, labels, or transfers are posted to GitHub.

Input format

Provide a JSON file containing an array of issue objects:
[
  {
    "org": "my-org",
    "repo": "backend",
    "number": 123,
    "title": "Login times out after 30 seconds",
    "body": "Users are getting logged out unexpectedly...",
    "state": "open",
    "labels": ["bug"],
    "author": "johndoe",
    "created_at": "2026-02-10T10:00:00Z"
  },
  {
    "org": "my-org",
    "repo": "frontend",
    "number": 456,
    "title": "Button click does nothing on Safari",
    "body": "The submit button has no effect in Safari 17...",
    "state": "open",
    "labels": [],
    "author": "janedoe",
    "created_at": "2026-02-11T14:30:00Z"
  }
]
Required fields: org, repo, number, title

Examples

Basic JSON output

simili batch --file issues.json

CSV export

simili batch --file issues.json --format csv --out-file results.csv

Parallel processing

simili batch --file issues.json --workers 5 --out-file results.json

Override thresholds

simili batch --file issues.json \
  --threshold 0.75 \
  --duplicate-threshold 0.90 \
  --top-k 3

Use a custom collection

simili batch --file issues.json --collection my-staging-collection

Output

JSON output

[
  {
    "issue": {
      "org": "my-org",
      "repo": "backend",
      "number": 123,
      "title": "Login times out after 30 seconds"
    },
    "similar": [
      { "number": 99, "title": "Session expires early", "score": 0.88 }
    ],
    "duplicate_detected": true,
    "duplicate_of": 99,
    "duplicate_confidence": 0.91,
    "quality_score": 0.72,
    "suggested_labels": ["bug", "auth"],
    "transfer_target": null
  }
]

CSV output

The CSV format includes 20+ flattened fields including:
FieldDescription
orgOrganization
repoRepository
numberIssue number
titleIssue title
duplicate_detectedtrue/false
duplicate_ofIssue number of duplicate
duplicate_confidenceConfidence score (0.0-1.0)
quality_scoreQuality score (0.0-1.0)
suggested_labelsComma-separated labels
similar_1_numberTop similar issue number
similar_1_scoreTop similar issue score
transfer_targetTarget repo if routing suggested

Performance

  • Default 1 worker for safe rate-limit behaviour
  • Increase workers to speed up large batches
  • Each issue takes ~3-8 seconds depending on features enabled

Next steps