Feat/phase 13 nemotron retrieval by DanielDeshmukh · Pull Request #14 · DanielDeshmukh/Hector

DanielDeshmukh · 2026-06-28T13:45:20Z

No description provided.

…motron support Create core/embedding_provider.py with a unified interface for embedding generation, supporting two backends: LocalEmbedder (default, 384-dim): - Uses sentence-transformers all-MiniLM-L6-v2 via ChromaDB - No API key required, runs offline after model download - HF_HUB_OFFLINE=1 for offline operation NemotronEmbedder (2048-dim, requires NVIDIA_API_KEY): - Uses NVIDIA Nemotron embed API (nemotron-embed-4b-v1) - Better semantic understanding for legal text - API health check before use - Graceful fallback to local on failure NemotronChromaEmbedFn: - Adapter wrapping NemotronEmbedder for ChromaDB's embedding API - Enables drop-in replacement of SentenceTransformerEmbeddingFunction Factory function: - get_embedding_provider() reads HECTOR_EMBEDDING_PROVIDER env var - Falls back to local if Nemotron unavailable - Supports: 'local' | 'nemotron' Configuration: - HECTOR_EMBEDDING_PROVIDER: 'local' | 'nemotron' - HECTOR_NEMOTRON_EMBED_MODEL: model ID - HECTOR_NEMOTRON_API_KEY: NVIDIA API key - HECTOR_EMBEDDING_DIM: 384 (local) | 2048 (Nemotron)

…ron support Create core/rerank_provider.py with a unified interface for document reranking, supporting two backends: LocalReranker (default): - Uses cross-encoder/ms-marco-MiniLM-L-6-v2 via sentence-transformers - Sigmoid normalization of raw scores to 0-1 range - No API key required, runs offline after model download NemotronReranker (requires NVIDIA_API_KEY): - Uses NVIDIA Nemotron rerank API (nemotron-rerank-v1) - Better semantic understanding for legal text ranking - API health check before use - Handles both rankings[] and scores[] response formats - Graceful fallback to local on failure Both providers: - Accept (query, documents) pairs - Add 'reranker_score' to each document dict - Sort by score descending - Append reason string ('cross-encoder-reranked' or 'nemotron-reranked') Factory function: - get_rerank_provider() reads HECTOR_RERANK_PROVIDER env var - Falls back to local if Nemotron unavailable - Supports: 'local' | 'nemotron' Configuration: - HECTOR_RERANK_PROVIDER: 'local' | 'nemotron' - HECTOR_NEMOTRON_RERANK_MODEL: model ID - HECTOR_NEMOTRON_API_KEY: NVIDIA API key

…ver and ingestor Modify data/hybrid_retriever.py: - Import core.embedding_provider and core.rerank_provider - _get_embedding_function(): check HECTOR_EMBEDDING_PROVIDER env var * If 'nemotron', use NemotronEmbedder via provider abstraction * Fall back to local SentenceTransformerEmbeddingFunction - _rerank_with_cross_encoder(): check HECTOR_RERANK_PROVIDER env var * If 'nemotron', use NemotronReranker via provider abstraction * Fall back to local CrossEncoder Modify utils/enhanced_ingestor.py: - __init__(): try core.embedding_provider.get_embedding_provider() * Respects HECTOR_EMBEDDING_PROVIDER env var * Falls back to local SentenceTransformerEmbeddingFunction on error Both components now seamlessly switch between local and Nemotron backends based on environment variables, with automatic fallback to local models when the Nemotron API is unavailable.

Tests for embedding and rerank provider abstractions validating: Embedding provider: - LocalEmbedder defaults to all-MiniLM-L6-v2 with 384d dimension - NemotronEmbedder defaults to nemotron-embed-4b-v1 with 2048d - Factory returns LocalEmbedder by default and on fallback - Factory returns NemotronEmbedder when NVIDIA_API_KEY is set - Factory falls back to local when Nemotron API is unreachable - NemotronChromaEmbedFn adapter wraps NemotronEmbedder correctly Rerank provider: - LocalReranker defaults to ms-marco-MiniLM-L-6-v2 - Sigmoid normalization produces bounded [0,1] values - Empty document list returns empty result for both providers - Factory returns LocalReranker by default and on fallback - Factory returns NemotronReranker when NVIDIA_API_KEY is set - Factory falls back to local when Nemotron API is unreachable Integration: - get_embedding_provider and get_rerank_provider are callable - hybrid_retriever and enhanced_ingestor modules load correctly All tests use mocked API calls and explicit env var control to avoid interference from concurrent test execution (933 tests total).

…afety Problem: Switching between local (384d) and Nemotron (2048d) embeddings on the same ChromaDB collection causes dimension mismatch errors. ChromaDB collections are bound to a specific embedding dimension at creation time. Solution: Both the ingestor and hybrid retriever now append the provider name to the collection name when a non-local provider is configured: local → indian_law_bns nemotron → indian_law_bns_nemotron This ensures local and Nemotron embeddings never share a collection, preventing dimension conflicts while remaining backward-compatible with existing 384d databases. Modified files: - utils/enhanced_ingestor.py: collection name includes provider suffix - data/hybrid_retriever.py: collection name includes provider suffix Env vars: - HECTOR_EMBEDDING_PROVIDER: 'local' (default) or 'nemotron' - HECTOR_EMBEDDING_DIM: reserved for future explicit dimension control

coderabbitai · 2026-06-28T13:45:28Z

Warning

Review limit reached

@DanielDeshmukh, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 32 minutes and 6 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aee761f5-d93b-4be6-9165-e0c716d4c868

📥 Commits

Reviewing files that changed from the base of the PR and between 87c0ed1 and 64c4c09.

📒 Files selected for processing (5)

core/embedding_provider.py
core/rerank_provider.py
data/hybrid_retriever.py
tests/test_providers.py
utils/enhanced_ingestor.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/phase-13-nemotron-retrieval

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

DanielDeshmukh added 5 commits June 28, 2026 18:53

DanielDeshmukh merged commit d4ec3c8 into main Jun 28, 2026
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/phase 13 nemotron retrieval#14

Feat/phase 13 nemotron retrieval#14
DanielDeshmukh merged 5 commits into
mainfrom
feat/phase-13-nemotron-retrieval

DanielDeshmukh commented Jun 28, 2026

Uh oh!

coderabbitai Bot commented Jun 28, 2026

Review limit reached

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DanielDeshmukh commented Jun 28, 2026

Uh oh!

coderabbitai Bot commented Jun 28, 2026

Review limit reached

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant