Skip to content

MCP Bug Fixes#1

Draft
apetti1920 wants to merge 23 commits intoap_mcp_neptune_supportfrom
ap_mcp_bug_fixes
Draft

MCP Bug Fixes#1
apetti1920 wants to merge 23 commits intoap_mcp_neptune_supportfrom
ap_mcp_bug_fixes

Conversation

@apetti1920
Copy link
Copy Markdown
Collaborator

No description provided.

@apetti1920 apetti1920 changed the base branch from main to ap_mcp_neptune_support December 14, 2025 00:23
@wiz-inc-faae60d47d
Copy link
Copy Markdown

wiz-inc-faae60d47d bot commented Dec 15, 2025

Wiz Scan Summary

Scanner Findings
Vulnerability Finding Vulnerabilities -
Data Finding Sensitive Data -
Secret Finding Secrets -
IaC Misconfiguration IaC Misconfigurations -
SAST Finding SAST Findings -
Software Management Finding Software Management Findings -
Total -

View scan details in Wiz

To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension.

e.valid_at = $edge_data.valid_at,
e.invalid_at = $edge_data.invalid_at,
e.fact_embedding = join([x IN coalesce($edge_data.fact_embedding, []) | toString(x) ], ","),
e.episodes = join($edge_data.episodes, ",")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neptune queries silently drop custom attributes during save

High Severity

The new Neptune queries use explicit property lists instead of the previous SET e = removeKeyFromMap(...) pattern, but this omits custom attributes. Both EntityEdge and EntityNode have an attributes: dict[str, Any] field. The save() methods spread these attributes into the data dictionary via edge_data.update(self.attributes or {}). The old approach copied all properties including these spread attributes. The new explicit lists only include built-in fields, causing any custom attributes to be silently dropped on save. The same issue affects both edge queries and node queries for Neptune.

Additional Locations (1)

Fix in Cursor Fix in Web

continue
elif isinstance(ts_value, int | float):
# Unix timestamp
return datetime.fromtimestamp(ts_value, tz=timezone.utc)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing OSError handling for invalid Unix timestamps

Low Severity

The _extract_reference_time_from_json method handles string timestamp parsing failures with a try/except that continues to the next field, but the numeric timestamp branch at lines 127-129 has no exception handling. datetime.fromtimestamp() raises OSError for out-of-range timestamps (e.g., very large numbers like millisecond timestamps, or negative values on some platforms). The outer except only catches json.JSONDecodeError, TypeError, and ValueError, so OSError propagates up and causes the episode to fail instead of gracefully falling back to the current time.

Fix in Cursor Fix in Web

@apetti1920 apetti1920 force-pushed the ap_mcp_neptune_support branch from 8df1283 to fb49a25 Compare February 3, 2026 14:35
# Conflicts:
#	mcp_server/pyproject.toml
#	mcp_server/uv.lock
#	pyproject.toml
#	uv.lock

diff --git c/mcp_server/pyproject.toml i/mcp_server/pyproject.toml
index 886b6f9..499d72a 100644
--- c/mcp_server/pyproject.toml
+++ i/mcp_server/pyproject.toml
@@ -7,7 +7,7 @@ requires-python = ">=3.10,<4"
 dependencies = [
     "mcp>=1.9.4",
     "openai>=1.91.0",
-    "graphiti-core[falkordb]==0.26.3",
+    "graphiti-core[falkordb,neptune]==0.26.3",
     "pydantic-settings>=2.0.0",
     "pyyaml>=6.0",
     "typing-extensions>=4.0.0",
diff --git c/mcp_server/uv.lock i/mcp_server/uv.lock
index 87c8651..0a689a0 100644
--- c/mcp_server/uv.lock
+++ i/mcp_server/uv.lock
@@ -718,7 +718,7 @@ wheels = [
 [[package]]
 name = "graphiti-core"
 version = "0.26.3"
-source = { registry = "https://pypi.org/simple" }
+source = { editable = "../" }
 dependencies = [
     { name = "diskcache" },
     { name = "neo4j" },
This reverts commit b9d7dcf.
diff --git c/graphiti_core/driver/neptune_driver.py i/graphiti_core/driver/neptune_driver.py
index 0a57ab3..9a02037 100644
--- c/graphiti_core/driver/neptune_driver.py
+++ i/graphiti_core/driver/neptune_driver.py
@@ -196,6 +196,7 @@ class NeptuneDriver(GraphDriver):
             timeout=30,  # 30 second timeout to prevent hanging connections
             max_retries=5,  # Enable retry logic with exponential backoff
             retry_on_timeout=True,  # Retry when timeout occurs
+            retry_on_status=[502, 503, 504],  # Retry on gateway errors and service unavailable
         )

     def _sanitize_parameters(self, query, params: dict):
diff --git c/graphiti_core/prompts/dedupe_edges.py i/graphiti_core/prompts/dedupe_edges.py
index c6e4359..8687712 100644
--- c/graphiti_core/prompts/dedupe_edges.py
+++ i/graphiti_core/prompts/dedupe_edges.py
@@ -41,6 +41,12 @@ class Versions(TypedDict):

 def resolve_edge(context: dict[str, Any]) -> list[Message]:
+    existing_facts_count = len(context.get('existing_edges', []))
+    invalidation_candidates_count = len(context.get('edge_invalidation_candidates', []))
+
+    existing_range = f'0 to {existing_facts_count - 1}' if existing_facts_count > 0 else 'none (empty list)'
+    invalidation_range = f'0 to {invalidation_candidates_count - 1}' if invalidation_candidates_count > 0 else 'none (empty list)'
+
     return [
         Message(
             role='system',
@@ -50,16 +56,25 @@ def resolve_edge(context: dict[str, Any]) -> list[Message]:
         Message(
             role='user',
             content=f"""
-        Task:
-        You will receive TWO separate lists of facts. Each list uses 'idx' as its index field, starting from 0.
-
+        You will analyze a NEW FACT against two separate lists of existing facts.
+
+        ═══════════════════════════════════════════════════════════════
+        LIST A: EXISTING FACTS (for duplicate detection)
+        ═══════════════════════════════════════════════════════════════
+        Count: {existing_facts_count} facts
+        Valid idx range: {existing_range}
+
         1. DUPLICATE DETECTION:
            - If the NEW FACT represents identical factual information as any fact in EXISTING FACTS, return those idx values in duplicate_facts.
            - Facts with similar information that contain key differences should NOT be marked as duplicates.
            - Return idx values from EXISTING FACTS.
            - If no duplicates, return an empty list for duplicate_facts.

-        2. CONTRADICTION DETECTION:
+        2. FACT TYPE CLASSIFICATION:
+           - Given the predefined FACT TYPES, determine if the NEW FACT should be classified as one of these types.
+           - Return the fact type as fact_type or DEFAULT if NEW FACT is not one of the FACT TYPES.
+
+        3. CONTRADICTION DETECTION:
            - Based on FACT INVALIDATION CANDIDATES and NEW FACT, determine which facts the new fact contradicts.
            - Return idx values from FACT INVALIDATION CANDIDATES.
            - If no contradictions, return an empty list for contradicted_facts.
@@ -73,17 +88,63 @@ def resolve_edge(context: dict[str, Any]) -> list[Message]:
         1. Some facts may be very similar but will have key differences, particularly around numeric values in the facts.
             Do not mark these facts as duplicates.

+        <FACT TYPES>
+        {context['edge_types']}
+        </FACT TYPES>
+
         <EXISTING FACTS>
         {context['existing_edges']}
-        </EXISTING FACTS>

-        <FACT INVALIDATION CANDIDATES>
+        ═══════════════════════════════════════════════════════════════
+        LIST B: FACT INVALIDATION CANDIDATES (for contradiction detection)
+        ═══════════════════════════════════════════════════════════════
+        Count: {invalidation_candidates_count} facts
+        Valid idx range: {invalidation_range}
+
         {context['edge_invalidation_candidates']}
-        </FACT INVALIDATION CANDIDATES>

-        <NEW FACT>
+        ═══════════════════════════════════════════════════════════════
+        NEW FACT TO ANALYZE
+        ═══════════════════════════════════════════════════════════════
         {context['new_edge']}
-        </NEW FACT>
+
+        ═══════════════════════════════════════════════════════════════
+        FACT TYPES FOR CLASSIFICATION
+        ═══════════════════════════════════════════════════════════════
+        {context['edge_types']}
+
+        ═══════════════════════════════════════════════════════════════
+        YOUR RESPONSE MUST INCLUDE THREE FIELDS
+        ═══════════════════════════════════════════════════════════════
+
+        1. duplicate_facts (list of integers)
+           SOURCE: Use idx values ONLY from LIST A (EXISTING FACTS)
+           VALID RANGE: {existing_range}
+           PURPOSE: Identify which facts in LIST A are duplicates of the NEW FACT
+           CRITERIA: Facts must represent identical factual information (minor wording differences OK)
+           NOTE: Facts with key differences (especially numeric values) are NOT duplicates
+           IF NO DUPLICATES: Return empty list []
+
+        2. contradicted_facts (list of integers)
+           SOURCE: Use idx values ONLY from LIST B (FACT INVALIDATION CANDIDATES)
+           VALID RANGE: {invalidation_range}
+           PURPOSE: Identify which facts in LIST B are contradicted by the NEW FACT
+           CRITERIA: Facts that are logically incompatible with the NEW FACT
+           IF NO CONTRADICTIONS: Return empty list []
+
+        3. fact_type (string)
+           SOURCE: Choose from FACT TYPES listed above
+           PURPOSE: Classify the NEW FACT's type
+           DEFAULT: Return 'DEFAULT' if NEW FACT doesn't match any predefined FACT TYPES
+
+        ═══════════════════════════════════════════════════════════════
+        CRITICAL WARNINGS
+        ═══════════════════════════════════════════════════════════════
+        - LIST A and LIST B are COMPLETELY SEPARATE with INDEPENDENT indexing
+        - Do NOT use idx values from LIST B in duplicate_facts field
+        - Do NOT use idx values from LIST A in contradicted_facts field
+        - Each list starts indexing from 0 independently
+        - Verify your idx values are within the valid ranges specified above
         """,
         ),
     ]
diff --git c/graphiti_core/utils/maintenance/edge_operations.py i/graphiti_core/utils/maintenance/edge_operations.py
index 9fc356a..3fb19c7 100644
--- c/graphiti_core/utils/maintenance/edge_operations.py
+++ i/graphiti_core/utils/maintenance/edge_operations.py
@@ -444,8 +444,7 @@ async def resolve_extracted_edges(
     resolved_edges: list[EntityEdge] = []
     invalidated_edges: list[EntityEdge] = []
     for result in results:
-        resolved_edge = result[0]
-        invalidated_edge_chunk = result[1]
+        resolved_edge, invalidated_edge_chunk, _ = result  # Third value (duplicate_edges) not needed here

         resolved_edges.append(resolved_edge)
         invalidated_edges.extend(invalidated_edge_chunk)
diff --git c/mcp_server/src/graphiti_mcp_server.py i/mcp_server/src/graphiti_mcp_server.py
index 2b9a855..b264c40 100644
--- c/mcp_server/src/graphiti_mcp_server.py
+++ i/mcp_server/src/graphiti_mcp_server.py
@@ -92,6 +92,12 @@ logging.getLogger('uvicorn.access').setLevel(logging.WARNING)  # Reduce access l
 logging.getLogger('mcp.server.streamable_http_manager').setLevel(
     logging.WARNING
 )  # Reduce MCP noise
+logging.getLogger('opensearch').setLevel(
+    logging.ERROR
+)  # Only log actual errors, not transient warnings
+logging.getLogger('urllib3.connectionpool').setLevel(
+    logging.ERROR
+)  # Suppress retry warnings

 # Patch uvicorn's logging config to use our format

<FACT TYPES>
{context['edge_types']}
</FACT TYPES>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing edge_types in context causes KeyError

High Severity

The prompt template accesses context['edge_types'] at two locations, but the context dictionary built in resolve_extracted_edge function (in edge_operations.py lines 555-559) only includes existing_edges, new_edge, and edge_invalidation_candidates. When this prompt is called, it will raise a KeyError for the missing edge_types key, causing the edge deduplication to fail.

Additional Locations (1)

Fix in Cursor Fix in Web

2. CONTRADICTION DETECTION:
2. FACT TYPE CLASSIFICATION:
- Given the predefined FACT TYPES, determine if the NEW FACT should be classified as one of these types.
- Return the fact type as fact_type or DEFAULT if NEW FACT is not one of the FACT TYPES.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Response model missing required fact_type field

Medium Severity

The prompt instructs the LLM to return a fact_type field (documented at lines 73-75 and 135-138), but the EdgeDuplicate response model only defines duplicate_facts and contradicted_facts fields. The LLM-generated fact_type value will either be silently discarded or cause validation errors, making the fact type classification feature non-functional.

Additional Locations (1)

Fix in Cursor Fix in Web

n.group_id = $entity_data.group_id,
n.created_at = $entity_data.created_at,
n.summary = $entity_data.summary,
n.name_embedding = join([x IN coalesce($entity_data.name_embedding, []) | toString(x) ], ",")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neptune entity nodes lose custom attributes on save

High Severity

The Neptune query now explicitly sets only name, group_id, created_at, summary, and name_embedding. The previous query used removeKeyFromMap to set all properties from $entity_data, which included custom attributes merged via entity_data.update(self.attributes). Custom entity attributes are now silently dropped on save.

Additional Locations (1)

Fix in Cursor Fix in Web

+ group_filter_query
+ """
RETURN DISTINCT id(n) as id, n.name_embedding as embedding
LIMIT $batch_limit
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neptune query uses n but filter references c

High Severity

The new Neptune-specific query in community_similarity_search uses MATCH (n:Community) but the filter at line 1090 references c.group_id. When group_ids is provided, the query will fail with an undefined variable error because c is never defined in the Neptune code path.

Fix in Cursor Fix in Web

- Do NOT use idx values from LIST B in duplicate_facts field
- Do NOT use idx values from LIST A in contradicted_facts field
- Each list starts indexing from 0 independently
- Verify your idx values are within the valid ranges specified above
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prompt contradicts data indexing and consuming code

High Severity

The new prompt has contradictory indexing instructions. The data passed uses continuous idx numbering (invalidation candidates start where existing facts end), and the consuming code in edge_operations.py expects this. However, the "CRITICAL WARNINGS" section says lists have "INDEPENDENT indexing" starting from 0, and contradicted_facts must only use LIST B indices. The invalidation_range (line 48) also computes a 0-based range that doesn't match the actual offset-based idx values in the data. This will cause the LLM to return incorrect idx values, leading to wrong edges being invalidated.

Fix in Cursor Fix in Web

n.group_id = $entity_data.group_id,
n.created_at = $entity_data.created_at,
n.summary = $entity_data.summary,
n.name_embedding = join([x IN coalesce($entity_data.name_embedding, []) | toString(x) ], ",")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neptune save queries silently drop entity/edge attributes

High Severity

The Neptune save queries were changed from SET n/e = removeKeyFromMap(...) (which wrote all properties from the data dict, including dynamically flattened attributes) to explicit field-by-field SET statements. The explicit statements omit attributes entirely. Since the save methods in nodes.py and edges.py call entity_data.update(self.attributes or {}) to merge attributes as top-level keys, the old approach saved them as graph properties. Now those attribute properties are silently dropped, causing data loss for any Neptune users with custom entity or edge types that define additional attributes.

Additional Locations (2)

Fix in Cursor Fix in Web

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

api_key=api_key,
base_url=base_url,
default_headers=custom_headers if custom_headers else None,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated custom headers builder logic across factories

Low Severity

The custom headers building logic (reading from config extra_headers, OPENAI_EXTRA_HEADERS env var, and X_SESSION_ID env var, plus creating an AsyncOpenAI client) is duplicated nearly identically between LLMClientFactory.create and EmbedderFactory.create. This increases maintenance burden and risks inconsistent bug fixes if one copy is updated but not the other.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant