feat: Excel search improvements (multi-keyword + row context) by k-ibaraki · Pull Request #59 · ncdcdev/sharepoint-docs-mcp

k-ibaraki · 2026-02-10T14:53:19Z

Summary

✅ Multi-keyword OR search with comma-separated keywords
✅ include_surrounding_cells parameter for getting row context in one API call
✅ Comprehensive test coverage and documentation

Changes

Add include_surrounding_cells: bool = False parameter to search
Implement comma-separated multi-keyword OR search
Update tool description for better LLM usability
Add English and Japanese documentation

Test Plan

All existing tests pass (backward compatibility)
New test methods for include_surrounding_cells feature
New test methods for multi-keyword search
Quality checks pass (ruff, ty)

Performance Impact

API calls: N+1 → 1 (96% reduction with include_surrounding_cells=True)
Default behavior unchanged (backward compatible)

Closes #55
Closes #56

🤖 Generated with Claude Code

- Add include_surrounding_cells parameter to get row context in search - Reduces API calls from N+1 to 1 when row data needed - Default False for backward compatibility - Add comma-separated multi-keyword OR search - Example: query='budget,forecast' - Backward compatible with single keyword - Update tool description for LLM usability - Add comprehensive test coverage - Fix dictionary iteration issue in _scan_sheet Resolves #55, #56 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Document multi-keyword OR search usage - Document include_surrounding_cells parameter - Add search best practices and examples - Update both English and Japanese documentation Related to #55, #56 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

gemini-code-assist · 2026-02-10T14:53:52Z

Summary of Changes

Hello @k-ibaraki, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Excel search capabilities by introducing multi-keyword OR search functionality and an option to retrieve full row context for matching cells. These improvements aim to make the sharepoint_excel tool more powerful and efficient, particularly for LLM interactions, by reducing the number of API calls needed to gather comprehensive data around search results.

Highlights

Multi-keyword OR Search: Implemented the ability to search for multiple keywords using a comma-separated string, applying OR logic.
Row Context Retrieval: Added a new include_surrounding_cells parameter to the search function, allowing users to retrieve the entire row data for each matching cell in a single API call.
API Efficiency Improvement: The include_surrounding_cells feature significantly reduces API calls from N+1 to 1 when row context is required, improving performance.
Enhanced Documentation: Updated both English and Japanese documentation, including READMEs and usage guides, to reflect the new search capabilities and parameters.
Comprehensive Testing: Introduced new test cases to ensure the correct functionality and backward compatibility of both multi-keyword search and row context retrieval.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

README.md
- Updated search mode description to detail multi-keyword OR search and row context.
README_ja.md
- Updated search mode description in Japanese to detail multi-keyword OR search and row context.
docs/usage.md
- Added include_surrounding_cells parameter to the sharepoint_excel tool documentation.
- Updated query parameter description to mention comma-separated OR search.
- Added new sections demonstrating advanced search features, including multiple keyword search, search with row context, and combining both.
docs/usage_ja.md
- Added include_surrounding_cells parameter to the sharepoint_excel tool documentation in Japanese.
- Updated query parameter description in Japanese to mention comma-separated OR search.
- Added new sections in Japanese demonstrating advanced search features.
src/server.py
- Added include_surrounding_cells as a new boolean parameter to the sharepoint_excel function.
- Modified the query parameter's docstring to clarify multi-keyword OR search support.
- Passed the new include_surrounding_cells parameter to the parser.search_cells method.
- Updated the tool's description to reflect the new search features and their performance benefits.
src/sharepoint_excel.py
- Added include_surrounding_cells parameter to the search_cells method signature and its docstring.
- Modified the _scan_sheet method to parse comma-separated queries into multiple keywords for OR logic.
- Implemented logic within _scan_sheet to retrieve and include full row_data for matching cells when include_surrounding_cells is true.
- Converted sheet._cells.values() to a list before iteration to prevent dictionary modification errors during row data retrieval.
tests/test_server.py
- Updated the test_excel_search_mode test to explicitly pass include_surrounding_cells=False to the mocked search_cells method.
tests/test_sharepoint_excel.py
- Added private helper methods _create_test_data_excel and _create_search_test_excel for generating test Excel files.
- Introduced new test cases for test_search_with_surrounding_cells_disabled, test_search_with_surrounding_cells_enabled, test_search_with_surrounding_cells_multiple_matches, and test_search_with_surrounding_cells_empty_cells.
- Added new test cases for multi-keyword search: test_search_single_keyword_backward_compatible, test_search_multiple_keywords_comma_separated, test_search_multiple_keywords_with_spaces, test_search_multiple_keywords_no_match, test_search_multiple_keywords_across_sheets, and test_search_multiple_keywords_with_surrounding_cells.

Activity

All existing tests passed, ensuring backward compatibility.
New test methods were added to validate the include_surrounding_cells feature.
New test methods were added to verify the multi-keyword search functionality.
Quality checks, including ruff and ty, passed successfully.
The changes address and close issues Add include_row_data parameter to Excel search for efficient data retrieval #55 and Improve search functionality and clarify usage guidelines #56.
The pull request was generated using Claude Code.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull request overview

Excel検索の使い勝手と取得効率を同時に改善していて、運用上の負荷（追加API呼び出し）を下げる方向性がとても良いです。既存の検索モードを保ちながら、複数キーワード対応と行コンテキスト取得をオプトインで追加する内容になっています。

Changes:

Excel検索にカンマ区切りの複数キーワード OR 検索を追加
検索結果に同一行のセル群を含める include_surrounding_cells を追加
ツール説明・README/usage（日英）とテストを更新

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`src/sharepoint_excel.py`	検索API拡張（複数キーワード/行データ付与）を実装
`src/server.py`	MCPツール関数の引数・呼び出し・説明文を拡張
`tests/test_sharepoint_excel.py`	新機能（行コンテキスト/複数キーワード）をユニットテストで追加検証
`tests/test_server.py`	サーバ層からの呼び出し引数（新パラメータ）をテストに反映
`docs/usage.md`	新しい検索オプションの利用方法を追記（英語）
`docs/usage_ja.md`	新しい検索オプションの利用方法を追記（日本語）
`README.md`	検索モードの機能説明を更新（英語）
`README_ja.md`	検索モードの機能説明を更新（日本語）

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-10T14:58:55Z

+        # 1つ目のマッチ（D2: "在庫あり"）
+        match1 = result["matches"][0]
+        assert match1["coordinate"] == "D2"
+        assert match1["value"] == "在庫あり"
+        assert "row_data" in match1
+        assert len(match1["row_data"]) == 4
+        assert match1["row_data"][1]["value"] == "商品A"
+
+        # 2つ目のマッチ（D4: "在庫わずか"）
+        match2 = result["matches"][1]
+        assert match2["coordinate"] == "D4"
+        assert match2["value"] == "在庫わずか"
+        assert "row_data" in match2
+        assert len(match2["row_data"]) == 4
+        assert match2["row_data"][1]["value"] == "商品C"
+


このテストは result["matches"][0] / [1] の順序が D2→D4 になることを前提にしていますが、検索結果の順序は実装詳細（openpyxl の内部辞書順や走査順）に依存して変わり得ます。順序を保証しない API であれば、coordinate で該当 match を取り出して検証する（または座標でソートしてから比較する）形にするとテストが安定します。

Suggested change

# 1つ目のマッチ（D2: "在庫あり"）

match1 = result["matches"][0]

assert match1["coordinate"] == "D2"

assert match1["value"] == "在庫あり"

assert "row_data" in match1

assert len(match1["row_data"]) == 4

assert match1["row_data"][1]["value"] == "商品A"

# 2つ目のマッチ（D4: "在庫わずか"）

match2 = result["matches"][1]

assert match2["coordinate"] == "D4"

assert match2["value"] == "在庫わずか"

assert "row_data" in match2

assert len(match2["row_data"]) == 4

assert match2["row_data"][1]["value"] == "商品C"

# coordinate をキーにしてマッチ結果を引けるようにしておく

matches_by_coord = {m["coordinate"]: m for m in result["matches"]}

# D2 のマッチ（"在庫あり"）

match_d2 = matches_by_coord["D2"]

assert match_d2["value"] == "在庫あり"

assert "row_data" in match_d2

assert len(match_d2["row_data"]) == 4

assert match_d2["row_data"][1]["value"] == "商品A"

# D4 のマッチ（"在庫わずか"）

match_d4 = matches_by_coord["D4"]

assert match_d4["value"] == "在庫わずか"

assert "row_data" in match_d4

assert len(match_d4["row_data"]) == 4

assert match_d4["row_data"][1]["value"] == "商品C"

Copilot · 2026-02-10T14:58:56Z

+                            if include_surrounding_cells:
+                                row_cells = sheet[cell.row]
+                                match_entry["row_data"] = [
+                                    {
+                                        "value": self._serialize_value(c.value),
+                                        "coordinate": c.coordinate,
+                                    }
+                                    for c in row_cells
+                                ]


include_surrounding_cells=True のときに row_cells = sheet[cell.row] で行全体を取得していますが、openpyxl ではこのアクセスが max_column までのセルを実体化（空セルも Cell 化）してメモリ/CPU を急増させることがあります。大きいシートや書式だけ付いた広い範囲だと、検索1件でも巨大な row_data を作ってしまうので、(1) 既存セルのみから row_data を組み立てる／必要な列範囲だけに限定する、(2) 既存の config.excel_max_data_cols 相当の上限で打ち切る、などの防御を入れるのが安全です。

Copilot · 2026-02-10T14:58:56Z

+            include_surrounding_cells: Trueの場合、マッチしたセルと同じ行の全セルを含める
        """
+        # カンマ区切りで複数キーワードを解析
+        keywords = [kw.strip() for kw in query.split(",") if kw.strip()]


keywords = [kw.strip() for kw in query.split(",") if kw.strip()] の結果が空リストになり得ます（例: query="," や " , "）。この場合は any(...) が常に False になり、入力ミスでも 0 件ヒットとして静かに成功してしまいます。少なくとも keywords が空なら ValueError を投げる、または search_cells の戻り JSON に warnings を入れる等で、無効なクエリを明示した方が API として分かりやすいです。

Suggested change

keywords = [kw.strip() for kw in query.split(",") if kw.strip()]

keywords = [kw.strip() for kw in query.split(",") if kw.strip()]

# キーワードが 1 つも抽出できない場合は、入力ミスなどの無効なクエリとみなす

if not keywords:

raise ValueError(

"無効な検索クエリです。カンマ以外の有効なキーワードを 1 つ以上指定してください。"

)

gemini-code-assist

Code Review

This pull request introduces valuable improvements to the Excel search functionality, including multi-keyword OR search and the ability to fetch row context for matches. However, the implementation of the include_surrounding_cells feature creates a significant Denial of Service (DoS) vulnerability due to a lack of resource limits and deduplication logic, potentially leading to excessive memory consumption and data duplication. The search_cells method also lacks crucial data size limits. Additionally, there is an opportunity to refactor duplicated code to improve maintainability.

gemini-code-assist · 2026-02-10T14:59:53Z

+                            if include_surrounding_cells:
+                                row_cells = sheet[cell.row]
+                                match_entry["row_data"] = [
+                                    {
+                                        "value": self._serialize_value(c.value),
+                                        "coordinate": c.coordinate,
+                                    }
+                                    for c in row_cells
+                                ]
+
+                            matches.append(match_entry)


The implementation of include_surrounding_cells can lead to significant memory exhaustion. For every matching cell, the entire row's data is fetched and added to the response. If multiple cells in the same row match the query, the entire row is duplicated in the response for each match. In a wide sheet (e.g., 2000 columns), a few hundred matches can result in a massive JSON response that could crash the server or the client.

Remediation:

Deduplicate row data so that each row is only included once per search result, even if multiple cells in that row match.

Implement a hard limit on the total number of matches or cells returned in a single search request.

Apply the excel_max_data_cols limit to the row context fetching.

gemini-code-assist · 2026-02-10T14:59:53Z

+                                if include_surrounding_cells:
+                                    row_cells = sheet[cell.row]
+                                    match_entry["row_data"] = [
+                                        {
+                                            "value": self._serialize_value(c.value),
+                                            "coordinate": c.coordinate,
+                                        }
+                                        for c in row_cells
+                                    ]
+
+                                matches.append(match_entry)


This fallback loop for search_cells also suffers from the same resource exhaustion vulnerability as the primary loop. It duplicates entire row data for every match when include_surrounding_cells is True, without any limits on the number of matches or row width.

gemini-code-assist · 2026-02-10T14:59:53Z

-                for cell in sheet._cells.values():
+                # Note: リストに変換してから反復処理することで、
+                # row_data取得時のシート内部の辞書変更によるエラーを回避
+                for cell in list(sheet._cells.values()):


The search_cells method lacks crucial data size limits (excel_max_data_rows, excel_max_data_cols) that are enforced in parse_to_json, creating a Denial of Service (DoS) vulnerability with large workbooks. It is recommended to apply these limits to prevent excessive resource consumption. Additionally, the logic for creating a match entry is duplicated across if hasattr(sheet, "_cells") and its else block, which could be refactored into a helper function to improve maintainability.

Based on real-world usage feedback, OR search was causing poor results (23 hits → 7 attempts → 26,600 tokens). AND search provides better precision for narrowing down results. Changes: - Replace comma-separated OR search with space-separated AND search - Update keyword parsing: split(",") → split() - Change match logic: any() → all() - Update all tests and documentation Example: - Before: query="予算,利益" (OR - finds cells with either keyword) - After: query="予算報告" (AND - finds cells with both keywords) Use case: "簾舞連絡先" finds cells containing both keywords in one call, instead of getting too many irrelevant results with OR search. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Update all documentation to reflect the change from OR to AND search: - README.md / README_ja.md: Update feature description - docs/usage.md / docs/usage_ja.md: Update examples and best practices Examples changed: - "budget,forecast" (OR) → "budget report" (AND) - "予算,見積" (OR) → "予算報告" (AND) Best practice guidance added: - Start with single keyword - Add more keywords to narrow down if needed Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

k-ibaraki · 2026-02-10T15:08:24Z

フィードバックを受けて修正

実際のエージェント使用で、OR検索が使いづらいというフィードバックを受けて、AND検索に変更しました。

問題点

OR検索: 23件ヒット → 7回試行 → 26,600トークン消費
異なる概念の組み合わせ（停電,対応業者,連絡先）で結果が爆発

修正内容

OR検索（削除）→ AND検索（実装）

# 変更前（OR検索）
query="予算,利益"  # いずれかを含む → 結果多すぎ

# 変更後（AND検索）
query="簾舞 連絡先"  # 両方を含む → 1回で絞り込み

実装変更

キーワード解析: split(",") → split()
マッチロジック: any() → all()
テスト・ドキュメント全更新

ベストプラクティス

まず単一キーワードで検索
結果が多い場合はキーワードを追加してAND検索

✅ 全テスト通過
✅ Quality check通過

- Add empty query validation (ValueError on empty/whitespace-only) - Add error handling for include_surrounding_cells row_data retrieval - Add match count limits (1000 max, 500 warning threshold) - Add warnings array to search response for actionable feedback - Add comprehensive test coverage for edge cases Improves production robustness based on agent feedback Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Document empty query restriction - Document 1000 match limit and 500 warning threshold - Add error handling documentation (row_data_error field) - Add warnings array documentation - Update workflow recommendations for large result sets Helps users write effective search queries Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Add query guidelines (empty query, specific keywords) - Document performance considerations (1000 limit, 500 warning) - Add error handling examples - Add good/bad query examples - Add warning handling examples Helps users write effective search queries Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

k-ibaraki · 2026-02-10T15:27:43Z

むしろ精度が下がったので破棄

k-ibaraki and others added 2 commits February 10, 2026 23:52

Copilot AI review requested due to automatic review settings February 10, 2026 14:53

Copilot started reviewing on behalf of k-ibaraki February 10, 2026 14:53 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

gemini-code-assist Bot reviewed Feb 10, 2026

View reviewed changes

k-ibaraki and others added 2 commits February 11, 2026 00:07

k-ibaraki and others added 3 commits February 11, 2026 00:21

k-ibaraki closed this Feb 10, 2026

k-ibaraki deleted the feat/issues-55-56-search-improvements branch February 10, 2026 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Excel search improvements (multi-keyword + row context)#59

feat: Excel search improvements (multi-keyword + row context)#59
k-ibaraki wants to merge 7 commits into
mainfrom
feat/issues-55-56-search-improvements

k-ibaraki commented Feb 10, 2026

Uh oh!

gemini-code-assist Bot commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Feb 10, 2026

Uh oh!

gemini-code-assist Bot Feb 10, 2026

Uh oh!

gemini-code-assist Bot Feb 10, 2026

Uh oh!

k-ibaraki commented Feb 10, 2026

Uh oh!

k-ibaraki commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        keywords = [kw.strip() for kw in query.split(",") if kw.strip()]
+        keywords = [kw.strip() for kw in query.split(",") if kw.strip()]
+        # キーワードが 1 つも抽出できない場合は、入力ミスなどの無効なクエリとみなす
+        if not keywords:
+            raise ValueError(
+                "無効な検索クエリです。カンマ以外の有効なキーワードを 1 つ以上指定してください。"
+            )

Conversation

k-ibaraki commented Feb 10, 2026

Summary

Changes

Test Plan

Performance Impact

Uh oh!

gemini-code-assist Bot commented Feb 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

k-ibaraki commented Feb 10, 2026

フィードバックを受けて修正

問題点

修正内容

実装変更

ベストプラクティス

Uh oh!

k-ibaraki commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants