Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .env.rag.example
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,10 @@ S3_ACCOUNT1_SCHEDULES=
#SERPAPI_KEY=your-serpapi-api-key
#SERPAPI_QUERIES="OpenAI news, Bitcoin price, Tesla updates"
#SERPAPI_SCHEDULES=60

# SHAREPOINT CONNECTORS (optional):

#SHAREPOINT1_CLIENT_ID=your-azure-app-client-id
#SHAREPOINT1_CLIENT_SECRET=your-azure-app-client-secret
#SHAREPOINT1_TENANT_ID=your-azure-tenant-id
#SHAREPOINT1_SCHEDULES=3600
52 changes: 52 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,58 @@ SERPAPI1_QUERIES=aaa
SERPAPI1_SCHEDULES=3600
````

### SharePoint Connector

The SharePoint connector ingests files from SharePoint document libraries or site pages.
Authentication is via Microsoft Entra ID (client credentials / app registration).

Supports two modes:
- **File mode** (`sharepoint_type: file`, default): load files from a drive/folder
- **Page mode** (`sharepoint_type: page`): load SharePoint site pages

```yaml
# config.yaml

sources:
# Loading files from a SharePoint drive
- type: "sharepoint"
name: "sharepoint1"
config:
client_id: "${SHAREPOINT1_CLIENT_ID}"
client_secret: "${SHAREPOINT1_CLIENT_SECRET}"
tenant_id: "${SHAREPOINT1_TENANT_ID}"
# sharepoint_site_id can be provided instead of sharepoint_site_name
sharepoint_site_name: "MySite"
# optionally, instead of site_name:
# sharepoint_host_name: "contoso.sharepoint.com"
# sharepoint_relative_url: "sites/YourSiteName"
# sharepoint_folder_id can be provided instead of sharepoint_folder_path
sharepoint_folder_path: "Documents/Reports"
sharepoint_type: "file" # "file" (default) or "page"
recursive: true
schedules: "${SHAREPOINT1_SCHEDULES}"

# Loading SharePoint site pages
- type: "sharepoint"
name: "sharepoint2"
config:
client_id: "${SHAREPOINT2_CLIENT_ID}"
client_secret: "${SHAREPOINT2_CLIENT_SECRET}"
tenant_id: "${SHAREPOINT2_TENANT_ID}"
sharepoint_site_name: "TeamSite"
sharepoint_type: "page"
schedules: "${SHAREPOINT2_SCHEDULES}"
```

```dotenv
# .env.rag

SHAREPOINT1_CLIENT_ID=your-azure-app-client-id
SHAREPOINT1_CLIENT_SECRET=your-azure-app-client-secret
SHAREPOINT1_TENANT_ID=your-azure-tenant-id
SHAREPOINT1_SCHEDULES=3600
```
Comment on lines +262 to +281
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align env snippet with the second connector example.

This section introduces sharepoint2 config but the .env.rag snippet only shows SHAREPOINT1_*. Please either add SHAREPOINT2_* placeholders or explicitly state the snippet is single-connector only.

📘 Suggested doc patch
 SHAREPOINT1_CLIENT_ID=your-azure-app-client-id
 SHAREPOINT1_CLIENT_SECRET=your-azure-app-client-secret
 SHAREPOINT1_TENANT_ID=your-azure-tenant-id
 SHAREPOINT1_SCHEDULES=3600
+# Optional second SharePoint connector
+# SHAREPOINT2_CLIENT_ID=your-azure-app-client-id-2
+# SHAREPOINT2_CLIENT_SECRET=your-azure-app-client-secret-2
+# SHAREPOINT2_TENANT_ID=your-azure-tenant-id-2
+# SHAREPOINT2_SCHEDULES=3600
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Loading SharePoint site pages
- type: "sharepoint"
name: "sharepoint2"
config:
client_id: "${SHAREPOINT2_CLIENT_ID}"
client_secret: "${SHAREPOINT2_CLIENT_SECRET}"
tenant_id: "${SHAREPOINT2_TENANT_ID}"
sharepoint_site_name: "TeamSite"
sharepoint_type: "page"
schedules: "${SHAREPOINT2_SCHEDULES}"
```
```dotenv
# .env.rag
SHAREPOINT1_CLIENT_ID=your-azure-app-client-id
SHAREPOINT1_CLIENT_SECRET=your-azure-app-client-secret
SHAREPOINT1_TENANT_ID=your-azure-tenant-id
SHAREPOINT1_SCHEDULES=3600
```
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 262 - 281, The README shows a second connector named
"sharepoint2" but the .env.rag snippet only defines SHAREPOINT1_* variables;
update the docs to either add matching SHAREPOINT2_* placeholders
(SHAREPOINT2_CLIENT_ID, SHAREPOINT2_CLIENT_SECRET, SHAREPOINT2_TENANT_ID,
SHAREPOINT2_SCHEDULES) in the env example to align with the sharepoint2 config,
or add a clear note near the .env.rag snippet stating it demonstrates only
SHAREPOINT1 and additional connectors like sharepoint2 require analogous
SHAREPOINT2_* entries; reference the "sharepoint2" connector name and the
SHAREPOINT2_* variable names so readers can find and fix the mismatch.


## Embeddings and Inference

### Embeddings support
Expand Down
17 changes: 17 additions & 0 deletions config.yaml.example
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,23 @@ sources:
# queries: "${SERPAPI_QUERIES}"
# schedules: "${SERPAPI_SCHEDULES}"

#- type: "sharepoint"
# name: "sharepoint1"
# config:
# client_id: "${SHAREPOINT1_CLIENT_ID}"
# client_secret: "${SHAREPOINT1_CLIENT_SECRET}"
# tenant_id: "${SHAREPOINT1_TENANT_ID}"
# # sharepoint_site_id can be provided instead of sharepoint_site_name
# sharepoint_site_name: "MySite"
# # optionally, instead of site_name:
# # sharepoint_host_name: "contoso.sharepoint.com"
# # sharepoint_relative_url: "sites/YourSiteName"
# # sharepoint_folder_id can be provided instead of sharepoint_folder_path
# sharepoint_folder_path: "Documents/Reports"
# sharepoint_type: "file" # "file" (default) or "page"
# recursive: true
# schedules: "${SHAREPOINT1_SCHEDULES}"

embedding:
# can be `local` or `openrouter`/`openai`
provider: local
Expand Down