Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
*.egg-info/
dist/
build/
*.egg
.pytest_cache/
.mypy_cache/
.ruff_cache/

# Virtual environments
.venv/
venv/
ENV/

# IDE
.idea/
.vscode/
*.swp
*.swo
*~

# Git
.gitignore
.gitattributes

# Documentation
docs/
*.md
!README.md

# Tests
tests/
.pytest_cache/

# CI/CD
.github/
.pre-commit-config.yaml

# Environment files (will be passed as build args or mounted)
.env

# Chainlit specific
.chainlit/
.files/

# Misc
*.log
.DS_Store
literature/
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -312,3 +312,6 @@ tests/example2.*

# Client data
src/paperqa/clients/client_data/retractions.csv
/src/.chainlit/*
.chainlit/*
chainlit.md
102 changes: 102 additions & 0 deletions DOCKER_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Docker Setup for Chainlit UI

This guide explains how the DESTINY repo search agent UI in a docker container.

## Quick Start

### Using Docker Compose (Recommended)

1. **Build and run the container:**
```bash
docker compose up --build
```

2. **Access the UI:**
Open your browser to http://localhost:8000

3. **Stop the container:**
```bash
docker compose down
```

### Using Docker Directly (Untested)

1. **Build the image:**
```bash
docker build -t paper-qa-chainlit .
```

2. **Run the container:**
```bash
docker run -p 8000:8000 -p 42071:42071 --env-file .env paper-qa-chainlit
```

## Environment Variables

The application requires several environment variables for API access. These are automatically loaded from your `.env` file in the project's root directory when using docker-compose.

If you haven't created one yet, make sure to do so!

**Required variables:**
- `AZURE_API_BASE` - Azure OpenAI endpoint
- `AZURE_API_KEY` - Azure OpenAI API key
- `DESTINY_API_URL` - DESTINY API URL
- `DESTINY_CLIENT_ID` - DESTINY OAuth client ID
- `DESTINY_AUTHORITY` - DESTINY OAuth authority
- `DESTINY_LOGIN_HINT` - User email for DESTINY
- `DESTINY_SCOPES` - DESTINY API scopes

## Data Persistence

Docker Compose sets up volumes for persistent data:
- `chainlit-data` - Chainlit configuration and session data
- `chainlit-files` - Uploaded files
- `chainlit-public` - Public assets

To view or backup this data:
```bash
docker volume ls
docker volume inspect paper-qa-chainlit-data
```

## Troubleshooting

### Port already in use
If port 8000 is already in use, change it in docker-compose.yml:
```yaml
ports:
- "8080:8000" # Use port 8080 on host
```

### Authentication issues with DESTINY
The DESTINY OAuth flow requires interactive authentication using MSAL. The container exposes port 42071 for the authentication callback.

**How it works:**
1. When you start the container and visit `localhost:8000` in your browser, MSAL will print an authentication URL in your terminal.
2. Copy the second URL and open it in your **host machine's browser** (not inside the container)
3. Complete the Microsoft login
4. The callback will be sent to `localhost:42071` which is forwarded to the container
5. Authentication completes and the token is cached

**If authentication fails:**
- Ensure port 42071 is not blocked by your firewall
- Check that the port mapping is correct: `docker compose ps`
- Ensure you're visiting the **second** authentication URL from your terminal
- Ensure you visit `localhost:8000` in your browser before looking for the auth link your terminal

### View logs
```bash
docker compose logs -f chainlit-ui
```

## Cleanup

Remove all containers and volumes:
```bash
docker compose down -v
```

Remove the image:
```bash
docker rmi paper-qa-chainlit
```
26 changes: 26 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
FROM python:3.12-slim-trixie
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

# Install git (required by setuptools-scm for version detection)
RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*

# Copy the project into the image
COPY . /app

# Set working directory
WORKDIR /app

# Install Python dependencies (skip dev dependencies)
RUN uv sync --locked --no-dev

# Add the virtual environment to PATH so we use the installed packages
ENV PATH="/app/.venv/bin:$PATH"

# Expose Chainlit default port
EXPOSE 8000

# Expose MSAL authentication callback port
EXPOSE 42071

# Run the Chainlit app directly from the venv
CMD ["chainlit", "run", "app.py", "--host", "0.0.0.0", "--port", "8000"]
83 changes: 83 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# PaperQA2

## To run the project locally

In line with the existing [CONTRIBUTING.md](CONTRIBUTING.md) file. Executing `uv sync` in the project root is sufficient to start editing and running the project code locally.

## To run on our infrastructure

There is a basic `azure.json` configuration file in `src/paperqa/configs` that provides a simple configuration `paperqa`'s `Settings` object.
Expand All @@ -14,6 +18,85 @@ For it to work, it requires a `.env` file in the project root directory populate
- `OPENALEX_MAILTO`

To make use of the configuration, simply create a `Settings` object using its `from_name` class method, passing the stem of the json config as a string, i.e. `Settings.from_name("azure")`.

## To run the DESTINY repo paper helper

The following additional environment variables are required:

- `DESTINY_API_URL` (ATTOW https://destiny-repository-stag-app.proudmeadow-2a76e8ac.swedencentral.azurecontainerapps.io)
- `DESTINY_CLIENT_ID` (ATTOW 96ed941e-15dc-4ec0-b9e7-e4eda99efd2e)
- `DESTINY_AUTHORITY` (ATTOW https://login.microsoftonline.com/f870e5ae-5521-4a94-b9ff-cdde7d36dd35)
- `DESTINY_SCOPES` (ATTOW api://14e3f6c0-b8aa-46c6-98d9-29b0dd2a0f7c/.default as a list, i.e. between double quotes ending with a comma)
- `DESTINY_LOGIN_HINT` (your UCL email address, see Lena's authentication notebook in the teams channel for more)

See `test_contribs.py` for an example of running the paper helper.

Using this forked version of paper-qa as a local package/dependency should work if not:

```python
import os
from dotenv import load_dotenv
from paperqa import Settings
from paperqa.contrib.destiny_paper_helper import DESTINYPaperHelper
from paperqa.settings import IndexSettings

load_dotenv()

paper_directory = "~/some-directory"

settings = Settings.from_name("azure").model_copy(
update={
"paper_directory": paper_directory,
"index": IndexSettings(paper_directory=paper_directory)
}
)
helper = DESTINYPaperHelper(
settings,
api_url=os.getenv("DESTINY_API_URL"),
client_id=os.getenv("DESTINY_CLIENT_ID"),
authority=os.getenv("DESTINY_AUTHORITY"),
login_hint=os.getenv("DESTINY_LOGIN_HINT"),
scopes=os.getenv("DESTINY_SCOPES").split(","),
)

question = "What is the progress on climate change intervention research?"

papers = await helper.fetch_relevant_papers(question)

docs = await helper.aadd_docs(papers)

session = await docs.aquery(question, settings=helper.settings)

print(session.answer)
```

## To run the agent with a DESTINY repo search tool

```python
from dotenv import load_dotenv
from paperqa import Settings, agent_query

load_dotenv() # load your environment variables

paper_directory = "~/some-directory"

settings = Settings.from_name("search_only_destiny").model_copy(
update={
"paper_directory": paper_directory,
"verbosity": 0 # to reduce output
}
)

query = "What are the greatest health risks brought about by climate change?"

answer_response = await agent_query(
query=query,
settings=settings
)

print(answer_response.session.answer) # show the agent's response
```

<!-- pyml disable-num-lines 6 line-length -->

[![GitHub](https://img.shields.io/badge/GitHub-black?logo=github&logoColor=white)](https://github.com/Future-House/paper-qa)
Expand Down
99 changes: 99 additions & 0 deletions app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
import tempfile

import chainlit as cl
from lmi.utils import update_litellm_max_callbacks

from paperqa import agent_query, Settings
from paperqa.sources.destiny_repo import get_access_token

# Suppress LiteLLM callback warnings
update_litellm_max_callbacks()


@cl.on_chat_start
async def on_chat_start():
print("Starting new chat session.")
print("Attempting to retrieve DESTINY access token...")
get_access_token()
print("Retrieved access token successfully.")
print("Creating paperqa settings...")
tempdir = tempfile.TemporaryDirectory()
cl.user_session.set("tempdir", tempdir)
settings = Settings.from_name("search_only_destiny")
settings.agent.index.paper_directory = tempdir.name
settings.verbosity = 0
cl.user_session.set("paperqa-settings", settings)


@cl.on_chat_end
def on_chat_end():
tempdir = cl.user_session.get("tempdir")
tempdir.cleanup()


@cl.on_message
async def main(message: cl.Message):
# Store steps for updating them during callbacks
current_step = None
step_count = 0

async def on_agent_action_callback(action, _state):
"""Called when agent takes an action (tool call)."""
nonlocal current_step, step_count
step_count += 1

# Extract tool names and details from the action
tool_names = [tc.function.name for tc in action.tool_calls]
display_names = [name.replace("_", " ").title() for name in tool_names]
step_name = f"{', '.join(display_names)} Tool - Step {step_count}"

# Build detailed output for each tool call
tool_details = []
for tc in action.tool_calls:
tool_name = tc.function.name

# Check if this is a DESTINY search and extract the query
if tool_name == "destiny_search":
query = tc.function.arguments.get("query", "N/A")
tool_details.append(f"**DESTINY API Search Query**: `{query}`")
current_step = cl.Step(name=step_name, show_input=True)
else:
# For other tools, just show the name
tool_details.append(f"**{tool_name}**")
current_step = cl.Step(name=step_name, show_input=False)

# Convert tool names to title case for display
await current_step.__aenter__()
current_step.input = "\n".join(tool_details)
await current_step.update()

async def on_env_step_callback(obs, _reward, _done, _truncated):
"""Called after environment processes the action."""
nonlocal current_step

if current_step is not None:
# Format the observations (tool results)
result_output = ""
for msg in obs:
if hasattr(msg, 'role') and msg.role == 'tool':
result_output = "\n\n" + str(msg.content)

# Append the result to existing output instead of replacing it
current_step.output = f"\n\n**Completed**{result_output}"
await current_step.update()
await current_step.__aexit__(None, None, None)
current_step = None

# Create a final answer step
async with cl.Step(name="Generating Answer") as answer_step:
answer_response = await agent_query(
query=str(message.content),
settings=cl.user_session.get("paperqa-settings"),
on_agent_action_callback=on_agent_action_callback,
on_env_step_callback=on_env_step_callback
)
answer_step.output = "Agent completed processing"

await cl.Message(
content=answer_response.session.answer
).send()
Loading