Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
b8f050b
Merge pull request #113 from buerokratt/dev
nuwangeek Nov 1, 2025
670b56a
Merge pull request #120 from buerokratt/dev
erangi-ar Nov 10, 2025
fae859f
Merge pull request #121 from rootcodelabs/dev
erangi-ar Nov 10, 2025
e11a1ef
bug fixes in gc
Nov 10, 2025
fe08371
Fixed bugs and added seperate endpoint to handle mock ckb update with…
nuwangeek Nov 11, 2025
7e321f6
fixed ruff lint issues and fix env issue
nuwangeek Nov 11, 2025
6611816
removed ollama container
nuwangeek Nov 12, 2025
24584e1
fixed requested changes
nuwangeek Nov 12, 2025
38bc160
Merge pull request #122 from rootcodelabs/deployment-bug-fixes-Bimsara
nuwangeek Nov 12, 2025
fdb047a
Merge branch 'docker-fixes-Bimsara' of https://github.com/rootcodelab…
nuwangeek Nov 12, 2025
a0f46e8
removed env
nuwangeek Nov 12, 2025
5ba8c05
fixed uv path issue
nuwangeek Nov 12, 2025
cecaa0a
Merge pull request #123 from rootcodelabs/docker-fixes-Bimsara
nuwangeek Nov 12, 2025
416a41f
Merge pull request #124 from buerokratt/dev
erangi-ar Nov 13, 2025
80eef14
Fixed issues in model training progress update and error handling
nuwangeek Nov 13, 2025
0b61232
Merge pull request #125 from rootcodelabs/docker-fixes-Bimsara
nuwangeek Nov 13, 2025
407f540
fixed env issue
nuwangeek Nov 13, 2025
3c664f7
Merge pull request #127 from rootcodelabs/docker-fixes-Bimsara
nuwangeek Nov 13, 2025
2b839c6
Merge branch 'deployment-bug-fixes-Bimsara' into dev
nuwangeek Nov 13, 2025
01c1b5a
Merge pull request #128 from rootcodelabs/dev
nuwangeek Nov 13, 2025
4e2aa13
Merge pull request #129 from buerokratt/dev
nuwangeek Nov 13, 2025
2e5a02e
Merge pull request #130 from rootcodelabs/dev
nuwangeek Nov 13, 2025
cc310d0
clean env and docker compose files
nuwangeek Nov 14, 2025
c7b26f7
feat: Update Vite configuration and environment settings for global c…
Nov 15, 2025
c365c98
feat: Add Grafana server configuration to environment variables
Nov 15, 2025
9f4240e
fix error handling issue
nuwangeek Nov 16, 2025
e0f1876
feat: Integrate dataset version handling in DataModel forms and queries
Nov 16, 2025
b004860
fix: Correct SQL queries to ensure proper referencing of dataset_vers…
Nov 16, 2025
80d878a
Merge pull request #131 from rootcodelabs/deployment-est-gpu-Bimsara
erangi-ar Nov 16, 2025
8750fb8
feat: Add comprehensive Docker Compose configuration for application …
Nov 16, 2025
3681829
Merge branch 'deployment-est-gpu' of https://github.com/rootcodelabs/…
Nov 16, 2025
0913705
feat: Add agency insertion functionality with data extraction and err…
Nov 17, 2025
c1a56a4
fix: Update insert agency data result handling to improve error checking
Nov 17, 2025
c286ad6
feat: Implement agency existence check before insertion in add.yml
Nov 17, 2025
8087816
feat: Include results in dataset generation callback logging
Nov 17, 2025
ea50354
fix: Adjust dataset ID check in process_callback_background function …
Nov 17, 2025
1dd5ce6
fixed callback payload issue
nuwangeek Nov 17, 2025
1359c0d
Merge pull request #132 from rootcodelabs/deployment-est-gpu
nuwangeek Nov 17, 2025
6c11424
Merge branch 'deployment-est-gpu-Bimsara' of https://github.com/rootc…
nuwangeek Nov 17, 2025
527cad0
fix: Update domain references from global-classifier-dev.rootcode.sof…
Nov 17, 2025
1ae59b9
Merge branch 'deployment-est-gpu-Bimsara' of https://github.com/rootc…
Nov 17, 2025
230192d
Change DOMAIN from dev to localhost
erangi-ar Nov 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 32 additions & 3 deletions .env
Original file line number Diff line number Diff line change
@@ -1,7 +1,36 @@
AWS_ACCESS_KEY_ID=your_aws_access_key_id
AWS_SECRET_ACCESS_KEY=your_aws_secret_access_key
API_CORS_ORIGIN=*
API_DOCUMENTATION_ENABLED=true
S3_REGION=eu-west-1
S3_ENDPOINT_URL=http://minio:9000
S3_ENDPOINT_NAME=minio:9000
S3_DATA_BUCKET_PATH=resources
S3_DATA_BUCKET_NAME=global-classifier
FS_DATA_DIRECTORY_PATH=/app
S3_SECRET_ACCESS_KEY=minioadmin
S3_ACCESS_KEY_ID=minioadmin
S3_HEALTH_ENDPOINT=http://minio:9000/minio/health/live
MINIO_BROWSER_REDIRECT_URL=http://localhost:9001/minio-console
GF_SECURITY_ADMIN_USER=admin
GF_SECURITY_ADMIN_PASSWORD=admin123
GF_USERS_ALLOW_SIGN_UP=false
GF_SERVER_ROOT_URL=https://dev-gloclf.buerokratt.ee/grafana
GF_SERVER_SERVE_FROM_SUB_PATH=true
PORT=3000
AWS_BEDROCK_ACCESS_KEY_ID=your_aws_access_key_id
AWS_BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_access_key
BEDROCK_AWS_REGION=eu-west-1
AZURE_OPENAI_API_KEY=your_openai_api_key
AZURE_OPENAI_ENDPOINT=your_openai_endpoint
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini
PROVIDER_NAME=azure-openai
PROVIDER_NAME=azure-openai
MLFLOW_TRACKING_USERNAME=mlflowadmin
MLFLOW_TRACKING_PASSWORD=value
MLFLOW_HOST_PORT=5000
MLFLOW_CONT_PORT=5000
MLFLOW_HOST=0.0.0.0
MLFLOW_PORT=${MLFLOW_CONT_PORT}
MLFLOW_BACKEND_STORE_URI=sqlite:////mlflow/mlflow_data/mlflow.db
MLFLOW_DEFAULT_ARTIFACT_ROOT=file:///mlflow/mlflow_artifacts
MLFLOW_HOST_CONFIG_PATH=./mlflow/config
MLFLOW_CONT_CONFIG_PATH=/mlflow/config
MLFLOW_FLASK_SERVER_SECRET_KEY=byk-mlflow-secret
2 changes: 1 addition & 1 deletion DSL/CronManager/DSL/callback_formatter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@ callback_format:
trigger: off
type: exec
command: "../app/scripts/callback_format.sh"
allowedEnvs: ['filePath', 'results', 'taskId']
allowedEnvs: ['filePath', 'results', 'taskId', 'metricsFile']
5 changes: 3 additions & 2 deletions DSL/CronManager/script/callback_format.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
echo "Started Shell Script for Dataset Generation Callback Processing"

# Check if environment variables are set
if [ -z "$filePath" ] || [ -z "$results" ] || [ -z "$taskId" ]; then
echo "Please set the filePath, results, and taskId environment variables."
if [ -z "$filePath" ] || [ -z "$results" ] || [ -z "$taskId" ] || [ -z "$metricsFile" ]; then
echo "Please set the filePath, results, taskId, and metricsFile environment variables."
exit 1
fi

Expand Down Expand Up @@ -62,6 +62,7 @@ python3 "$CALLBACK_SCRIPT" \
--encoded-results "$results" \
--output-json "$temp_response" \
--session-id "$taskId" \
--metrics-file "$metricsFile" \
> /tmp/callback_stdout.log 2> /tmp/callback_stderr.log
exit_code=$?

Expand Down
43 changes: 42 additions & 1 deletion DSL/CronManager/script/dataset_pipeline_s3.sh
Original file line number Diff line number Diff line change
Expand Up @@ -379,17 +379,58 @@ EOF
else
log "S3 download failed - success status: $success_status"
log "Response: $response_body"

# Update progress status to indicate failure
progress_update_payload=$(cat <<EOF
{
"sessionId": "$sessionId",
"generationStatus": "Fail",
"generationMessage": "Generation Failed",
"progressPercentage": 100,
"processComplete": true
}
EOF
)

progress_update_response=$(curl -s -X POST "$PROGRESS_UPDATE_URL" \
-H "Content-Type: application/json" \
-d "$progress_update_payload")
log "Progress status updated to failed: $progress_update_response"

send_failure_status_update "S3 download and extraction failed" "$CURRENT_DATASET_ID" "$response_body" "extraction_failure"
rm -f /tmp/download_response.json
exit 1
fi

else
log "Python script execution failed with exit code: $exit_code"

# Update progress status to indicate failure
progress_update_payload=$(cat <<EOF
{
"sessionId": "$sessionId",
"generationStatus": "Fail",
"generationMessage": "Generation Failed",
"progressPercentage": 100,
"processComplete": true
}
EOF
)

progress_update_response=$(curl -s -X POST "$PROGRESS_UPDATE_URL" \
-H "Content-Type: application/json" \
-d "$progress_update_payload")
log "Progress status updated to failed: $progress_update_response"

if [ -f "$temp_response" ]; then
log "Error response: $(cat $temp_response)"
rm -f /tmp/download_response.json
response_body=$(cat "$temp_response")
send_failure_status_update "Python script execution failed" "$CURRENT_DATASET_ID" "$response_body" "extraction_failure"
else
send_failure_status_update "Python script execution failed - no response data" "$CURRENT_DATASET_ID" "" "extraction_failure"
fi

rm -f /tmp/download_response.json
exit 1
fi

Expand Down
106 changes: 56 additions & 50 deletions DSL/CronManager/script/train_script_starter.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,48 @@ GET_FIRST_COME_TRAINING_JOB_SQL="http://resql:8082/global-classifier/get-queued-
GET_DATA_MODEL_BY_MODEL_ID_SQL="http://resql:8082/global-classifier/get-data-model-info-by-given-model-id"
UPDATE_JOB_STATUS="http://resql:8082/global-classifier/update-training-job-status"

# Centralized error handling function
handle_training_failure() {
local error_message="$1"
echo "[FAILED] $error_message"

# Only proceed with status updates if we have the required variables
if [ -n "$job_id" ] && [ -n "$model_id" ] && [ -n "$session_id" ]; then
echo "[UPDATE] Updating job status to training-failed..."
response_update_job_status=$(curl -s -X POST "$UPDATE_JOB_STATUS" \
-H "Content-Type: application/json" \
-d "{\"jobId\": $job_id, \"jobStatus\": \"training-failed\"}")

echo "[MODEL] Updating model training status to failed..."
UPDATE_MODEL_TRAINING_STATUS_FAILED="http://resql:8082/global-classifier/update-training_status-failed"
response_update_model_status=$(curl -s -X POST "$UPDATE_MODEL_TRAINING_STATUS_FAILED" \
-H "Content-Type: application/json" \
-d "{\"model_id\": $model_id}")

echo "[PROGRESS] Updating progress session to show training failure..."
UPDATE_PROGRESS_SESSION_ENDPOINT="http://ruuter-public:8086/global-classifier/datamodels/progress/update"
response_update_progress_failure=$(curl -s -X POST "$UPDATE_PROGRESS_SESSION_ENDPOINT" \
-H "Content-Type: application/json" \
-d "{
\"sessionId\": $session_id,
\"trainingStatus\": \"Training Failed\",
\"trainingMessage\": \"Training Failed\",
\"progressPercentage\": 100,
\"processComplete\": false
}")

if [ -z "$response_update_progress_failure" ]; then
echo "[WARNING] Failed to update progress session with failure status"
else
echo "[PROGRESS] Progress session updated with failure status successfully"
fi
else
echo "[WARNING] Cannot update training status - missing required variables (job_id, model_id, or session_id)"
fi

exit 1
}

echo "[START] Training script starter"

# Check if training is in progress
Expand Down Expand Up @@ -102,8 +144,7 @@ echo "[DEBUG] Create session response: '$response_create_session'"

# Extract session ID from response
if [ -z "$response_create_session" ]; then
echo "[ERROR] Failed to create training progress session - empty response"
exit 1
handle_training_failure "Failed to create training progress session - empty response"
fi

# Check if session creation was successful
Expand All @@ -113,14 +154,14 @@ if echo "$response_create_session" | grep -q '"operationSuccessful":true'; then
if [ -z "$session_id" ] || [ "$session_id" = "$response_create_session" ]; then
echo "[ERROR] Failed to extract session ID from response"
echo "[DEBUG] Raw response: '$response_create_session'"
exit 1
handle_training_failure "Failed to extract session ID from response"
fi

echo "[SESSION] Training progress session created successfully with ID: $session_id"
else
echo "[ERROR] Training progress session creation failed"
echo "[DEBUG] Raw response: '$response_create_session'"
exit 1
handle_training_failure "Training progress session creation failed"
fi

# Update initial training progress
Expand Down Expand Up @@ -154,16 +195,15 @@ echo "[DEBUG] Dataset ID response: '$response_get_dataset_id'"

# Handle empty response
if [ -z "$response_get_dataset_id" ] || [ "$response_get_dataset_id" = "[]" ]; then
echo "[ERROR] No dataset information found for model ID: $model_id"
exit 1
handle_training_failure "No dataset information found for model ID: $model_id"
fi

dataset_id=$(echo "$response_get_dataset_id" | sed -E 's/.*"connectedDsId":([0-9]+).*/\1/')

if [ -z "$dataset_id" ] || [ "$dataset_id" = "$response_get_dataset_id" ]; then
echo "[ERROR] Connected Dataset ID not found in response"
echo "[DEBUG] Raw response: '$response_get_dataset_id'"
exit 1
handle_training_failure "Connected Dataset ID not found in response"
fi

echo "[DATASET] Dataset ID: $dataset_id"
Expand All @@ -177,12 +217,12 @@ else
echo "[ERROR] Failed to extract base models from response"
echo "[ERROR] Raw response: $response_get_dataset_id"
echo "[ERROR] Extracted base_models: $base_models_json"
exit 1
handle_training_failure "Failed to extract base models from response"
fi

# Activate existing virtualenv
echo "[INFO] Activating existing virtualenv at /app/python_virtual_env"
source /app/python_virtual_env/bin/activate || { echo "[ERROR] Failed to activate virtualenv"; exit 1; }
source /app/python_virtual_env/bin/activate || { echo "[ERROR] Failed to activate virtualenv"; handle_training_failure "Failed to activate Python virtual environment"; }
export PYTHONPATH="/app:/app/src:/app/src/training:/app/src/s3_dataset_processor:$PYTHONPATH"
echo "[DEBUG] PYTHONPATH set to: $PYTHONPATH"
# Add these debug commands
Expand Down Expand Up @@ -224,41 +264,41 @@ if [ ${#missing_pkgs[@]} -ne 0 ]; then
# Create installation directory
mkdir -p "$UV_INSTALL_DIR" || {
echo "[ERROR] Failed to create UV installation directory"
exit 1
handle_training_failure "Failed to create UV installation directory"
}

# Use unmanaged installation to avoid root directory modifications
curl -LsSf https://astral.sh/uv/install.sh | env UV_UNMANAGED_INSTALL="$UV_INSTALL_DIR" sh || {
echo "[ERROR] Failed to install uv"
exit 1
handle_training_failure "Failed to install UV package manager"
}

# Verify installation
if [ ! -x "$UV_BIN" ]; then
echo "[ERROR] UV installation failed or not executable"
exit 1
handle_training_failure "UV installation failed or not executable"
fi

# Verify functionality
"$UV_BIN" --version || {
echo "[ERROR] UV installation corrupted"
exit 1
handle_training_failure "UV installation corrupted"
}

echo "[UV] Successfully installed uv (unmanaged) to $UV_INSTALL_DIR"
fi

if [ ! -f /app/src/training/requirements-gpu.txt ]; then
echo "/app/src/training/requirements-gpu.txt not found!"
exit 1
handle_training_failure "Training requirements file not found"
fi

echo "[INSTALL] Installing from /app/src/training/requirements-gpu.txt using secure uv..."
"$UV_BIN" pip install --python "$VIRTUAL_ENV/bin/python3" -r /app/src/training/requirements-gpu.txt || {
echo "[WARNING] uv install failed — trying pip as fallback..."
pip install -r /app/src/training/requirements-gpu.txt || {
echo "[ERROR] Both uv and pip install failed inside virtualenv"
exit 1
handle_training_failure "Failed to install required Python packages"
}
}

Expand Down Expand Up @@ -321,41 +361,7 @@ if [ $training_exit_code -eq 0 ]; then

echo "[DEBUG] Update job status to trained response: '$response_update_job_status_trained'"
else
echo "[FAILED] Training failed with exit code: $training_exit_code"

echo "[UPDATE] Updating job status to training-failed..."
response_update_job_status=$(curl -s -X POST "$UPDATE_JOB_STATUS" \
-H "Content-Type: application/json" \
-d "{\"jobId\": $job_id, \"jobStatus\": \"training-failed\"}")

echo "[MODEL] Updating model training status to failed..."
UPDATE_MODEL_TRAINING_STATUS_FAILED="http://resql:8082/global-classifier/update-training_status-failed"
response_update_model_status=$(curl -s -X POST "$UPDATE_MODEL_TRAINING_STATUS_FAILED" \
-H "Content-Type: application/json" \
-d "{\"model_id\": $model_id}")

echo "[DEBUG] Update model training status response: '$response_update_model_status'"

echo "[PROGRESS] Updating progress session to show training failure..."
response_update_progress_failure=$(curl -s -X POST "$UPDATE_PROGRESS_SESSION_ENDPOINT" \
-H "Content-Type: application/json" \
-d "{
\"sessionId\": $session_id,
\"trainingStatus\": \"Training Failed\",
\"trainingMessage\": \"Model training has failed\",
\"progressPercentage\": 100,
\"processComplete\": false
}")

echo "[DEBUG] Update progress failure response: '$response_update_progress_failure'"

if [ -z "$response_update_progress_failure" ]; then
echo "[WARNING] Failed to update progress session with failure status"
else
echo "[PROGRESS] Progress session updated with failure status successfully"
fi

exit 1
handle_training_failure "Model training script failed with exit code: $training_exit_code"
fi

echo "[DONE] Training script starter completed"
4 changes: 4 additions & 0 deletions DSL/Resql/global-classifier/POST/get-agency-centops.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
-- Check if agency exists in mock_centops table
SELECT agency_id
FROM public.mock_centops
WHERE agency_id = :agencyId;
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
SELECT id, major, minor
FROM public.dataset_versions
WHERE generation_status = 'Generation_Success'
ORDER BY id;
42 changes: 22 additions & 20 deletions DSL/Resql/global-classifier/POST/get-datasets.sql
Original file line number Diff line number Diff line change
@@ -1,27 +1,29 @@
SELECT
id,
major,
minor,
created_at,
generation_status,
last_model_trained,
last_trained,
dv.id,
dv.major,
dv.minor,
dv.created_at,
dv.generation_status,
COALESCE(dm.model_name, dv.last_model_trained) AS last_model_trained,
dv.last_trained,
CEIL(COUNT(*) OVER() / :page_size::DECIMAL) AS total_pages
FROM
dataset_versions
dataset_versions dv
LEFT JOIN
data_models dm ON dv.last_model_trained = dm.model_id::text
WHERE
(:generation_status = 'all' OR generation_status ILIKE '%' || :generation_status || '%')
(:generation_status = 'all' OR dv.generation_status ILIKE '%' || :generation_status || '%')
AND (:dataset_name = 'all'
OR POSITION(LOWER(:dataset_name) IN LOWER(CONCAT('v', major, '.', minor))) > 0
OR POSITION(LOWER(:dataset_name) IN LOWER(CONCAT(major, '.', minor))) > 0
OR POSITION(LOWER(:dataset_name) IN LOWER(major::text)) > 0
OR POSITION(LOWER(:dataset_name) IN LOWER(minor::text)) > 0)
OR POSITION(LOWER(:dataset_name) IN LOWER(CONCAT('v', dv.major, '.', dv.minor))) > 0
OR POSITION(LOWER(:dataset_name) IN LOWER(CONCAT(dv.major, '.', dv.minor))) > 0
OR POSITION(LOWER(:dataset_name) IN LOWER(dv.major::text)) > 0
OR POSITION(LOWER(:dataset_name) IN LOWER(dv.minor::text)) > 0)
ORDER BY
CASE WHEN :sort_by = 'created_at' AND :sort_type = 'asc' THEN created_at END ASC,
CASE WHEN :sort_by = 'created_at' AND :sort_type = 'desc' THEN created_at END DESC,
-- CASE WHEN :sort_by = 'major' AND :sort_type = 'asc' THEN major END ASC,
-- CASE WHEN :sort_by = 'major' AND :sort_type = 'desc' THEN major END DESC,
-- CASE WHEN :sort_by = 'minor' AND :sort_type = 'asc' THEN minor END ASC,
-- CASE WHEN :sort_by = 'minor' AND :sort_type = 'desc' THEN minor END DESC,
CASE WHEN :sort_by IS NULL OR :sort_by = '' THEN created_at END DESC
CASE WHEN :sort_by = 'created_at' AND :sort_type = 'asc' THEN dv.created_at END ASC,
CASE WHEN :sort_by = 'created_at' AND :sort_type = 'desc' THEN dv.created_at END DESC,
-- CASE WHEN :sort_by = 'major' AND :sort_type = 'asc' THEN dv.major END ASC,
-- CASE WHEN :sort_by = 'major' AND :sort_type = 'desc' THEN dv.major END DESC,
-- CASE WHEN :sort_by = 'minor' AND :sort_type = 'asc' THEN dv.minor END ASC,
-- CASE WHEN :sort_by = 'minor' AND :sort_type = 'desc' THEN dv.minor END DESC,
CASE WHEN :sort_by IS NULL OR :sort_by = '' THEN dv.created_at END DESC
OFFSET ((GREATEST(:page, 1) - 1) * :page_size) LIMIT :page_size;
3 changes: 3 additions & 0 deletions DSL/Resql/global-classifier/POST/insert-agency-centops.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
-- Insert new agency into mock_centops table
INSERT INTO public.mock_centops (agency_id, agency_name, created_at)
VALUES (:agencyId, :agencyName, NOW());
Loading
Loading