feat(cpex-scraper): add file logger for better debugging and tracing#4376
feat(cpex-scraper): add file logger for better debugging and tracing#4376faizkhairi wants to merge 3 commits into
Conversation
Add a lightweight file logger that writes timestamped lines to both stdout and a log file in the logs/ directory. The log directory is created automatically at runtime (already gitignored by root .gitignore). This matches the logging pattern used by the nus-v2 scraper, making it easier to debug and trace CPEx scraper runs. - Add src/logger.ts with createFileLogger() utility - Update src/index.ts to use the file logger instead of console
|
@faizkhairi is attempting to deploy a commit to the modsbot's projects Team on Vercel. A member of the Team first needs to authorize it. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #4376 +/- ##
==========================================
+ Coverage 54.52% 56.39% +1.86%
==========================================
Files 274 317 +43
Lines 6076 6962 +886
Branches 1455 1679 +224
==========================================
+ Hits 3313 3926 +613
- Misses 2763 3036 +273 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Fix log directory path: '../logs' resolved to build/logs/ at runtime (inside gitignored build dir). Changed to '../../logs' to resolve to scrapers/cpex-scraper/logs/ (the scraper root) - Make close() return Promise to ensure all buffered data is flushed to disk before process exits - Await close() in both success and error paths in index.ts - Add error handler on write stream to prevent unhandled crash
|
Note on testing: I was unable to run the full CI suite locally (
The existing Let me know if you'd like me to add these tests. |
|
@greptile |
Confidence Score: 3/5Safe to merge for normal runs, but the promise chain in index.ts has a structural flaw that could produce a misleading exit code and an unhandled rejection when the write stream itself errors. The logger logic is straightforward and the happy path works correctly. The concern is in index.ts: attaching .catch() after .then() means any failure inside the .then() close call is caught by the same handler that handles scrape failures — the stream close error is reported as 'Failed to scrape', the exit code is set to 1 even though the scrape succeeded, and a second close() call on an already-destroyed stream can leave an unhandled rejection. scrapers/cpex-scraper/src/index.ts — the .then()/.catch() promise chain needs attention for the logger close handling.
|
| Filename | Overview |
|---|---|
| scrapers/cpex-scraper/src/index.ts | Wires the new FileLogger into the scrape entry point; the .then()/.catch() structure double-closes the logger stream and can produce an unhandled rejection + wrong exit code on stream failure. |
| scrapers/cpex-scraper/src/logger.ts | New file adding timestamped dual-output (stdout + file) logger; minor: redundant existsSync guard before mkdirSync({ recursive: true }), and pad2 is duplicated from scraper.ts. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[index.ts starts] --> B[createFileLogger]
B --> C{logs/ dir exists?}
C -- No --> D[mkdirSync recursive]
C -- Yes --> E[createWriteStream cpex-timestamp.log]
D --> E
E --> F[scrapeCPEx runs]
F --> G{resolves?}
G -- Yes --> H[logger.close in .then]
H --> I{close succeeds?}
I -- Yes --> J[Done]
I -- No --> K[.catch fires with stream error]
K --> L[logger.log misleading error]
K --> M[logger.close double-close]
M --> N{close succeeds?}
N -- No --> O[Unhandled rejection]
N -- Yes --> P[process.exitCode = 1 wrong]
G -- No --> Q[.catch fires with scrape error]
Q --> R[logger.log error]
R --> S[logger.close]
S --> T[process.exitCode = 1]
style K fill:#ffcccc
style O fill:#ff6666
style P fill:#ffcccc
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
scrapers/cpex-scraper/src/index.ts:23-29
**Double-close on stream failure, wrong exit code**
If `logger.close()` in the `.then()` block rejects (e.g. the write stream was already destroyed due to a disk error), the `.catch()` handler fires with that stream error as `error`. This causes three problems at once: `logger.log(...)` writes to the already-errored stream (silently dropped), `logger.close()` is called a second time on a destroyed stream (which can reject again with no further handler → unhandled promise rejection), and `process.exitCode = 1` is set even though the scrape itself succeeded.
The standard fix is to use `.finally()` so `close()` runs exactly once regardless of outcome.
### Issue 2 of 2
scrapers/cpex-scraper/src/logger.ts:39-41
The `existsSync` guard is redundant here because `mkdirSync` with `{ recursive: true }` is a no-op when the directory already exists — it never throws in that case. The extra check adds a TOCTOU window (another process could remove the directory between the check and the mkdir) without buying anything.
```suggestion
fs.mkdirSync(LOG_DIR, { recursive: true });
```
Reviews (1): Last reviewed commit: "Merge branch 'master' into feat/cpex-scr..." | Re-trigger Greptile
| }).then(async () => { | ||
| await logger.close(); | ||
| }).catch(async (error) => { | ||
| logger.log(`Failed to scrape: ${error}`); | ||
| await logger.close(); | ||
| process.exitCode = 1; | ||
| }); |
There was a problem hiding this comment.
Double-close on stream failure, wrong exit code
If logger.close() in the .then() block rejects (e.g. the write stream was already destroyed due to a disk error), the .catch() handler fires with that stream error as error. This causes three problems at once: logger.log(...) writes to the already-errored stream (silently dropped), logger.close() is called a second time on a destroyed stream (which can reject again with no further handler → unhandled promise rejection), and process.exitCode = 1 is set even though the scrape itself succeeded.
The standard fix is to use .finally() so close() runs exactly once regardless of outcome.
Prompt To Fix With AI
This is a comment left during a code review.
Path: scrapers/cpex-scraper/src/index.ts
Line: 23-29
Comment:
**Double-close on stream failure, wrong exit code**
If `logger.close()` in the `.then()` block rejects (e.g. the write stream was already destroyed due to a disk error), the `.catch()` handler fires with that stream error as `error`. This causes three problems at once: `logger.log(...)` writes to the already-errored stream (silently dropped), `logger.close()` is called a second time on a destroyed stream (which can reject again with no further handler → unhandled promise rejection), and `process.exitCode = 1` is set even though the scrape itself succeeded.
The standard fix is to use `.finally()` so `close()` runs exactly once regardless of outcome.
How can I resolve this? If you propose a fix, please make it concise.| if (!fs.existsSync(LOG_DIR)) { | ||
| fs.mkdirSync(LOG_DIR, { recursive: true }); | ||
| } |
There was a problem hiding this comment.
The
existsSync guard is redundant here because mkdirSync with { recursive: true } is a no-op when the directory already exists — it never throws in that case. The extra check adds a TOCTOU window (another process could remove the directory between the check and the mkdir) without buying anything.
| if (!fs.existsSync(LOG_DIR)) { | |
| fs.mkdirSync(LOG_DIR, { recursive: true }); | |
| } | |
| fs.mkdirSync(LOG_DIR, { recursive: true }); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: scrapers/cpex-scraper/src/logger.ts
Line: 39-41
Comment:
The `existsSync` guard is redundant here because `mkdirSync` with `{ recursive: true }` is a no-op when the directory already exists — it never throws in that case. The extra check adds a TOCTOU window (another process could remove the directory between the check and the mkdir) without buying anything.
```suggestion
fs.mkdirSync(LOG_DIR, { recursive: true });
```
How can I resolve this? If you propose a fix, please make it concise.
Summary
src/logger.ts) that writes timestamped lines to both stdout and a log filescrapers/cpex-scraper/logs/with formatcpex-YYYY-MM-DD.HH-mm-ss.log.gitignore)src/index.tsto use the file logger instead of the defaultconsolefsmodulePick<Console, 'log'>interface, soscraper.tsis unchangedRelated Issue
Closes #4371
Test Plan
pnpm --filter nusmods-cpex-scraper buildand verify compilation succeedslogs/directorypnpm --filter nusmods-cpex-scraper test)