Support PDF 2.0 dates and tighten date validation#631
Conversation
24ce6ed to
6724ffe
Compare
…lidation Support the PDF 2.0 spec (ISO 32000-2) date format which omits the trailing apostrophe that was required in PDF 1.7. Per section 7.9.4, continue accepting the older format. The regex also tolerates a missing splitting apostrophe for additional leniency (matching pdf.js behavior). The regex-based implementation replaces the string position tracking and fixes several existing bugs: - Date strings with optional fields omitted (e.g., no seconds) now parse correctly. The old parser incorrectly interpreted the timezone sign as part of the seconds field when seconds were absent. - Add strict validation for month (0-11 range, was not checked for < 0) and day (1-31 range, now checked for < 1). - UTC indicator 'Z' now correctly rejects non-zero offsets, as per spec. Remove dependency on android.text.TextUtils by replacing isDigitsOnly() checks with regex validation. Fixes GrapheneOS#629
6724ffe to
d28e515
Compare
|
I noticed a couple of preexisting issues in 1. UTC offset handling is semantically inverted. The current arithmetic, preserved from case "-":
hours -= offsetHours;
minutes -= offsetMinutes;
break;
case "+":
hours += offsetHours;
minutes += offsetMinutes;
break;Per ISO 32000-2 §7.9.4, an offset of https://github.com/mozilla/pdf.js/blob/v5.6.205/src/display/display_utils.js#L551-L561 However, just flipping the signs would not fully fix it, because final Calendar calendar = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
// set fields and apply offset to normalize to UTC
return DateFormat.getDateTimeInstance(...).format(calendar.getTime());2.
calendar.setLenient(false);
calendar.getTime(); |
closes #629