1.0
Before 2025-01-13, these notes are the changes that are done before major version 2.0
Features
Added a configurable whitelist for hyperlinks using regular expressions, allowing precise control over which links remain in sanitized documents.
Introduced an option to strip comments from Excel workbooks, removing hidden annotations and preventing potential information leakage.
Added decryption support for CDFV2‐encrypted Microsoft Office files so that previously locked documents can now be fully sanitized.
Expanded the result payload to include granular details—such as file size before and after sanitization, exact modifications applied, and relevant warnings—enabling more thorough auditing and debugging.
Changes
Refactored the CDR module for improved code organization, readability, and maintainability across the sanitization pipeline.
Updated the file‐type checker to recognize a broader range of legacy and modern formats, reducing false positives and false negatives during detection.
Restructured the
cdr
configuration schema inconfig.yaml
, consolidating related settings and clarifying default values for easier management.
Fixes
Improved PDF image sanitization by detecting and deduplicating identical images, preventing redundant processing and reducing output size.
Fixed APK files being misdetected as ZIP archives, ensuring accurate file‐type classification during scanning.
Optimized LSB‐based image sanitization for significantly faster execution without sacrificing accuracy.
Addressed OpenDocument (ODT/ODS) ZIP compression corruption so that documents remain intact after sanitization.
Resolved issues in PPT image sanitization by correctly detecting EMF magic bytes, ensuring all embedded images are processed properly.
Fixed the temporary‐file naming logic to guarantee unique names and prevent runtime errors caused by collisions.
Improved hyperlink sanitization in Office files to remove only unsafe or broken links, preserving valid relationships.
Corrected macro handling so that macro object metadata is now correctly appended to the result payload after sanitization.
Updated the OLE‐object macro‐removal routine to support a wider range of embedded objects.
Applied minor bug fixes to eliminate various logical errors discovered during testing.
Last updated
Was this helpful?