Skip to content

Deduplication via Parameter Hashing

Deduplication via Parameter Hashing is a mechanism used to identify and prevent redundant report generation requests. It works by creating a cryptographic hash of the request parameters to uniquely identify the query context.^[001-TODO__code_copy.md]

Implementation

In the provided system, the deduplication logic is implemented within the ReportDownloadRecordServiceImpl class using the duplicateQry method^[001-TODO__code_copy.md].

This method queries the database (report_download_record) to check if a specific report type and parameter combination has already been successfully generated or is currently running for a specific user within a defined time window^[001-TODO__code_copy.md].

Key Components

  • Hash Generation: The system calculates a hash (referred to as searchParamHash) of the input parameters^[001-TODO__code_copy.md].
  • Database Storage: The ReportDownloadRecordEntity stores this hash alongside the record, allowing for efficient lookup of previous requests^[001-TODO__code_copy.md].
  • Time Window: The check includes a duration constraint (e.g., 5 minutes), typically configured via REPORT_CACHE_MINUTE, to ensure that only recent duplicate queries are considered valid cache hits^[001-TODO__code_copy.md].

Benefits

  • Resource Efficiency: Prevents the system from re-processing identical queries that have already been completed recently, saving computational resources.
  • User Experience: Allows the system to immediately serve an existing file or link if a request is duplicated, rather than forcing the user to wait for a new generation process.
  • Integrity: Ensures that multiple identical requests initiated simultaneously (e.g., double-clicks) do not result in conflicting processing states^[001-TODO__code_copy.md].
  • [[Memoization]]
  • [[Idempotency]]
  • [[Caching strategies]]
  • [[Report workflow]]

Sources

  • 001-TODO__code_copy.md