Asynchronous Report Generation Architecture¶
The Asynchronous Report Generation Architecture is a system designed to handle long-running data export tasks (such as large CSV or PDF reports) without blocking the primary application thread.^[001-TODO__code_copy.md, 001-todo-code.md] It achieves this by decoupling the request initiation from the actual file generation and data retrieval processes, utilizing a Message Queue (RabbitMQ) for state management and retry logic, and storing generated files in Google Cloud Storage (GCS).^[001-TODO__code_copy.md, 001-todo-code.md]
Core Workflow¶
The report generation lifecycle follows a specific state transition model managed by a ReportDownloadRecord entity.^[001-TODO__code_copy.md, 001-todo-code.md]
- Initiation: A user requests a report via the
ReportManageController.^[001-TODO__code_copy.md, 001-todo-code.md] The system creates a database record with aRUNNINGstatus and sends a message (ReportQryVo) to a RabbitMQ topic exchange.^[001-TODO__code_copy.md, 001-todo-code.md] - Processing: An external service (or listener) consumes the query message.^[001-TODO__code_copy.md, 001-todo-code.md] This service fetches the data and sends the results back in chunks (pages) via a
PUTrequest tomakeReportDocument.^[001-TODO__code_copy.md, 001-todo-code.md] Each chunk is converted to a byte array (e.g., CSV) and temporarily stored in GCS.^[001-TODO__code_copy.md, 001-todo-code.md] - Completion: Once all pages are received, a
ReportDocumentCombineVomessage is sent to acombineFileQqueue.^[001-TODO__code_copy.md, 001-todo-code.md] A listener merges the temporary files into a single document and updates the database status toSUCCESS.^[001-TODO__code_copy.md, 001-todo-code.md] - Download: The user polls or triggers a download via the
/downloadendpoint.^[001-TODO__code_copy.md, 001-todo-code.md] If the status isSUCCESS, the file is served from GCS.^[001-TODO__code_copy.md, 001-todo-code.md]
Key Components¶
Message Queue Topology¶
The architecture relies on RabbitMQ to manage the asynchronous flow and implement a "Retry with Delay" pattern to prevent blocking on failures.^[001-TODO__code_copy.md, 001-todo-code.md]
- Topic Exchange (
plt.basic.report.topic.ex): Receives the initial query request.^[001-TODO__code_copy.md, 001-todo-code.md] - Delay Queue (
plt.basic.report.delay.q): A queue with a TTL (Time To Live) that forwards messages to a Dead Letter Exchange after a set delay (5 minutes).^[001-TODO__code_copy.md, 001-todo-code.md] This allows the system to retry processing if the initial attempt times out or fails.^[001-TODO__code_copy.md, 001-todo-code.md] - Dead Letter Queue (
plt.basic.report.dead.q): If the message expires from the Delay Queue without successful processing, it lands here.^[001-TODO__code_copy.md, 001-todo-code.md] A listener on this queue marks the report status asFAILin the database.^[001-TODO__code_copy.md, 001-todo-code.md] - Combine Queue (
plt.basic.report.combine.file.q): Dedicated queue for triggering the final merging of temporary file chunks.^[001-TODO__code_copy.md, 001-todo-code.md]
File Management Strategy¶
The system uses Google Cloud Storage (GCS) to handle raw file data, structured around specific prefixes and temporary directories.^[001-TODO__code_copy.md, 001-todo-code.md]
- Temporary Storage: During generation, partial files are stored in a temporary directory (e.g.,
/doc/report/csv/{id}-temp/).^[001-TODO__code_copy.md, 001-todo-code.md] - Final Storage: Once combined, the final document is moved to a permanent path (e.g.,
/doc/report/csv/{id}/{id}.csv).^[001-TODO__code_copy.md, 001-todo-code.md] - Cleanup: After successful merging, the system automatically deletes the temporary directory to save storage costs.^[001-TODO__code_copy.md, 001-todo-code.md]
Database State Machine¶
The report_download_record table tracks the progress of every report.^[001-TODO__code_copy.md, 001-todo-code.md]
RUNNING: The initial state. The system also checks for "duplicate queries" within a 5-minute cache window usingsearchParamHashto prevent redundant report generation.^[001-TODO__code_copy.md, 001-todo-code.md]SUCCESS: Indicates the file path is populated and ready for download.^[001-TODO__code_copy.md, 001-todo-code.md]FAIL: Indicates an error occurred (timeout or generation failure). Theerror_messagecolumn stores the reason.^[001-TODO__code_copy.md, 001-todo-code.md]
Implementation Details¶
- Document Strategy: The system uses a Strategy Pattern for Document Generation via the
ReportDocumentServiceinterface.^[001-TODO__code_copy.md, 001-todo-code.md] Concrete implementations likeCSVDocumentServiceImplhandle specific format logic (e.g., joining rows with commas, removing headers during chunk merging).^[001-TODO__code_copy.md, 001-todo-code.md] - Status Verification: The
makeReportDocumentendpoint checksisRunningStatusto ensure it only accepts data for records currently in theRUNNINGstate.^[001-TODO__code_copy.md, 001-todo-code.md]
Related Concepts¶
- Message Queue
- [[Object Storage]]
- [[Async Pattern]]
Sources¶
001-TODO__code_copy.md001-todo-code.md