Celery Tasks For HR & Recruitment Reports
Hey everyone! Let's dive into implementing Celery tasks for robust HR and recruitment report aggregation in the mvl-erp-backend project. This involves creating event-driven and scheduled tasks to ensure our data is accurate, consistent, and resilient. We'll be using Django signals for event-driven tasks and Celery beat for scheduled batch tasks. Let's break down the requirements and implementation details step by step.
1. HR Reports Aggregation with Celery
Models and Objectives
First off, we're dealing with a couple of key models:
StaffGrowthReport: This guy tracks monthly or weekly growth, resignations, and transfers within the company. Think of it as a pulse check on our workforce.EmployeeStatusBreakdownReport: This one provides a snapshot of the current status of all employees.
The main goal here is to create two types of Celery tasks to aggregate this data:
- Event-Driven Task: This task springs into action whenever something changes in the
EmployeeWorkHistorymodel. This includes new entries, edits, or deletions. We'll be using Django signals for this. The signal approach ensures that every change triggers an update to our reports immediately. The Django signals are prioritized oversave()overrides. - Scheduled Batch Task: This task runs every night at midnight. It aggregates all HR reporting data for the entire previous day. This is super important because it helps us catch any missed or failed events. It's like a safety net for our data.
Triggering the Tasks
So, how do these tasks get triggered?
- Event-Driven: We use
post-saveandpost-deletesignals for theEmployeeWorkHistorymodel. This is the preferred method, as it provides real-time updates. The references are inapps/hrm/models/employee_work_history.pyandapps/hrm/services/employee/work_history.py. - Scheduled Batch: We use Celery beat to schedule the midnight batch task.
Implementation Insights
Here are some implementation suggestions:
- Prioritize Signals: Always lean towards using signals rather than overriding the model's
save()method. Signals give us a cleaner and more efficient way to trigger the tasks. - Service Layer: Use a service layer to aggregate data by period and organizational unit. This keeps the logic organized and maintainable.
- Batch Task: Make sure the batch task can re-process all the changes from the previous day. This is crucial for handling any missed or failed events and keeps our data consistent.
- Retry Mechanisms: Both tasks must support retry mechanisms. This is essential for handling transient errors. If a task fails due to a temporary issue, the retry mechanism will automatically try again, ensuring that the data is eventually processed.
2. Recruitment Reports Aggregation with Celery
Models and Objectives
Next up, we're looking at recruitment reports. Here are the models involved:
RecruitmentSourceReportRecruitmentChannelReportRecruitmentCostReportHiredCandidateReport
We need to create two Celery tasks:
- Event-Driven Task: This task is triggered when a
RecruitmentCandidateis created, edited, when its status changes (like being hired or approved), or when it's deleted. Again, we're using signals here. - Scheduled Batch Task: This task runs at midnight to aggregate all daily recruitment report activity. This ensures we don't miss any data and that our reports are up to date.
Triggering the Tasks
How do these recruitment tasks get triggered?
- Event-Driven: By signals on model changes, especially status transitions. Refer to
apps/hrm/api/views/recruitment_candidate.py, serializers, and similar recruitment report APIs. - Scheduled Batch: We use Celery beat to schedule the midnight batch task.
Implementation Suggestions
- Prioritize Signals: Signals should be prioritized, especially for the
HIREDstatus, where a candidate becomes an employee. When this happens, we want to trigger both recruitment and HR Celery tasks. - Scheduled Task: The scheduled task should cover all of the day's events to ensure data resilience.
- Retry Logic: Always include robust retry logic.
3. General Recommendations and Robustness
Enhancing Resiliency
To make our tasks more robust, here are some general recommendations:
- Retries: All Celery tasks should support retries. This helps handle transient errors gracefully.
- Split Logic: Consider splitting the handler logic. One part should handle single events (fast, signal-based), and the other should handle the midnight batch (all updates of the day).
- Batch Task: The scheduled batch is crucial for reconciling any missed events and preventing orphaned aggregations.
- Logging: Tasks should log details about their execution, including the trigger source, affected records, and a summary of the aggregation. This will help with debugging and monitoring.
Detailed Implementation Steps
Here's a detailed list of what needs to be done:
- Define Celery Task Signatures: Create the Celery task signatures for both event-driven and batch aggregation.
- Wire Up Django Signals: Connect the Django signals for model mutations. This is the preferred method. Ensure signals are correctly configured for
post-saveandpost-deleteevents. - Implement Batch Scheduler: Set up the batch scheduler using Celery beat to run at midnight.
- Implement Aggregation Logic: Write the aggregation logic. Ensure that it's transactionally safe to avoid data inconsistencies. Utilize the service layer to encapsulate the business logic.
- Add Retry Logic, Logging, Error Handling, and Documentation: Implement robust retry logic to handle potential failures. Include comprehensive logging for each task to track its execution. Add proper error handling to manage exceptions and provide clear documentation for the implemented tasks and their usage.
- Example API Responses and Documentation: Provide example API responses to show how to use the implemented Celery tasks and detailed documentation for developers and users.
Conclusion
Implementing these Celery tasks will significantly improve the accuracy, consistency, and reliability of our HR and recruitment reports. By prioritizing Django signals, incorporating retry mechanisms, and using a scheduled batch task, we'll create a resilient system that can handle any issues that may arise. This will provide valuable insights for our HR and recruitment teams. Let's get to work, and happy coding, guys!