Incident Management Plan
This document describes the incident management plan for Obscreen and complements the Information Security Policy, the Systems Security and Integrity Policy, the Customer Data Encryption Policy, the Risk Analysis, the DPO Designation, the CISO Designation, the Privacy Policy, and the Subprocessors page.
1. Introduction
Effective incident management is essential to ensure the continuity of Obscreen services and to protect the personal data of our customers and users. This document describes the incident management plan, including the procedures, roles, and responsibilities for detecting, responding to, and resolving security and operational incidents.
A centralized alerting and on-call platform (PagerDuty), combined with the native detection capabilities of our cloud providers and application-level error reporting, plays a key role in this strategy.
2. Objectives
- Detect incidents quickly through proactive monitoring.
- Respond effectively to incidents to minimize their impact.
- Communicate clearly with all stakeholders during an incident.
- Document and analyze incidents to improve future processes.
- Ensure compliance with applicable legal and regulatory requirements (in particular GDPR Articles 33 and 34).
3. Scope
This plan applies to:
- All types of incidents affecting the systems, services, or data operated by Obscreen, including the Obscreen Cloud platform, the license issuance services (
lic.obscreen.io), the public websites (obscreen.io,motd.obscreen.io,updates.obscreen.io,support.obscreen.io,docs.obscreen.io), and the supporting payment integrations. (plus all .com derived domains) - All employees, contractors, and partners involved in incident management.
- All information assets, including cloud infrastructure (Hetzner, Cloudflare, AWS, and other subprocessors listed on the Subprocessors page), applications, and data.
For self-hosted deployments, the management of incidents on the customer's infrastructure is the responsibility of the customer, in accordance with the Terms of Service. Obscreen will, however, provide reasonable assistance and clear communication when an incident on Obscreen-managed services (such as license verification or update servers) impacts self-hosted customers.
4. Definitions
- Incident: An unplanned event that disrupts or reduces the quality of a service or compromises the security of information.
- Alerting and On-Call Platform: PagerDuty, used to route alerts, manage on-call schedules, and trigger escalations.
- Monitoring Sources: The combination of tools that produce alerts and observability data, including the native security and monitoring services of our cloud providers (Cloudflare, Hetzner, AWS), application error reporting via Sentry, and internal log aggregation.
- Incident Response Team (IRT): The group of people responsible for managing incidents.
5. Roles and Responsibilities
5.a Management
- Strategic Decisions: Make critical decisions during major incidents.
- External Communication: Approve communications addressed to customers and the media.
5.b Jessym Reziga (Founder, CISO, and DPO)
- General Coordination: Supervise incident management.
- Operational Decisions: Authorize the actions necessary to contain and resolve the incident.
- Communication: Serve as the primary point of contact for internal and external communications.
- Regulatory Notifications: As DPO, ensure that personal data breach notifications are sent to the CNIL within 72 hours when applicable, and to affected data subjects without undue delay.
5.c Incident Response Team (IRT)
- Detection and Analysis: Use the monitoring platform to supervise systems and analyze alerts.
- Immediate Action: Implement the measures necessary to contain and resolve the incident.
- Documentation: Record all actions taken and observations made.
5.d Employees and Contractors
- Reporting: Immediately report any incident or anomaly detected, including suspicious emails, unauthorized access, malware, or accidental data disclosure.
- Cooperation: Collaborate with the IRT during incident management.
6. Detection and Reporting of Incidents
6.a Detection Sources
Detection of incidents on the services operated by Obscreen relies on several layers:
- Provider-Native Detection: Native provider security services (such as Cloudflare WAF, bot management, and DDoS protection, as well as the equivalent capabilities of Hetzner and any other provider used).
- Application Error Reporting: Anonymous error and crash reports collected via Sentry feed back into the detection workflow when appropriate.
- Application Health Checks: Custom checks on critical endpoints (Obscreen Cloud Studio, license issuance, update / status APIs, public websites) verify availability and key metrics (request latency, error rates, abuse signals, license issuance anomalies).
- Manual Observations: Suspicious activity reported by employees, contractors, or customers.
6.b Alerting and On-Call (PagerDuty)
- Alert Routing: Alerts produced by the detection sources above are routed to PagerDuty, which dispatches them to the on-call IRT members.
- Notification Channels: PagerDuty notifies on-call responders via email, SMS, push notifications, and phone calls, depending on the severity and acknowledgment status.
- On-Call Schedules: PagerDuty manages the on-call rotations and ensures that an alert always reaches an available responder.
- Escalation Policies: If an alert is not acknowledged within a defined time, PagerDuty automatically escalates it to a higher level.
6.c Reporting Process
- Automatic Reporting: Alerts produced by detection sources are automatically dispatched to the IRT through PagerDuty.
- Manual Reporting: Employees and contractors can report incidents through a dedicated channel (email, internal messaging, ticketing tool).
- Customer Reporting: Customers can report security incidents or vulnerabilities via
[email protected]or through the support channels. - Recording: All reported incidents are recorded in an incident management system.
7. Incident Classification
7.a Severity Levels
- Level 1 (Critical): Major incidents affecting all services or compromising the security of sensitive data (for example, a confirmed breach of customer personal data, a global outage of Obscreen Cloud, or a compromise of the license issuance service).
- Level 2 (High): Incidents affecting important services or presenting a high risk for security (for example, a partial outage, a confirmed vulnerability without exploitation, abuse of Customer Content on Obscreen Cloud).
- Level 3 (Moderate): Incidents affecting non-critical components or with limited impact.
- Level 4 (Minor): Incidents with no significant impact on services or security.
7.b Classification Criteria
- Service Impact: Extent of the disruption of services.
- Data Security: Risk of compromise of personal data.
- Number of Affected Users: Scope of the incident in terms of users impacted.
- Estimated Recovery Time: Expected duration to resolve the incident.
8. Incident Response
8.a General Procedure
- Incident Validation
- Confirm the reality of the reported incident.
- Quickly evaluate the severity and impact.
- Notification of the Relevant Team
- Notify the appropriate IRT members based on the classification.
- When necessary, notify Management for Level 1 or Level 2 incidents.
- Containment
- Isolate affected systems to prevent propagation.
- Apply temporary mitigations if possible (rate limiting, IP blocking, service isolation, license revocation in case of a license-related compromise).
- Investigation
- Identify the root cause of the incident.
- Collect evidence and event logs for analysis.
- Recovery
- Restore affected services to normal operation.
- Verify the integrity of data and systems, restoring from validated daily backups if needed, as defined in the Systems Security and Integrity Policy.
- Communication
- Provide regular updates to internal stakeholders.
- Inform customers when necessary, in accordance with the communication plan and contractual commitments.
- Incident Closure
- Document all actions taken.
- Update the incident management system with the final details.
8.b Target Response Times
- Level 1: Immediate response, resolution within 4 hours.
- Level 2: Response within 1 hour, resolution within 8 hours.
- Level 3: Response within 2 hours, resolution within 48 hours.
- Level 4: Response within 8 hours, resolution within 96 hours.
These targets are best-effort objectives and may vary depending on the complexity of the incident and on the response capabilities of our subprocessors.
9. Communication During Incidents
9.a Internal
- Channels Used: Email, instant messaging, phone calls.
- Frequency: Updates every 2 hours for Level 1 and Level 2 incidents, or as needed.
- Content: Incident status, actions taken, impact on services.
9.b External
- Customers: Notification in case of impact on services or personal data, including for self-hosted customers when an Obscreen-managed service (license, updates) is involved.
- Regulatory Authorities: If the incident involves a personal data breach, notification to the CNIL (Commission nationale de l'informatique et des libertés) within 72 hours, in accordance with Article 33 of the GDPR.
- Affected Data Subjects: Communication without undue delay when the breach is likely to result in a high risk to their rights and freedoms, in accordance with Article 34 of the GDPR.
- Media and Public: Coordinated communication with Management for major incidents.
10. Documentation and Reporting
- Incident Report: A detailed report is written for each incident, including:
- Description of the incident.
- Timeline of events.
- Actions taken.
- Impact on services and data.
- Lessons learned.
- Secure Storage: Reports are stored in a secure repository and accessible only to authorized people.
11. Post-Incident Review and Continuous Improvement
- Post-Mortem Meeting: Organized by the CISO within 15 days of the closure of a major incident.
- Root Cause Analysis: Identification of root causes and weaknesses in the system.
- Action Plan: Definition of corrective measures to prevent recurrence.
- Procedure Updates: Adapt the policies and procedures based on the lessons learned.
12. Training and Awareness
- Training Sessions: Regular training of the IRT on new threats and best practices.
- Incident Simulations: Periodic exercises to test the reactivity of the team and the effectiveness of the procedures.
- General Awareness: Inform all employees and contractors about the importance of reporting incidents quickly.
13. Plan Maintenance
- Periodic Review: This incident management plan is reviewed at least once every three years.
- Updates: Updated according to technological, organizational, or regulatory changes.
- Validation: Any major change must be approved by Management.
14. Additional Notes
14.a Use of PagerDuty
PagerDuty is a central element of our alerting and on-call strategy. It is integrated into the incident management plan as follows:
- Centralized Alert Routing: Alerts produced by the various detection sources (provider-native security services on Cloudflare, Hetzner, AWS; application error reporting via Sentry; application health checks) are routed to PagerDuty, providing a single point of dispatch.
- Multi-Channel Notifications: PagerDuty notifies on-call responders via email, SMS, push notifications, and phone calls, depending on the severity and acknowledgment status.
- On-Call Schedules: PagerDuty manages the on-call rotations and ensures that an alert always reaches an available responder.
- Severity Levels: Severity levels (see section 7) are reflected in PagerDuty as urgency settings, controlling the notification behavior.
14.b Escalation and Acknowledgment
- Acknowledgment: On-call responders acknowledge alerts in PagerDuty when they begin investigation.
- Automatic Escalation: If an alert is not acknowledged within a defined time, PagerDuty automatically escalates it to the next on-call responder according to the escalation policy.
- Alert Enrichment: Alerts include the contextual information provided by the detection source (relevant metrics, affected service, related runbook link) to facilitate a fast response.
14.c Integration with Other Incident Management Tools
- Ticketing System: Alerts and incidents handled in PagerDuty are linked to our ticketing system to ensure follow-up and documentation.
- Runbooks: Predefined procedures (runbooks) are associated with alert types to guide responders during resolution.
- Post-Mortem Workflow: Incident timelines collected through PagerDuty feed into the post-mortem documentation described in section 11.
By adopting this incident management plan, Obscreen strengthens its ability to manage incidents effectively, minimizing the impact on services and on the security of customer data. The combination of clear procedures, well-defined roles, layered detection sources, and a centralized alerting platform (PagerDuty) allows us to maintain a high level of operational resilience.
15. Contact
For any question regarding this Incident Management Plan, or to report a security incident or vulnerability, please contact us at [email protected].
