Dec 17, 17:33 EST
Postmortem - On 12/16, between 11:10a.m. and 12:25p.m. EST, a subset of Pardot customers experienced a disruption in a portion of our background job processing that schedules new jobs. This disruption resulted in delayed processing for jobs scheduled to begin during this timeframe, including email sending, imports, exports, and CRM syncing.
We determined the root cause to be a spike in job process metadata, which in turn led to low memory errors that caused the system to stop scheduling jobs. While we have metrics and alerting in place to proactively detect similar issues, in investigating this incident we noticed a misconfigured monitoring metric that would have brought the problem to light sooner.
As part of our follow-up to prevent similar issues in the future, we will be taking two steps. First, we will be making improvements to our handling of job metadata to prevent memory issues that result when a spike occurs. Second, we will be correcting the alerting, providing better and more visible notifications should this class of error occur again in the future.
We appreciate your patience and continued trust as we worked to resolve this situation.
Zach Bailey, Sr. Director of Engineering
Dec 16, 13:17 EST
Resolved - Email sending has returned to normal. Please contact our Support Team at pardot.com/help with any further issues. A full postmortem will follow.
Dec 16, 12:36 EST
Monitoring - We've identified the issue and emails are now sending. Our Engineers are continuing to monitor.
Dec 16, 12:19 EST
Investigating - We're currently investigating reports of delayed emails as top priority with our Engineers. Updates will be provided here as we learn more.