Affected components

No components marked as affected

Updates

Write-up published

Read it here

Resolved

Find bellow our Postmortem on the OMS incident.

http://help.vtex.com/en/announcement/post-mortem-oms-november-2017

Mon, Jul 30, 2018, 07:29 PM
8 months earlier...

Resolved

We've completed phase two of our recovery process.

All the orders placed in 2017 are now available in the admin dashboard or through the APIs.

We continue to work on the system's stability and will follow-up with a Post mortem.

Sat, Nov 11, 2017, 04:22 PM
5h earlier...

Monitoring

Phase two of the recovery process is still ongoing.

We anticipate that the recovery will be completed early in this evening.

Sat, Nov 11, 2017, 10:24 AM
7h earlier...

Monitoring

We are still working on phase two of our recovery process.

More updates in the morning.

Sat, Nov 11, 2017, 02:35 AM
1h earlier...

Monitoring

We are still working on phase two of our recovery process.

Sat, Nov 11, 2017, 01:07 AM
55m earlier...

Monitoring

We are still working on phase two of our recovery process.

Sat, Nov 11, 2017, 12:11 AM
1h earlier...

Monitoring

We've completed the first phase of our recovery process.

All the orders placed/updated in the last 24 hours are now available in the admin dashboard or through the APIs.

We are starting the second phase right now and we will continue to post updates here.

Fri, Nov 10, 2017, 10:53 PM
1h earlier...

Monitoring

The fixes we’ve rolled out are showing positive results, though we’re are not there yet.

Our OMS started to recover and you might be able to start seeing some orders via admin or through our APIs.

We are phasing the recovery process in this given order:

  • Orders placed/updated in the last 24 hours

  • Orders from 2017

There was no data loss as a consequence of this incident.

We will keep posting updates as the recovery process unfolds.

Fri, Nov 10, 2017, 09:04 PM
1h earlier...

Monitoring

Dear Customers,

We’d like to share with you some additional information about the details of this incident.

It is important to reinforce that all storefronts are receiving orders and there is no impact in sales. All data is being securely stored, there's no data loss as a consequence of this incident.

The issue lies within the boundaries of our administrative dashboard and its respective APIs, but we understand that this creates challenges for the operation.

One of our core values is transparency, as such we are making continuous efforts to improve our incident communication protocols and your continued feedback is always appreciated.

Right now, all of our senior engineers are committed to the recovery of the affected systems.

Fri, Nov 10, 2017, 07:26 PM
1h earlier...

Monitoring

We are still working on the root cause of the increased error rates in our OMS service.

Fri, Nov 10, 2017, 06:01 PM
1h earlier...

Monitoring

We are still working on the root cause of the increased error rates in our OMS service.

Fri, Nov 10, 2017, 04:54 PM
2h earlier...

Monitoring

We are still working on the root cause of the increased error rates in our OMS service.

Fri, Nov 10, 2017, 02:45 PM
1h earlier...

Monitoring

We are still working on the root cause of the increased error rates in our OMS service.

Fri, Nov 10, 2017, 12:52 PM
1h earlier...

Monitoring

We can confirm an improvement in the elevated error rates in OMS. We are monitoring the result of our actions.

Fri, Nov 10, 2017, 11:02 AM
37m earlier...

Investigating

We are investigating increased error rates in our OMS service.

Fri, Nov 10, 2017, 10:25 AM