Data Sync & Contact Data Error
Incident Report for HEXONET
Postmortem

Incident Report

Summary

The details of this report are pertaining to an incident affecting HEXONET customers that caused issues with domain contact management for specific TLD domains. For a subset of these domains, general domain management actions were affected also. A copy of this report has been sent, via email, to the affected customers.

Impact

The initial impact was on a subset of domains that were not manageable in HEXONET customer accounts. Following this, a further small subset of those received outbound transfer event notifications (note: at no point were domains transferred out of HEXONET).

During resolution steps, customers with specific TLD domains experienced errors when updating domain name contact handles before a final resolution of this incident was reached.

List of impacted TLDs can be found on this incident status page below.

During this incident, there has been no risk to the resolution, ownership, or lifecycle management (including renewals) of affected domain names.

Root Cause

During a routine configuration change of an internal system account, a human-driven step was not followed entirely according to process, causing a break between the HEXONET and registry system synchronisation.

During a re-synchronisation process, the HEXONET system created read-only AUTO contacts for select domains, which in turn caused customers to be unable to update their domain associated contacts.

Remediation & Prevention

Remediation steps taken during this incident:

  • All domains moved to same internal system account
  • Re-synchronisation between HEXONET and registry systems
  • Database backup retrieval and contact data restoration

Prevention steps will be actioned to increase level of detail in this process documentation. Additionally, future such changes will require added review steps to sign off. Future scope to automate this process will be investigated.

Timeline

31 Jan 13:00 UTC

As part of an internal system account configuration change, domains were moved between two internal system accounts.

A subset of domains were not moved to the new account. Domains that were not moved could no longer be managed through customer HEXONET accounts.

31 Jan

First customer contact notification of issue received by HEXONET Support.

1 Feb

Following validation of the customer report, internal investigations into the issue began.

5 Feb

Investigations escalated to Engineering and issue identified as caused by steps to change internal account configuration.

6 Feb

Synchronisation of domains initiated between HEXONET and Registry system to resolve management issue. Synchronisation completed but led to the creation of AUTO contact handles on all specific TLD domains as a result. This sent out the events ‘UPDATE=OK’ to affected customers.

These AUTO handles caused the HEXONET system to no longer be able to manage domain contacts within the Registry System.

Initial mechanisms to reassign original contacts failed.

7 Feb

Identified original domain management issue as being linked to specific statuses of domains.

8 Feb

Management issue domains transfer completed to updated internal system account. Management capability reinstated for these domains.

Contact data issue escalated internally – status page created for incident. https://status.hexonet.net/incidents/gx124dcdbm2n

9 Feb

Customer incident notification sent to all affected Resellers. Hourly status updates through status page.

9 Feb

Identified database tables needed for restore of contact data. Started restore process.

9 Feb

Prepared script to restore original contact handles from the database backup. Executed script for all affected domains.

9 Feb

Validated contacts updated on domains and AUTO handles removed. Contact management capability restored at HEXONET.

9 Feb 17:19 UTC

Incident confirmed resolved via email and status page.

As always, we remain dedicated to your success and hope this background information supports your understanding for all that transpired. If you have any questions or would like to connect with our team, please always contact us at help@hexonet.support .

Your HEXONET Team

Posted Feb 12, 2024 - 17:44 UTC

Resolved
We are pleased to announce that all work related to the data synchronisation issue has now been completed. Our team has worked diligently to ensure a thorough resolution, and we'll continue to monitor our support channels over the weekend - please contact us if you have any questions or concerns at help@hexonet.support.

We will provide an additional update on Monday, 12th of February, with more details on the work completed. Moreover, direct communications will be sent to all affected Resellers, including a complete breakdown and an incident report.

We want to express thanks for the patience and understanding throughout this process.
Posted Feb 09, 2024 - 17:19 UTC
Update
Our database restore activities have now been completed and we are now in the process of restoring contact handles to ensure that all information is accurately aligned with each domain.

As we proceed with the synchronisation, please be aware that there will be an additional event per domain. This event will either be made available as a poll event or sent directly to you via email, depending on the settings of your account.

Next update:
17:00 UTC
Posted Feb 09, 2024 - 16:00 UTC
Update
We are currently in the final stages of our review process for the data synchronisation restores. Our team is making the last few, but necessary checks to ensure a complete resolution.

We anticipate providing you with an update on our progress before or on the next hourly update.

Next Update:
16:00 UTC
Posted Feb 09, 2024 - 15:00 UTC
Update
A database restore is currently underway and progressing at 80% complete. This will enable the remaining remediation tasks. Our estimated time to complete resolution is 18:00 UTC.

Next update will be at 15:00 UTC.
Posted Feb 09, 2024 - 14:00 UTC
Update
The current progress is on track and we'll continue to update on the hour throughout today.

Next Update 14:00 UTC
Posted Feb 09, 2024 - 13:00 UTC
Update
There continue to be no unexpected challenges while rectifying this incident and progress is ongoing.

Next update: 13:00 UTC
Posted Feb 09, 2024 - 12:12 UTC
Update
Progress is ongoing & we continue to have no unexpected challenges while rectifying this incident.

Next update: 12:00 UTC
Posted Feb 09, 2024 - 11:00 UTC
Update
Progress continues on rectifying the data synchronisation issue, with no unexpected challenges encountered so far.

Next update 11:00 UTC
Posted Feb 09, 2024 - 10:00 UTC
Update
We are initiating the process to resolve the data synchronisation issue that has impacted a select group of TLDs. Our technical team has begun the necessary work to address and rectify this matter efficiently.

We anticipate that this work will be completed within the next 6 hours. To keep you fully informed, we will be posting updates on our progress every hour. Our team is committed to resolving this issue as swiftly as possible and restoring normal service.

We appreciate everyone's patience and understanding as we work to rectify this situation. Stay tuned for our hourly updates.
Posted Feb 09, 2024 - 09:00 UTC
Update
We are committed to providing transparent and timely updates regarding the data synchronisation issue previously reported. Our team has identified the Top-Level Domains (TLDs) affected by this incident, which include the following domain extensions.

The impacted TLDs are as follows:

.ae.org, .art, .audio, .auto, .autos, .baby, .bar, .beauty, .best, .blog, .boats, .br.com, .build, .cam, .car, .cars, .ceo, .cfd, .christmas, .cn.com, .co.com, .co.gl, .co.nl, .college, .com.de, .com.gl, .com.se, .de.com, .diet, .eu.com, .fans, .flowers, .fm, .forum, .fun, .game, .gb.net, .gd, .gl, .gr.com, .guitars, .hair, .help, .homes, .host, .hosting, .hu.net, .icu, .in.net, .inc, .jp.net, .jpn.com, .lat, .lol, .london, .luxury, .MAKEUP, .mex.com, .mom, .motorcycles, .online, .pics, .press, .protection, .pw, .qpon, .quest, .radio.am, .radio.fm, .rent, .rest, .ru.com, .ruhr, .sa.com, .sbs, .se.net, .security, .site, .skin, .space, .storage, .store, .tech, .theatre, .tickets, .uk.com, .uk.net, .uno, .us.com, .us.org, .vg, .website, .xyz, .yachts, .za.com

Rectification Plan & Schedule

Our technical team has a comprehensive plan to fully rectify the issue, scheduled to commence at 09:00 UTC on Friday, 9th February 2024 and will post an update when the work starts and will continue to update this status page throughout the day on an hourly basis and aim to have all affected TLDs corrected as swiftly as possible.

We appreciate your patience and understanding as we work to rectify this matter and apologise for any inconvenience caused.
Posted Feb 08, 2024 - 20:08 UTC
Identified
Our technical team has identified an incident pertaining to a data synchronisation error that affected a small batch of domain names within our system.

It is imperative to emphasise that, subsequent to a thorough investigation, we have ascertained no domain names are at risk due to this incident.

However, we have recognised that some customers may experience discrepancies, including the absence of contact data for certain domain names, inaccurate contact details for others, and, in a very limited number of instances, domain names may not be visible within their accounts.

Our team is diligently working to rectify these anomalies and restore full accuracy and visibility to the impacted accounts. We sincerely apologise for any inconvenience this may cause and are committed to implementing measures to prevent such occurrences in the future.
Posted Feb 08, 2024 - 18:39 UTC
This incident affected: Data Sync.