Decommissioning data centers in a cloud system

Sam, a storage administrator, manages a private cloud system with data centers in Germany and the U.S.A. For the past few months, he has observed that data movement between these data centers is consistently slow. Customers who use this system are furious about the delay. Since most customers are in America, he has been ordered to move all the data from Germany to the U.S.A and shut down (or decommission) the German data center. 

How can Sam move all the data safely? 

Methods used

User interviews

Affinity mapping

User flow explorations

Rapid prototyping

Usability testing

SUS score

Understanding the problem

To understand the complete picture, I interviewed eight storage administrators who handle data centers of various sizes. Furthermore, I spoke to subject matter experts to dissect data movement policies and derive relationships between servers, data centers, and the cloud system.

 

These are some quotes and high-level themes derived from user interviews.

Exploring user flows to solve the problem

With the user and business goals in mind, I collaborated with subject matter experts to explore nine user flows to solve the problem. The decommissioning process was being developed by multiple scrum teams focusing on various parts of the cloud system. These user flows helped all stakeholders align towards the same goal by presenting an end-to-end flow. Furthermore, we converged to one user flow based on cost-benefit analysis, technical feasibility, and time to release the product. The final user flow had the following six steps:

 

  1. Select the data center to be decommissioned
  2. View servers of the data center to be decommissioned and details of other data centers in the cloud system
  3. Revise the active data management policy to remove any dependencies on the site to be decommissioned
  4. Remove any other references on the site to be decommissioned from other data management rules
  5. Resolve any server conflicts and references in high-availability groups
  6. Monitor decommissioning

Aligning with user expectations

After choosing a feasible user flow based on user expectations, I started designing interfaces that align with the aforementioned high-level themes. 

Guidance

  • Included guardrails to prevent the user from starting a futile decommission
  • Evaluated multiple explorations to summarize and guide the user if the data from the decommissioned site cannot be accommodated among the other centers.

Control

  • Identified data management rules that refer to the site to be decommissioned. These rules will be edited by the user. Provided suggestions to create a future-proof policy.
  • Ensured that the user removes unused rules and policies within the flow to prevent any confusion.

Flexibility

  • Recognized servers in other data centers that are either disconnected or administratively down. The user can fix these issues at any point and come back to the flow to continue the process. 

Transparency

  • The user is shown a visualization that depicts data moving out of the site being decommissioned and the increase in data in other data centers.
  • The decommissioning progress can be optionally seen both at a server level and at a system level to help the user dive into granular details.

Testing and iteration

Tested the user flow with customers by providing a prototype connected to a dummy private cloud infrastructure. Iterated quickly based on user feedback to include a breakdown of database update progress. Overall, users felt that the flow is intuitive and this is reflected in qualitative user research and the average SUS score of 85.

Final thoughts

Although storage admins were pretty nervous about losing data during decommissioning, they regained their confidence after using the aforementioned user flow. This is primarily because the user flow dissected every part of a complex procedure by progressively disclosing bite-sized information

 

My favorite part of this project was the process of exploring multiple user flows. We leveraged these flow charts to align stakeholders across multiple teams towards the same goal.