Redesigning the alert system (IMS)
Redesigning the alert system (IMS)
Revamped the entire product by redesigning the workflow of the incident management system, an integral core feature, and introducing a new onboarding process.
Revamped the entire product by redesigning the workflow of the incident management system, an integral core feature, and introducing a new onboarding process.




Overview
A brief intro about Cliff.ai and the project
A brief intro about Cliff.ai and the project
Cliff.ai (B2B SaaS) is a business reliability tool that helps companies actively monitor their important business metrics, automatically catch unexpected spikes or dips in the metrics and alert the users. However, users encountered challenges with this core feature.
Cliff.ai (B2B SaaS) is a business reliability tool that helps companies actively monitor their important business metrics, automatically catch unexpected spikes or dips in the metrics and alert the users. However, users encountered challenges with this core feature.
Recognizing the complexity, I took on the responsibility of simplifying this intricate process, ensuring that our users find the alert system effortlessly easy to use.
Recognizing the complexity, I took on the responsibility of simplifying this intricate process, ensuring that our users find the alert system effortlessly easy to use.
Led and mentored the team through the end-to-end design process, from conception to launch.
Led and mentored the team through the end-to-end design process, from conception to launch.
Goal & Impact
What impact we aimed for?
What impact we aimed for?
Our Goal:
Optimizing for Users’ Time-Saving
Boosting Business Growth for both Users and Cliff.ai
The Impact:
64%
We succeeded in reducing the task completion time by 64% which had an enormous positive impact on Cliff.ai's and users' business growth.
Problem & solution in a nutshell
A synopsis of the problem and solution
A synopsis of the problem and solution
Users faced the challenge of not getting alert notifications despite surpassing the threshold
During a festive season, a client's 65K visitor alert on their e-commerce platform failed to notify them, causing frustration. Reporting the issue led to two more complaints and further scrutiny.
Users faced the challenge of not getting alert notifications despite surpassing the threshold
During a festive season, a client's 65K visitor alert on their e-commerce platform failed to notify them, causing frustration. Reporting the issue led to two more complaints and further scrutiny.
Introducing Monitor, Incidents, and Escalation Policies as the Solution
The old alert system involved setting threshold values and KPIs. However, based on our subsequent research, we introduced Monitor, Incidents, and Escalation policy:
Introducing Monitor, Incidents, and Escalation Policies as the Solution
The old alert system involved setting threshold values and KPIs. However, based on our subsequent research, we introduced Monitor, Incidents, and Escalation policy:
Streamlining Monitors: Automated and customizable ‘Alert Rule Setup’
Old


New








Efficient Incident Management: Centralizing Insights with Date-Wise Incident Page
Old


New








Efficient Notification Workflow: Introducing Customizable Escalation Policies at Cliff.ai
New








Research & Analysis
Understanding the problem
Understanding the problem
We began our research by evaluating the existing feature's feasibility, diving into its fundamental aspects to understand the 'whats,' 'whys,' and 'hows.' We employed the following research techniques, including:
We began our research by evaluating the existing feature's feasibility, diving into its fundamental aspects to understand the 'whats,' 'whys,' and 'hows.' We employed the following research techniques, including:
Think-out-loud sessions with two participant groups, one familiar with the product and the other unfamiliar
To gather diverse perspectives and valuable insights, we collaborated with the technical team, and assigned the task of creating a new account, setting an alert threshold value, and analyzing the root cause of anomalies. Here are the outcomes:
Think-out-loud sessions with two participant groups, one familiar with the product and the other unfamiliar
To gather diverse perspectives and valuable insights, we collaborated with the technical team, and assigned the task of creating a new account, setting an alert threshold value, and analyzing the root cause of anomalies. Here are the outcomes:
85%
Well-acquainted participants were able to complete the task.
Well-acquainted participants were able to complete the task.
70%
Unfamiliar participants failed to complete the task.
Unfamiliar participants failed to complete the task.
User interviews with 7 participants including business owners and newer platform users
The key insights from the interviews:
User interviews with 7 participants including business owners and newer platform users
The key insights from the interviews:
The most common problem was that, even after setting up monitoring criteria, clients could not get timely alerts.
The most common problem was that, even after setting up monitoring criteria, clients could not get timely alerts.
Identifying whom to notify and when during incidents was a user pain point.
Identifying whom to notify and when during incidents was a user pain point.
Lack of clarity in understanding the intricate process of setting "Alert Rules."
Lack of clarity in understanding the intricate process of setting "Alert Rules."
Users highlighted challenges in extracting crucial insights and conducting root cause analysis from the metrics.
Users highlighted challenges in extracting crucial insights and conducting root cause analysis from the metrics.
Competitive analysis to learn from our competitors' successful features and identify opportunities to enhance our platform.
Competitive analysis to learn from our competitors' successful features and identify opportunities to enhance our platform.




Changes we made after competitive analysis:
Nomenclature: The naming convention is crucial for accessibility. While our team understood it, we realized that improvements were needed for better user-friendliness.
Hierarchy: We enhanced the system's overall hierarchy, structuring information after brainstorming feature feasibility and created a new user flow with the key steps (I included it later in this case study)
Incident List: Managing the incidents by compiling all of them in a list.
Nomenclature: The naming convention is crucial for accessibility. While our team understood it, we realized that improvements were needed for better user-friendliness.
Hierarchy: We enhanced the system's overall hierarchy, structuring information after brainstorming feature feasibility and created a new user flow with the key steps (I included it later in this case study)
Incident List: Managing the incidents by compiling all of them in a list.
HMW statement
How might we optimize our platform for efficient anomaly analysis, streamlined alert setup, and timely notifications, fostering business growth for both users and Cliff.ai?
How might we optimize our platform for efficient anomaly analysis, streamlined alert setup, and timely notifications, fostering business growth for both users and Cliff.ai?
Target users
The users we are aiming for
The users we are aiming for
Our platform is designed for business and operations teams within organizations of all sizes. It caters to a wide range of users, including data analysts, site reliability engineers, product managers, executive leaders, and customer success managers.
Our platform is designed for business and operations teams within organizations of all sizes. It caters to a wide range of users, including data analysts, site reliability engineers, product managers, executive leaders, and customer success managers.
Ideation
Creating the User Flow
Creating the User Flow
The old user flow lacked a clear process for setting up thresholds and the alert system:
The old user flow lacked a clear process for setting up thresholds and the alert system:




The new user flow involves setting the threshold value and configuring the alert system in a sequential queue:
The new user flow involves setting the threshold value and configuring the alert system in a sequential queue:




Mid-fi & User-testing
Mid-fi prototypes for user testing and evaluation
Mid-fi prototypes for user testing and evaluation
Based on research and competitive analysis outcomes, we divided our alert system into three parts: Monitors, Incidents & Escalation Policies. Here’s the first iteration we prototyped to share with the participants for user testing:
Based on research and competitive analysis outcomes, we divided our alert system into three parts: Monitors, Incidents & Escalation Policies. Here’s the first iteration we prototyped to share with the participants for user testing:








Monitor list & Monitor detail page








Incident list & Incident detail page








Escalation policy list & adding new policy page
User test results:
Primary feedback on our initial iterations included:
The incident list we designed would result in endless scrolling due to the continuous nature of incident occurrences.
The details in the list are not enough and user would not get enough idea about the incident.
The monitor list should include details about the type of monitor.
Users wish to access details of incidents attributed to a specific monitor.
Users want to view the details of escalation policy, including the assigned users or teams, chosen platforms, and specifics about streams and measures.
The incident list we designed would result in endless scrolling due to the continuous nature of incident occurrences.
The details in the list are not enough and user would not get enough idea about the incident.
The monitor list should include details about the type of monitor.
Users wish to access details of incidents attributed to a specific monitor.
Users want to view the details of escalation policy, including the assigned users or teams, chosen platforms, and specifics about streams and measures.
Solution in detail
The solution - Introducing ‘Incident Management System’
The solution - Introducing ‘Incident Management System’
Based on the user-testing feedback, we started working on the screens. Let’s go deeper and view incidents, monitors and escalation policies in detail.
Based on the user-testing feedback, we started working on the screens. Let’s go deeper and view incidents, monitors and escalation policies in detail.
Monitors
We automated the Alert Rule setup process (Monitors) to ensure users receive notifications for anomalies, even without setting up a monitor themselves. Users can select rules, add streams, measures, and dimensions with thresholds for metrics.
Monitors
We automated the Alert Rule setup process (Monitors) to ensure users receive notifications for anomalies, even without setting up a monitor themselves. Users can select rules, add streams, measures, and dimensions with thresholds for metrics.




Monitor list page - list of all the monitors created
Now, opening a monitor displays comprehensive details, aiding analysis of top measures and dimensions with assigned responders. The heatmap visualizes past incidents on this monitor.
Now, opening a monitor displays comprehensive details, aiding analysis of top measures and dimensions with assigned responders. The heatmap visualizes past incidents on this monitor.




Monitor detail page
Incidents
At Cliff.ai, we lacked a central incident repository for users to quickly gather insights from metric incidents. Thus, we created a date-wise incident page to streamline this process.
Incidents
At Cliff.ai, we lacked a central incident repository for users to quickly gather insights from metric incidents. Thus, we created a date-wise incident page to streamline this process.




Incident list page
The 'Incidents' screen displays a list of incidents based on monitor thresholds, organized by date. Serving as the central ‘hub’, users can manage alerts and access insights, enhancing accessibility to incident information. The heatmap illustrates resolved incidents over time:
The 'Incidents' screen displays a list of incidents based on monitor thresholds, organized by date. Serving as the central ‘hub’, users can manage alerts and access insights, enhancing accessibility to incident information. The heatmap illustrates resolved incidents over time:




Incident detail page
Escalation policy
Cliff.ai lacked prompt anomaly notifications, so we introduced an escalation policy for users to select recipients and specify details.
An escalation policy describes the following three things:
1. Who to deliver the notifications to.
2. In what order or interval, notifications should be delivered
3. On which platform to deliver the notifications.
Escalation policy
Cliff.ai lacked prompt anomaly notifications, so we introduced an escalation policy for users to select recipients and specify details.
An escalation policy describes the following three things:
1. Who to deliver the notifications to.
2. In what order or interval, notifications should be delivered
3. On which platform to deliver the notifications.




Escalation policies list page
Now comes the fun part, on incident generation, notifications are automatically sent to assigned responders through the Default Escalation Policy. Users benefit from a hassle-free experience without the need for manual setup.
Users can customize the Escalation Policy for efficient incident management, defining rules for intervals and delivery platforms. And, if no response is received in the escalation queue, the actions can be repeated n number of times, which is decided by the user.
Now comes the fun part, on incident generation, notifications are automatically sent to assigned responders through the Default Escalation Policy. Users benefit from a hassle-free experience without the need for manual setup.
Users can customize the Escalation Policy for efficient incident management, defining rules for intervals and delivery platforms. And, if no response is received in the escalation queue, the actions can be repeated n number of times, which is decided by the user.




Adding/Editing an escalation policy
Key learnings
My learnings after leading the project
My learnings after leading the project
Planning the roadmap: Pulling off the design journey efficiently was a big challenge especially because this was a huge project I was given ownership for. I ensured that every individual entity including project managers, stakeholders, interns, and engineers were looped into each design decision that I intended to make.
Mentoring: Getting the juniors onboard in every design decision to maximize their participation and learning was one of my priority goals while working on this project. It polished my leadership skills.
Giving feedback: Taking feedback is easier than giving feedback on someone else’s designs. It helped me improve in a way that I could give feedback without overwhelming their thought process.
Audits: The way we iterate while designing, the same way the developers should iterate after audits as nothing can be built perfectly in one go.
Planning the roadmap: Pulling off the design journey efficiently was a big challenge especially because this was a huge project I was given ownership for. I ensured that every individual entity including project managers, stakeholders, interns, and engineers were looped into each design decision that I intended to make.
Mentoring: Getting the juniors onboard in every design decision to maximize their participation and learning was one of my priority goals while working on this project. It polished my leadership skills.
Giving feedback: Taking feedback is easier than giving feedback on someone else’s designs. It helped me improve in a way that I could give feedback without overwhelming their thought process.
Audits: The way we iterate while designing, the same way the developers should iterate after audits as nothing can be built perfectly in one go.
That's a wrap! I hope you enjoyed it ✨
That's a wrap! I hope you enjoyed it ✨
That's a wrap! I hope you enjoyed it ✨
That's a wrap! I hope you enjoyed it ✨
More case studies


Building a Design System for a SaaS Product
Design system case study • From scratch • Research • Strategy • 2022 • SaaS
Unveiling the journey of building a robust design system for a SaaS product for Quantive.


The Modern Data Stack
Product design case study • Research • 2021 • Technology
Curated repository of invaluable resources to empower individuals in mastering data stack best practices, and making informed choices for their specific use cases.


IMENCO
UX/UI Design case study • Research • 2022 • Sustainability
Encouraging and assisting individuals live a more sustainable life by consuming energy mindfully, saving on bills, and lowering their carbon footprint.
More case studies
More case studies



Building a Design System for a SaaS Product
Design system case study • From scratch • Research • Strategy • 2022 • SaaS
Unveiling the journey of building a robust design system for a SaaS product for Quantive.



The Modern Data Stack
Product design case study • Research • 2021 • Technology
Curated repository of invaluable resources to empower individuals in mastering data stack best practices, and making informed choices for their specific use cases.



IMENCO
UX/UI Design case study • Research • 2022 • Sustainability
Encouraging and assisting individuals live a more sustainable life by consuming energy mindfully, saving on bills, and lowering their carbon footprint.
Like what you see? Let’s chat!
Contact me:
Like what you see? Let’s chat!
Contact me:


Like what you see? Let’s chat!
Contact me: