Let’s discuss the first phase of CRISP-DM: Business Understanding. Recall that CRISP-DM stands for the “CRoss Industry Standard Process for Data Mining” and it’s a six-phase process for organizing and iterating through a data project. Feel free to check out my previous posts where we discuss Why CRISP-DM is a Data Scientist’s Secret Weapon and What is CRISP-DM, Anyway?
Table of Contents
- What is the Business Understanding phase of CRISP-DM?
- 1. Assess the situation
- 2. Understand the business objectives
- 3. Determine how you are going to measure success
- 4. Establish specific data mining goals
- 5. Write a project plan
- Risks if Business Understanding step is skipped or rushed
- Benefits of a Well-Executed Business Understanding Phase
- Conclusion
Table of Contents
- What is the Business Understanding phase of CRISP-DM?
- 1. Assess the situation
- 2. Understand the business objectives
- 3. Determine how you are going to measure success
- 4. Establish specific data mining goals
- 5. Write a project plan
- Risks if Business Understanding step is skipped or rushed
- Benefits of a Well-Executed Business Understanding Phase
- Conclusion
What is the Business Understanding phase of CRISP-DM?
Before doing any modeling or analytics work, the first and most crucial part of a data project is developing business understanding. You need to be able to define the business goals and expected impact of the data project. You need to be able to answer questions like “What is the current state of the problem and solution?”, “What is the goal?”, “Who is involved?”, “Who is affected by the outcome and how?”, “What is the expected future state?”, “What is the expected ROI and how will it be measured?”, among others.
This first phase of the project includes indispensable research that will guide your data prep, modeling, evaluation, and deployment efforts.
The image below shows the original diagram for CRISP-DM with subtasks and outputs. It consisted of a 4-step process: Determine Business Objectives, Assess the Situation, Determine Data Mining Goals, and Produce a Project Plan.
I have separated the first subtask, Determine Business Objectives, into two tasks: Understand the business objectives, and Determine how you are going to measure success.
With that in mind, the Business understanding phase can be broken down into a 5-step process:
- 1
Assess the situation
- 2
Understand the business objectives
- 3Determine how you are going to measure success
- 4Establish data mining goals
- 5Write a project plan
Keep reading for ideas on what actions should be taken during each of these steps to get the project off on the right foot.
1. Assess the situation
Basically, by understanding the bigger picture, the lay of the land, you can find answers that have real business impact. But you need to understand the business first!
How do you assess the situation?
If you are new to a team or organization, it may be daunting to begin a new project.
You don’t know what you don’t know
Some wise person
In English, we often say to remember the 5 W’s (and an H): Who?, What?, Why?, When?,Where?, and How? But, we can get a little more specific. In order to gain the information we need, we should have several conversations to lay the groundwork at the beginning of a project. Some teams will have kick-off meetings run by a project manager (PM). In an ideal world, your PM will be on the same page with understanding what the data team needs to know to be successful. However, as data projects are new to some organizations, it may require self-advocacy from the data team to develop a culture for this. With that in mind, the data team should take lead in the following ways:
Have Conversations with Stakeholders
Determine what success looks like
Ask High-Level Technical Questions
Address Some Logistics
Developing situational awareness of the business problem also involves understanding the people and roles involved, so get familiar with the business structure through organizational charts. Learn the project groups and business units that will be involved or affected by the data project.
Additionally, begin documenting what you discover. This information will be the beginning of your project documentation and will be a reference for others on the data team. You may come back to many of these questions again. Don’t worry about making changes and updating things as you go–CRISP-DM is meant to be flexible and your understanding of the business should evolve as you learn! Likewise, the business may evolve it’s vision and goals over time as well.
2. Understand the business objectives
Critically, you must understand the business problem. What area of the business does it affect, what are the motivations? Has there been any other data mining effort for this business problem? How familiar is the business unit with Data Science?
Also, it’s important to understand the current state of the solution to the problem. Is there already a process in place to address the problem? What are the advantages and disadvantages? Who uses it? Is it automated (if not, how many hours per week or month is spent on it?)
You may consider doing a literature review or industry review regarding how others in the same industry are solving this problem. Is there a standard developed? Is there machine learning research being done in this area–what methods are well-accepted versus state-of-the-art?
Are there compliance requirements, industry standards, or laws (e.g. GDPR) to keep in mind? This is an excellent time to discuss data ethics as well.
At this step in the first phase, it is crucial to get a more detailed inventory of the resources: hardware, software and data. This is an additional logistical step and should be well documented, however, I believe it is important to understand the business problem before talking about data sources because at this stage you need to be able to discuss whether you will need additional data, either purchased or generated. Will you need any other datasets from within the organization? Without understanding the business problem, you may not know what data will be necessary.
3. Determine how you are going to measure success
Finally, a crucial part of the usiness Understanding phase is to establish the business objective. What does “done” look like? That are the key performance indicators and/or metrics that will be used to evaluate the success or effect of the data mining effort? Are there objective and subjective measures? Document the metrics that will be used for every business objective.
This would be a great time in the project to dicuss measuring ROI. If the goal is to reduce customer churn, for example, what is the dollar value of reducing churn of x number of customers. If the goal is to reduce downtime of vehicles in a fleet, determine if this value is measured and how it can be measured so that the metric can be tracked.
Explore what your organization expects to gain from data mining. Try to involve as many key people as possible in these discussions and document the results
IBM SPSS CRISP-DM Documentation
4. Establish specific data mining goals
Before moving on to the next phase, the Data team should have a clear idea of what type of problem they are solving, sush as clustering, prediction, or classification. Including a clear numerical goal is also helpful: the team wants to predict component failure at least 1 week before catastrophic failure occurs.
These types of prediction horizons and thresholds can be related to risk and business processes, so the stakeholders are integral to deciding where the thresholds should be.
For example, in predictive maintenance, the risk tolerance of the organization and the standard operating procedures will affect how much warning time is needed before a predicted failure. If a component on a long haul truck is predicted to fail next week, is 1 week enough time to lead time to make sure that truck is off the road and getting repaired before the failure? Or, does the fleet manager actually need 2 weeks’ notice in order to plan the repair and not disrupt operations?
Additionally, during this phase, the data team should be discussing metrics that will be used to assess the model and benchmarks that will be used. The team should consider what deployment will look like for this application.
5. Write a project plan
After working through steps 1 through 4 of this first phase of the data mining process, you should have enough information to create an initial project plan. Some good things to include in a data project plan:
Risks if Business Understanding step is skipped or rushed
Benefits of a Well-Executed Business Understanding Phase
Getting to know the business reasons for your data mining effort helps to ensure that everyone is on the same page before expending valuable resources.
IBM SPSS CRISP-DM Documentation
Clearly defining a team’s purpose and then setting goals that move the team forward in accomplishing that purpose can be a powerful elixir for strengthening the bond of a team. The real value comes in though once the goals are set and the team steadily works together to achieve them. Solving problems, working through conflict and experiencing progress as a team is a great morale builder
Conclusion
There is clearly a lot of ground to cover during the Business Understanding phase of a data project. This step is crucial for giving direction to the data team and to informing how to evaluate results and plan a good delpoyment of a final product–whatever that might look like.
This step is also easiest to skip in practice. Data teams are often eager to dive into the data–to begin exploring and modeling and gleaning insights, but without the due dillligence of the Business Understanding step, the project runs the risks listed above.
Hopefully, this guide has given you a clear idea on the conversations that need to happen and many questions that should be asked at the beginning of a data project.
[…] this phase, we are keeping in mind the goals we teased out in phase one: Business Understanding. Now, we look closely at the data sources we will use to meet the business goals and drive value […]
[…] Phase 1: Business Understanding […]
[…] adequate time in the business understanding and data understanding phases of the project. Ask a lot of questions. Document […]
[…] Business Understanding: Understand the current situation and determine the business goals for the project […]
[…] CRISP-DM Phase 1: Business/Problem Understanding […]
[…] Follow CRISP-DM and start with a business use case. […]
[…] used to assess the model in the previous phase: Does the model meet the criteria established in the project plan during the first phase? Is it within the allowable […]