We don’t talk too often about estimating data science services and projects. And there is a good reason for that. It’s very difficult and depends on many factors. In many instances, you simply cannot foresee how much your AI or data science project will cost nor how long it will take to finish. But we decided to tell you something more about data science project estimation so that you have a general idea of what to expect.
Looking for solutions for your company? Estimate project
Generally speaking, data science project estimation is so difficult because there are no two identical data science projects. Each one is unique or has some unique features. Of course, this doesn’t mean that you shouldn’t look for benchmarks. You should, and that is the first thing you have to think about:
Data science project estimation: Benchmarks
If that’s one of your subsequent data science projects, your situation is much more straightforward. You can simply analyze your past projects and see how they ended up time and money-wise. Things are a bit more complicated if that’s your first go with AI or data science. In such a situation, you have several options:
• You can ask the agency you’re working with for some benchmarks or referral for you to verify how other projects went.
• You can look for other entrepreneurs who had some data science projects in the past and talk to them. The easiest way to find them is through LinkedIn, but you can also look for data science associations.
This way, you should be able to get some useful information. Now, remember about a simple yet important rule: It’s easier to estimate smaller projects. If it’s your first attempt to adopt data science, start with baby steps. A small project is one planned for two to three months, not longer. Validating and estimating extensive data science projects upfront is close to impossible.
Read more about Data science for business – what problems can be easily solved?
Data science project estimation: Divide your project into stages
Here’s the next important thing. No matter how big your data science project is, it can be divided into at least four fundamental stages. Estimating each stage’s cost and time will be much easier because you start every step with the knowledge from the previous one.
These four stages that we talk about here are as follows:
STAGE 1: RESEARCH AND DISCOVERY
Typically, this stage takes up to two weeks. Here, you primarily gather information. You need to know what you already have at your disposal, verify your data repositories and IT infrastructure, and contrast it with your needs and requirements. You’ll most likely have to consult many questions with your data science team, but it’s necessary for the project’s success. The most important question is always the same: “Can we solve our problem(s) with data science?” Only when the answer is yes, can you start further arrangements.
STAGE 2: EXPLORATION/POC
You can expect this phase to take at least a month. If your data science project involves some AI/ML elements, this is a good moment for a POC (proof-of-concept). You have to make sure that your idea will work in the real world in the way you expect. Otherwise, it could turn out that you’ve spent a lot of time and money on a project that doesn’t work the way you expected.
Shortly put, the second stage is all about ensuring that our vision, although already initially confirmed, can really be executed with the means and resources we have at our disposal.
It might be also interesting for you: 13 Tips to remember while working on ML and BI projects
THE ROLE OF A GROUNDWORK
Every data science project starts with data; that’s apparent. But we have to emphasize that the client’s data sets and data sources leave much to be desired in many situations. If your data is not cleaned and organized properly, its use in data science is bleak. That’s why data science companies spend the vast majority of their time fixing and improving datasets. This part of work is frequently overlooked in data science project estimations, and that’s a huge mistake! After all, this is where the real work happens!
Every data science project needs enough clean data in order to work accurately. In other words, you have to “feed” your algorithms with decent datasets so that they have enough material to work on. And here, insufficient quality of data is just one of many problems. More often than not, companies we work with simply don’t have enough useful data. In such a situation, the project immediately becomes far more complicated (i.e., costly) because we have to start with organizing everything so that data science algorithms can do their job. And this groundwork can take weeks.
Your data science team have to get satisfactory answers to several questions:
• What relevant data is available? Can we get it easily?
• How complete is the client’s data? Is it accurate and reliable?
• What can we do and what we can’t do with it?
• What kind of data do we have access to? How is it stored, and in what format(s)?
These questions are valid not just for the project itself but also for estimating the time and cost needed to finish your data science project. If that’s your first DS project, most likely, we will have to spend some time getting answers to these questions. And that’s the whole problem with data science–the vast majority of work happens behind the scenes. And it’s often difficult to explain and justify everything that’s “invisible” to the client’s eyes.
STAGE 3: DEVELOPMENT
Once we have all the yeses we need, we can start working on your project. Medium-sized projects will take at least three months. At this stage, we set everything and get it up and running. This involves cleaning and organizing your data, providing the necessary infrastructure (e.g., data warehouses), and organizing all the data-related processes and procedures. At this point, we also provide clients with a dashboard that enables quick and easy access to their data to analyze it freely.
In this stage, many data science teams work in so-called sprints. These are usually two-week-long stages of work. After each such sprint, the team and the client assess the results and decide on where to go next.
STAGE 4: IMPROVEMENTS AND MAINTENANCE
For obvious reasons, this stage is not limited in time. It all depends on how long you’re going to use your data science solution. It can take the whole life of your company. This stage is vital primarily because your company’s needs will change over time. Once you start with data science and see how beneficial it is, you will surely want to achieve more and go further. Professional data science companies such as Addepto will provide you with the necessary support all the way. As a result, you will be able to keep your data science application exactly the way you want it to look, even if it’s years after deploying its first version.
Data science project estimation: Decide what you really need
You have to be aware that in data science, a lot depends on your expectations. What do we mean? Let’s use a simple example. You want to build a predictive algorithm. The first version hits 95% accuracy. Will that be sufficient for you? If yes, your project is done, case closed. If not, though, data scientists and machine learning specialists will have to spend more time working on this project.
And perhaps they will achieve higher accuracy (maybe they will try to get better data or find new data sources), but you have to understand it will cost more money and take more time. Here, your point of view is the crucial one. You, as a client, have to be satisfied with your data science project. If this 5% inaccuracy isn’t worth the time/money risk, you can decide to close the project at this point. It all really depends on what you are going to use your project for.
Do you really understand your future project?
More often than not, it’s a source of some serious misunderstanding ending up in the client’s dissatisfaction. Make sure you thoroughly understand every element of your future data science project, especially regarding the expected outcome. If you feel like it, get more knowledge, do some digging. Don’t take someone else’s word for it. You have to understand what you’re paying for. And you have to be satisfied with the end result. Every decent data science company understands that fully and they pay a lot of attention to answering clients’ questions and dispelling doubts.
As you can see, estimating a data science project is far more tricky than people would think. That’s why AI and data science companies have no “pricing” sections on their websites (the only exception–SaaS companies that sell ready-made cloud-based solutions). It’s important to do thorough research and assess your current situation and needs before you invest more money or time in the project.
If you’re looking for a company that’s transparent about time and money related to data science services, reach out! At Addepto, we help companies all over the world implement data science and AI-related projects. Let’s see what we can do together!