Client: ClevAir

Data classification in energy technologies

Case study details


ClevAir delivers a smart system that manages the building’s energy consumption, automates its maintenance and operations, revitalizes its climate, and offers you all the insights you need to run it even better. The system enables reducing energy costs and minimizing a negative impact on the environment.



Challenge


Let’s consider data cleaning, the task of generating clean data by using transformation and validation rules. To automate it, we have to detect the data types of a given dataset and infer the most appropriate rules to apply.

For instance, knowing that the first two columns of the table contain country names and their capitals significantly simplifies the job of correcting any errors or detecting and filling in missing values in these columns.



Approach


Addepto team has built and trained a model with a sufficient amount of real-world semantic type examples. This kind of model is robust to dirty data and scales well. The predictive performance of that model exceeds dictionary and regular expression benchmarks and allows companies to detect data types in the source data and transform them accordingly.



Goal


ClevAir was looking for a way to automate and speed up data cleaning and labeling processes.



Outcome


Addepto delivered a semantic data type detection in the datasets.


Challenge

Develop and Deploy a Framework for Data Cleansing and Labeling of Acquired Data


The company was looking for a partner able to build and implement a system to clean and label collected data.

ClevAir is a company operating in the growing “smart” sector, and delivers an intelligent system that manages the building’s energy consumption, and automates its maintenance and operations. The company collects and analyzes a vast amount of various data to optimize energy usage.

This optimization part was covered and works excellent. The problem was that data came disordered, with no label, and the labeling process was handled manually.


Labeling process was handled manually


Lack of automation in this area was significantly slowing down its customer acquisition processes. All data – and that’s a lot – has to be, first, gathered, and then – labeled. The process had to be repeated from scratch with every new client onboard, which took time before the actual optimization job could happen.


Approach

Creating Semantic Data Type Detection for Datasets


During project development, our team worked on:


  • To automate the data cleaning process, Addepto team had to be able to detect the data type of each data point in a given dataset and infer the most appropriate rules to apply.
  • With that approach in mind, Addepto team has built and trained a model with a sufficient amount of real-world semantic type examples. This kind of model is robust to dirty data and scales well.
  • Predictive performance of that model exceeds tailor-made decision rules benchmark and allows companies to detect data types in the source data and transform them accordingly.
  • Traditional models such as decision trees are faster. However, they come at the cost of predictive performance and storage requirements, so Addepto team had chosen a different approach, and implemented it in the AWS testing environment.

Goal

ClevAir was looking for a way to automate and speed up data cleaning and labeling processes


Addepto delivered semantic data type detection in the datasets, which is robust to dirty data and scales well.

The predictive performance allows companies to detect data types in the source data and transform them accordingly.



Outcome

Automated Data Type Detection and Transformation - Outcome


Now the company’s software is able to automatically detect data types and transform data accordingly, for example using a map to visualize the value pairs.

With automation, the company is able to speed up client acquisition processes as pre-optimization tasks of collecting data go smooth and error-free. New client doesn’t have to wait for the business insights and knows from the very beginning the value they can get.



Before


  • Manual data cleaning and labeling process
  • Slow business growth and lack of scalability


After


  • Automatic data cleaning process
  • Faster acquisition of new customers
  • Business scalability

About Addepto


Addepto is a fast-paced, growing company focused on innovations in AI-related and data-oriented areas.


Here you can learn more about the technologies used in this project:



We support businesses operating Energy Technologies in digital transformation, helping them find ways to use their data with the support of technologies such as Machine Learning, and data classification.


About us


We are recognized as one of the best AI, BI, and Big Data consultants


We helped multiple companies achieve their goals, but - instead of making hollow marketing claims here - we encourage you to check our Clutch scoring.

Let's discuss
a solution
for you



Edwin Lisowski

will help you estimate
your project.










Required fields

For more information about how we process your personal data see our Privacy Policy





Message sent successfully!