A leading multinational IT provider serving the air transport industry partnered with the European Union to explore new market opportunities beyond traditional aviation, recognizing that passenger journeys don’t end at the airport gate. Working with the Client, we developed an Intermodal Data Platform that unifies aviation, maritime, and rail data into a single system.
Built on Databricks as a scalable data mesh architecture, the platform enables proactive disruption management across entire multi-modal journeys while establishing the infrastructure for future AI and machine learning applications.
The platform now powers real-time operational decisions for transportation operators in Athens, with the technical foundation ready to expand across European cities and support increasingly sophisticated intelligent capabilities.
Meet Our Client
Our client is a multinational information technology company providing comprehensive IT and telecommunications services to the air transport industry. Serving airlines, airports, ground handlers, and governments worldwide, they deliver solutions for passenger processing, baggage handling, aircraft operations, and more.
Case Study Shortcut
Challenge
Fragmented Data Across Transportation Modes
Aviation, maritime, and rail data existed in completely separate systems with no integration. Airports managed their terminals effectively but had zero visibility into external factors – train delays, port congestion, strikes blocking access routes. Connected journeys combining multiple modes (flight + train, flight + cruise) were invisible to operators, despite being increasingly common. Each transportation provider operated in a data silo.
Reactive Instead of Proactive Disruption Management
Operators learned about disruptions only after passengers had already missed connections. A delayed flight meant a missed cruise departure, but port operators had no advance warning to coordinate. Without cross-modal communication, each operator worked in isolation, unable to prevent cascading disruptions or take preventive action like deploying additional shuttles or opening extra gates.
No Foundation for Scalable Intelligence
Each transportation mode used different data standards, update frequencies, and even conflicting definitions of basic concepts like “delay” or “cancellation.” There was no existing infrastructure capable of unifying this heterogeneous data landscape, maintaining quality at scale, expanding to multiple cities, or supporting future AI/ML capabilities. A fundamentally new architectural approach was required.
Goal
The primary objective was to build a scalable data engineering foundation that unifies incompatible transportation data ecosystems while supporting both immediate operational needs and future AI/ML applications. The platform needed to process real-time streaming and batch data simultaneously, normalize diverse schemas while preserving source integrity, and deliver sub-minute disruption detection across transportation modes.
Data mesh implementation on Databricks: Design a federated architecture ingesting 10+ disparate sources (FIDS, AMS via CDC, AIS maritime tracking, GTFS rail feeds, weather APIs, event data) while maintaining domain autonomy and enabling cross-domain analytics
Medallion architecture with SCD2 historization: Implement Bronze/Silver/Gold layers that preserve complete lineage, track temporal changes through Slowly Changing Dimensions Type 2, and support both real-time queries and historical analytics
Near real-time intelligence layers: Build specialized detection engines for delays, cancellations, and diversions that process streaming data, apply mode-specific business rules, and generate alerts with sub-5-minute latency
API-ready data synchronization: Establish automated pipelines from Gold tables to Cosmos DB, exposing normalized datasets through REST APIs while separating analytical and operational workloads
Horizontal and vertical scalability: Create standardized data contracts and modular ingestion patterns for rapid city onboarding, while establishing infrastructure for future ML models, predictive analytics, and LLM-powered data interaction
Data quality automation: Implement monitoring, validation, and alerting frameworks that handle schema drift, missing data, and inconsistent external feeds, ensuring operator trust in system insights
Outcome
The platform now processes streaming and batch data from over 10 distinct sources, providing Athens operators with unprecedented visibility into connected travel patterns. The medallion architecture ensures data quality and traceability, while the intelligence layers actively monitor for disruptions across all three transportation modes.
More importantly, the platform is built to grow in two directions. Horizontally, it can easily expand to new cities and additional data sources. Vertically, it supports increasingly advanced AI/ML capabilities and predictive models. This creates a compounding effect: the more data we collect, the more use cases become possible—from predictive analytics to automated decision support.
What started as an operational tool is designed to evolve into a comprehensive intelligence platform for European transportation networks.
Before
Data siloed across separate aviation, maritime, and rail systems with no integration
No visibility into connected journeys or passenger connection risks
Reactive disruption management after incidents already impacted passengers
Manual data analysis across disconnected systems
Limited to single-mode operational insights
No external event awareness (strikes, protests, weather) affecting airport access
Platform limited to single location with no expansion framework
After
Unified data mesh platform consolidating all transportation modes in Databricks
Real-time monitoring of multi-modal journeys with risk identification for at-risk connections
Proactive delay, cancellation, and diversion detection enabling preventive action
Automated intelligence layers with near real-time alerts and normalized data
Cross-modal coordination capabilities supporting operators at airports, ports, and rail stations
Integrated external data sources providing comprehensive situational awareness
Scalable architecture designed for easy replication across European cities
Integrate those solutions in your company
Contact below and let us design and integrate solutions tailored to your business needs
Data Mesh Architecture for Diverse Transportation Sources
Transportation data originates from fundamentally different systems with incompatible formats and update frequencies. The platform accepts this reality through a data mesh architecture that ingests diverse inputs - real-time maritime tracking, batch rail schedules, CDC streams from airport systems - and progressively refines them through layered processing. This flexible approach enabled rapid onboarding of new sources without requiring platform restructuring.
Medallion Architecture: Bronze, Silver, Gold
Three distinct data layers form the platform's core structure on Databricks. Bronze captures raw data exactly as received, preserving complete lineage for reprocessing. Silver normalizes and cleanses this data, applying SCD2 historization to track schedule and status changes over time. Gold delivers business-ready aggregations and intelligence layers that power operator dashboards. This layered approach balances data quality with downstream flexibility.
Specialized Intelligence Layers for Disruption Detection
Three purpose-built detection engines analyze disruptions in real-time. The Delay Intelligence Layer monitors schedule deviations across all modes and flags at-risk connections. The Cancellation Intelligence Layer unifies cancellation data from disparate sources into a single, normalized view. The Diversions Intelligence Layer tracks unexpected route changes in aviation and maritime operations. These engines convert raw operational data into immediate, actionable operator alerts.
Databricks as the Unified Development Platform
The entire development lifecycle — ingestion, transformation, testing, and access management — runs within a single Databricks environment, with curated Gold-layer data transferred to Cosmos DB (exposed via APIs for frontend dashboards and visualizations), significantly accelerating development compared to fragmented toolchains while also enabling future AI/ML use cases such as natural language data interaction and predictive modeling.
API Layer for Operator Applications
Gold layer tables synchronize to Cosmos DB and expose data through a .NET API, powering the Athens operator dashboard. This architecture separates compute-intensive analytical processing (Databricks) from high-frequency operational queries (Cosmos DB), optimizing performance for both workload types.
Designed for Dual-Direction Growth
Two expansion paths drive platform evolution. Horizontal growth incorporates new cities and data sources through standardized contracts and modular ingestion patterns. Vertical growth layers advanced analytics, machine learning predictions, and autonomous decision support onto the existing foundation. Both directions leverage the same core architecture, ensuring genuine scalability.
Automated Data Quality Management
External feeds introduced significant challenges: missing data, format inconsistencies, and unexpected schema changes. Automated monitoring with real-time alerting catches quality issues immediately. For particularly volatile sources like the Port Authority's Excel-based schedules, resilient parsing logic with fallback strategies maintains data flow despite format changes. This quality framework ensures operators can confidently act on system insights.
Technology
Databricks
Cosmos DB
Our team
Vadym Mariiechko
Data Engineer
Bartosz Obstawski
Data Engineer
Madgalena Bogdał
Project Manager
Our Team Expert Opinion
The challenge of this project isn’t only technology - it’s also building a functioning ecosystem around it. Negotiating access to live data streams, and establishing partnerships with airports, ports, and rail operators across the TravelWise network is a huge part of the work our client has to carry on.Every new partner meaningfully expands the value of the platform, unlocking richer visibility across international journeys and enabling predictive, not just reactive, decision-making. The solid technical foundation is in place; now the focus is on scaling collaboration to unlock the full business potential
Magdalena BogdałProject Manager – Addepto
Integrating aviation, maritime, and rail data is architecturally challenging because each mode uses different standards, update cycles, and even different definitions of basic concepts like ‘delay.’ On Databricks, we addressed this with Unity Catalog for clean environment separation, reusable connector repositories, and a layered intelligence model we call the ‘Brain Layer.’This allowed us to move beyond isolated, per-mode alerts and build true intermodal intelligence that analyzes entire journeys and detects cascading risks. The platform is designed to handle constant schema changes and inconsistent feeds, creating infrastructure that continuously learns as new data sources come online.
Vadym MariiechkoData Engineer – Addepto
Take the next step
Schedule an intro call to get know each other better and understand the way we work
Transform Engineering Chaos into Strategic Clarity
Discover how AI turns CAD files, ERP data, and planning exports into structured knowledge graphs-ready for queries in engineering and digital twin operations.