How we implemented electronic car parts catalogs

The implementation of the data warehouse, ELT process, Customer Portal, and Reports.

Electronic car parts catalogs implementation

Customer

A supplier of technology solutions in the automotive industry.

Challenges

The client has been acquiring electronic parts catalogs for years, spending $5000-50000$ per month on each catalog.

Our team accepted the challenge to develop a solution that generates and updates the catalogs from five manufacturers based on source data.

  • Each feature was implemented irrespective of the solution as a whole. This happened due to the lack of a thorough approach towards the solution’s architecture;
  • It was impossible to locate a specific feature as a mixed bunch of tools and technologies were used on the project: a mix of both cloud/on-premise environments and languages (Python, Perl, SQL, and Bash);
  • The documentation was missing.

Toolset

Icon white
Microsoft SQL Server for all databases
Flip icon
TeamCity for Continuous Integration
Search icon
SpecFlow for autotests
Code icon
Microsoft SSIS for the ETL process

Technical Details

01
The ETL process runs once a month for each manufacturer, with the processing time of 1-8 hours.
02
TeamCity is configured to run integrity checks, autotests, and build catalogs based on full data sets.
03
APIs response time is 300ms which required a lot of optimization effort.
04
Autotests cover the ETL process and all APIs implemented on SpecFlow.
05
The largest Deliverable Database is 150Gb.

Solution

Each manufacturer has their source data in a specific format - text files, XML, Excel, databases (.MDF), and even EBCDIC - the format used since the 50s on IBM mainframes. The goal was to build ETL pipelines that would transform diverse source data and enable a single set of APIs to query all catalogs.

Stage raw data is imported into MS SQL Database, then cleaned and validated. The stage only contains a set of processed data that is cleared at the beginning of each ETL run. Data Warehouse is a permanent data storage that includes integrated and normalized data sets (not a canonical data warehouse).

Deliverable Database contains data with applied business rules and transformations required to enable access via APIs. In their turn, APIs are implemented as SQL procedures.

Electronic car parts catalogs implementation

Key Achievements

1

Now, the customer saves approximately $100.000 per subscription.

2

High-quality catalogs save resources on maintenance.

3

We enabled full access to catalogs and the ability to update catalogs at any time.

Team Composition

1

Project Manager

6

Developers

2

Quality Assurance

1

Business Analyst

Let's start your project

Get in touch with us. We will carefully guide you through the entire process, from idea generation to product launch.

contact us

Contact us

Leave us your inquiry, and our specialists will get in touch within 1 business day!