Skip to content

Instantly share code, notes, and snippets.

@taiwotman
Last active May 11, 2024 23:57
Show Gist options
  • Save taiwotman/8b11f34cccfd824676c070c032a5afc6 to your computer and use it in GitHub Desktop.
Save taiwotman/8b11f34cccfd824676c070c032a5afc6 to your computer and use it in GitHub Desktop.
Communicate complex concepts in a clear
Describe a complex data modeling challenge you faced in a previous project. How did you approach the problem, and what factors influenced your decisions regarding the data model?
A complex data modelling challenge I faced was designing a data model for a multinational e-commerce company that wanted to analyze customer behaviour across multiple countries.
The complexity in this scenario arises from the need to handle diverse data sources, different types of data (structured, semi-structured, and unstructured), and the requirement to support multi-language data. Additionally, I was required to implement the data model to be scalable to handle the increasing volume of data as the company grows.
My first approach was understanding the business requirements and the type of analysis that the company wants to perform. Second, I explored the available data sources to understand the data’s structure, quality, and content. Third, based on the requirements and the nature of the data, as a data engineer, I chose a hybrid data model, which combines elements of both relational and NoSQL models to handle a variety of data types and support the scalability requirements. Fourth, I considered data from various sources to be transformed and cleaned to ensure consistency. The transformation involved handling missing values, removing duplicates, and standardizing the multi-language data. Last but not least, I implemented the data model using a suitable database management system. I also tested the data model with real queries and use cases.
In summary, I believe the factors that influenced my decisions regarding the data model include but are limited to the company’s business requirements, the nature and volume of the data, the required scalability, and the need to support multi-language data. The choice of technology and tools also played a significant role.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment