Empowering Immoweb’s Data Journey to AWS Cloud

The approach

In our collaboration with Immoweb, we provided invaluable technical expertise in the field of data engineering and AWS, enabling them to embark on a transformative journey. Our primary objective was to migrate their on-premise data pipelines to the robust infrastructure of AWS.

Employing industry best practices, we carefully orchestrated the migration process, ensuring a seamless transition and optimal utilization of cloud resources. Our team took a comprehensive approach, considering the unique requirements and challenges of Immoweb’s data ecosystem.

Immoweb

Immoweb is Belgium’s leading digital real estate portal, connecting buyers, sellers, tenants, and landlords for over 25 years. With an extensive range of nearly 150,000 properties listed daily, Immoweb offers a comprehensive selection of real estate, from houses and apartments to commercial and historic properties. Their services include a free estimation tool for property owners and integrated mortgage simulations for buyers. As part of the AVIV Group, a major player in the digital real estate industry, Immoweb benefits from a strong network of renowned brands across Europe.

Generating data insights and leveraging AI

Immoweb’s extensive range of features and services generates a vast amount of valuable data. The platform collects data from property listings, user interactions, market trends, and more. Recognizing the potential for extracting valuable insights and leveraging data science and AI techniques, Immoweb is committed to capturing and efficiently making this data available for analysis.

The captured data holds immense potential for understanding user behavior, market dynamics, and predicting real estate trends. By leveraging data science and AI, Immoweb aims to enhance user experiences, provide personalized recommendations, and optimize their platform’s performance.

Efficient capture and availability of this data are crucial for the success of Immoweb’s data-driven initiatives. It enables them to uncover hidden patterns, gain a deeper understanding of their user base, and develop innovative solutions to better serve the real estate community.

Seamless data integrations with AWS

Given Immoweb’s objective to leverage cloud technology for scalability and their selection of AWS as their primary cloud provider, Infofarm was engaged to facilitate the migration of their data and data pipelines from on-premise servers to the cloud. In order to achieve this, we established a standardized and scalable data platform that adheres to the best practices outlined in the Well-Architected Framework.

To optimize the platform’s efficiency and flexibility, we adopted a serverless-first approach, prioritizing the utilization of serverless solutions wherever feasible. This approach offers several advantages in this context, including simplified management, automatic scaling based on demand, reduced operational overhead, and cost optimization through pay-per-use billing models. By embracing serverless architecture, we ensured that Immoweb’s data platform is agile, resilient, and capable of handling varying workloads efficiently.

What follows is a high-level overview with description of the platform and data pipelines that Infofarm set up in order to facilitate seamless data processing and management.

  • Data Lake: Amazon S3 is used to store all the collected data from various systems. The data lake is structured into three layers, each with its own buckets. These buckets are configured with security measures and retention policies specific to each data source, ensuring data integrity and compliance.
  • Data Processing: A combination of AWS Glue ETL Jobs and AWS Lambda is employed for data processing. Glue ETL Jobs are used for larger-scale transformations, while Lambda functions handle micro-ETL tasks, performing transformations on smaller datasets to optimize costs without compromising efficiency.
  • Data Access: Amazon Athena is used for efficient and interactive data access. With Athena, users can run ad-hoc queries directly on the data lake stored in S3, enabling fast and seamless exploration of the data for insights and analysis. The Athena connection is also used to get access to the data using Tableau.
  • Pipeline Orchestration: AWS Step Functions in combination with EventBridge rules are setup for pipeline orchestration. Step Functions provide a reliable and scalable workflow management system, allowing Immoweb to coordinate and automate complex data processing pipelines, ensuring smooth and efficient execution.
  • Data Catalog: AWS Glue Data Catalog acts as a centralized metadata repository. The Glue Data Catalog enables Immoweb to define, manage, and discover metadata about their data assets, making it easier to organize and understand the structure and characteristics of their data.
  • Logging & Monitoring: Immoweb relies on Amazon CloudWatch for logging and monitoring of their data pipelines and infrastructure. CloudWatch allows them to collect and track metrics, and gain insights into the performance and health of their data processes, ensuring smooth operations and prompt issue detection.
  • Security: The security of the AWS environment is a top priority for Immoweb, and it is achieved by adhering to the principle of least privilege using AWS Identity and Access Management (IAM). IAM enables them to define fine-grained access policies and manage user permissions, ensuring that only authorized entities can access and manipulate the data and resources within the AWS infrastructure.

Results of the project

By leveraging AWS’s robust suite of services and Infofarm’s expertise in designing and implementing cloud-based data solutions, we successfully migrated Immoweb’s data and established a powerful and flexible data platform.

By implementing the data platform, Infofarm has facilitated the availability and accessibility of data for crucial business intelligence (BI) activities, as well as data science and AI initiatives. The platform serves as a robust foundation for Immoweb to harness the power of their data, enabling informed decision-making, generating valuable insights, and driving innovation in the real estate industry.

Infofarm’s collaboration with Immoweb has resulted in a robust and future-ready data platform. By leveraging the power of their data, Immoweb is positioned to stay ahead in a competitive market, drive innovation, and deliver exceptional value to their users and stakeholders.