Shashank Mishra

July 5, 2023

Shashank Mishra

July 5, 2023

Google BigQuery is a serverless, scalable data warehouse on Google Cloud. It supports real-time analytics, machine learning, and GIS capabilities. With its unique architecture separating storage and computing, it offers automatic scalability and strong security, ideal for data-driven businesses.

Shashank Mishra

June 22, 2023

Edit: July 7, 2023

Shashank Mishra

June 22, 2023

Edit: July 7, 2023

Snowflake is a cloud-based data warehousing platform that brings a new level of performance, simplicity, and affordability to businesses that require big data processing and analytics.

Shashank Mishra

June 13, 2023

Edit: July 5, 2023

Shashank Mishra

June 13, 2023

Edit: July 5, 2023

Amazon Redshift is a powerful, scalable data warehousing service within the AWS ecosystem. It excels in handling large datasets with its columnar storage, parallel query execution, and features like Redshift Spectrum and RA3 instances. Redshift's clustered architecture, robust security, and integration with AWS services make it a go-to choice for businesses needing efficient and secure data management solutions.

Tommy Dang

May 31, 2023

Edit: June 12, 2023

Tommy Dang

May 31, 2023

Edit: June 12, 2023

Mage pivoted from an AI platform to an open-source data pipeline tool and is making a huge impact on the lives of data engineers around the world.

Shashank Mishra

May 30, 2023

Shashank Mishra

May 30, 2023

Dive into the implementation of stream data processing with Mage, using Kafka as source.

Tommy Dang

Thomas Chung

May 24, 2023

Edit: July 5, 2023

Tommy Dang

Thomas Chung

May 24, 2023

Edit: July 5, 2023

Join us for our first ever in-person data engineering meetup on Tuesday, June 27, 2023 from 6pm to 8pm (PST) in San Francisco, Bay Area! Don't miss out on this fantastic opportunity to learn about the latest technologies and best practices in the data engineering field and network with data professionals!

Shashank Mishra

May 15, 2023

Edit: June 1, 2023

Shashank Mishra

May 15, 2023

Edit: June 1, 2023

This guide introduces Apache Flink and stream processing, explaining how to set up a Flink environment and create simple applications. Key Flink concepts are covered along with basic troubleshooting and monitoring techniques. It ends with resources for further learning and community support.

Tommy Dang

May 9, 2023

Edit: May 18, 2023

Tommy Dang

May 9, 2023

Edit: May 18, 2023

Combine powerful database features with the flexibility of an object storage system by using the Delta Lake framework.

Shashank Mishra

May 6, 2023

Edit: June 1, 2023

Shashank Mishra

May 6, 2023

Edit: June 1, 2023

Dive into a comprehensive comparison of Apache Flink and Apache Spark, exploring their differences and strengths in data processing, to help you decide which framework best suits your data processing needs.

Shashank Mishra

May 5, 2023

Edit: May 16, 2023

Shashank Mishra

May 5, 2023

Edit: May 16, 2023

Join us for our first ever in-person data engineering meetup on Saturday, May 20, 2023 from 11am to 2pm (IST) in Gurugram, India! Don't miss this fantastic opportunity to connect, learn, and celebrate with your fellow data aficionados.