Big Data, Data & Analytics

Enhancing Workflows with Apache Airflow and Docker

In today's world, handling complex tasks and automating them is crucial. Apache Airflow is a powerful tool that helps with this. It's like a conductor for tasks, making everything work smoothly. When we use Airflow with Docker, it becomes even better because it's flexible and can be easily moved around. In this blog, we'll explain what...

by bishal.singh
Tag: BigData
17-Oct-2023

AWS, Big Data

Unlocking the Potential: Kafka Streaming Integration with Apache Spark

In today's fast-paced digital landscape, businesses thrive or falter based on their ability to harness and make sense of data in real time. Apache Kafka, an open-source distributed event streaming platform, has emerged as a pivotal tool for organizations aiming to excel in the world of data-driven decision-making.In this blog post, we'll...

by ashish.gupta
Tag: BigData
12-Oct-2023

Big Data

Amazon Redshift: A Comprehensive Overview

Introduction In today's data-centric world, making informed decisions is vital for businesses. To support this, Amazon Web Services (AWS) offers a robust data warehousing solution known as Amazon Redshift. Redshift is designed to help organizations efficiently manage and analyze their data, providing valuable insights for strategic...

by shubham.thakur
Tag: BigData
19-Sep-2023

Big Data, Data & Analytics

Efficient Data Migration from MongoDB to S3 using PySpark

Data migration is a crucial process for modern organizations looking to harness the power of cloud-based storage and processing. The blog will examine the procedure for transferring information from MongoDB, a well-known NoSQL database, to Amazon S3, an elastic cloud storage solution leveraging PySpark. Moreover, we will focus on handling...

by bishal.singh
Tag: BigData
18-Sep-2023

Big Data, Data & Analytics

Spark Structured Streaming

In this blog, I will discuss how Spark structured streaming works and how we can process data as a continuous stream of data. Before we discuss this in detail, let’s try to understand stream processing. In layman’s terms, stream processing is the processing of data in motion or computing data directly as it is produced or...

by ravindra.jain
Tag: BigData
31-Aug-2023

Big Data

No Code Data Ingestion Framework Using Apache-Flink 

The conveyance of data from many sources to a storage medium where it may be accessed, utilized, and analyzed by an organization is known as data ingestion. Typically, the destination is a data warehouse, data mart, database, or document storage. Sources can include RDBMS such as MySQL, Oracle, and Postgres. The data ingestion layer...

by vikas.duvedi
Tag: BigData
27-Jun-2023

Big Data, Product Engineering

5 Considerations For Building Data Driven Applications

Innovation is at the center of application development. A lot of established companies as well as startups are investing big money in product ideas that have the potential to solve business challenges. While traditional applications are still in place, new age SaaS companies are developing amazing applications for web and mobile keeping...

by kinshuk jhala
Tag: BigData
22-Feb-2017

AWS, Big Data

What is Amazon Redshift and why you should definitely use it?

So you have spent some odd years of your software development career and now you know many of those RDBMS implementations in and out. In fact, you also already know that RDBMS is not the only enterprise storage and due to frequent scalability issues you encountered, someday you found about Big Data tools. Chances are you were...

by Ajay Sharma
Tag: BigData
26-Sep-2016

Big Data, Technology

DataSafe – A Data Archival Tool

#fame is India's first (and now the biggest) live-streaming app on IOS and Android platforms. This app allows people to create their own beam and go live immediately, or book a slot for future. As time passed, the operational databases of #fame kept on increasing at a great speed. As a result, the disk space utilization of database server...

by Rohan Kalra
Tag: BigData
16-Aug-2016

Big Data

Prediction Analysis using Knime

Prediction Analysis is the practice of extracting information from existing data sets in order to determine patterns and predict future outcomes and trends. There are various analytic and machine learning tool available in the market for predictive analysis. This post includes introduction of Knime followed by a sample use case of...

by Surendra Pratap Singh
Tag: BigData
05-Feb-2015

Big Data

Realtime Event processing with Esper

In one of the recent use case, we had to implement a complex event processing in real time mode. Storm is used as real time processing engine, but since It doesn't provide batching of events therefore we took upon Esper to do the required job. Esper can be thought as a complex event processing (CEP) component generally used for event...

by Mohit Garg
Tag: BigData
21-Jan-2015

Big Data

Spark 1O1 – Revamping Hadoop

Big Data has witnessed a tremendous movement and growth over the last couple of years. As per the top research agencies, Big Data has recently emerged as the most successful “launch pad”,   giving a way to the maximum number of start-up ever.  As the space evolves further, more and more organizations of varied sizes and...

by Moonesh Kachroo
Tag: BigData
20-Jan-2015