The conveyance of data from many sources to a storage medium where it may be accessed, utilized, and analyzed by an organization is known as data ingestion. Typically, the destination is a data warehouse, data mart, database, or document storage. Sources can include RDBMS such as MySQL, Oracle, and Postgres. The data ingestion layer...
Video is the future of content marketing and reaches a large mass of customers within no time. Digital media environment is evolving rapidly and the increasing usage of smartphones and tablets has changed the overall consumption habits of audiences across the globe. Not just the social networking sites are swamped with videos; even the...
#fame is India's first (and now the biggest) live-streaming app on IOS and Android platforms. This app allows people to create their own beam and go live immediately, or book a slot for future. As time passed, the operational databases of #fame kept on increasing at a great speed. As a result, the disk space utilization of database server...
Big Data has witnessed a tremendous movement and growth over the last couple of years. As per the top research agencies, Big Data has recently emerged as the most successful “launch pad”, giving a way to the maximum number of start-up ever. As the space evolves further, more and more organizations of varied sizes and...
We at IntelliGrape divide Big Data into four major sectors - as we commonly refer as 4C's of Big Data. These 4C's are:- Capture (Data Ingestion) Contain (Data Persistence (NoSQL) Compute (Data Processing) Comprehend (Data Analytics and Visualization) Within this blog, I'll be focusing on the last pointer i.e....
Overview: The big data space has been evolving continuously and each day more technologies are added in ecosystem. Hadoop Hive is one of the technologies that has been around along. It's give a SQL wrapper to execute Hadoop as a query language. Inherently, It's having some of the optimizations techniques. Through this blog, I thought...
Using GroupBy and JOIN is often very challenging. Recently in one of the POCs of MEAN project, I used groupBy and join in apache spark. I had two datasets in hdfs, one for the sales and other for the product. Sales Datasets column : Sales Id, Version, Brand Name, Product Id, No of Item Purchased, Purchased Date Product...
MySQL provides an easy mechanism for writing the results of a select statement into a text file on the server. Using extended options of the INTO OUTFILE nomenclature, it is possible to create a comma separated value (CSV) which can be imported into a spreadsheet application such as OpenOffice or Excel or any other application which...
A Brief History of Hadoop: Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open source web search engine, itself a part of the Lucene project. The Origin Of The Name “hadoop”. Hadoop is not an acronym; it’s a made-up name. The...