Neo4j Overview

Neo4j is an open-source graph database, implemented in Java.The developers describe Neo4j as “embedded, disk-based, fully transactional Java persistence engine that stores data structured in graphs rather than in tables”….

Read more »

Pig UDF

Pig provides extensive support for user defined functions (UDFs) as a way to specify custom processing. Pig UDFs can currently be implemented in three languages: Java, Python, JavaScript, Ruby and…

Read more »

Hive Architecture

Command line interface: It’s the default and the most common way of accessing hive. Hiveserver : Runs hive as a server exposing a thrift service,enabling access from a range of…

Read more »

Pig Overview

Hive Vs Pig Feature Hive Pig Language SQL-like PigLatin Schemas/Types Yes (explicit) Yes (implicit) Partitions Yes No Server Optional (Thrift) No User Defined Functions (UDF) Yes (Java) Yes (Java) Custom…

Read more »

Hive Complex Data Types with Examples

There are three complex types in hive, arrays: It is an ordered collection of elements.The elements in the array must be of the same type. map: It is an unordered…

Read more »

Hive Internal & External Table

A Hive table is a logical concept that’s physically comprised of a number of files in HDFS. Tables can either be Hive Internal Table: Internal table—If our data available into…

Read more »

Hive Services

Cli  —The command line interface to Hive (the shell). This is the default service. Hiveserver –Runs Hive as a server exposing a Thrift service, enabling access from a range of …

Read more »

Hive Shell is run on two modes

The shell is the primary way that we will interact with Hive, by issuing commands in HiveQL. HiveQL is Hive’s query language, a dialect of SQL. It is heavily influenced…

Read more »

Aggregate Functions in Hive

The following are built-in aggregate functions are supported in Hive: count(*), count(expr), count(DISTINCT expr[, expr_.]) count(*) – Returns the total number of retrieved rows, including rows containing NULL values; count(expr)…

Read more »

Hive Built-In Functions

Functions in Hive are categorized as below. Numeric and Mathematical Functions: These functions mainly used to perform mathematical calculations. Date Functions: These functions are used to perform operations on date…

Read more »