AI-powered search & chat for Data / Computer Science Students

Introduction to Hive

 Towards Data Science

This article focuses on Hive, it’s features, use cases, and Hive queries. Since a lot of DML and DDL queries are very similar to SQL, it can act as a foundation or building block for anyone new to…

Read more at Towards Data Science

Getting Started With Hive

 Towards Data Science

The aim of this blog post is to help you get started with Hive using Cloudera Manager. Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization…

Read more at Towards Data Science

Hive Installation on ubuntu 18.04

 Analytics Vidhya

Note: Prefer java 8, as newer versions no longer has URLClassLoader which is required for running hive. Now. lets begin the installation process of hive by downloading the latest stables release from…...

Read more at Analytics Vidhya

Is it still good to learn Apache Hive?

 Analytics Vidhya

As the big data world moves towards Apache Spark, Databricks, or Cloud-based Data Warehouses like Amazon RedShift / Snowflake, the general conception is, Hive is an obsolete technology to learn.

Read more at Analytics Vidhya

Introducing Hiveplotlib

 Towards Data Science

Introducing hiveplotlib— a new, open-source Python package for generating Hive Plots.

Read more at Towards Data Science

Hive using S3 and Scala

 Analytics Vidhya

In this article, I’m going to share my experience of maintaining a Hive schema. This will be useful to the freshers who are willing to step into Big Data technologies. Mainly this will describe how…

Read more at Analytics Vidhya

Hive — How to install in 5 Steps in Windows 10

 Analytics Vidhya

Once extracted, we would get a new file apache-hive-3.1.2-bin.tar Now, once again we need to extract this tar file. To edit environment variables, go to Control Panel System click on the…

Read more at Analytics Vidhya

The Curious Case of MySQL, PostgreSQL, and Hive

 Towards Data Science

In an era of Big Data where the amount, size, and velocity of data are rapidly growing, knowing SQL is still an essential thing for Data Analyst and Data Scientist to know. SQL helps us to manage…

Read more at Towards Data Science

Apache Hive Optimization Techniques — 1

 Towards Data Science

Apache Hive is a query and analysis engine which is built on top of Apache Hadoop and uses MapReduce Programming Model. It provides an abstraction layer to query big-data using the SQL syntax by…

Read more at Towards Data Science

Planet Beehive

 Towards Data Science

Close your eyes for a second, ignore the rain dripping outside, the beep from a new incoming email and deeply think about ten global activities that are on the top of your To Do list .. Now be…

Read more at Towards Data Science

Apache Hive Optimization Techniques — 2

 Towards Data Science

In the earlier article, we covered how appropriate data modeling using partitioning and bucketing, choosing Tez as execution engine as well as compression could prove to be very big cost-saving…

Read more at Towards Data Science

Sampling Data in Hive

 Analytics Vidhya

Sampling data operation is something you need to feed your other environments than production something like test environment, or for unit testing, etc. Because your data is probably so huge amount…

Read more at Analytics Vidhya

Apache Hive Hooks and Metastore Listeners: A tale of your metadata

 Towards Data Science

The target audience for this article should have a basic understanding of Hive and the Hadoop ecosystem features. This article is focused on comparing and showing what it takes to write a Hive Hook…

Read more at Towards Data Science

Ultimate Hive Tutorial: Essential Guide to Big Data Management and Querying

 Towards Data Science

Introduction Navigating the labyrinth of big data can be a daunting endeavor, especially when the paths are paved with complex terminology and intricate processes. This is particularly true for Apache...

Read more at Towards Data Science

What’s Buzzing with the Bees?

 Towards Data Science

The ‘save the bees’ campaign has been trending for almost a decade and a half now, and we often emphatically hear ‘the bees are dying!’, but how true is that actually? From about 1992 onwards…

Read more at Towards Data Science

A Data Science/Big Data Laboratory — part 3 of 4: Hive and Postgres over Ubuntu in a 3-node cluster

 Towards Data Science

This text can be used to support the installation in any Ubuntu 20.04 server clusters, and this is the beauty of well-designed layered software. Furthermore, if you have more nodes, you can…

Read more at Towards Data Science

What is Partitioning vs Bucketing in Apache Hive? (Partitioning vs Bucketing)

 Python in Plain English

Exploring partitioning vs clustering in the Hive table, and understanding when to do partitioning and when to do clustering Hey guys, Apache Hive is one of the popular data warehouses in distributed ...

Read more at Python in Plain English

Working with Hive using AWS S3 and Python

 Towards Data Science

In this article, I’m going to share my experience of maintaining a Hive schema. This will be useful to the freshers who are willing to step into Big Data technologies. Mainly this will describe how…

Read more at Towards Data Science

Best way to Export Hive table to CSV file

 Analytics Vidhya

Best way to Export Hive table to CSV file. This post is to explain different options available to export Hive Table (ORC, Parquet or Text) to CSV File..

Read more at Analytics Vidhya

Understanding Apache Hive LLAP

 Towards Data Science

Apache Hive is a complex system when you look at it, but once you go looking for more info, it’s more interesting than complex. There are multiple query engines available for Hive, and then there’s…

Read more at Towards Data Science

Installing Apache Hive 3.1.2 on Windows 10

 Towards Data Science

This is a step-by-step guide to install Apache Hive 3.1.2 on Windows 10 operating system

Read more at Towards Data Science

A practical approach to caching remote data using Hive in Flutter

 Level Up Coding

When developing apps, it is common to have communication with a remote data source. However, relying on the user constantly having a network connection in order to be able to display content can…

Read more at Level Up Coding

How To Create Your Own Hive SerDe — Hive Custom Data Serialize-Deserialize Mechanism

 Analytics Vidhya

As mentioned in my earlier blog post, SerDe is an interface which hive use to deserialize (read data from table’s hdfs location then converting it to java object) and serialize data (convert a Java…

Read more at Analytics Vidhya

Apache Sqoop

 Towards Data Science

Getting data from RDBMS to HDFS and back

Read more at Towards Data Science