AI-powered search & chat for Data / Computer Science Students
Introduction to Hive
This article focuses on Hive, it’s features, use cases, and Hive queries. Since a lot of DML and DDL queries are very similar to SQL, it can act as a foundation or building block for anyone new to…
Read more at Towards Data ScienceGetting Started With Hive
The aim of this blog post is to help you get started with Hive using Cloudera Manager. Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization…
Read more at Towards Data ScienceHive Installation on ubuntu 18.04
Note: Prefer java 8, as newer versions no longer has URLClassLoader which is required for running hive. Now. lets begin the installation process of hive by downloading the latest stables release from…...
Read more at Analytics VidhyaIs it still good to learn Apache Hive?
As the big data world moves towards Apache Spark, Databricks, or Cloud-based Data Warehouses like Amazon RedShift / Snowflake, the general conception is, Hive is an obsolete technology to learn.
Read more at Analytics VidhyaIntroducing Hiveplotlib
Introducing hiveplotlib— a new, open-source Python package for generating Hive Plots.
Read more at Towards Data ScienceHive using S3 and Scala
In this article, I’m going to share my experience of maintaining a Hive schema. This will be useful to the freshers who are willing to step into Big Data technologies. Mainly this will describe how…
Read more at Analytics VidhyaHive — How to install in 5 Steps in Windows 10
Once extracted, we would get a new file apache-hive-3.1.2-bin.tar Now, once again we need to extract this tar file. To edit environment variables, go to Control Panel System click on the…
Read more at Analytics VidhyaThe Curious Case of MySQL, PostgreSQL, and Hive
In an era of Big Data where the amount, size, and velocity of data are rapidly growing, knowing SQL is still an essential thing for Data Analyst and Data Scientist to know. SQL helps us to manage…
Read more at Towards Data ScienceApache Hive Optimization Techniques — 1
Apache Hive is a query and analysis engine which is built on top of Apache Hadoop and uses MapReduce Programming Model. It provides an abstraction layer to query big-data using the SQL syntax by…
Read more at Towards Data SciencePlanet Beehive
Close your eyes for a second, ignore the rain dripping outside, the beep from a new incoming email and deeply think about ten global activities that are on the top of your To Do list .. Now be…
Read more at Towards Data ScienceApache Hive Optimization Techniques — 2
In the earlier article, we covered how appropriate data modeling using partitioning and bucketing, choosing Tez as execution engine as well as compression could prove to be very big cost-saving…
Read more at Towards Data ScienceSampling Data in Hive
Sampling data operation is something you need to feed your other environments than production something like test environment, or for unit testing, etc. Because your data is probably so huge amount…
Read more at Analytics VidhyaApache Hive Hooks and Metastore Listeners: A tale of your metadata
The target audience for this article should have a basic understanding of Hive and the Hadoop ecosystem features. This article is focused on comparing and showing what it takes to write a Hive Hook…
Read more at Towards Data ScienceUltimate Hive Tutorial: Essential Guide to Big Data Management and Querying
Introduction Navigating the labyrinth of big data can be a daunting endeavor, especially when the paths are paved with complex terminology and intricate processes. This is particularly true for Apache...
Read more at Towards Data ScienceWhat’s Buzzing with the Bees?
The ‘save the bees’ campaign has been trending for almost a decade and a half now, and we often emphatically hear ‘the bees are dying!’, but how true is that actually? From about 1992 onwards…
Read more at Towards Data ScienceA Data Science/Big Data Laboratory — part 3 of 4: Hive and Postgres over Ubuntu in a 3-node cluster
This text can be used to support the installation in any Ubuntu 20.04 server clusters, and this is the beauty of well-designed layered software. Furthermore, if you have more nodes, you can…
Read more at Towards Data ScienceWhat is Partitioning vs Bucketing in Apache Hive? (Partitioning vs Bucketing)
Exploring partitioning vs clustering in the Hive table, and understanding when to do partitioning and when to do clustering Hey guys, Apache Hive is one of the popular data warehouses in distributed ...
Read more at Python in Plain EnglishWorking with Hive using AWS S3 and Python
In this article, I’m going to share my experience of maintaining a Hive schema. This will be useful to the freshers who are willing to step into Big Data technologies. Mainly this will describe how…
Read more at Towards Data ScienceBest way to Export Hive table to CSV file
Best way to Export Hive table to CSV file. This post is to explain different options available to export Hive Table (ORC, Parquet or Text) to CSV File..
Read more at Analytics VidhyaUnderstanding Apache Hive LLAP
Apache Hive is a complex system when you look at it, but once you go looking for more info, it’s more interesting than complex. There are multiple query engines available for Hive, and then there’s…
Read more at Towards Data ScienceInstalling Apache Hive 3.1.2 on Windows 10
This is a step-by-step guide to install Apache Hive 3.1.2 on Windows 10 operating system
Read more at Towards Data ScienceA practical approach to caching remote data using Hive in Flutter
When developing apps, it is common to have communication with a remote data source. However, relying on the user constantly having a network connection in order to be able to display content can…
Read more at Level Up CodingHow To Create Your Own Hive SerDe — Hive Custom Data Serialize-Deserialize Mechanism
As mentioned in my earlier blog post, SerDe is an interface which hive use to deserialize (read data from table’s hdfs location then converting it to java object) and serialize data (convert a Java…
Read more at Analytics VidhyaApache Sqoop
Getting data from RDBMS to HDFS and back
Read more at Towards Data Science- «
- ‹
- …