A collection of 52 posts

Big Data Hive in 2021 (1): Basic Concepts of HiveIn the early hours of the night, my girlfriend asked what a data warehouse was, and my answer surprised her, and then found out. . .

The most detailed series of Hive articles in the entire network, it is strongly recommended to collect and pay attention!Later updated articles will list historical article directories to help you review the key points of knowledge.table of Contents Historical articles Preface Hive basic concepts 1. Introduction to Hive


SQL query statement & injection actual combat (hand note)

table of ContentsPrefaceCondition queryQuery orderLimit resultJoint queryDisplay dislocationSQL built-in functionsBuilt-in database and tablesmycli auxiliary commandsSql injection typeCombat shooting rangeDetermine whether there is an injection point for the type, the typeSee how many fields he hasDatabase queryGet the table nameBurst field nameGet the password, view the value of the password fieldto


Big data development must learn to read yarn logs: Task fault tolerance mechanism, task speculative execution, counter

Background: Yarn's web interface is more or less viewed by all big data development, such as task running failure, task running slowly, viewing detailed task running progress, detailed error troubleshooting, debugging, etc. However, from the actual feedback, many big data developers are not in-depth about the log view of the


hive metadata analysis

Preface在对hive SQL进行解析,以及跟踪hive job与yarn application的关系时, 还有对hive数据仓库进行数据治理时,需要对hive元数据有个较为清楚的认识, 进而更好的在解析SQL时,对数据访问进行权限控制; 在资源管理时,进行资源归属; 在数据生命周期管理时对其进行有效管理 hive metadata database, tablesHive metadata is stored in mysql. If installed by default, it is a hive database, which contains a series of data tables related to data tables, partitions, data skew, data storage, compression, etc.versionStore hive


Introduction to Hive (8) Optimization Summary

Introduction to Hive (8) Optimization SummaryProperty optimization (configuration optimization)Local modeJVM reuseSpeculative executionFetch fetchParallel executioncompressionVectorized queryZero copyAssociation optimizationCBO optimizerSmall file handlingIndex optimizationPredicate pushdownMap JoinBucket JoinTask memoryBuffer sizeSpill thresholdMerge threadReduce pulls parallelismSQL optimizationDesign OptimizationPartition TableBucket tableFile storageOut of memory optimizationInsufficient heap memoryInsufficient physical memoryInsufficient virtual memoryOptimization of data skewGroup by/count


Day14_20180426_Hive metadata configuration, development and use

1. Review-     "Functions of hive         -" Convert SQL into MapReduce program and submit it to yarn to run-         "Mapping files on HDFS into     tables-"Component of hive-         "Hadoop:             -"Storage: hdfs-             "Calculation : MapReduce-         "metastore: all the scheme information in the entire database-             "Mapping of tables and hdfs-             " Mapping of files