Big Data SQL Quick Start. Binary Images and Big Data SQL – Part 22

Big Data SQL Quick Start. Binary Images and Big Data SQL – Part 22

Many thanks to Dario Vega, who is the actual author of this content, I’m just publishing it in the Big Data SQL blog.

Create a hive table with a binary field and cast to BLOB type in RDBMS when using big data sql

[oracle@tvpbdaacn13 dvega]$   /usr/bin/avro-tools tojson avro.file.dvega  | more
{"zipcode":{"string":"00720"},
 "lastname":{"string":"ALBERT"},
 "firstname":{"string":"JOSE"},
 "ssn":{"long":253181087},
 "gender":{"string":"male"},
 "license":{"bytes":"S11641384"}}[oracle@tvpbdaacn13 dvega]$   /usr/bin/parquet-tools head  parquet.file.dvega 
zipcode= 00566
lastname= ALEXANDER
firstname= PETER
ssn= 637221663
gender= male
license= UzY4NTkyNTc4

Simulating using Linux tools

On hive:

createtable image_new_test (img binary);

On Oracle:

SQL>CREATETABLE image_new_test
    (
    IMG BLOB
    )
    ORGANIZATION EXTERNAL
    (
      TYPE ORACLE_HIVE
      DEFAULT DIRECTORY DEFAULT_DIR
      ACCESSPARAMETERS
      (
        com.oracle.bigdata.cluster= tvpbdaacluster3
        com.oracle.bigdata.tablename: pmt.image_new_test 
      )
    );

On Linux:

base64 --w 10000000 YourImage.PNG > YourImage.BASE64
#Be sure to have only one line before copy to hadoop. If not fix
wc -l YourImage.BASE64

# you can concat many images on the same BASE64  file - one image by line

hadoop fs -put  Capture.BASE64 hdfs://tvpbdaacluster3-ns/user/hive/warehouse/pmt.db/image_new_test

or use load hive commands

Validate using SQL Developer:

1 Big Data SQL Quick Start. Binary Images and Big Data SQL – Part 22

Compare to the original one:

2 Big Data SQL Quick Start. Binary Images and Big Data SQL – Part 22

Original tables:

SQL> createtable image ( id number, img BLOB);

insert an image using sqldeveloper

REM create an externaltabletocopy the dmp files to hadoop

CREATETABLE image_dmp
   ORGANIZATION EXTERNAL (
     TYPE oracle_datapump
     DEFAULT DIRECTORY DEFAULT_DIR
     LOCATION ('filename1.dmp')
     )
   ASSELECT*FROM image;

Hive Tables:

#copy files to hadoop eg. on/user/dvega/images/filename1.dmp

CREATEEXTERNALTABLE image_hive_dmp
   ROW FORMAT SERDE 'oracle.hadoop.hive.datapump.DPSerDe'
   STORED AS
      INPUTFORMAT  'oracle.hadoop.hive.datapump.DPInputFormat'
      OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'LOCATION'/user/oracle/dvega/images/';

createtable image_hive_text
asselect*from image_hive_dmp
;

Big Data SQL tables:

CREATETABLE IMAGE_HIVE_DMP
(
  ID NUMBER
, IMG BLOB
)
ORGANIZATION EXTERNAL
(
  TYPE ORACLE_HIVE
  DEFAULT DIRECTORY DEFAULT_DIR
  ACCESSPARAMETERS
  (
    com.oracle.bigdata.cluster= tvpbdaacluster3
    com.oracle.bigdata.tablename: pmt.image_hive_dmp
  )
);

CREATETABLE IMAGE_HIVE_TEXT
(
  ID NUMBER
, IMG BLOB
)
ORGANIZATION EXTERNAL
(
  TYPE ORACLE_HIVE
  DEFAULT DIRECTORY DEFAULT_DIR
  ACCESSPARAMETERS
  (
    com.oracle.bigdata.cluster= tvpbdaacluster3
    com.oracle.bigdata.tablename: pmt.image_hive_text
  )
);

Let’s block ads! (Why?)

Oracle Blogs | Oracle The Data Warehouse Insider Blog