mysql - converting date to timestamp in hive -
i have table in rdbms, date format '1986-12-01'. using hive .08 not .12.
while import data hive null timestamp, there option populate data in table directly file(the data pretty big). or have use a stage table string , use function convert data timetamp, if like?
thanks !
i answer based on mysql, because see tag rdms name in post. then, have 3 options.
1.- filtering on sqoop query side
i assume here import data using sqoop. tool have option allows export result of sql query. in query use mysql method, unix_timestamp(date, format), transform date timestamp. sqoop instruction this:
sqoop import --connect jdbc:mysql://mysqlhost/mysqldb --username user --password passwd --query "select col_1, ..., unix_timestamp(str_to_date(date_col, '%y-%m-%d')) table1 \$conditions" -m 1 --target-dir hive_table1_data`
notice where \$conditions mandatory. furthermore i've assumes here date colum string. if date type, method str_to_date not needed.
2.- on rdms side
in point assume you have permisions create views in database.
other solution create view contains field date transformed timestamp. saw below function unix_timestamp(date, format). here have more details. in case should write this:
create view view_to_export select fields1, ..., unix_timestamp(str_to_date(field_date, '%y-%m-%d'));
assuming date field string data type, if date data type, method str_to_date not needed. sqoop instruction this:
sqoop import --connect jdbc:mysql://mysqlhost/mysqldb --username user --password passwd --table mysql_table_name -m 1 --target-dir hive_table1_data
so then, view, use sqoop without problems.
3. after import data
in case use other kind of import system, have data stored hdfs. create new table field transformed. use this:
insert table my_table select col_1, ..., unix_timestamp(date_col, 'yyyy-mm-dd') ) exported_table ;
in case need more time process. independent on way import data, , use less resources mysql server, because don't have calculate transformation date timestamp each row. you delegate hadoop mission process huge amount of data.
Comments
Post a Comment