Good afternoon, I try to import a csv file in Hbase that is very long. It is dealing with food products: ingredients, nutrition, labels. It comes from Open Food Facts. It lists information about food products: ingredients, nutritional information, labels, etc. The majority of the data comes from crowdsourcing information. The file is about the open French public data platform.
I have seen this command for a csv file with 2 columns:
hbase org.apache.hadoop.hbase.mapreduce.ImportTsv –Dimporttsv.columns=HBASE_ROW_KEY,cf1:name,cf2:exp bulktable /hbase/bulk_data.tsv
But I have more than 150 columns. Mer seems difficult to copy all in this command. Is there any easier way to import my large csv file into my HBase database?
I try the following:
hbase(main):005:0> hadoop jar /usr/lib/hbase/lib/hbase-server-1.0.0-cdh5.4.4.jar
importtsv '-Dimporttsv.separator=,'
-Dimporttsv.bulk.output=output
-Dimporttsv.columns=HBASE_ROW_KEY,code,f: url,f: creator,f: created_t,f: created_datetime, ...
...
f:water-hardness_100g store /bulkload/perftest/bouffe.csv
But the terminal answers me:
SyntaxError: (hbase):4: lbunknown regexp options - lb
The complete command is:
hbase(main):005:0> hadoop jar /usr/lib/hbase/lib/hbase-server-1.0.0-cdh5.4.4.jar importtsv '-Dimporttsv.separator=,' -Dimporttsv.bulk.output=output -Dimporttsv.columns=HBASE_ROW_KEY,code,f: url,f: creator,f: created_t,f: created_datetime,f: last_modified_t,f: last_modified_datetime,f: product_name,f: generic_name,f: quantity,f: packaging,f: packaging_tags,f: brands,f: brands_tags,f: categories,f: categories_tags,f: categories_fr,f: origins,f: origins_tags,f: manufacturing_places,f: manufacturing_places_tags,f: labels,f: labels_tags,f: labels_fr,f: emb_codes,f: emb_codes_tags,f: first_packaging_code_geo,f: cities,f: cities_tags,f: purchase_places,f: stores,f: countries,f: countries_tags,f: countries_fr,f: ingredients_text,f: allergens,f: allergens_fr,f: traces,f: traces_tags,f: traces_fr,f: serving_size,f: no_nutriments,f: additives_n,f: additives,f: additives_tags,f: additives_fr,f: ingredients_from_palm_oil_n,f: ingredients_from_palm_oil,f: ingredients_from_palm_oil_tags,f: ingredients_that_may_be_from_palm_oil_n,f: ingredients_that_may_be_from_palm_oil,f: ingredients_that_may_be_from_palm_oil_tags,f: nutrition_grade_uk,f: nutrition_grade_fr,f: pnns_groups_1,f: pnns_groups_2,f: states,f: states_tags,f: states_fr,f: main_category,f: main_category_fr,f: image_url,f: image_small_url,f: energy_100g,f: energy-from-fat_100g,f: fat_100g,f: saturated-fat_100g,f: butyric-acid_100g,f: caproic-acid_100g,f: caprylic-acid_100g,f: capric-acid_100g,f: lauric-acid_100g,f: myristic-acid_100g,f: palmitic-acid_100g,f: stearic-acid_100g,f: arachidic-acid_100g,f: behenic-acid_100g,f: lignoceric-acid_100g,f: cerotic-acid_100g,f: montanic-acid_100g,f: melissic-acid_100g,f: monounsaturated-fat_100g,f: polyunsaturated-fat_100g,f: omega-3-fat_100g,f: alpha-linolenic-acid_100g,f: eicosapentaenoic-acid_100g,f: docosahexaenoic-acid_100g,f: omega-6-fat_100g,f: linoleic-acid_100g,f: arachidonic-acid_100g,f: gamma-linolenic-acid_100g,f: dihomo-gamma-linolenic-acid_100g,f: omega-9-fat_100g,f: oleic-acid_100g,f: elaidic-acid_100g,f: gondoic-acid_100g,f: mead-acid_100g,f: erucic-acid_100g,f: nervonic-acid_100g,f: trans-fat_100g,f: cholesterol_100g,f: carbohydrates_100g,f: sugars_100g,f: sucrose_100g,f: glucose_100g,f: fructose_100g,f: lactose_100g,f: maltose_100g,f: maltodextrins_100g,f: starch_100g,f: polyols_100g,f: fiber_100g,f: proteins_100g,f: casein_100g,f: serum-proteins_100g,f: nucleotides_100g,f: salt_100g,f: sodium_100g,f: alcohol_100g,f: vitamin-a_100g,f: beta-carotene_100g,f: vitamin-d_100g,f: vitamin-e_100g,f: vitamin-k_100g,f: vitamin-c_100g,f: vitamin-b1_100g,f: vitamin-b2_100g,f: vitamin-pp_100g,f: vitamin-b6_100g,f: vitamin-b9_100g,f: folates_100g,f: vitamin-b12_100g,f: biotin_100g,f: pantothenic-acid_100g,f: silica_100g,f: bicarbonate_100g,f: potassium_100g,f: chloride_100g,f: calcium_100g,f: phosphorus_100g,f: iron_100g,f: magnesium_100g,f: zinc_100g,f: copper_100g,f: manganese_100g,f: fluoride_100g,f: selenium_100g,f: chromium_100g,f: molybdenum_100g,f: iodine_100g,f: caffeine_100g,f: taurine_100g,f: ph_100g,f: fruits-vegetables-nuts_100g,f: fruits-vegetables-nuts-estimate_100g,f: collagen-meat-protein-ratio_100g,f: cocoa_100g,f: chlorophyl_100g,f: carbon-footprint_100g,f: nutrition-score-fr_100g,f: nutrition-score-uk_100g,f: glycemic-index_100g,f: water-hardness_100gcode, f:url, f:creator, f:created_t, f:created_datetime, f:last_modified_t, f:last_modified_datetime, f:product_name, f:generic_name, f:quantity, f:packaging, f:packaging_tags, f:brands, f:brands_tags, f:categories, f:categories_tags, f:categories_fr, f:origins, f:origins_tags, f:manufacturing_places, f:manufacturing_places_tags, f:labels, f:labels_tags, f:labels_fr, f:emb_codes, f:emb_codes_tags, f:first_packaging_code_geo, f:cities, f:cities_tags, f:purchase_places, f:stores, f:countries, f:countries_tags, f:countries_fr, f:ingredients_text, f:allergens, f:allergens_fr, f:traces, f:traces_tags, f:traces_fr, f:serving_size, f:no_nutriments, f:additives_n, f:additives, f:additives_tags, f:additives_fr, f:ingredients_from_palm_oil_n, f:ingredients_from_palm_oil, f:ingredients_from_palm_oil_tags, f:ingredients_that_may_be_from_palm_oil_n, f:ingredients_that_may_be_from_palm_oil, f:ingredients_that_may_be_from_palm_oil_tags, f:nutrition_grade_uk, f:nutrition_grade_fr, f:pnns_groups_1, f:pnns_groups_2, f:states, f:states_tags, f:states_fr, f:main_category, f:main_category_fr, f:image_url, f:image_small_url, f:energy_100g, f:energy-from-fat_100g, f:fat_100g, f:saturated-fat_100g, f:butyric-acid_100g, f:caproic-acid_100g, f:caprylic-acid_100g, f:capric-acid_100g, f:lauric-acid_100g, f:myristic-acid_100g, f:palmitic-acid_100g, f:stearic-acid_100g, f:arachidic-acid_100g, f:behenic-acid_100g, f:lignoceric-acid_100g, f:cerotic-acid_100g, f:montanic-acid_100g, f:melissic-acid_100g, f:monounsaturated-fat_100g, f:polyunsaturated-fat_100g, f:omega-3-fat_100g, f:alpha-linolenic-acid_100g, f:eicosapentaenoic-acid_100g, f:docosahexaenoic-acid_100g, f:omega-6-fat_100g, f:linoleic-acid_100g, f:arachidonic-acid_100g, f:gamma-linolenic-acid_100g, f:dihomo-gamma-linolenic-acid_100g, f:omega-9-fat_100g, f:oleic-acid_100g, f:elaidic-acid_100g, f:gondoic-acid_100g, f:mead-acid_100g, f:erucic-acid_100g, f:nervonic-acid_100g, f:trans-fat_100g, f:cholesterol_100g, f:carbohydrates_100g, f:sugars_100g, f:sucrose_100g, f:glucose_100g, f:fructose_100g, f:lactose_100g, f:maltose_100g, f:maltodextrins_100g, f:starch_100g, f:polyols_100g, f:fiber_100g, f:proteins_100g, f:casein_100g, f:serum-proteins_100g, f:nucleotides_100g, f:salt_100g, f:sodium_100g, f:alcohol_100g, f:vitamin-a_100g, f:beta-carotene_100g, f:vitamin-d_100g, f:vitamin-e_100g, f:vitamin-k_100g, f:vitamin-c_100g, f:vitamin-b1_100g, f:vitamin-b2_100g, f:vitamin-pp_100g, f:vitamin-b6_100g, f:vitamin-b9_100g, f:folates_100g, f:vitamin-b12_100g, f:biotin_100g, f:pantothenic-acid_100g, f:silica_100g, f:bicarbonate_100g, f:potassium_100g, f:chloride_100g, f:calcium_100g, f:phosphorus_100g, f:iron_100g, f:magnesium_100g, f:zinc_100g, f:copper_100g, f:manganese_100g, f:fluoride_100g, f:selenium_100g, f:chromium_100g, f:molybdenum_100g, f:iodine_100g, f:caffeine_100g, f:taurine_100g, f:ph_100g, f:fruits-vegetables-nuts_100g, f:fruits-vegetables-nuts-estimate_100g, f:collagen-meat-protein-ratio_100g, f:cocoa_100g, f:chlorophyl_100g, f:carbon-footprint_100g, f:nutrition-score-fr_100g, f:nutrition-score-uk_100g, f:glycemic-index_100g, f:water-hardness_100g store /bulkload/perftest/bouffe.csv