Home:ALL Converter>How to migrate On Prem Hadoop to GCP

How to migrate On Prem Hadoop to GCP

Ask Time:2019-02-11T17:42:49         Author:Fr_nkenstien

Json Formatter

I am trying to migrate our organization's hadoop jobs to GCP...I am confused between GCP Data Flow and Data Proc...

I want to re-use Hadoop jobs we already have created and minimize the management of the cluster as much as possible. We also want to be able to persist data beyond the life of the cluster...

Can anyone suggest

Author:Fr_nkenstien,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/54627640/how-to-migrate-on-prem-hadoop-to-gcp
skjagini :

I would just start with DataProc as it is very close to what you have. \n\nCheck out DataProc initialization actions, https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/init-actions, create a simple cluster and get a feel for it. \n\nDataFlow is completely managed and you don't operate any cluster resources, but at the same time you cannot migrate an onsite cluster to DataFlow as is, you need to migrate (some times rewrite) your Hive/Pig/Oozie etc. \n\nCost for DataFlow is also calculated differently, though there is no upfront cost vs DataProc, everytime you run a job you incur some cost associated with it on DataFlow.",
2019-02-12T05:11:48
yy