Let's suppose I have multiple HdInsight4.0 clusters. Also suppose that I would like to access the Hadoop services eg jobhistory server running inside these clusters. Let's suppose I get the corresponding jobhistory address from each cluster via Ambari client configuration API. To be precise I fetch the value of the "mapreduce.jobhistory.address" hadoop property.
Ambari answers back the "headnodehost:10020" string. This is fine - one might guess - if I'm on a cluster node since all nodes have an /etc/hosts file which knows about "headnodehost" hostname:
But I'm not on a cluster node! Also I have a setup which registers all of the unique hostnames of every HDInsight cluster nodes so I can access those from my node. In other words I'm on a node which is able to reach all of my HDInsight clusters (network connectivity is provided). As you would guess this is where things get complicated. What should I do with "headnodehost". I cannot use the returned "headnodehost" hostname to establish TCP/IP connectivity simply because all of my HDInsight clusters have one which resolves to multiple different internal IP in each cluster. Obviously One might mistakenly say that I might as well find out what is the unique hostname alternative for that very same node like: "hn0-hdi101.iuyf3i2yrrvetpdqnyswcj2c3b.fx.internal.cloudapp.net" or "hn0-hdi101" and use that for TCP/IP but my automatism (and client libraries) rely on the "mapreduce.jobhistory.address" hadoop property, as well as these following properties fetched from Hadoop cluster via Ambari so this is approach would be a bottomless rabbit hole:
Is it possible to provision the HDInsight cluster in a way that jobhistory service setup would be configured with one of the unique hostnames?
Alternatively is it possible to make this concept of "headnodehost" alias globally unique? Like prefixing it with clusterid: "hdi101.headnodehost" and configure that for Hadoop services like jobhistory server during cluster creation? Additionally keeping the "headnodehost" entry on the cluster /etc/hosts could maintain backward compatibility for existing applications.