Configure HDFS Storage for Zeppelin Notebooks on HDInsight Spark Clusters with ESP

Christoph Kiefer 141 Reputation points
2020-08-28T09:55:45.47+00:00

Dear All

We followed this step-by-step tutorial to configure HDFS storage for Zeppelin notebooks on our ESP-enabled HDInsight Spark Cluster (HDI 3.9, Spark 2.3): https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_zeppelin-component-guide/content/ch_zeppelin_upgrade_hdfs_storage.html

However, once we restarted the Zeppelin service and navigate to the Zeppelin UI, we realized that no automatic login for the currently logged in Ambari user is performed anymore, and second, that we are not able at all to log in to Zeppelin anymore.

The shiro.ini that get's created looks as follows (I use a dummy name for our AD domain name):

[main]
krbRealm = org.apache.zeppelin.realm.kerberos.KerberosRealm
krbRealm.principal=HTTP/hn0-prdupc.ourdomain.onmicrosoft.com@OURDOMAIN.ONMICROSOFT.COM
krbRealm.keytab=/etc/security/keytabs/spnego.service.keytab
krbRealm.nameRules=DEFAULT
krbRealm.signatureSecretFile=/etc/security/http_secret
krbRealm.tokenValidity=36000
krbRealm.cookieDomain=azurehdinsight.net
krbRealm.cookiePath=/
authc = org.apache.zeppelin.realm.kerberos.KerberosAuthenticationFilter

sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager
cacheManager = org.apache.shiro.cache.MemoryConstrainedCacheManager
securityManager.cacheManager = $cacheManager
securityManager.sessionManager = $sessionManager
# 86,400,000 milliseconds = 24 hour
securityManager.sessionManager.globalSessionTimeout = 86400000

shiro.loginUrl = /api/login

[roles]
admin = ckiefer

[urls]
# anon means the access is anonymous.
/api/version = anon
/api/interpreter/** = authc, roles[admin]
/api/configurations/** = authc, roles[admin]
/api/credential/** = authc, roles[admin]
/** = authc

Any help is appreciated to debug and solve this issue.

Update: I did not yet think about Zeppelin impersonation and Ranger policies. Is it also necessary to set this up?

BR, Christoph

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
199 questions
{count} votes