question

HariGS-9454 avatar image
0 Votes"
HariGS-9454 asked HariGS-9454 commented

HDInsight Cluster create a file with write mask

Hi,

We are using Data Lake Gen 1 as the data store for HDInsight Cluster version 3.6. I wrote a simple spark code to write a file using saveas command of pyspark. The file is created in the DataLake with a write mask on. As a result. Any other user is not able to delete it. I don't have owner access on datalake account but have the necessary read/write/execute access through default file permission setting. But since the write mask is set by the Service Principle used by the Cluster, i am unable to delete the file.

Is this an expected behavior. Is there a workaround to this.

azure-data-lake-storageazure-hdinsight
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello HarisGS ,

We are reaching out to the innternal team to get help on this . I will update you once we hear from them,.

0 Votes 0 ·

1 Answer

HimanshuSinhamfst-5269 avatar image
0 Votes"
HimanshuSinhamfst-5269 answered HariGS-9454 commented

Hello Hari ,

This seems to be the expected behavior as Spark is using HDFS library for writing the file on the ADLS G1.

The reason spark is sending calls with 644 permissions for new files and why its not honoring the default permissions from the parent folder is because we apply the "spark.hadoop.fs.permissions.umask-mode" (default 022) mask. You can read more here


We may try to reset the permission on ADLS folder/file through code or using Azure CLI.

We tried below on Databricks for resetting perm on ADLS G1 & G2. It worked .



 %scala
 import java.io._
 import org.apache.hadoop.fs.{Path, FileStatus}
 import org.apache.hadoop.fs.permission.FsPermission
 def setPermissions(stringPath: String, stringPermissions: String): Unit = {
     val conf = spark.sessionState.newHadoopConf()
     val path = new Path(stringPath)
     val fs = path.getFileSystem(conf)
     val perms = new FsPermission(stringPermissions)
     fs.setPermission(path, perms)
   }
 setPermissions("adl:/yourfilepath.ext", "777")



Please let me know how it goes .

Thanks
Himanshu

Please do consider to click on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members



· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

HI Himanshu,
Thanks for your clarification. I was able to change the permission using hdfs dfs chmod command and that worked. But this needs to be run from the spark code. Also every time there is a new file created or new folder created (using saveastable) these masks are set and we would need to add additional code to update this permission.

Is there a way to update the default mask setting to not do the write mask. This way permission will be controlled by the default ACL and its more transparent?




0 Votes 0 ·