Skip to content

Instantly share code, notes, and snippets.

@dasgoll
Last active September 28, 2022 06:23
Show Gist options
  • Save dasgoll/7867c9fe7104f968d6ee to your computer and use it in GitHub Desktop.
Save dasgoll/7867c9fe7104f968d6ee to your computer and use it in GitHub Desktop.
If you’ve set up Hadoop for development you may be wondering why you can’t read or write files or create MapReduce jobs then you’re probably missing a tiny bit of configuration. For most development systems in pseudo-distributed mode it’s easiest to disable permissions altogether. This means that any user, not just the “hdfs” user, can do anything they want to HDFS so do not do this in production unless you have a very good reason.
If that’s the case and you really want to disable permissions just add this snippet into your hdfs-site.xml file (located in /etc/hadoop-0.20/conf.empty/hdfs-site.xml on Debian Squeeze) in the configuration section:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
Then restart Hadoop (su to the “hdfs” user and run bin/stop-all.sh then bin/start-all.sh) and try putting a file again. You should now be able to read/write with no restrictions.
Superuser status - The username which was used to start the Hadoop process (i.e., the username who actually ran bin/start-all.sh or bin/start-dfs.sh) is acknowledged to be the superuser for HDFS
Supergroup - There is also a special group named supergroup, whose membership is controlled by the configuration parameter dfs.permissions.supergroup.
Disabling permissions - By default, permissions are enabled on HDFS. The permission system can be disabled by setting the configuration option dfs.permissions to false. The owner, group, and permissions bits associated with each file and directory will still be preserved, but the HDFS process does not enforce them, except when using permissions-related operations such as -chmod.
========
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment