Apache HBase is a powerful NoSQL database, but out-of-the-box, it’s a bit like a house with no locks on the doors. Any user who can connect to the cluster can read, write, and delete data, or even drop tables entirely. In any real-world production environment, this is a major security risk. 😱
Fortunately, HBase has a robust, built-in authorization system that allows you to control user access with fine-grained precision. In this guide, we’ll walk through the essential steps to enable authorization and manage user permissions.
🔐 Step 1: Enable Authorization in hbase-site.xml
The first step is to activate the security features by modifying the hbase-site.xml
configuration file. These changes must be applied to all HMaster and RegionServer nodes in your cluster. A full cluster restart will be required for them to take effect.
Add the following properties to your hbase-site.xml
:
<property>
<name>hbase.security.authorization</name>
<value>true</value>
<description>The master switch to enable HBase's built-in access control.</description>
</property>
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.hadoop.hbase.security.access.AccessController</value>
<description>Registers the AccessController coprocessor on the HMaster to handle administrative-level access checks (e.g., create/drop table).</description>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController</value>
<description>Registers the AccessController on RegionServers to handle data-level access checks (e.g., read/write). TokenProvider is also included for authentication purposes.</description>
</property>
<property>
<name>hbase.rpc.engine</name>
<value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
<description>Enables a more secure RPC engine for communication between clients and servers, essential in a secure environment.</description>
</property>
What Do These Properties Do?
hbase.security.authorization
: This is the main toggle. Setting it totrue
turns on the authorization engine.- Coprocessors: Think of HBase Coprocessors like triggers in a traditional database. They intercept events and execute custom code. The
AccessController
is the specific coprocessor that checks if a user has the required permissions before allowing an action to proceed. We need to load it on both the Master (for cluster-level actions) and the RegionServers (for data-level actions).
Once you’ve saved these changes and restarted your cluster, authorization is active. Be aware: at this point, only the HBase superuser (typically the hbase
user) can perform any actions!
🔑 Step 2: Managing Permissions via the HBase Shell
Now that security is enabled, you need to grant permissions to your users. You’ll do this as the hbase
superuser from the HBase shell.
First, connect to the shell:
[youruser@master ~]$ sudo -u hbase hbase shell
The grant
Command
The grant
command is your primary tool for assigning permissions. Its syntax is flexible, allowing for global, table-level, or even column-family-level access.
Syntax:
grant <user | @group>, <permissions> [, <table> [, <column family> [, <column qualifier>]]]
Permissions: Permissions are represented by a string of one or more of these characters:
- R - Read: The ability to read data from cells.
- W - Write: The ability to write data to cells.
- X - Execute: The ability to execute coprocessor endpoints.
- C - Create: The ability to create tables or column families.
- A - Admin: The ability to perform cluster-level administrative actions (e.g., balancing the cluster, dropping tables).
Practical Examples ✍️
Let’s look at some common scenarios.
1. Granting Global Admin Access:
To give a user named data_admin
full administrative rights over the entire cluster:
hbase(main):001:0> grant 'data_admin', 'RWXCA'
0 row(s) in 1.2340 seconds
2. Granting Read/Write Access to a Specific Table:
To allow the user app_user
to read from and write to the users
table:
hbase(main):002:0> grant 'app_user', 'RW', 'users'
0 row(s) in 0.5670 seconds
3. Granting Read-Only Access to a Column Family:
To give a user analyst
read-only access to just the metrics
column family within the sensor_data
table:
hbase(main):003:0> grant 'analyst', 'R', 'sensor_data', 'metrics'
0 row(s) in 0.4890 seconds
👀 Step 3: Verifying and Revoking Permissions
Managing security isn’t just about granting access; you also need to be able to check and revoke it.
Verifying Permissions (user_permission
)
To see the permissions for a specific table, use the user_permission
command. This is incredibly useful for auditing.
hbase(main):004:0> user_permission 'users'
User Namespace,Table,Family,Qualifier:Permission
app_user default,users,,: [Action: READ, WRITE]
data_admin default,users,,: [Action: READ, WRITE, EXEC, CREATE, ADMIN]
Revoking Permissions (revoke
)
To remove a user’s permissions, the revoke
command uses a syntax nearly identical to grant
.
1. Revoke All Permissions from a User on a Table:
hbase(main):005:0> revoke 'app_user', 'users'
0 row(s) in 0.8760 seconds
2. Revoke Global Admin Rights:
hbase(main):006:0> revoke 'data_admin'
0 row(s) in 0.9120 seconds
After running these commands, remember to quit
the shell.
Useful Links
For more in-depth information, the official Apache HBase documentation is the best resource: