Dynamic library permissions with ArticDB on AWS S3¶
Goal¶
One of the advantages of ArcticDB is how easy it is to setup and use as a personal database. But how can we extend this pattern to an organisation? How can we keep it trivial to use as an individual, but allow for secure sharing of data with team-mates and groups across your organisation? We also want this to be easy to maintain, a key challange for permissions generally.
Here we model a small two team organisation, 'Acme', in AWS and create some flexible permissions that allow users and teams to create and use private and shared data without any per-library setup.
- Data Team
- Jane
- Samir
- Quant Team
- Alan
- Diana
Each user should be able to,
- list all ArcticDB libraries (but not their content)
- create personal libraries that only they can read and write to
- create team libraries that only those in their team can read and write to
Users will follow an ArcticDB library name convention. The library name should one of <USERNAME>/<LIBRARY>
or <TEAM>/<MYLIBRARY>
, so if Jane wants to create a personal library for weather data, they would use lib.create_library('jane@acme/weather')
(assuming the AWS S3 setup below).
We can do this with path-prefix permissions in AWS S3. Other backends, such as Minio, support path based permissioning.
AWS S3¶
You should have a number of users setup in AWS IAM already along with a group containing all the users. Follow the IAM docs for help with that.
For Acme we've setup four users and the users are tagged with the teams they are a member of:
aws:username | aws:PrincipalTag/team |
---|---|
jane@acme | data |
samir@acme | data |
alan@acme | quant |
diana@acme | quant |
We've also created a user group, acme
with all four users in.
Let's create an S3 bucket for Acme, acme-arcticdb
, using cloudshell.
aws s3 mb s3://acme-arcticdb
Setting up the permissions policy¶
In general for read access our users will need s3:ListBucket
and s3:GetObject
permissions and for write access our users will additionally need s3:PutObject
and s3:DeleteObject
.
Then setup the following access policy. We will save this snippet to policy.json
.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListObjects",
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::acme-arcticdb",
"Condition": {
"StringLike": {
"s3:prefix": [
"_arctic_cfg/*",
"${aws:username}/*",
"${aws:PrincipalTag/team}/*"
]
}
}
},
{
"Sid": "PutGetDeleteObjects",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::acme-arcticdb/${aws:username}/*",
"arn:aws:s3:::acme-arcticdb/${aws:PrincipalTag/team}/*",
"arn:aws:s3:::acme-arcticdb/_arctic_cfg/cref/?sUt?${aws:username}/*",
"arn:aws:s3:::acme-arcticdb/_arctic_cfg/cref/?sUt?${aws:PrincipalTag/team}/*"
]
}
]
}
Create the policy in AWS.
aws iam create-policy --policy-name acme-arcticdb-access --policy-document file://policy.json
Take note of the Arn
in the output to the last command as you'll need it to attach the policy to a group
with all your users in, for this example the group is acme
.
aws iam attach-group-policy --policy-arn <ARN> --group-name acme
If you intend to adapt that example policy to your own situation then please note that,
acme-arcticdb
is the name of the bucket and will need to be replaced everywheres3:ListBucket
is used to permissionListObjectsV2
and needs its own section, as it applies to the bucket as a whole. We control access to paths by checking thes3:prefix
argument that's part of theListObjectsV2
request.Put
,Get
andDelete
can be specifed for object paths in the second section._arctic_cfg/cref/*
is where the ArcticDB library configuration is stored and the data for each library is stored in the root of the bucket with a path that starts with the library name.- By using
${aws:username}
and${aws:PrincipalTag/team}
we've restricted library access to those with a matching AWS IAM username or a matched user 'team' tag.
Security note¶
Because ${aws:username}
, ${aws:PrincipalTag/team}
and _arctic_cfg
... are at the beginning of the path, it's important they don't contain values that can overlap. For example if you have a team called 'data' and a username for an application called 'data', they will have the same permissions, or if a username can be created that starts with _arctic_cfg
... then that user will be able to modify all library configs.
Usage¶
Jane can now list and read and write to their own libraries and team libraries, but not to others in Acme.
import numpy as np
import arcticdb as adb
# jane@acme team=data
access = '<REDACTED>'
secret = '<REDACTED>'
bucket='acme-arcticdb'
region='eu-west-2'
arctic = adb.Arctic(f's3://s3.{region}.amazonaws.com:{bucket}?access={access}&secret={secret}')
# Create library as me
arctic.create_library('jane@acme/weather')
lib = arctic.get_library('jane@acme/weather')
lib.write('test', np.arange(100))
# Create library as data team
arctic.create_library('data/forecast')
lib = arctic.get_library('data/forecast')
lib.write('test', np.arange(100))
# See all libraries
arctic.list_libraries()
# ['alan@acme/bonds', 'data/forecast', 'jane@acme/weather', 'quant/stocks']
# Can't use or delete Alan or Quant team data
arctic.get_library('alan@acme/bonds')
arctic.get_library('quant/stocks')
arctic.delete_libraru('alan@acme/bonds')
# All raise:
# PermissionException: E_PERMISSION Permission error: S3Error#15 : No response body.