Norwegian version of this page

Cloudian S3 - Guide

Getting started

Requesting object storage

Our Cloudian system is secured for up to red (sensitive) data, however, self-service via our GUI will only be available for UiO users with green and/or yellow data.

Note that it is the responsibility of the user to ensure that the proper security attributes are enforced for the different data-classes. Therefore, please refer to our data storage guide before requesting storage in order to determine what level of security will be required for your use, and request accordingly.

Storage is ordered via our form: https://nettskjema.no/a/189898
Furthermore, for green/yellow buckets you will be asked to request a system user which will be used to log in to the GUI.

    Setting up access keys

    A key pair is required to communicate with your bucket(s).
    This section contains information on how to retrieve your access keys, how to secure them while in use, and how to configure IAM with a new set of keys providing specific access for your buckets.

    Retrieving the keys

    Red users will simply retrieve a pair of IAM access and secret keys which grants access to the bucket(s), and from specific IP's as ordered. Green and yellow users can retrieve the keys themselves from the user profile in the GUI.

    Once retrieved, bucket access can be confirmed by placing the keys in
    ~/.aws/credentials with the following syntax, and using your favorite request method:

    [default] 
    region = oslo 
    aws_access_key_id = AKIA0123456787EXAMPLE
    aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
    
    [some_profile]
    region = oslo
    aws_access_key_id = ANOTHER123456787EXAMPLE
    aws_secret_access_key = wjAnotherI/K7MDENG/bPxRfiCYEXAMPLEKEY

    Most user will only have one set of keys which can be defined beneath "default", but if you have several you can split them up in different profiles to easily switch between key pair by ex. adding the --profile <name> option to your requests.

    Numerous tools and programming languages supports S3 requests using the AWS credentials file, with AWS CLI being the simplest to use from a terminal. Here's a simple command to check if you have access:

    # Using default keys
    aws s3api list-buckets --endpoint https://s3-oslo.educloud.no
    
    {
        "Buckets": [
            {
                "Name": "1003-green-some-test",
                "CreationDate": "2023-03-15T10:40:04.953Z"
            },
            {
                "Name": "1003-green-some-prod",
                "CreationDate": "2023-02-20T09:35:55.596Z"
            }
        ],
        "Owner": {
            "DisplayName": "",
            "ID": "53f0lots32of934characters2ed4245"
        }
    }
    
    # To use a specific profile
    aws s3api list-buckets --profile foo --endpoint https://s3-oslo.educloud.no

    Guides on requests methods are linked in the quick reference at the bottom of the page (TODO). But first, please read ahead on how to secure your keys.

     

    Securing keys on the client

    It is important to note that the S3 credentials should be treated similar to usernames and passwords. Preferably they should be rotated on a yearly basis, but unfortunately it is challenging to automate this process. For user with access to the GUI, rotation can be done by simply creating a new pair and delete the former in the profile page.

    More importantly, the active keys should not be stored in clear text as with the example above, which unfortunately is also the default behaviour when setting up credentials with ex. AWS-CLI or Powershell.

    AWS has an in-built parameter which allows the credentials to be retrieved via a script, utilized in the ~/.aws/credentials file like this:

    [default] 
    region = oslo 
    credential_process = /path/to/script
    

    The credential_process field expects a script producing the following JSON output:

    {
      "Version": 1, 
      "AccessKeyId": "AKIA0123456787EXAMPLE",
      "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }

    Below is a couple of procedures of how you can set this up by using an environmental variable to decrypt an encrypted key pair, or retrieve them from a password vault to automatically authorize when running S3 requests.

     

    1. Encryption + decryption script
    This method only requires an encryption tool such as OpenSSL, and a few lines of shell code. Therefore it is readily available on just about any Unix system.

    Start by creating a JSON-file containing your credentials in the expected format, and encrypt it with the following openssl options:

    openssl enc -in s3-creds.json -out s3-creds.enc \
                -aes-256-cbc -md sha512 -pbkdf2 -iter 100000 -salt 

    You will be asked to provide a password, which should be stored in a vault like your usual shared passwords.

    Side note: additional steps for Mac users
    By default MacOS uses LibreSSL for encryption, which doesn't support pbkdf2.
    So if you're getting errors, check the version, and if it refers to LibreSSL, install proper OpenSSL from brew and install it on PATH:

    # openssl version
    LibreSSL 3.3.6
    
    # brew update
    # brew install openssl
    # echo 'export PATH="/usr/local/opt/openssl@3/bin:$PATH"' >> ~/.bash_profile
    
    # openssl version
    OpenSSL 3.1.2 1 Aug 2023 (Library: OpenSSL 3.1.2 1 Aug 2023) 
    

     

    Now we'll create a decryption script to be provided in the credential_process, so that the json file is decrypted on demand. In the simplest form it looks like this:

    #!/bin/sh
    
    openssl enc -d -in $1 -pass env:S3_PW \
                -aes-256-cbc -md sha512 -pbkdf2 -iter 100000 -salt
    

    The decrypt.sh script above expects the path to the encrypted file to be provided when running it ($1), and an environmental variable, S3_PW, with the encryption password to be set beforehand. The variable can be made available without revealing your input in the terminal like this:

    read -s -r S3_PW && export S3_PW

    Note that once you close the shell, the variable will be automatically removed, and needs to be re-set in the next shell session you'll be using for S3 requests.

    Finally, update ~/.aws/credentials to use your decryption script:

    [default]
    region = oslo
    credential_process = /path/to/decrypt.sh /path/to/s3-creds.enc
    

    When you now try to do an S3 request towards the storage with the configured profile, it will automatically decrypt the file and authorize with the associated keys.
    Once confirmed that it works, remove your unencrypted key file.

     

    2. Using Vault for encryption and retrevial of S3 credentials

    To use this method, you need to have the Hashicorp Vault client installed. In your prefered path, create the credential like this, where the name "system_user_testbucket" is the name of the key where you will be storing the JSON-data containing your access key and secret key.

    vault kv put /secret/engine/path/to/S3_credentials system_user_testbucket='{
      "Version": 1,
      "AccessKeyId": "AKIA0123456787EXAMPLE",
      "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }'

    Next we create a shell script that can retrieve the JSON data so that awscli and the Powershell module can use it for credentials.

    #!/bin/bash
    
    vault kv get --field $1 it-cloudian/credentials/system_user_testbucket
    

    If you have more than one key stored in that Vault secret, you can now give the retrieve-script the name of the key, and it will try to request the secret from vault:

    $ ./retrieve_S3_vault.sh my_adminuser
    {
      "Version": 1,
      "AccessKeyId": "AKIA0123456787EXAMPLE",
      "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
    }

    Once the script is tested and working you can add the script in the .aws/config-file:

    [default]
    region = oslo
    credential_process = /path/to/retrieve_S3_vault.sh system_user_testbucket

    Each time you now want to connect to your buckets, you need to make sure that you have logged in first with the Vault CLI and have a valid token.

    Configuring IAM for fine-grained access

    The key pair of an system/GUI user, can be considered the "root key", granting full access to all your buckets from whichever machine has them configured. While practical, this poses a security risk should key pair be leaked.
    IAM, short for Identity and Access Management, is a way to create key pairs combined with an ACL (access policy). The policy details which buckets the key pair can access, with which requests, and from which machines, drastically reducing the risk of a compromised client.

    When requesting buckets for red data, the users will only receive IAM-keys with a tailored policy, while the admins retain the bucket ownership. Therefore red users don't need to set up IAM themselves, but it can be useful to know the structure of a policy.
     

    Creating an IAM users or groups

    You can create one or more IAM users and groups under your system user, which can be granted access to selected buckets owned by the same system user. This is a way of enabling specific applications to use the object storage service without giving them the S3 access credentials associated with your system user, and thus without giving them the full range of permissions your system user has. Splitting permissions over several users can be useful if you want to have a bucket each for development, test, and production, in which you could use separate users for each bucket.

     

    Creating an IAM policy

    By default, an IAM user has zero permissions. If you create a bucket called "test_s3_bucket", that bucket is per default private, and only available for the owner, i.e its system user. Thus, if you want to let the IAM-user "testenviroment_iam_user1" get access to the bucket "test_s3_bucket", you need to create an IAM Policy.

    (TODO: example w/figures and explanations)

     

    Other kinds of policies

    Bucket Policy
    Used when you want to control who can access the bucket, be it from certain users, all users AKA public non-authenticated users, or from certain IPs, or filtering based on request headers. Only root users, i.e non-IAM users are the ones that can be used in a Bucket Policy, and is also the only one that can apply, change and delete bucket policies. An extensive guide on bucket policy options can be found on AWS' website.

    For example, the following Bucket Policy makes sure that S3-actions on the bucket is only available for the IPs defined in the values for the key "aws:SourceIp".

    Write-S3BucketPolicy -BucketName test_s3_bucket \
                         -EndpointUrl https://s3-oslo.educloud.no \
    -Policy '{
       "Id": "OnlyAllowedIps",
       "Statement": [
         {
           "Sid": "OnlyAllowedIps",
           "Effect": "Deny",
           "Principal": "*",
           "Action": "s3:*",
           "Resource": [
                "arn:aws:s3:::test_s3_bucket",
              "arn:aws:s3:::test_s3_bucket/*"
           ],
           "Condition": {
            "NotIpAddress": { 
              "aws:SourceIp": [
                "129.240.2.0/27",
                "129.240.6.149/32",
                "193.156.42.128/25"
              ]
             }
           }
         }
       ]
     }

    Note that single IPs should still add the subnet suffix (CIDR). If you want to only allow "129.240.6.149", it should still have "/32" added.

    In most scenarios though, a bucket needs different kinds of permissions, for example that the bucket is only accessible from certains IPs, and by certain users. Since we are requiring that all projects use IAM users, you will have have to use IAM policies to secure your buckets. As such, bucket policies are only useful if you want to configure specific functions like object lifecycles.

    S3 ACL
    When you use the CMC GUI, and set permissions on a bucket through Buckets & Objects → Buckets → Properties → Bucket Permissions, you are in fact setting up S3 ACL for the bucket. However, this is a legacy way of enforcing access to a bucket, and one should avoid using it. Instead use bucket policies or IAM policies, as both are more readable and enable fine-grained policies.

    Additional tips & tricks

    Server-side encryption

    There are several ways of encrypting S3 objects. The UiO S3 solution currently support SSE-C (Server Side Encryption with Customer key). Data will be encrypted at rest and will not accidentally be made available with a public URL. It provides a great additional level of security. When using SSE-C the object and an encryption key is uploaded, and the encryption key is discarded.
    Therefore, if the encryption key is lost, so is the object!

    Our S3 solution only supports HTTPS, ensuring that the key is never transmitted in plain text.

    For now we refer to Amazon's guide on SSE-C with PowerShell, but an example using AWS CLI will be written.

     

    Quick reference

    GUI link: https://cmc.educloud.no  (UiO network only)
    
    S3 region: oslo
    
    S3 endpoint: https://s3-oslo.educloud.no:443
    

    Further usage guides

    Tags: Storage, S3 By Markus Sørensen
    Published Sep. 12, 2023 12:13 PM - Last modified Apr. 5, 2024 4:05 PM