Configuring Multi-Session Transfers

Enterprise Server and Connect Server can achieve significant performance improvements by using multi-session transfers (also known as parallel transfers and multi-part transfers) on multi-node and multi-core systems.

Enabling Multi-Session Transfers

To enable multi-session transfers, run ascp with the option -C nid:ncount, where nid is the node ID and ncount is the number of nodes or cores. Assign each session (or invocation) its own UDP port. See the following section for examples.

Multi-session transfers are supported to AWS S3 storage, but transfers to clusters must use access key or assumed role authentication. IAM role authentication for multi-session transfers to S3 clusters is not supported.

Enabling File Splitting

Individual files can be split between multiple sessions by using --multi-session-threshold=threshold. The threshold value specifies, in bytes, the smallest-size file that can be split. Files greater than or equal to the threshold are split, while those smaller than the threshold are not.

A default value for the threshold can be specified in aspera.conf by setting <multi-session_threshold_default> in the <default> section. The command-line setting overrides the aspera.conf setting. If the client's aspera.conf does not specify a default value for the threshold, then the server's setting is used (if specified). If neither the client nor the server set a multi-session threshold, then no files are split.

To set a value (in bytes) from the command line, run the following:

> asconfigurator -x "set_node_data;transfer_multi_session_threshold_default,threshold"

The value of the multi-session threshold depends on the target rates that a single ascp transfer can achieve on your system for files of a given size, as well as the typical distribution of file sizes in the transfer list. If the multi-session threshold is set to 0 (zero), files are not split.

Using Multi-Session to Transfer Between Nodes

The following example shows a multi-session transfer on a dual-core system. Together, the two sessions can transfer at up to 2 Gbps and each session uses a different UDP port. To run simultaneous ascp transfers, you can run each command from its own terminal window, or run a script in a single terminal, or by backgrounding processes with the shell. Since no multi-session threshold is specified on the command line or in aspera.conf, no file splitting occurs.

> ascp -C 1:2 -O 33001 -l 1000m /dir01 10.0.0.2:/remote_dir
> ascp -C 2:2 -O 33002 -l 1000m /dir01 10.0.0.2:/remote_dir

If dir01 contains multiple files, ascp distributes the files between each command to get the most efficient throughput. If dir01 contains only one file, only one of the commands transfers the file.

In the following example, the multi-session threshold is used to enable file splitting:

> ascp -C 1:2 -O 33001 -l 100m --multi-session-threshold=5242880 /dir01 10.0.0.2:/remote_dir
> ascp -C 2:2 -O 33002 -l 100m --multi-session-threshold=5242880 /dir01 10.0.0.2:/remote_dir

In this case, if dir01 contains multiple files, all files less than 5 MB are distributed between sessions, while all files 5 MB or larger are split and then distributed between sessions. If dir01 contains only one file and that file is 5 MB or larger, then the file is split, otherwise the file is transferred by one session.

Using Multi-Session to Transfer to Object Storage

Files that are transferred to object storage are sent in chunks, and the chunk size is specified by <chunk_size> in aspera.conf. File-splitting transfers must respect a minimum split size, which for object storage is a part, and each ascp session must deliver full parts. Thus, the chunk size must be set equal to the object-storage part size on the . If the file size is greater than the multi-session threshold but smaller than the chunk size, then the file is not split. Set chunk size and part size on the Aspera server in the object storage as follows:

  1. Set the chunk size to some value greater than 5 MB (the minimum part size).

    For example, to set the chunk size to 64 MB, run the following command:

    $ asconfigurator -x "set_node_data;transfer_protocol_options_chunk_size,67108864"

    This edits aspera.conf as follows,

    <central_server>
        . . .
        <transfer>
            <protocol_options>
                <transfer>
                    <chunk_size>67108864</chunk_size>   <!-- 64 MB -->
                </transfer>
            </protocol_options>
        </transfer>
    </central_server>
  2. Open /opt/aspera/etc/trapd/s3.properties and set the upload part size (default 64 MB) to the same value as the chunk size and set a ONE_TO_ONE gathering policy:
    aspera.transfer.upload.part-size=64MB
    aspera.transfer.gathering-policy=ONE_TO_ONE

Using Multi-Session to Transfer to an Aspera Transfer Cluster

Note: For cloud transfers, file-splitting is currently only supported for AWS S3.

To run a multi-session transfer to an Aspera Transfer Cluster (ATC), configure the ATC and the client. The transfer request is initiated by the client through the Node API using a curl command to HTTP POST a JSON file with the transfer specifications.

Configuring the Aspera Transfer Cluster

  1. Confirm that <chunk_size> is set to a value equal to or greater than the minimum part size.

    For transfers to cloud storage, files are split into chunks, and the chunk size is specified by <chunk_size> in aspera.conf. File-splitting must respect a minimum split size for cloud storage, known as a part. If a file is larger than the multi-session threshold but smaller than the chunk/part size, it is not split.

    To view the cluster's transfer configuration, log into the Aspera Transfer Cluster Manager (ATCM). Select the cluster (in this case jmoore-se-demo-cluster), click the Action button, and click Edit Transfer Configuration.



    Confirm that the text string <transfer><protocol_options><chunk_size>67108864</chunk_size> </protocol_options></transfer> is present.



    If chunk size is not set, SSH to the instance (for instructions, see the Aspera Transfer Cluster Manager Deployment Guide: Customizing the Cluster Manager or Cluster Node Images) and run the following command:

    # asconfigurator -x "set_node_data;transfer_protocol_options_chunk_size,67108864"

    Confirm that chunk size equals part size by opening the file /opt/aspera/etc/trapd/s3.properties on the ATC and looking for the following lines:

    aspera.transfer.upload.part-size=64MB
    aspera.transfer.gathering-policy=ONE_TO_ONE
  2. Set the scaling policy.

    From the Action drop-down menu, select Edit Auto Scaling Policy. In this example for a 20-node cluster, set Max Nodes and Min Available Nodes to 20. Also ensure that Max Start Frequency Count is greater than or equal to the values for Max Nodes and Min Available Nodes.



Configuring the Aspera Client Transfer System

  1. Configure multi-session threshold and URI docroot restriction on the Aspera Enterprise Server or Connect Server.

    To set the multi-session threshold, run the following command. In this example, the value is set to 100000000 bytes:

    # asconfigurator -x "set_node_data;transfer_multi_session_threshold_default,10000000"

    To transfer files to and from cloud storage, you must configure a docroot restriction on your cloud-based transfer server instead of a docroot absolute path. A configuration with both a docroot absolute path (docrooted user) and a restriction is not supported. To set a restriction, run the following command. For more information on setting docroot restrictions for cloud storage, see the Aspera Enterprise Server Admin Guide for Linux: Docroot Restriction for URI Paths. In this example, no restrictions are set on the users by setting the value to *.

    # asconfigurator -x "set_node_data;file_restriction,|*"

    These commands result in an aspera.conf with the following text:

    <?xml version='1.0' encoding='UTF-8'?>
    <CONF version="2">
        <default>
            <file_system>
                <access>
                    <paths>
                        <path>
                            <restrictions>
                                <restriction>*</restriction>
                            </restrictions>
                        </path>
                    </paths>
                </access>
            </file_system>
            <transfer>
                <multi_session_threshold_default>100000000</multi_session_threshold_default>
            </transfer>
        </default>
        <aaa/>
        ...
    </CONF>
  2. Create a JSON transfer request file.

    In the example file below, named ms-send-80g.json, the file 80g-file is being sent to the ATC jmoore-se-demo-cluster.dev.asperacloud.net where the transfer user is xfer, using token authorization and the multi-session upload cookie.

    {
        "transfer": {
            "remote_host": "jmoore-se-demo-cluster.dev.asperacloud.net",
            "remote_user": "xfer",
            "token": "Basic QVVrZ3VobUNsdjBsdjNoYXAxWnk6dXI0VGZXNW5",
            "target_rate_kbps": 700000,
            "destination_root":"/",
            "multipart": 75,
            "paths": [
                {
                    "source": "/80g-file"
                }
            ],
            "ssh_port": 33001,
            "fasp_port": 33001,
            "direction": "send",
            "overwrite" : "always",
            "cookie": "multi-session upload"
        }
    }

Running and Monitoring the Transfer

  1. Initiate the transfer through the Node API with an HTTP POST of the JSON transfer request using a curl command.

    In the example below, the JSON file ms-send-80g.json is posted by the node user ak_data using the password aspera, from the node server localhost where the HTTPS port is the default 9092.

    $ curl -k -v -X POST -d @ms-send-80g.json https://ak_data:aspera@localhost:9092/transfers 
  2. Monitor transfer progress, bandwidth utilization, and the distribution of transfers for each cluster node.
    On UNIX/Linux systems, view bandwidth utilization from a terminal by running nload on the client system by running the following command. In this example, the vertical scale is set to 10 Gbps and the device is ens1f0.
    $ nload -u g -o 10000000 ens1f0
    The nload report shows bandwidth utilization at 9+ Gbps:


    In the ATCM UI, click Action > Monitor Nodes to view the transfer distribution and utilization for each of the 20 nodes in the cluster: