Naming Constraints

Path names in Hadoop Distributed File System (HDFS)

By default, each component of a path is limited to 255 bytes in UTF-8 encoding. This value can be configured in the Hadoop configuration file (/etc/hadoop/conf/hdfs-default.xml) by changing the value of dfs.namenode.fs-limits.max-component-length. A value of 0 disables the limit but may create incompatibilities with other file systems that do not support long paths.

Bucket and Container Names

Many object storage platforms (including AWS S3, Google Cloud Storage, and Azure, require DNS-compliant bucket names, with additional constraints specific to certain platforms. Container names in object storage using the OpenStack Swift API, do not need to be DNS-compliant, as described below.

Rules for DNS-compliance:

  • Names must be between three and 63 characters long.
  • Names must be a series of one or more labels, with adjacent labels separated by a period (.).
  • Labels can contain lowercase letters, numbers, and hyphens (-), but must start and end with a lowercase letter or a number (labels cannot start or end with a period). Periods may not be adjacent to another period or a hyphen and nor can a hyphen be adjacent to another hyphen. For example, "..", "--", "-.", and ".-" are not valid.
  • Labels cannot be formatted as IP addresses (for example, 192.00.00.20).

Additional Information:

Object Storage Platform Additional Information on Bucket Names
AWS S3 For more information, see:

http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html

Google Cloud Storage
  • Names containing periods may be up to 222 characters total, but each label must be no more than 63 characters.
  • Names may not begin with "goog", nor contain "google" or close misspellings of "google."

For more information, see:

https://cloud.google.com/storage/docs/naming#requirements

Azure For more information, see:

https://docs.microsoft.com/en-us/rest/api/storageservices/fileservices/Naming-and-Referencing-Containers--Blobs--and-Metadata?redirectedfrom=MSDN

Container Names in OpenStack Swift:

Container names must be unique within each account and consist of one to 256 UTF-8 characters. Names can start with any character and contain any character except forward slash (/). For more information, see:

https://developer.openstack.org/api-ref/object-storage/

Object Names, Key Names, and Blob Names

In general, object names, key names, and blob names must be a sequence of Unicode characters whose UTF-8 encoding is one to 1024 bytes long. This format applies to AWS S3, Google Cloud Storage, and Azure. Object storage using the OpenStack Swift API has no restrictions on object names.

The following character sets are generally safe:

  • Alphanumeric characters: 0-9, a-z, A-Z
  • ! - _ . * ' ( )

The following characters may require special handling, such as URL encoding or referencing as HEX:

  • & $ @ = ; : + , ?
  • spaces
  • ASCII character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal).

Avoid the following characters:

  • \ { } ^ % ` [ ] " < ~ # |
  • Non-printable ASCII characters (128-255 decimal characters)
Object Storage Platform Additional Information on Object Name, Key Name, or Blob Name Requirements
AWS S3 For more information, see:

http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-keys

Google Cloud Storage
  • Names cannot contain Carriage Return or Line Feed characters
  • Avoid control characters that are illegal in XML 1.0 (#x7F -#x84 and #x86-#x9F)

For more information, see:

https://cloud.google.com/storage/docs/naming#objectnames

Azure
  • Blob names are case sensitive.
  • Avoid blob names that end with a period, a forward slash (/), or a sequence of the two.
  • Blob names cannot contain more than 254 path segments, where a path segment is the string between delimiter characters (such as the forward slash) that correspond to the name of a virtual directory.

For more information, see:

https://docs.microsoft.com/en-us/rest/api/storageservices/fileservices/naming-and-referencing-containers--blobs--and-metadata

Object Metadata Names (Keys) and Values

Object metadata is a set of name-value pairs. Users can often add customized metadata names, within the constraints of the object storage platform.

Object Storage Platform Object Metadata Name Requirements
AWS S3
  • Name-value pairs must conform to US-ASCII when using REST, and to UTF-8 when using SOAP or browser-based uploads (POST requests).
  • When using the REST API, user-defined metadata names must begin with "x-amz-meta-".
  • PUT request headers are limited to 8 Kb, of which 2 Kb may be user-defined metadata. User-defined metadata is calculated as the total bytes of the UTF-8 encoded name and value.

For more information, see:

http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-metadata

Google Cloud Storage
  • Custom metadata names must begin with "x-goog-meta-".
  • Each individual metadata entry is limited to 32768 bytes, and 512 Kb for the total metadata server.

For more information, see:

https://cloud.google.com/compute/docs/storing-retrieving-metadata

OpenStack Swift
  • Metadata names are case-insensitive.
  • Names may contain ASCII 7-bit characters that are not control (0-31) characters, DEL, or a separator character. Underscores (_) are silently converted to hyphens (-).

For more information, see:

http://developer.openstack.org/api-ref/object-storage/index.html

Azure
  • Metadata names must follow the naming rules for C# identifiers.
  • The combined size of the name-value pair may not exceed 8 Kb.

For more information, see:

https://docs.microsoft.com/en-us/rest/api/storageservices/fileservices/naming-and-referencing-containers--blobs--and-metadata