The HDFS connector is used to communicate with a Hadoop Distributed File System (HDFS). The HDFS site-specific information files (core-site.xml and hdfs-site.xml) are imported into the connector configuration through the HDFS properties tab. You should obtain these files from your Hadoop administrator.
The HDFS connector supports both simple and Kerberos authentication. When using simple authentication, only the HDFS User property is required. When using Kerberos, you must import a keytab file into the connector configuration through the HDFS properties tab. You should obtain the keytab file from your Hadoop administrator.
The HDFS connector also supports transparent encryption. While no additional configuration is needed in the connector, your Hadoop administrator will need to configure it on the server and enable encryption zones from which you will send and receive files. See the Apache Hadoop website for a further description of transparent encryption in HDFS.
Notes:
- The ATTR command currently supports a subset of all HDFS attributes.
- The Cleo HDFS connector uses the Apache Hadoop API. This API defines a "wire compatibility" policy consisting of rules for compatibility between the client (the HDFS connector) and the server (your HDFS repository). In summary, this policy states that compatibility will be maintained as long as both the client and server are operating under the same major release. To understand the current client release level, refer to Help>About where the Hadoop API release numbers are identified.
HDFS Connector Properties
Each instance of the HDFS Connector can be configured using the following settings:
Property | Description | Required |
---|---|---|
HDFS User | The user to use for HDFS access. | Yes |
core-site.xml | Import core-site.xml file. | Yes |
hdfs-site.xml | Import hdfs-site.xml fie. | Yes |
Keytab | Import keytab for Kerberos authentication. | |
Command Retries | The number of times the command should be retried when an error or exception occurs. Valid range: [0-5]. |
Yes |
Command Retry Delay (seconds) | The number of seconds to wait between retries. Valid range: [0-120]. |
Yes |
Do Not Send Zero Length Fies | For PUT, a switch that controls whether to send a file if it is zero-length. | |
Delete Received Zero Length Fies | For GET, a switch that controls whether to remove a received file that is zero-length. | |
Retrieve Directory Sort | For PUT, the sorting options for the list of outbound files. | |
Enable Debug | A switch that controls whether to perform debug logging. | |
System Scheme Name |
The URI scheme name used as a shortcut to this host. Valid pattern: |
|
System Public | A switch that indicates whether the connector is public. |
Comments
0 comments
Please sign in to leave a comment.