添加资源
运行作业需要先将外部资源添加到SKIL的系统中。在添加资源之前,你需要将相关的凭证文件存储在SKIL集群的一个节点中。
存储凭证
下面显示了存储每种受支持资源类型的凭据的格式。
注意
对于HDFS和YARN,不需要凭证,因为设置是在本地完成的。你必须配置SPARK_HOME
环境变量,并将其指向YARN的spark root文件夹。
{
"accessKey": "<access_key>",
"secretKey": "<secret_key>"
}
data:image/s3,"s3://crabby-images/714de/714dec007828116be6a2991288e4c410afcba00e" alt=""
在哪里可以找到凭证?
请访问以下链接以根据你的资源需求获取安全凭据:
- AWS S3 and EMR
- Azure Storage and HDInsight
-
Google Storage and Cloud DataProc - 将此信息保存在一个文件中,并给出
serviceaccountfile
键的路径,如上述代码段中所述。
添加资源
存储完资源凭据后,可以使用以下方法添加相应的资源:
- CLI
- REST端
- UI
1. CLI
skil resources命令通过CLI管理资源。以下代码段显示了如何添加每种类型的资源:
AWS S3
skil resources create-s3 --name <resource_name> --credentialUri <credentials_uri> --bucketId <bucket_id> --region <region>
data:image/s3,"s3://crabby-images/392bf/392bf8ef327b8de0008b4b45714bf00c8e09f77a" alt=""
AWS EMR
skil resources create-emr --name <resource_name> --credentialUri <credentials_uri> --clusterId <cluster_id> --region <region>
data:image/s3,"s3://crabby-images/15b90/15b9035abc0f73274dcc0c7a423aa5f06a24f423" alt=""
Google Storage
skil resources create-google-storage --name <resource_name> --credentialUri <credentials_uri> --projectId <project_id> --bucketName <bucket_name>
data:image/s3,"s3://crabby-images/da606/da6068347a6f76b9ff039913bcd5d2943b3fd36c" alt=""
Google Cloud DataProc
skil resources create-dataproc --name <resource_name> --credentialUri <credentials_uri> --projectId <project_id> --sparkClusterName <spark_cluster_name> --region <region>
data:image/s3,"s3://crabby-images/066e2/066e23cdfdedfb3b3227534421b1fa2a08da2975" alt=""
Azure Storage
skil resources create-azure-storage --name <resource_name> --credentialUri <credentials_uri> --containerName <container_name>
data:image/s3,"s3://crabby-images/a7577/a75774fe2e897312f2b3d5f616ca769204bd0352" alt=""
Azure HDInsight
skil resources create-hdinsight --name <resource_name> --credentialUri <credentials_uri> --subscriptionId <subscription_id> --resourceGroupName <resource_group_name> --clusterName <cluster_name>
data:image/s3,"s3://crabby-images/2034a/2034afc256da8635e69aa0d6762f0264f49c2874" alt=""
HDFS
skil resources create-hdfs --name <resource_name> --credentialUri <credentials_uri> --nameNodeHost <name_node_host> --nameNodePort <name_node_port>
data:image/s3,"s3://crabby-images/f4600/f460048e118264c86fa4af58920963a5fb5a94b2" alt=""
YARN
skil resources create-yarn --name <resource_name> --credentialUri <credentials_uri> --localSparkHome <local_spark_home>
data:image/s3,"s3://crabby-images/7abb1/7abb1ade92b8ac68823120150619bec7be742fc6" alt=""
2. REST 端
使用类似“curl”的工具,你可以通过向http://host:port/resource端点发送post请求来添加资源。通过REST端点添加资源的一般格式如下:
curl -d '<resource_request_data>' -H "Authorization: Bearer <auth_token>" -H "Content-Type: application/json" -X POST http://host:port/resource
data:image/s3,"s3://crabby-images/a266a/a266a49ffd6b3db7364a1f578aa89bd40c58241b" alt=""
注意
你可以通过运行以下curl请求来获取<auth_token>:
curl -d '{"userId":"<userId>", "password":"<password>"}' -H "Content-Type: application/json" -X POST http://localhost:9008/login
data:image/s3,"s3://crabby-images/1cc78/1cc78ab42b194e0234fe90ea29344a96fa132120" alt=""
其中,<userid>和<password>是登录SKIL的凭据。
对于每种类型的资源,<resource_request_data>将具有以下格式:
AWS S3
{
"resourceName":"<resource_name>",
"resourceDetails": {
"@class":"io.skymind.resource.model.subtypes.storage.AzureStorageResourceDetails",
"containerName":"<container_name>"
},
"credentialUri":"<credentials_uri>", // 通常看起来像"file:///path/to/credentials.json
"type":"STORAGE",
"subType":"AzureStorage",
"credentialId":<credentials_id> // 一个整数
}
//你只需要提供credentialsUri或credentialsId
data:image/s3,"s3://crabby-images/3be18/3be181918e1ac60d9fb3bf6eaedab90a785e0c39" alt=""
AWS EMR
{
"resourceName":"<resource_name>",
"resourceDetails": {
"@class":"io.skymind.resource.model.subtypes.compute.EMRResourceDetails",
"clusterId":"<cluster_id>",
"region":"<region>"
},
"credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
"type":"COMPUTE",
"subType":"EMR",
"credentialId":<credentials_id> // 一个整数
}
//你只需要提供credentialsUri或credentialsId
data:image/s3,"s3://crabby-images/65525/655251d7b60e92fc96cd24e09902bcffdbdb345a" alt=""
Google Storage
{
"resourceName":"<resource_name>",
"resourceDetails": {
"@class":"io.skymind.resource.model.subtypes.storage.GoogleStorageResourceDetails",
"projectId":"<project_id>",
"bucketName":"<bucket_name>"
},
"credentialUri":"<credentials_uri>", // 通常看起来像"file:///path/to/credentials.json
"type":"STORAGE",
"subType":"GoogleStorage",
"credentialId":<credentials_id> // 一个整数
}
//你只需要提供credentialsUri或credentialsId
data:image/s3,"s3://crabby-images/2c45e/2c45ef827f43c814b5753a80a2e381a0d0720c0d" alt=""
Google Cloud DataProc
{
"resourceName":"<resource_name>",
"resourceDetails": {
"@class":"io.skymind.resource.model.subtypes.compute.DataProcResourceDetails",
"projectId":"<project_id>",
"region":"<region>",
"sparkClusterName":"<spark_cluster_name>"
},
"credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
"type":"COMPUTE",
"subType":"DataProc",
"credentialId":<credentials_id> // 一个整数
}
//你只需要提供credentialsUri或credentialsId
data:image/s3,"s3://crabby-images/f119f/f119fea953c9ab2a3fd81479c213951931b7b329" alt=""
Azure Storage
{
"resourceName":"<resource_name>",
"resourceDetails": {
"@class":"io.skymind.resource.model.subtypes.storage.AzureStorageResourceDetails",
"containerName":"<container_name>"
},
"credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
"type":"STORAGE",
"subType":"AzureStorage",
"credentialId":<credentials_id> // 一个整数
}
//你只需要提供credentialsUri或credentialsId
data:image/s3,"s3://crabby-images/02ff5/02ff567c50e7d798915f86b668b6634d77c7cd4a" alt=""
Azure HDInsight
{
"resourceName":"<resource_name>",
"resourceDetails": {
"@class":"io.skymind.resource.model.subtypes.compute.HDInsightResourceDetails",
"subscriptionId":"<subscription_id>",
"resourceGroupName":"<resource_group_name>",
"clusterName":"<cluster_name>"
},
"credentialUri":"<credentials_uri>", // 通常看起来像"file:///path/to/credentials.json
"type":"COMPUTE",
"subType":"HDInsight",
"credentialId":<credentials_id> //一个整数
}
//你只需要提供credentialsUri或credentialsId
data:image/s3,"s3://crabby-images/a6c11/a6c1156d5cecaddc432a2b2d65d180286131cbb2" alt=""
HDFS
{
"resourceName":"<resource_name>",
"resourceDetails": {
"@class":"io.skymind.resource.model.subtypes.storage.HDFSResourceDetails",
"nameNodeHost":"<name_node_host>",
"nameNodePort":"<name_node_port>"
},
"credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
"type":"STORAGE",
"subType":"HDFS",
"credentialId":<credentials_id> // 一个整数
}
//你只需要提供credentialsUri或credentialsId
data:image/s3,"s3://crabby-images/6514f/6514f8fbd7b20b38dffc8dd999ff8803677396ca" alt=""
YARN
{
"resourceName":"<resource_name>",
"resourceDetails": {
"@class":"io.skymind.resource.model.subtypes.compute.YARNResourceDetails",
"localSparkHome":"<local_spark_home>"
},
"credentialUri":"<credentials_uri>", // 通常看起来像 "file:///path/to/credentials.json
"type":"COMPUTE",
"subType":"YARN",
"credentialId":<credentials_id> // 一个整数
}
//你只需要提供credentialsUri或credentialsId
data:image/s3,"s3://crabby-images/12fa7/12fa7c6721fca36d3c82504bfe7786a61228678c" alt=""
注意
如果你已被授予凭证,那么你可以在请求中省略credentialsId
,反之亦然。
3. UI
你可以通过单击SKIL仪表盘右上角的“齿轮”图标,然后转到“资源(Resources)”来访问添加资源的用户界面:
data:image/s3,"s3://crabby-images/4c3cb/4c3cbeb631ad6ffcd097ed6a8756a1ab6c295866" alt=""
data:image/s3,"s3://crabby-images/c53c1/c53c11b585f92fe06384fc64c1c268e707e74260" alt=""
单击 "添加资源(Add Resource)"来添加资源 :
data:image/s3,"s3://crabby-images/e5aea/e5aea231e62626691afac3d571cd92916b2c0d8b" alt=""
data:image/s3,"s3://crabby-images/5ca13/5ca13fc9e907d0696c4c94efaae19c252b6a4cdc" alt=""
选择要添加的资源类型:
data:image/s3,"s3://crabby-images/09ee5/09ee5d4c8a960eeb96647a9df156c2b016bca358" alt=""
data:image/s3,"s3://crabby-images/e11aa/e11aa999cb1b77591b8a32f42d180873381d72c0" alt=""
现在,填写详细信息,最后单击“添加(Add)…”添加所需资源:
data:image/s3,"s3://crabby-images/11a62/11a62dc6d0efd98b596f037c17d1016ecd616101" alt=""
网友评论