美文网首页Watson
WatsonAPI之Natural Language Class

WatsonAPI之Natural Language Class

作者: teitiyuu | 来源:发表于2017-07-25 12:17 被阅读0次

    说明

    NLC服务使用机器学习算法返回短文本输入的匹配预定义类。创建和训练一个分类器,将预定义分类与示例文本连接起来,以便服务可以将这些分类器可以对新的输入进行分类

    认证方式

    使用HTTP Basic Authentication方式认证。 即用户名/密码方式

    创建一个分类器

    CURL命令

    curl -u "USERNAME":"PASSWORD"  ^
    -F training_data=@weather_data_train.csv  ^
    -F training_metadata="{\"language\":\"en\",\"name\":\"atp-weather\"}"  ^
    "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers"
    

    返回值

    {
      "classifier_id" : "359f3fx202-nlc-223328",
      "name" : "atp-weather",
      "language" : "en",
      "created" : "2017-07-25T03:20:16.451Z",
      "url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
      "status" : "Training",
      "status_description" : "The classifier instance is in its training phase, not yet ready to accept classify requests"
    }
    

    ** 注意此时分类器的状态为训练中 暂时还不能使用。我们可以通过命令查看分类器状态**

    查看分类器列表

    CURL命令

    curl -u "USERNAME":"PASSWORD" ^
    "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers"
    

    返回值

    {
      "classifiers" : [ {
        "classifier_id" : "359f3fx202-nlc-223328",
        "url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
        "name" : "atp-weather",
        "language" : "en",
        "created" : "2017-07-25T03:20:16.451Z"
      } ]
    }
    

    查看分类器信息

    CURL命令

    curl -u "USERNAME":"PASSWORD"  ^
    "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328"
    

    返回值

    {
      "classifier_id" : "359f3fx202-nlc-223328",
      "name" : "atp-weather",
      "language" : "en",
      "created" : "2017-07-25T03:20:16.451Z",
      "url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
      "status" : "Available",
      "status_description" : "The classifier instance is now available and is ready to take classifier requests."
    }
    

    分类器有如下五种状态

    • 1 Non Existent : 不存在
    • 2 Training : 训练中
    • 3 Failed:失败
    • 4 Available:有效
    • 5 Unavailable:无效

    使用分类器进行分类

    CURL命令

    • Get方法分类 How how will it be today?
    curl -G -u "USERNAME":"PASSWORD" ^
    "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328/classify?text=How%20hot%20will%20it%20be%20today%3F"
    
    • Post方法分类 How how will it be today?
    curl -X POST -u "USERNAME":"PASSWORD" ^
    -H "Content-Type:application/json" ^
    -d "{\"text\":\"How hot will it be today?\"}" ^
    "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328/classify"
    

    返回值

    {
      "classifier_id" : "359f3fx202-nlc-223328",
      "url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
      "text" : "How hot will it be today?",
      "top_class" : "temperature",
      "classes" : [ {
        "class_name" : "temperature",
        "confidence" : 0.9929586035651006
      }, {
        "class_name" : "conditions",
        "confidence" : 0.007041396434899482
      } ]
    }
    

    使用分类器训练数据中未包含的词汇(sleet 为雨夹雪)
    特意使用了temperature分类中包含的句式 how xxx it is today?
    分类器还是准确将其分到condition类中了。

    curl -X POST -u "username":"password" ^
    -H "Content-Type:application/json" ^
    -d "{\"text\":\"How sleet will it be today?\"}" ^
    "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328/classify"
    

    返回值

    {
      "classifier_id" : "359f3fx202-nlc-223328",
      "url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
      "text" : "How sleet will it be today?",
      "top_class" : "conditions",
      "classes" : [ {
        "class_name" : "conditions",
        "confidence" : 0.89688785244637
      }, {
        "class_name" : "temperature",
        "confidence" : 0.10311214755363002
      } ]
    }
    

    使用分类器完全无关的词汇 it is atp's notebook?
    分类结果非常不理想 temperature类的置信度竟然高达82%

    curl -X POST -u "74e23665-dfea-4bd6-ad80-3e9b4a7f7604":"RxFKejjwlUcA" ^
    -H "Content-Type:application/json" ^
    -d "{\"text\":\"it is atp's notebook?\"}" ^
    "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328/classify"
    

    返回值

    {
      "classifier_id" : "359f3fx202-nlc-223328",
      "url" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/359f3fx202-nlc-223328",
      "text" : "it is atp's notebook?",
      "top_class" : "temperature",
      "classes" : [ {
        "class_name" : "temperature",
        "confidence" : 0.8255246180698945
      }, {
        "class_name" : "conditions",
        "confidence" : 0.1744753819301055
      } ]
    }
    

    删除一个分类器

    CURL命令

    curl -X DELETE -u "{username}":"{password}" 
    "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/10D41B-nlc-1"
    

    要点

    • 置信度值表示为百分比,值越大表示置信度越高。响应最多包含 10 个类。
    • 如果培训数据中的类少于10个,那么所有置信度值的和为 100%。例如只定义了两个类,就只能返回两个类。
    • 其中一个样本问题包含未对分类器进行培训的词语(“foggy”)。您无须执行额外工作来识别这些“缺少”的词语,分类器对于这些词语就能获得不错的分数。请尝试使用包含培训数据中没有的词(例如,“sleet”或“storm”)的其他问题。

    课题

    • 1 支持语言 en之外还包含?
    • 2 训练数据文本的格式 csv固定? csv的format也是固定?
    • 3 分类器建成以后是否可以追加training数据

    相关文章

      网友评论

        本文标题:WatsonAPI之Natural Language Class

        本文链接:https://www.haomeiwen.com/subject/sdkwkxtx.html