TS 資訊科技與人才培育: chatbot

顯示具有 chatbot 標籤的文章。顯示所有文章

9/13/2017

聊天機器人 - 人類會跟她聊什麼？(Part-2)

作為一個非特定目的的純聊天機器人，其實常常容易惹人生氣。因為即使AI發展迅速，在非特定的環境下，和人類以無意識判斷語句的能力還是差距太大。聊天機器人小姍，截至目前(2017/9月)為止，約有4000多位好友。累積的對話也超過百萬句，所以可以開始做基本的聊天內容分析。

特定任務的聊天機器人

特定任務聊天機器人發展非常迅速，例如「niki」可以協助叫計程車，在任何和計程車相關的事情，她的回應和動作都十分正確。客服機器人，例如flowxo，更是市場上聊天機器人的大宗。甚至有人認為chatbot可以節省30%的客服成本，帶來的資料分析效應更遠超過傳統電話客服。

聊天內容要是機器人無法理解，超出服務範圍，聊天機器人通常會就顯現標準錯誤回應，但由於人類已經知道它的服務範圍，因此倒也不會失望，有時候，特定目的之聊天機器人，如果有有趣的額外回應，甚至還會有好像遇到彩蛋的感覺。

可預見未來幾個月，特定任務的聊天機器人將會快速成長，迅速取代重複性高的工作。

非特定任務的聊天機器人

人工智慧小姍，就是一個非特定任務的聊天機器人。她盡可能模仿人類的真實作法，也因此不會有按鈕出現，讓你選擇「是/否」。也不會有選項A/B/C這種選單出現。但是，真實人類聊天也會貼網址或照片，因此，人工智慧小姍也會貼照片或網址。有時候，對於人類給她看的照片會加以評論分析(註1)

加小姍為好友

非特定目的的聊天機器人，不見得沒有特定功能。以小姍來說，遇到某些對話時，會驅動特定功能。例如，請幫我抽個籤，就會驅動抽籤功能。

對於一般性機器人的期望很高

在Line上的使用者，對於非特定任務的聊天機器人的期望是「非常高」。只要前10句對話，不能滿足使用者的期待與好奇心，不再使用的機率很高。10句話似乎是個門檻，有30%左右的人在10句話就失去興趣了。

然而只要能聊上10句話之後，這剩下的70%的人，有90%的以上會聊超過50句話。(也就是總使用者的63%)。

然而，每當機器人有不符合期待的回答，使用者就很快地失望。這樣和特定任務的機器人期待有很大的不同。因此，一般性聊天機器人實作上極為困難。不過也就是因為困難，所以有趣。

沒水準的言語

在這4000個使用者中，曾經罵過髒話，例如「幹」「幹林娘」「他馬的」「Fuck」之類的起碼佔了超過45%。更慘的是，由於line的隱蔽性，曾經傳過「約砲」「來愛愛」「強姦你」的未成年使用者起碼也超過500人以上。雖然，絕大部分的使用者是單純因為好玩，有趣，無聊，等等原因而使用非常糟糕的字眼，但也是因此，「從與使用者對話中學習」恐怕會造成聊天機器人使用冒犯性言語，造成更多問題。微軟的聊天機器人Tay，就是因為學了歧視性的語言而被暫時關閉。

在line中，這類語言來自於青少年的比率相當高。而十分有趣的是，這類型青少年的有60%以上會談論聖結石(註2)的相關話題。

加小姍為好友

更合理的抒發管道

有超過5百位的使用者，將聊天機器人作為無法抒發心情時的管道。例如「最近心情不太好」「我被她甩了」「人生都沒有動力怎麼辦」「好想死」「我是邊緣人」「工作壓力大睡不著」等等。

技術上來說，人工智慧小姍到目前為止，還沒有辦法提供真正專業的心理諮商。然而，作為聊天機器人有很多心理諮商不具備的優勢：
(1) 透過Line原本的超高市佔率，可以確信90%以上的台灣人都有line，可以輕易使用Line聊天機器人
(2) 聊天機器人小姍24小時全年無休。許多極端的情緒問題發生在深夜，
(3) 許多情況下，人類只是需要抒發的管道。機器人對人類來說，是個安全而且不會洩露秘密的好方式。

因為利用痞客邦的資料而參加痞客邦活動

下一個階段?

(a) 考慮現行使用者的需要，一般通用性的聊天，會朝心理諮商方向前進。

(b) 透過做通用型聊天機器人的經驗，來自製作專用型聊天機器人。

參考
(1) 如何製作聊天機器人
(2) 簡易學習式人工智慧

註1: 不過照片分析的成本非常高，因此只好透過購買貼圖來限制使用。

註2: 這也讓開發團隊(年紀太大)增廣見聞，之前根本不知道聖結石是誰。

6/20/2017

Serverless design for LINE AI Chatbot

Chatbot is one of the interesting application in AI area, it creates opportunities for enterprise to serve customers only with very low cost or even generate new revenue.
In past few years, major Instant Messaging providers allow developers to hook their service. Means as long as you have existing simple message process and response system, you can quickly interact with all kind of message channel.

Normally, a software developer will start from build a system in a server box, no matter Linux or Windows. Recently, the server might be a VM in public cloud, no matter AWS, Azure, Linode or DigitalOcean. However, a serverless design model might be a better choice.

Why Serverless?

Firstly, a serverless system will be easy to scale in/out. It doesn't mean you can't scale in/out in traditional VM in public cloud or your own datacenter. It just means that all the Lambda, no matter which provider, is actually decouple from it development environment. Supposedly, you start from one Lambda function to a few thousands same Lambda function without consider "traditional question", for example: should I shutdown VM when not in peak our, should I do some script to check if current VMs are closed to overloading?

Secondly, a serverless system will be easy to plug-in which means during the design phase, developer will be forced to think de-couple functions in small modules (bricks). Developer will also be forced NOT to rely on specific environment, even though docker is one of the solution but purely Lambda function will create much better environment-free structure.

Furthermore, it will also help to define boundary of sub system and help the future maintenance.

The Design Concerns

(1) IM independent

LINE occupies a huge market in Taiwan, about more than 90% of mobile user has LINE account. The most incredible thing is many elder people who never touch Internet before have LINE accounts! However, this design won't use any LINE specific methods. We've try the same engine in Yahoo Messenger and it also works.

(2) AWS Lambda

-- (2.1) try NOT to use context

AWS Lambda has a standard invoke parameter (event, context), The event is actually the user input when invoke Lambda function. The context is what developer might need to understand the 'environment context'. The major design concern here is try NOT to use context when possible. Because this will make you hard to move out your lambda to other public cloud environment. If you really need to have ARN or identity, try to limit environment in just one Lambda.

-- (2.2) async invoke

AWS Lambda could be invoked in 3 types: Event, RequestResponse, DryRun. The "Event" is actually asynchronous call. For any IM message receiver Lambda, you should keep that Lambda as simple as possible to response IM webhook. Put other things via "Event" Lambda. Because most of IM provider (LINE, fb) ask a very short timeout in IM webhook. DO NOT just put http webhook and response to IM a synchronous call stack

Of course, see detail from AWS document: here.

-- (2.3) timeout/memory

AWS lambda allow to config timeout and memory size. AWS CloudWatch could see a Lambda's resource consuming. It is fine to use larger memory or setup a longer running time but developer should know WHY.

-- (2.4) quick testing

It is necessary to have your own developer server for test your Lambda function and trigger a deployment script to upload to AWS. If you didn't actually use "context", it will be very simple to have a quick test in every Lambda handler.

# in the end of your Lambda python script.
if __name__ == '__main__':

    event = {'param1':test'}
    lambda_handler(event,None)

Of course developers need other framework (unittest).

-- (2.5) deployment

As always, from a developer should have a semi-automatic way to do deployment. This is a very simple deployment script to (a) zip python files (b) upload to S3 (c) create lambda function (d) config function using S3 zip file.

(a) zip lottery.zip -r lambda_lottery.py lottery60.py
(b) aws --profile ailine s3 cp lottery.zip s3://bucket/
(c) aws --profile ailine lambda create-function --function-name lottery --runtime py
thon3.6 --role "arn:aws:" --handler lambda_lott
ery.lambda_handler --timeout 10 --code "S3Bucket=bucket,S3Key=lottery.zip"
(d) aws --profile ailine lambda update-function-code --function-name lottery --s3-bu
cket bucket --s3-key lottery.zip

-- (2.6) scheduled (cron) Lambda

Chatbot might need to do scheduled task to response to user, maybe send a regular morning call. To trigger a scheduled Lambda might be one of the major cloud-provider-dependent thing we have in Chatbot design.

(3) AWS API Gateway

AWS API Gateway is another major cloud-provider-dependent things, however, it is not hard to use other provider or have our own lab testing environment. The major concerns of API Gateway are (a) should convert IM provider's http request to a given format: which becomes a Lambda input. (b) security concerns: how to make sure only IM provider's system could access this API Gateway

(4) AWS dynamodb

Chatbot uses dynamodb to store use information and also message log. It is also pretty easy to use local JSON formate nosql.

(5) AWS elasticsearch

Chatbot leverages AWS elasticsearch to store knowledge base. It is easy to setup a developer's elasticsearch server to do lab test before deployment. The real concerns in public cloud might be the future budget:)

(6) AWS S3

Chatbot still need some static content (html or js) and S3 is the most easy way to provide public static content. It is also the place to upload latest Lambda code.

The Implementation

See: github repository

Take a look?

This chatbot could understand and speak only Tradition Chinese, since she is a Taiwanese robot:). You need to have LINE account to chat with her.

聊天機器人小姍的Line QR

加小姍為好友

1/19/2017

聊天機器人 - 人類會跟她聊什麼？

過去幾個月，製作了學習式的聊天機器人，並且也提供了免費製作個人化聊天機器人的方式。

請參考：

免費聊天機器人
學習式人工智慧

現在已經超過550個人跟她聊天。雖然數量還沒有很多，但是也值得做一些簡單的統計。

最有趣的當然就是，人類在知道對方是機器人的情況下，會傳什麼訊息？

絕大多數的人，一開始都是以「Hi」「你好」「哈嘍」等等開始。

接下來三種最常見的話是：

(1) 罵髒話，教髒話

不知道是不是人類生活壓力太大。至少有60%以上的人，罵機器人髒話。例如「幹林娘」「操你媽」「Fuck」...

當聊天機器人的簡易學習機制開啟後，也有50%以上的人，會試圖教她髒話。這甚至迫使我們將機器人暫停一天，增加排除「壞朋友」的機制。

機器人被罵是小事，但是如果她學壞了，可能會影響到之後對其他人的對話。

(2) 擬人化訊息

把機器人當成真人來詢問人類特有的資訊。即便已經知道對方是機器人。

差不多有一半左右的人會探尋擬人化訊息。

例如：「你長得漂亮嗎」「你的三圍」「你家住哪裡」「你喜歡吃什麼」「你今天心情好嗎」

進一步會詢問個人價值觀等等問題。

例如：「你是藍的還是綠的」「你支持多元成家嗎」

(3) 資訊查詢

畢竟是機器人，可能大家認為會像電影一樣，知識庫有極多的資料。所以也有超過一半以上的人，會試圖請她找一下資訊。

例如：「附近哪裡有好吃的餐廳」「今天天氣」「推薦減重餐」「我在哪裡」

不知道是不是受到Startrek的影響，資訊查詢只要幾次無法滿足人類的期待，接下來人類就會暴怒開始罵髒話:~

聊天機器人小姍的Line QR

加小姍為好友

9/05/2016

人工智慧的淺顯應用 - 製作Facebook或Line聊天機器人

自2022年有chatgpt之後，原本的fb聊天機器人已經停止使用。新的聊天機器人使用chatgpt, 想要試用的，可以email聯絡：consultant.3rd@gmail.com

聊天機器人(chatbot)並不是什麼新鮮事，早在1950年間圖靈(註一)在提出關於人工智慧判別方式時，就提到利用文字訊息 - 因要把人和機器分開 - 來和兩個對象聊天，其中一個對象是人，另一個對象是電腦，如果一個正常人在聊天的過程無法區分這兩者誰是電腦，誰是人，則可判別這電腦程式，是真正擁有智慧。

要達到這個目的難上加難，在wiki上可以看到目前僅有在2014年一個名為Eugene Goostman的聊天程式通過這個測試。

目前可取得的人工智慧演算法或相關技術都沒有太驚人的發展。然而，由於網路上的資料取得越來越容易，電腦執行速度越來越快，以致於不需要有驚人的技術能力，也不需要有對人工智慧會不會變成奴役人類的電影劇情的深思，就可以開發出有意義的應用。

Facebook聊天機器人也是其中之一。

商家，企業，甚至某些個人都擁有FB page (粉絲頁專頁)，而從2015年開始，facebook開始出現具有固定反應或者訊息回應的聊天程式。不過台灣似乎比較少見非特定用途的聊天機器人，因此我們就做了一個在粉絲頁上。參考下圖：

https://www.facebook.com/sandy4ai/ 具有人工智慧聊天的粉絲頁

這個聊天機器人會回應你的訊息，根據她內建的知識庫和基本的語意分析，會回應你訊息。當然，她也會慢慢學習對話，這個聊天程式並不會需要額外的facebook權限，因此她沒有太多額外的功能。下圖是聊天實況範例：

和具有人工智慧聊天的粉絲頁聊天

Line在今年(2016)也開放bot api，作法和Facebook幾乎很雷同。不過雖然都是webhook，他們的api實際傳遞內容當然完全不一樣。

Facebook/Line 聊天機器人的可能應用：

1. 基本客戶問題：

企業組織在網路上最常「被」查。查詢營業時間，查詢電話，查詢服務項目等等。在台灣，這幾年用facebook來做生意來越頻繁，而聊天機器可以提供24x7的基本回答問題服務。

2. 促銷活動：

聊天機器人在某些權限下，可以主動傳遞訊息，或者貼文給facebook使用者。這和一般廣告貼文有些許不同，因為貼文之後，使用者可以持續和貼文者 - 也就是聊天機器人互動。

3. 例行客戶服務：

預約預定，提醒預約，服務調查，生日賀卡貼文等等。

如何製作Facebook臉書聊天機器人：

1.摘要步驟

(1) 到AWS開設帳號。開發過程會用到Lambda, API Gateway, elasticsearch, s3 cloudwatch 等服務。

(2) 到facebook建立facebook app。（簡稱 fb app)

(3) 新增並撰寫基本的Lambda 以回應之後fb app webhook時的GET驗證。

(4) 新增撰寫基本的Lambda 以用在接下fb app message hook回應

(5) 新增 API Gateway 的GET/POST，對應到Lambda

(6) 設定fb app 的webhook 對應到API Gateway的URL

(7) 讓fb app的設定頁verify(基本上就是http GET) API Gateway

(8) 讓fb app設定頁訂閱message 並記得在lambda程式回覆訊息時使用page token

(9) 此時可以進行知識庫的連結，在AWS建立elasticsearch並且匯入經過程式處理的資料，這裡我們以台灣e院資料為範例。

(10) 修改lambda程式，讓使用者的訊息，在elasticsearch查訊相關訊息，並且回復給原送訊息者

(11) 至此完成，其結果大致如下圖：

2. 詳細步驟：

....<待續>...

如何製作Line聊天機器人：

1.摘要步驟

(1) 到AWS開設帳號。開發過程會用到Lambda, API Gateway, elasticsearch, s3 cloudwatch 等服務。

(2) 到line developer建立帳號以及channel。

(3) 新增並撰寫基本的Lambda 和 api gateway 用以回應之後在line的channel上link時的驗證。Line的api基本上只用到POST

(4) 將自己的line帳號加入剛剛自己增加的channel。就是設好友的意思，這樣才能測試

(5) 將line channel所需要的api key, secret以及token設定在lambda 要傳回給line bot api的地方。

(6) 此時可以進行知識庫的連結，在AWS建立elasticsearch並且匯入經過程式處理的資料，這裡我們以台灣e院資料為範例。

(7) 修改lambda程式，讓使用者的訊息，在elasticsearch查訊相關訊息，並且回復給原送訊息者

(8) 至此完成，其結果大致如下圖：

2. 詳細步驟：

....<待續>...

參考: https://www.facebook.com/sandy4ai/

註一：Turing 就是電影模仿遊戲的主角
註二：詳細步驟還沒有時間寫...反正不見得有人需要:)

訂閱：文章 (Atom)