Clova Speech Recognition(CSR)
사람의 목소리를 텍스트로 바꿔주어 다양한 음성 인식 서비스에 활용할 수 있습니다
Speech Recognition API with the Highest Speech Recognition Rate for the Korean Language
The Speech Recognition API service can be used to develop services, such as the voice-operated personal assistant application, chatbot, or speech memo. Speech data are sent to the Clova Speech Recognition (CSR) engine through API, and the data is recognized and converted into text.
- High Recognition Rate and Multi-language Support
- The best speech recognition rate for the Korean language in Korea, as a result of NAVER’s years of research and development experience. English, Japanese, and Chinese (Simplified) are also available. You can use the same level of service as the speech recognition technology applied to Naver's Clova service.
- Performance Improvement through Learning
- The quality continues to be improved with various types of learning based on machine learning, one of the artificial intelligence technologies. Meet Clova Speech Recognition (CSR), which is becoming more intelligent with rich data obtained via NAVER’s services to enhance your own services.
- Safe Personal Information Management
- To safely protect the information entered for speech recognition, Clova Speech Recognition (CSR) complies with Korea's privacy laws, such as the Personal Information Protection Act and the Act on Promotion of Information and Communications Network Utilization and Information Protection, Etc.
- Easy Use
- You can handle various types of management tasks, such as service registration (register, change, and delete), setup of usage limits (daily/monthly), and checking usage statistics on the web-based console. In addition, you can provide the service in a range of environments, such as server and mobile, because both mobile SDK and REST API methods are provided.
A high quality speech recognition service is provided that learns from NAVER's abundant data.
If a human voice is sent using CSR mobile SDK and API, the results of recognition is returned as text from the server.
An HTTP-based REST API and a mobile SDK type that has an implemented self-streaming protocol are provided.
|Method of Use||Supported Platforms||Recognizable Languages||Recognizable Time||Recognizable Audio File Format||Data Sent to CSR Engine|
|Mobile SDK||Android 10 or later||Korean, English, Japanese, Chinese||60 secs||-||Voice input with microphone|
|iOS 8 or later|
|REST API||-||mp3, aac, ac3, ogg, flac, wav||Sound recording file|
Applications that have utilized the CSR mobile SDK use the development package name of the client ID and the Android application for API authentication, and REST API uses the client ID and client secret key for authentication. You can apply encryption on all data transfer sections from authentication to using API.
You can use the service that was only used with text input or physical methods to control the service using voice.
You can apply it on home appliances, home network control of apartments, delivery orders, financial services, and more.
Using the mobile SDK, various personal assistant services can be implemented with smart phone applications. You can also utilize REST API for converting the accumulated audio data file into text.
Service Usage Limit
- The usage limit of the Clova Speech Recognition API can be adjusted directly from the service console.
- You can adjust the usage limit of Clova Speech Recognition API yourself using various options from 300,000 secs a month (10,000 secs a day) to 30 mill secs a month (10 mill secs a day) per application.
- Please note that the service usage is rounded up to every 15th second for billing.
- Calling is allowed for the last call that exceeds the usage limit. (The usage fee is calculated based on the number of actual calls.)
E.g., you have set the monthly usage limit to 1,000 seconds. Once you use 990 seconds, you can call for an extra 15 seconds. Then, you can use a total of 1,005 seconds and will be billed for 1,005 seconds.
- If you need to increase the service limit, please contact Customer Support
The CSR usage fee will be charged in 15 second units based on the usage time of speech recognition.
|Charge (Monthly)||Unit||Usage Fee (in KRW)|
|Speech recognition usage time (speech recognition request - termination)||Usage time (per 15 seconds)||4 KRW per 15 seconds|
- Usage time will be rounded up to the nearest 15 second unit.
- Maximum speech recognition per 1 API call is 60 seconds.
- The time from the request of speech recognition until termination is measured and calculated into the usage time. In other words, even if there is a period of non-usage in the middle, such as silent intervals in the middle of a file or in the middle of a streaming request for recognition, that time is included in the usage time.
- The same rate will apply on the mobile SDK and REST API.
Case 1) Using 10 seconds from API call 1: 4 KRW
Case 2) Using 32 seconds from API call 1: 12 KRW
Case 3) In case of using CSR for 65 seconds: Up to the time limit of 60 seconds is recognized and 16 KRW will be charged.