Question-2: You have developed a Website for the translation and user can use it to translate from 1 word to upto 1 Lakh words at a time. Which of the following you can use to provide the translation solution?
- You would be using Amazon Translate service with the Python programming interface.
- You would be using Amazon Translate and AWS Lambda
- You would be using Amazon Translate, AWS Lambda and S3 to store the uploaded document.
- You would be using Amazon Translate, NLTK for Python
Exp: If you see this question carefully that you need to have an Amazon Translation service which can provide support for the 1 Word to 1 Lakh word. However, if you see with the Amazon Translate it can support at a time at the max you can 5000 words and if you insert more than that you would get following error
An error occurred (TextSizeLimitExceededException) Input text size exceeds limit. Max length of request text allowed is 5000 bytes while in this request the text size is 5074 bytes
And to solve this issue you need to properly split your input data into 5000 words at the max. And there is a library available which can do this activity intelligently without breaking the words in the middle and even it take care of the sentence boundary so that no grammatical error would be introduced and this library is known as NLTK (Natural language Toolkit) for Python. Which provides the convenient way to split the text into sentences for many different languages. You need to select the tokenizer as specific to your language in case of English you can use English Tokenizer.