If you are using free AWS tier, you can analyze 50K units a month free. A unit is 100 characters. In my example, every tweet is ~2 units. In the scheduled job I am analyzing 10K tweets at once, so the free tier limit runs out pretty fast, and then it's $1 per 10K. Be sure to check pricing before you proceed. https://aws.amazon.com/comprehend/pricing/
- Create account at Amazon AWS
- Create an S3 bucket for your files
My blog post with screenshots: https://zyabkina.com/no-code-natural-language-processing-with-amazon-comprehend/
- Prepare your data. I used Twitter API to clreate a .csv files for specific queries. https://github.com/tanyazyabkina/AmazonComprehend/blob/master/get_tweets_for_terms.ipynb
- Upload your data into your S3 bucket. Create a separate folder for the resutls.
- Create Analysis Job for your analysis.
- Once the job finishes running, download the output file.
- Process the JSON output to understand the results. https://github.com/tanyazyabkina/AmazonComprehend/blob/master/AWS_Comprehend_JSON_to_CSV.ipynb
- Create account at Amazon AWS
- Create an admin user in IAM and get the access keys
- Install AWS Command Line Interface (AWS CLI) and configued it to use your access keys
- Create an S3 bucket for your file
- Install python packages: boto3, pandas, json,tarfile
- Prepare your data. I used Twitter API to clreate a .csv files for specific queries. https://github.com/tanyazyabkina/AmazonComprehend/blob/master/get_tweets_for_terms.ipynb
- Upload your data into S3 bucket.
- Create analysis job.
- Run analysis job.
- Download results from S3 bucket.
- Process the JSON output to understand the results.
Code for 2-6: https://github.com/tanyazyabkina/AmazonComprehend/blob/master/Use_Amazon_Comprehend_API_GitHub_.ipynb
Additional outputs JSON to CSV https://github.com/tanyazyabkina/AmazonComprehend/blob/master/AWS_Comprehend_JSON_to_CSV.ipynb
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html