Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
aidinkhosrowshahi committed Jul 29, 2024
1 parent 56cc428 commit 102c77f
Showing 1 changed file with 21 additions and 21 deletions.
42 changes: 21 additions & 21 deletions apigw-log-analytic/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Logging API Gateway access logs, analyzing and visualizing them using Amazon QuickSight

[Amazon API Gateway](https://aws.amazon.com/api-gateway/) is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the 'front door' for applications to access data, business logic, or functionality from your backend services. API Gateway produces detailed access logs to show who has accessed the API. When using usage plans, a customer identifier is included in the log records. API developers uses these access logs to identify popular routes, errors, authentication methods, users reaching quota limits, and gain deeper insights into API usage.
[Amazon API Gateway](https://aws.amazon.com/api-gateway/) is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the 'front door' for applications to access data, business logic, or functionality from your backend services. API Gateway produces detailed access logs to show who has accessed the API. When using usage plans, logs include a customer identifier. API developers rely on these access logs to analyze popular routes, errors, authentication methods, users exceeding quota limits, and gain deeper insights into API usage.

You can use these logs to populate a business intelligence service, such as [Amazon QuickSight](https://aws.amazon.com/quicksight/), to analyze and report on usage patterns across your APIs and customers.
Use these logs to populate a business intelligence service, such as [Amazon QuickSight](https://aws.amazon.com/quicksight/), to analyze and report on usage patterns across your APIs and customers.

In this repo, I will show how to visualize and analyze API Gateway access logs using [Amazon QuickSight dahboard](https://docs.aws.amazon.com/quicksight/latest/user/example-create-a-dashboard.html). This pre-built dashboard enables you to analyze API usage by visualizing the following components:
In this repo, I will show how to visualize and analyze API Gateway access logs using [Amazon QuickSight dahboard](https://docs.aws.amazon.com/quicksight/latest/user/example-create-a-dashboard.html). With this pre-built dashboard, you can analyze API usage by visualizing these components:

* 30 days of API usage by domain
* API routes showing popular API paths
Expand Down Expand Up @@ -32,25 +32,25 @@ As we are using QuickSight for the visualization part, you have the flexibility
The integration works by forwarding API Gateway access logs from your API Gateway to Amazon S3 bucket via [Amazon Data Firehose](https://www.google.com/search?client=firefox-b-1-d&q=Amazon+Kinesis+Data+Firehose). This solution uses the following AWS services to provide near real-time logging analytics:

* Amazon S3 bucket ensures durable and secure storage.
* Amazon Data Firehose to deliver logs into an S3 bucket.
* Amazon Data Firehose is used to deliver logs into an S3 bucket.
* AWS Lambda function for log enrichment.
* AWS Glue crawler to provide fresher data to QuickSight.
* QuickSight for analytics and visualization.

![Architecture diagram](./assets/apigw-log-analytic-solution-overview.jpg)

## Streamlining API Access Logs
API access logs are streamed in near real-time from API Gateway to Firehose. Firehose buffers these records, enriching them with information from the API usage plans. It then writes batches of enhanced records to an Amazon S3 bucket, ensuring durable and secure storage. To enrich the access logs, an AWS Lambda function is used. The Lambda function retrieves API Gateway usage plan details and loads them into memory. During each invocation, it processes each access log record from Firehose stream by decoding it from a base64-encoded binary. The record is then enriched with the usage plan name and customer name before being re-encoded to base64 binary and returned to the Firehose stream.
API Gateway streams API access logs in near real-time to Firehose. The Firehose buffers and enriches these records with information from the API usage plans. It then writes batches of enhanced records to an Amazon S3 bucket, ensuring durable and secure storage. To enrich the access logs, I use an AWS Lambda function. The Lambda function retrieves API Gateway usage plan details and loads them into memory. During each invocation, it processes each access log record from Firehose stream by decoding it from a base64-encoded binary. The record is then enriched with the usage plan name and customer name before being re-encoded to base64 binary and returned to the Firehose stream.

## Indexing Access Logs
Metadata for the API access logs is stored in an [AWS Glue Data Catalog](https://docs.aws.amazon.com/glue/latest/dg/catalog-and-crawler.html), which QuickSight uses for querying. An AWS Glue crawler identifies and indexes newly written access logs. You can adjust the frequency of the crawler to ensure fresher data is available in QuickSight by updating the DataRefreshFrequency parameter of the SAM template when deploying the solution. The default data refresh frequency is every 10 minutes (cron(0/10 * * * ? *)).
I store metadata for the API access logs in an [AWS Glue Data Catalog](https://docs.aws.amazon.com/glue/latest/dg/catalog-and-crawler.html), which QuickSight uses for querying. An AWS Glue crawler identifies and indexes newly written access logs. Adjust the frequency of the crawler to ensure fresher data is available in QuickSight by updating the DataRefreshFrequency parameter of the SAM template when deploying the solution. The default data refresh frequency is every 10 minutes (cron(0/10 * * * ? *)).

## Visualizing Data
QuickSight is configured to use the S3 location of the access logs as a data source, enabling comprehensive analysis and visualization.
QuickSight uses the S3 location of the access logs as a data source, enabling comprehensive analysis and visualization.

## Pre-requisites

If you have not activated QuickSight in your AWS account, follow the steps below. Otherwise, you can skip to step 2.
If you have not activated QuickSight in your AWS account, follow the steps below. Otherwise, skip to step 2.

1. Create a QuickSight Account.
1. Navigate to QuickSight service from the AWS Management console.
Expand All @@ -63,7 +63,7 @@ If you have not activated QuickSight in your AWS account, follow the steps below
3. On the left menu, select “Manage Groups”.
4. Select the “NEW GROUP” button and name the group in the format “<projectName>-Admins” (It is case sensitive). Select “CREATE.” For example, apiaccesslogs-Admins.
5. Add yourself as an administrator to the dashboard by selecting the newly created group name, then click “ADD USER.”
6. Copy the project name without '-Admins' as this is required for the project name parameters in the SAM template. It needs to be the exact same name for deployment.
6. Copy the project name without '-Admins' as this is required for the project name parameters in the SAM template. It needs to be the same name for deployment.

![Pre-requisites1](./assets/apigw-log-analytic-prerequisite2.jpg)

Expand All @@ -73,7 +73,7 @@ Note: This solution supports the [REST API Gateway](https://docs.aws.amazon.com/

This solution won’t create an API Gateway in your AWS account and assumes you have an API endpoint. If you do not, follow [this tutorial](https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-create-api-from-example.html) to create one.

Deploy an AWS SAM template into your account. This template creates all components of the analytics pipeline, including an Amazon S3 bucket, Firehose, Lambda functions, AWS Glue, QuickSight dashboards and visuals, using Infrastructure as Code (IaC).
Deploy an AWS SAM template into your account. This template creates components of the analytics pipeline, including an Amazon S3 bucket, Firehose, Lambda functions, AWS Glue, QuickSight dashboards and visuals, using Infrastructure as Code (IaC).

Once deployment is complete, configure existing API Gateway to deliver access logs to the deployed Firehose stream.

Expand All @@ -90,21 +90,21 @@ sam deploy -g

Enter the following parameters for deployment:

- Stack Name: Use if for stack name. For example apgwaccesslogs.
- ProjectName: Use the project name without '-Admins'. Ensure it matches the one created in the Prerequisites section 6.
- DataRefreshFrequency: You can leave as default (every 10 minutes) or customize it based on your requirement.
- Stack Name: Use it for stack name. For example, apgwaccesslogs.
- ProjectName: use the project name without '-Admins'. Ensure it matches the one created in the Prerequisites section 6.
- DataRefreshFrequency: leave as default (every 10 minutes) or customize it based on your requirement.

![deployment](./assets/apigw-log-analytic-deployment1.jpg)

![deployment](./assets/apigw-log-analytic-deployment2.jpg)

Wait a couple of minutes for the deployment to complete. Once the stack has been successfully created, copy the Amazon Resource Names (ARNs) of the Firehose stream and S3 bucket from the output. These resource ARNs are required for the next steps in configuring API Gateway access logging.
Wait for a couple of minutes for the deployment to complete. After creating the stack, copy the ARNs of the Firehose stream and S3 bucket from the output. These resource ARNs are required for the next steps in configuring API Gateway access logging.

![deployment](./assets/apigw-log-analytic-deployment3.jpg)

## Authorize Amazon QuickSight to access your Amazon S3 bucket

1. Followthe instructions to authorize Amazon QuickSight to access API Gateway S3 access logs.
1. Follow the instructions to allow Amazon QuickSight to access API Gateway S3 access logs.
2. Select the S3 access logs bucket created and Finish and Save.

![Pre-requisites1](./assets/apigw-log-analytic-qs-security.jpg)
Expand All @@ -114,7 +114,7 @@ Wait a couple of minutes for the deployment to complete. Once the stack has been

## Configure API Gateway to stream access logs to this Data Firehose stream

1. You can now configure API Gateway to stream access logs to this Firehose delivery stream. [Follow these instructions](https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-logging-to-kinesis.html#set-up-kinesis-access-logging-using-console) to turn on access logging on your API stages using the ARN of the Firehose delivery stream created via SAM template.
1. You can now configure API Gateway to stream access logs to this Firehose delivery stream. [Follow these instructions](https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-logging-to-kinesis.html#set-up-kinesis-access-logging-using-console) to enable access logging on your API stages using the ARN of the Firehose delivery stream created via SAM template.
2. Under Log Format, choose the fields to include in the access logs in JSON format. Find [examples in the API Gateway documentation](https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-logging.html#apigateway-cloudwatch-log-formats) as well as the [full set of available fields in the $context variable](https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html#context-variable-reference). The below fields and mapped names are required for the analysis.

```
Expand All @@ -125,29 +125,29 @@ Wait a couple of minutes for the deployment to complete. Once the stack has been

## Data Visualization

Once you configure API Gateway access logs, it takes a few minutes for the logs to appear in the QuickSight dashboard. This pre-built dashboard allows you to analyze API usage by visualizing components such as 30 days of API usage by domain, popular API paths, 400 errors (quota exceeded, unauthorized, and invalid-signature) requests, users hitting quota limits, and Cognito-based and IAM authorization. Additionally, you can filter visualizations and reports by date, customer, status, usage plan, IP, and users (IAM or Cognito users). You can also customize these visuals or create new ones for your business use case, as additional context fields have been included in the API access logs.
Once you configure API Gateway access logs, it takes a few minutes for the logs to appear in the QuickSight dashboard. Analyze API usage using this pre-built dashboard, which visualizes components like 30-day API usage by domain, popular API paths, 400 errors, users hitting quota limits, and Cognito-based and IAM authorization. Filter visualizations and reports by date, customer, status, usage plan, IP, and users (IAM or Cognito users). Customize these visuals or create new ones for your business use case, as we have included additional context fields in the API access logs.

**Dashboard overview**

![dashboard overview](./assets/apigw-log-analytic-dashboard-overview.jpg)

This pre-built dashboard allows you to analyze API usage by providing visualizations of the following components:

**Congnito based access control**
**Cognito based access control**

If you are using [Amazon Cognito](https://aws.amazon.com/pm/cognito/?gclid=CjwKCAjw74e1BhBnEiwAbqOAjNsHPd0SoVxdGF33x27NozfK-9PWWQ2_1x62VdHIAajb2B9XUcjCzhoC8hYQAvD_BwE&trk=f5fef02c-2926-48d3-898a-b4d668742a20&sc_channel=ps&ef_id=CjwKCAjw74e1BhBnEiwAbqOAjNsHPd0SoVxdGF33x27NozfK-9PWWQ2_1x62VdHIAajb2B9XUcjCzhoC8hYQAvD_BwE:G:s&s_kwcid=AL!4422!3!651737511575!e!!g!!amazon%20cognito!19845796024!146736269189) to control access to REST APIs, this visual can help you understand which users are interacting with your APIs.

![dashboard overview](./assets/apigw-log-analytic-cognito.jpg)

**400 errors (quota exceeded, unauthorized, and invalid-signature) requests**

This visual provides insights into 4xx errors in your API requests, focusing on quota exceeded, unauthorized, and invalid signature errors. You can identify trends and patterns affecting your API's performance and reliability. This breakdown helps identify root causes, facilitating targeted improvements, stronger security measures, and optimized quota management for a seamless and secure API experience.
This visual provides insights into 4xx errors in your API requests, focusing on quota exceeded, unauthorized, and invalid signature errors. Identify trends and patterns affecting your API's performance and reliability. This breakdown helps pinpoint root causes, enabling targeted improvements, enhanced security measures, and optimized quota management for a seamless and secure API experience.

![dashboard overview](./assets/apigw-log-analytic-400-errors.jpg)

**API by usage plan**

[A usage plan](https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-api-usage-plans.html#api-gateway-api-usage-plans-overview) specifies who can access one or more deployed API stages and methods. This visual provides an overview of API usage by usage plan, domain, and path, highlighting request counts and their utilization across different plans.
[A usage plan](https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-api-usage-plans.html#api-gateway-api-usage-plans-overview) specifies who can access one or more deployed API stages and methods. This visual summarizes API usage by usage plan, domain, and path, highlighting request counts and their utilization across different plans.

![dashboard overview](./assets/apigw-log-analytic-usageplan.jpg)

Expand Down Expand Up @@ -179,4 +179,4 @@ sam delete

In this repo, I walked you through how API Gateway's detailed access logs provide insights into API usage. These logs include customer identifiers with usage plans, helping developers identify popular routes, errors, authentication methods, and users reaching quota limits. By using these logs, you can populate QuickSight to analyze and report on API usage patterns across domains and customer segments.

You can customize this dashboard by creating additional visuals and tables based on your business use case. Additionally, you can customize the Lambda enrichment logic, for example, to add geo information for IP addresses and create geographic dashboards within QuickSight to identify geographically separated users.
Customize this dashboard by creating additional visuals and tables based on your business use case. You can also customize the Lambda function enrichment logic, for example, to add geo information for IP addresses and create geographic dashboards within QuickSight to identify geographically separated users.

0 comments on commit 102c77f

Please sign in to comment.