DynamoDB Key Diagnostics Library for Java

The DynamoDB Key Diagnostics Library is a wrapper around the DynamoDB Java SDK that can be used to log key usage information for DynamoDB into a Kinesis stream, which can then be used to track hot spots (keys).

License	License Amazon Software License
Categories	Categories AWS Container PaaS Providers KeY Data Data Formats Formal Verification
GroupId	GroupId com.amazonaws
ArtifactId	ArtifactId dynamodb-key-diagnostics-library
Last Version	Last Version 1.1
Release Date	Release Date Feb 19, 2019
Type	Type jar
Description	Description DynamoDB Key Diagnostics Library for Java The DynamoDB Key Diagnostics Library is a wrapper around the DynamoDB Java SDK that can be used to log key usage information for DynamoDB into a Kinesis stream, which can then be used to track hot spots (keys).
Project URL	Project URL https://aws.amazon.com/dynamodb
Source Code Management	Source Code Management https://github.com/awslabs/dynamodb-key-diagnostics-library.git

Download dynamodb-key-diagnostics-library

Filename	Size
dynamodb-key-diagnostics-library-1.1.pom
dynamodb-key-diagnostics-library-1.1.jar	25 KB
dynamodb-key-diagnostics-library-1.1-sources.jar	8 KB
dynamodb-key-diagnostics-library-1.1-javadoc.jar	33 KB
Browse

How to add to project

Apache Maven

<!-- https://jarcasting.com/artifacts/com.amazonaws/dynamodb-key-diagnostics-library/ -->
<dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>dynamodb-key-diagnostics-library</artifactId>
    <version>1.1</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/com.amazonaws/dynamodb-key-diagnostics-library/
implementation 'com.amazonaws:dynamodb-key-diagnostics-library:1.1'

Gradle Kotlin

// https://jarcasting.com/artifacts/com.amazonaws/dynamodb-key-diagnostics-library/
implementation ("com.amazonaws:dynamodb-key-diagnostics-library:1.1")

Apache Buildr

'com.amazonaws:dynamodb-key-diagnostics-library:jar:1.1'

Apache Ivy

<dependency org="com.amazonaws" name="dynamodb-key-diagnostics-library" rev="1.1">
  <artifact name="dynamodb-key-diagnostics-library" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='com.amazonaws', module='dynamodb-key-diagnostics-library', version='1.1')
)

Scala SBT

libraryDependencies += "com.amazonaws" % "dynamodb-key-diagnostics-library" % "1.1"

Leiningen

[com.amazonaws/dynamodb-key-diagnostics-library "1.1"]

Dependencies

compile (9)

Group / Artifact	Type	Version
com.amazonaws : aws-java-sdk-dynamodb	jar	1.11.459
com.amazonaws : aws-java-sdk-kinesis	jar	1.11.459
com.amazonaws : amazon-kinesis-client	jar	1.9.2
org.apache.commons : commons-math3	jar	3.2
com.google.guava : guava	jar	26.0-jre
com.google.collections : google-collections	jar	1.0-rc2
com.google.code.gson : gson	jar	2.8.5
org.slf4j : slf4j-log4j12	jar	1.7.25
commons-logging : commons-logging	jar	1.2

provided (2)

Group / Artifact	Type	Version
org.projectlombok : lombok	jar	1.16.16
com.google.code.findbugs : annotations	jar	3.0.1

test (4)

Group / Artifact	Type	Version
org.apache.commons : commons-lang3	jar	3.3.2
org.mockito : mockito-core	jar	2.19.0
junit : junit	jar	4.12
org.hamcrest : hamcrest-all	jar	1.3

Project Modules

There are no modules declared in this project.

DynamoDB Key Diagnostics Library 🌡️

DynamoDB Key Diagnostics Library is a Java DynamoDB client wrapper that automatically logs your key usage information to Kinesis as your application reads/writes data from/to DynamoDB. You can then use Kinesis Data Analytics to feed into CloudWatch to monitor and alarm if any single key gets too hot, and to feed into S3/Athena/QuickSight to report on your detailed key usage and have heatmaps to help diagnose your application.

.
├── README.md                                                   <-- This instructions file
├── LICENSE.txt                                                 <-- Apache Software License 2.0
├── NOTICE.txt                                                  <-- Copyright notices
├── checkstyle.xml                                              <-- Checkstyle for the Key Diagnostics client
├── pom.xml                                                     <-- Java dependencies, Docker integration test orchestration
├── resources
│   ├── python                                                  <-- Contains the AWS Lambda function for emitting hot key metrics and logs
│   │   ├── src
│   │   │   └── diagnostics
│   │   │       └── hot_key_logger_lambda.py                    <-- The actual Lambda function
│   │   └── tests
│   │       └── diagnostics
│   │           └── test_hot_key_logger_lambda.py               <-- Unit test for the Lambda function
│   └── DynamoDB_Key_Diagnostics_Library.yml                    <-- CloudFormation template, see "Packaging and Deployment" section below
├── src
│   ├── main
│   │   └── java
│   │       └── com.amazonaws.services.dynamodb.diagnostics     <-- Main package for Key Diagnostics Client
│   │           ├── DynamoDBKeyDiagnosticsClient.java           <-- Contains inject methods for handler entrypoints
│   │           ├── DynamoDBKeyDiagnosticsClientAsync.java      <-- Similar to DynamoDBKeyDiagnosticsClient, but supports async DynamoDB APIs
│   │           ├── DynamoDBKeyDiagnosticsClientBuilder.java    <-- Provides dependencies like the DynamoDB client for injection
│   │           └── KinesisStreamReporter                       <-- Reporter class that asynchronously sends key usage information to a Kinesis stream
│   └── test                                                    <-- Unit and integration tests
│       └── java
│           └── com.amazonaws.services.dynamodb.diagnostics     <-- Contains integration tests and unit tests.
│
└── samples                                                     <-- Contains the Movies demo application that uses the Key Diagnostics client
     └── movies
         ├── main
         │   ├── java
         │   │   └── com.amazonaws.services.dynamodb.diagnostics.demo
         │   │       └── MoviesApplication.java                 <-- Simulates hot key scenario with certain "hit movies"
         │   │
         │   └── resources
         │       └── log4j.properties                           <-- Log4j configuration for the demo application
         ├── checkstyle.xml                                     <-- Checkstyle for the Movies demo application
         └── pom.xml                                            <-- Java and the Key Diagnostics client dependencies

Prerequisites

To use the DynamoDB Key Diagnostics Library or run the demo, you must have the following:

Java 1.8 or later
Maven 3 or later
an AWS account

Setup process

Note: At this time, the library aggregates the metrics for keys at minute and second granularity. Depending on your business requirements, you may choose to modify the client to aggregate data differently. Currently, you can set up this post’s CloudFormation template in the following AWS Regions: US East (N Virginia), US West (Oregon), EU (Ireland), and EU (Frankfurt).

With the following setup, the DynamoDB Key Diagnostics Library will log the values of your partition key, sort key or any attributes for the selected DynamoDB table. The key usage information will be stored on S3, and specific hot keys will be logged and displayed through Amazon CloudWatch and Amazon QuickSight.

Step 1: Install the Key Diagnostics Library

To install the Key Diagnostics Library, run the following command:

mvn install

Step 2: Configure your AWS credentials

Configure your AWS CLI credentials, if you haven't already. The following AWS resources will be synthesized under the configured account.

aws configure

Make sure you have Amazon S3, AWS Lambda, Amazon Kinesis, Amazon CloudWatch and CloudFormation permissions with the configured credentials.

Step 3: Create and deploy the required AWS resources by using the CloudFormation template

You will now deploy a Lambda function for reporting and monitoring metrics. To do this, first upload the provided Lambda function to Amazon S3. If you don't have an Amazon S3 bucket already, create one. (Throughout the following instructions, replace the placeholder names with your own names.)

export BUCKET_NAME=my_cool_new_bucket
aws s3 mb s3://$BUCKET_NAME

Then, package the provided Hot Key Lambda function the Amazon S3 bucket.

aws cloudformation package \
    --template-file resources/DynamoDB_Key_Diagnostics_Library.yaml \
    --s3-bucket $BUCKET_NAME \
    --output-template-file packaged.yaml

You can then create the rest of the necessary AWS resources (such as the Amazon Kinesis Data Streams stream, Amazon Kinesis Data Analytics application, and CloudWatch alarm). Also, provide a CloudFormation stack name.

STACK_NAME=KeyDiagnosticsStack

aws cloudformation deploy \
    --template-file packaged.yaml \
    --stack-name $STACK_NAME \
    --capabilities CAPABILITY_IAM

Customizing your Kinesis Data Stream according to your DynamoDB Table

Depending on the provisioned capacity of your DynamoDB table, you may change the shard count of the Kinesis Data Stream used to process your requests. The default and minimum is 4 shards. To override the shard count, you add the following override instead:

SHARD_COUNT=10

aws cloudformation deploy \
    --template-file packaged.yaml \
    --stack-name postreview \
    --capabilities CAPABILITY_IAM
    --parameter-overrides KinesisSourceStreamShardCount=$SHARD_COUNT

CloudFormation does not automatically start the Kinesis Data Analytics application, so to start the application, navigate to the Amazon Kinesis console or run the following commands.

# Find out the Kinesis Analtyics Application Name by going to the Kinesis console or `aws kinesisanalytics list-applications`
KINESIS_ANALYTICS_APP_NAME="Put your application name here"

# Then, find out the InputID
INPUT_ID=`aws kinesisanalytics describe-application \
    --application-name $KINESIS_ANALYTICS_APP_NAME \
    --query 'ApplicationDetail.InputDescriptions[0].InputId'`

# Start the Kinesis Data Analytics app
aws kinesisanalytics start-application \
    --application-name $KINESIS_ANALYTICS_APP_NAME \
    --input-configurations Id=$INPUT_ID,InputStartingPositionConfiguration={InputStartingPosition=NOW}

You now are ready to run the demo Movies example application in the repository (step 3.1) or change your code to use the Key Diagnostics Library (step 3.2).

Step 3.1: Running the Demo

This demo uses the IMDb meta-data dataset to create an application that rates movies by putting in items into DynamoDB. Certain movies will be "trending", thus creating an uneven load on certain hash keys.

After you have installed the Key Diagnostics Library dependencies and setup all the AWS resources, navigate to the samples/movies/ directory.

Then, execute the demo by running:

KINESIS_STREAM_NAME="Put your Kinesis Data Stream name here"
REGION="Put the region where the Kinesis Data Stream and DynamoDB table are set up"

mvn package exec:java@movies -Dexec.args="trend $KINESIS_STREAM_NAME $REGION"

Step 3.2: Integrating with your existing DynamoDB code

To use the Key Diagnostics client, you first need to create a Kinesis stream that it can log to, then you can use that stream name along with the Kinesis client and the DynamoDB client you're wrapping to create the Key Diagnostics client. You also need to specify which key attributes in which tables you need to monitor - the easiest way to do that is to use the factory method that just monitors all the key attributes for all the tables and global secondary indexes in your account:

DynamoDBKeyDiagnosticsClient ddbClient = DynamoDBKeyDiagnosticsClient.monitorAllPartitionKeys(
    dynamoDB,
    kinesisClient,
    kinesisStreamName
);

If you do need to specify your own attributes to monitor (e.g. if you are considering creating a new global secondary index on a new attribute and are wondering if it has hot values) then you can create it as follows:

DynamoDBKeyDiagnosticsClient ddbClient = new DynamoDBKeyDiagnosticsClient(
    dynamoDB,
    kinesisClient,
    kinesisStreamName,
    ImmutableMap.of("MyTable", ImmutableList.of("MyAttribute"))
);

After you created the diagnostics client, you can then use it everywhere you would've used the regular AmazonDynamoDB client (it implements the AmazonDynamoDB interface). The diagnostics client creates a thread pool to asynchronously log the key usage information to Kinesis, so when you're done with it you should close() it so that it can shut down those threads.

Step 4: Visualization through Amazon Athena and QuickSight

If you are interested in creating dashboards or querying the key usage information, or wish to understand what the access patterns of certain attributes, we highly recommend setting up Athena and QuickSight.

First, go to the Athena Console, and put in the following under New query 1, then click Run Query. This will create an Athena database for the key usage information stored on S3:

CREATE DATABASE IF NOT EXISTS dynamodbkeydiagnosticslibrary
COMMENT 'Athena database for DynamoDB Key Diagnostics Library';

Then, create the Athena table. Following the demo app, we will use movies as the table name. If you synthesized the AWS with the provided CloudFormation template in Step 1, the S3 Location should be something similar to: s3:///keydiagnosticsstack-aggregatedresultbucket-ejkhrnvyw8ku/keydiagnostics/

CREATE EXTERNAL TABLE `movies`(
    `second` timestamp COMMENT 'Second aggregated results',
    `tablename` string COMMENT 'DynamoDB table name',
    `hashkey` string COMMENT 'The partition key attribute name',
    `hashkeyvalue` string COMMENT 'The partition key attribute value',
    `operation` string COMMENT 'DynamoDB operation',
    `totalio` float COMMENT 'Total IO consumed')
ROW FORMAT SERDE
    'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
    'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
    's3:///keydiagnosticsstack-aggregatedresultbucket-ejkhrnvyw8ku/keydiagnostics/'

After setting the Athena table, you can use QuickSight to visualize the key usage pattern of your application:

Go to the QuickSight console and click Manage data on the upper right.
Click New data set, choose Athena and pick a data source name. Then, you should be able to select the Athena database and table created in the previous section.
Choose Import to SPICE for quicker analytics, then click Visualize!
Now you should be able to create graphs by filtering table names, time range, partition keys, operation etc. The following is a heat map that shows what movies are popular over a certain time range:

Testing

Running unit tests

We use JUnit for testing our code. Unit tests mock out the AmazonDynamoDBClient and do not require connectivity to a DynamoDB endpoint. You can run unit tests with the following command:

mvn test

Running integration tests

Integration tests do not mock out the AmazonDynamoDBClient and require connectivity to a DynamoDB endpoint. As such, the POM starts DynamoDB Local from the Dockerhub image for integration tests.

mvn verify -P integration-tests

Amazon Web Services - Labs

AWS Labs

Versions

Version
1.1 Feb 19, 2019
1.0 Dec 14, 2018

DynamoDB Key Diagnostics Library for Java

License

Categories

GroupId

ArtifactId

Last Version

Release Date

Type

Description

Project URL

Source Code Management