Table of contents
- 1. Introduction
- 2. Technology Stack Used
- 3. Architecture Overview
- 4. Implementation Steps
- Step 1: Preparing the Dataset
- Step 2: Configuring Amazon Kendra
- Step 3: Implementing AWS Lambda for Data Processing
- Step 4: Setting Up Amazon QuickSight for Visualization
- 5. Results & Insights
- 6. Challenges & Solutions
- 7. Future Enhancements
- 8. Conclusion
1. Introduction
- Brief overview of the project and its objectives
With the increasing volume of unstructured data, such as blog posts, documents, and articles, retrieving relevant information efficiently has become a challenge. Traditional search methods often fail to provide accurate results, and without proper analytics, understanding user behavior and search trends remains difficult.
This case study focuses on building a searchable knowledge base using Amazon Kendra and analyzing user insights and search analytics with Amazon QuickSight. The key objectives include:
Storing blog datasets (.md files) in Amazon S3
Indexing and enabling AI-powered search using Amazon Kendra
Capturing user search behavior and storing insights as structured data in Amazon S3
Visualizing search analytics, trends, and related metrics in Amazon QuickSight
This integration provides an end-to-end solution for intelligent knowledge retrieval and data-driven decision-making through search analytics.
Importance of AI-powered search (Kendra) and visualization (QuickSight)
Amazon Kendra: AI-Powered Search & Knowledge Base
Traditional keyword-based search methods often return irrelevant results due to a lack of contextual understanding. Amazon Kendra enhances search capabilities by leveraging:
Machine learning & natural language processing (NLP) for context-aware search
Relevance ranking to prioritize the most accurate answers
User query tracking to understand frequently searched topics
Seamless integration with multiple data sources (S3, databases, etc.)
With Amazon Kendra, users can quickly retrieve relevant, intelligent, and context-aware search results, improving information accessibility within the knowledge base.
Amazon QuickSight: Search Analytics & User Insights
Once search data is collected from Amazon Kendra, Amazon QuickSight enables:
Visualization of search trends, user behavior, and query patterns
Analysis of popular search terms, failed searches, and engagement metrics
Creation of interactive dashboards for monitoring search performance and knowledge base usage
Use case: Analyzing blogs dataset stored in S3
For this case study, a dataset of Markdown (.md) blog posts was stored in Amazon S3, and a searchable knowledge base was created using Amazon Kendra. The key steps include:
Enabling intelligent search to retrieve relevant blog content from Amazon Kendra
Capturing user queries and search activity logs using AWS Lambda
Converting user behavior data into structured CSV format stored in Amazon S3
Visualizing search analytics in Amazon QuickSight, including user engagement metrics
2. Technology Stack Used
Amazon S3 – Storage for markdown (
.md
) blog datasetAmazon Kendra – AI-powered search service for text-based content
AWS Lambda – Serverless processing to fetch Kendra search results and store in S3 bucket with csv format
Amazon QuickSight – Visualization of Kendra user metrics
Amazon IAM – Managing permissions for AWS services
3. Architecture Overview
- Flow diagram of data processing
How each component interacts:
Data is uploaded to S3
Kendra indexes the dataset and provides search capabilities
Lambda fetches data from Kendra, processes it, and stores it in CSV format in S3
QuickSight connects to S3 CSV data to create dashboards(manual process)
4. Implementation Steps
Step 1: Preparing the Dataset
Create an S3 Bucket
Uploading files to an S3 bucket
Step 2: Configuring Amazon Kendra
- Setting up an Amazon Kendra index
An index is the place where you add your data sources to make them searchable in Amazon Kendra.
- Configuring a data source (S3 as storage)
Amazon S3 is an object storage service that stores data as objects within buckets. You can use Amazon Kendra to index your Amazon S3 bucket repository of documents. Add and sync your data from S3, SharePoint, and other databases to your index.
- Design, tune and share your search experience
Quickly create and customize your search application using Experience Builder.
Amazon Kendra Experience Builder provides an intuitive visual workflow for quickly building and customizing Kendra-powered search applications in the cloud. Start with a ready-to-use template and easily customize it by dragging and dropping components like filters and sorting options. You can invite collaborators for testing and feedback, then deploy the application when ready. The tool also integrates with AWS Single Sign-On (SSO) for secure access management.
Step 3: Implementing AWS Lambda for Data Processing
Lambda functions are used to upload search metrics to an Amazon Simple Storage Service (Amazon S3) bucket. This ensures efficient storage and easy retrieval of the metrics for further analysis, helping to monitor and optimize the search application's performance.
Lambda function to:
Extract relevant metrics
Format the extracted data into CSV
Store the CSV file back in S3
- Sample Lambda code snippet (Python)
import boto3
from botocore.exceptions import BotoCoreError, ClientError
import csv
import io
def lambda_handler(event, context):
kendra_client = boto3.client('kendra', region_name='us-east-1')
s3_client = boto3.client('s3')
index_id = 'INDEX_ID'
interval = 'THIS_MONTH'
metric_types = ['QUERIES_BY_COUNT', 'QUERIES_BY_ZERO_RESULT_RATE', 'AGG_QUERY_DOC_METRICS']
output_data = {}
try:
for metric in metric_types:
response = kendra_client.get_snapshots(IndexId=index_id, MetricType=metric, Interval=interval)
output_data[metric] = response['SnapshotsData']
csv_buffer = io.StringIO()
writer = csv.writer(csv_buffer)
writer.writerow(['Metric Type', 'Query Content', 'Metric 1', 'Metric 2', 'Metric 3'])
for metric_type, rows in output_data.items():
for row in rows:
writer.writerow([metric_type] + row)
csv_buffer.seek(0)
bucket_name = 'BUCKET_NAME'
s3_file_key = 'kendra_response_csv/kendra_responses.csv'
s3_client.put_object(Bucket=bucket_name, Key=s3_file_key, Body=csv_buffer.getvalue())
return {
'statusCode': 200,
'body': f'Successfully uploaded Kendra analytics CSV data to S3 bucket {bucket_name} at {s3_file_key}.'
}
except (BotoCoreError, ClientError) as e:
return {
'statusCode': 500,
'body': f'Error: {str(e)}'
}
if __name__ == '__main__':
print(lambda_handler({}, {}))
Step 4: Setting Up Amazon QuickSight for Visualization
- Connecting QuickSight to S3
Amazon QuickSight is connected to the S3 bucket where the processed CSV file is stored. This connection is crucial because it allows QuickSight to directly access and analyze the search metrics data in a centralized location without manual uploads. The data in the CSV format includes key search analytics such as user queries, engagement metrics, and search performance.
QuickSight requires a manifest.json
file for this connection:
{
"fileLocations": [
{
"URIPrefixes": [
"s3://BUCKET_NAME/kendra_response_csv/"
]
}
],
"globalUploadSettings": {
"format": "CSV"
}
}
- Creating a dataset from the CSV file
Datasets serve as the foundation for all visualizations in QuickSight. Properly organizing and formatting data ensures that it can be easily analyzed and visualized. It also simplifies the process of setting up multiple visualizations from the same dataset without having to repeatedly configure the data source.
QuickSight automatically structures the data into rows and columns, making it ready for analysis.
Designing a dashboard
The data from the dataset is used to create visualizations such as graphs, tables, and filters. Dashboards can be designed to show key metrics, including:
Search volume: The number of searches performed.
Click-through rates: How often users click on search results.
Zero-result queries: How many queries returned no relevant results.
Dashboards provide an interactive and visual representation of key metrics that help in identifying patterns, trends, and areas for improvement. For example, high zero-result queries might indicate gaps in the knowledge base or issues with search relevance
5. Results & Insights
Screenshot of Kendra search results
Screenshot of Lambda execution logs & CSV in S3
- Screenshot of QuickSight dashboard
6. Challenges & Solutions
Handling Markdown processing & text extraction
Managing IAM permissions for secure access
Optimizing Lambda execution & API response times
Handling QuickSight dataset refresh
7. Future Enhancements
Automating the entire workflow and quicksight dataset update
Implementing real-time analytics with QuickSight
Expanding the Kendra index with multiple data sources
Enhancing search capabilities with custom relevance tuning
8. Conclusion
- Summary of benefits from Kendra-QuickSight integration
Provide users with context-aware, accurate search results using Amazon Kendra’s machine learning and natural language processing features.
Capture valuable user insights and search behavior metrics, which are then stored in Amazon S3 for analysis.
Transform raw search data into meaningful visualizations, trends, and reports with Amazon QuickSight, empowering businesses to monitor and optimize search performance.
- Key takeaways and business impact
Actionable Insights: QuickSight visualizations enable businesses to track key metrics such as search volume, user engagement, and zero-result queries, allowing for continuous improvement in the knowledge base and search functionality.
Data-Driven Decisions: The integration empowers organizations to make informed, data-driven decisions based on real-time analytics and trends, ultimately driving better customer satisfaction, operational efficiency, and competitive advantage.
Scalable Solution: The combination of Kendra and QuickSight allows businesses to scale their search capabilities and analytics as their data grows, ensuring that they can maintain high performance and deep insights as their needs evolve.