Purpose of the article: Amazon Redshift external schema references to the external database in the external data catalog.
Intended Audience: Useful for the people working on AWS Redshift Glue service
Tools and Technology: AWS
Keywords: Redshift External Schemas
Amazon Redshift external schema references to the external database in the external data catalog. The external database can be created in Redshift, in AWS Glue Data Catalog, in Athena, or in an Apache Hive meta store. Such as Amazon EMR. To create an external database in Redshift, the database should reside in Athena Data Catalog. For creating the database in Hive metastore, you need to create the database in the Hive application.
Amazon Redshift requires authorizations/permissions to access the Data Catalog in glue and the files in Amazon S3. To provide permission to create an AWS Identity and Access Management (IAM) role.
Follow the below steps to create an external table using glue:
Step 1: Create an IAM Role for Amazon Redshift.
- Open the IAM console.
- In the navigation pane, choose Roles.
- Choose to Create a role.
- AWS service opens and then choose Redshift.
Choose Redshift – Customizable, under Select your use case, and then choose Next: Permissions.
The Attach permissions Policy page appears. Attach policies AmazonS3ReadOnlyAccess, AWSGlueConsoleFullAccess, and create a new policy of the JSON script, which grants access to the Data Catalog but denies the administrator permissions for Lake Formation.
In the Lake Formation database, grant SELECT permissions on the table to query
- Open lake formation console
- In Table and column permissions, choose Select.
- Choose Grant.
- Attach the policy create
- Enter the Database name and save
Step 2: Associate the IAM role with your cluster
- Log in to the AWS Management Console, and in services, open the Amazon Redshift.
- CLUSTERS select, On the navigation menu, then choose the name of the cluster that you want to update.
- For Actions, choose Manage IAM roles. The IAM roles page appears.
- Select Enter ARN and then enter ARN or IAM role or select IAM role from the list. To add it to the list of Attached IAM roles, select Add IAM role.
- In order to complete the change, the cluster is modified.
- Select completed associating the IAM role with the cluster.
Step 3: Create an external table and an external schema
- Open the editor in Redshift and create a schema and table. Mention the role of ARN in the code to create the external schema.
- Now create an external table and give the reference to the s3 location where the file is present.
Step 4: Query your data in Amazon Redshift
After your external tables are created, you can query them using the SELECT statements to fetch records.
Output: