https://cuchujr23.substack.com/p/guide-to-integrate-aws-s3-redshift
- Access S3:
- Log in to the AWS console.
- Navigate to S3 from the services panel.
- Create a New Bucket:
- Click on "Create bucket".
- Assign a unique name to the Bucket and select a region.
- Configure the access options and policies according to your needs.
- Complete the Bucket creation.
- Create an IAM Group:
- Access IAM from the AWS services panel.
- Create a new group called S3.
- Assign the AmazonS3FullAccess policy to this group.
- Create an IAM User:
- Create a user named PowerBI.
- Assign the user to the S3 group.
- Generate an Access Key and Secret Access Key for this user.
- Download the credentials and store them securely.
- Install Boto3:
- If not already installed, install Boto3 by running pip install boto3.
- Python Code to Upload Files to S3:
import boto3
from botocore.exceptions import NoCredentialsError
def upload_to_aws(local_file, bucket, s3_file):
s3 = boto3.client(service_name='s3',
region_name= REGION,
aws_access_key_id= ACCESS_KEY,
aws_secret_access_key=SECRET_KEY)
try:
s3.upload_file(local_file, bucket, s3_file)
print("Upload Successful")
return True
except FileNotFoundError:
print("The file was not found")
return False
except NoCredentialsError:
print("Credentials not available")
return False
bucket_name = 'XXXXXXXXXXXXXXXX'
upload_to_aws('path/to/your/file', bucket_name)
- Create a Redshift Cluster:
- Access Redshift from the services panel.
- Create a new cluster and select the most economical node type and the minimum number of nodes.
- Ensure that "Publicly accessible" is enabled in the network and security configuration.
- Load Data from S3 to Redshift:
- Use the Redshift SQL interface or third-party tools to copy data from S3 to Redshift.
- Configure Security Rules in VPC:
- Access the VPC Dashboard in AWS.
- Select "Security Groups".
- Find and select the security group used by Redshift.
- Edit the inbound rules to allow connections:
- Type: Redshift
- Protocol: TCP
- Port: 5439
- Source: My IP (your IP address) or IPv4 and IPv6
- Description: (optional)
- Save the rules.
- Connect Power BI:
- Open Power BI Desktop.
- Select "Get Data" and choose "Amazon Redshift".
- Enter the Redshift cluster information and your credentials.
- Once connected, you can visualize and analyze the loaded data.