A Python CLI application that randomly selects and uploads JSON files from a specified folder to an AWS S3 bucket at regular intervals.
- Python 3.11
- AWS account with S3 access
- Required Python packages (see requirements.txt)
- Log into AWS Management Console
- Navigate to S3 service
- Make sure you're in the correct AWS Region
- Click "Create bucket"
- Configure bucket settings:
- Choose
General purposebucket type - Choose a globally unique bucket name (this will be your
S3_BUCKET_NAMEin .env) - Leave most settings as default
- Click "Create bucket"
- Choose
- Go to IAM service in AWS Console
- Click "Users" → "Create user"
- Give your user a name (e.g., "s3-uploader")
- Do NOT check the box next to "Provide user access to the AWS Management Console"
- Click "Next: Permissions"
- Click "Attach policies directly"
- Create a new policy (Button in Policy section)
- On the next page, choose JSON in the Policy Editor
- Copy and paste the following
(Replace
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["s3:PutObject", "s3:GetObject", "s3:ListBucket"], "Resource": [ "arn:aws:s3:::YOUR-BUCKET-NAME", "arn:aws:s3:::YOUR-BUCKET-NAME/*" ] } ] }YOUR-BUCKET-NAMEwith your actual bucket name) - Give the policy a name (e.g., "S3UploadAccess")
- Attach this policy to your user
- Complete the user creation
- IMPORTANT: Save the Access Key ID and Secret Access Key - these are your credentials for the .env file
- Clone this repository
- Install dependencies:
pip install -r requirements.txt
- Copy
.env.exampleto.envand fill in your AWS credentials:cp .env.example .env
- Edit the
.envfile with your AWS credentials:AWS_ACCESS_KEY_ID=your_access_key_here AWS_SECRET_ACCESS_KEY=your_secret_key_here AWS_REGION=your_aws_region S3_BUCKET_NAME=your_bucket_name - Update the configuration variables in
src/s3_uploader.py:DATA_FOLDER: Path to your JSON filesUPLOAD_INTERVAL: Time between uploads in seconds
Run the script:
python src/s3_uploader.pyThe script will:
- Load AWS credentials from the .env file
- Connect to your S3 bucket
- Randomly select a JSON file from the specified folder
- Upload it to the S3 bucket
- Wait for the specified interval
- Repeat the process
.
├── data-news-articles/ # Folder containing JSON files to upload
├── src/
│ └── s3_uploader.py # Main script
├── .env # AWS credentials (not in version control)
├── .env.example # Template for .env file
├── requirements.txt # Python dependencies
└── README.md # This file
- Never commit your
.envfile to version control - Keep your AWS credentials secure
- Use appropriate IAM roles and permissions for S3 access