Amazon Web Services

Redshift Pipelines^[draft]

Reference Mark Smallcombe: 15 Examples of Data Pipelines Built with Amazon Redshift

Best practices of AWS EMR^[draft]

Reference AWS Big Data Blog: Best practices for resizing and automatic scaling in Amazon EMR

Redshift Data Ingestion from S3^[draft]

S3 -> COPY -> Redshift Staging Database -> Redshift Database Reference Data Engineering in S3 and Redshift with Python Amazon redshift: bulk insert vs COPYing from s3

Upload Files to S3 Bucket

Uploading local files to AWS S3 with boto3 is quite straight forward. You can install the AWS python SDK boto3 via 1 pip install boto3 Before any implementation, please make sure you have enough permission to interactive with S3. In order to upload file to S3, you can do something like the the following 1 2 3 4 import boto3 s3res = boto3.resource("s3", region="us-east-1") s3.meta.client.upload_file("<LOCAL_FILE_PATH>", "<YOUR_BUCKET>", "<YOUR_KEY>") For example, you can use the above snippet like...