Redshift Pipelines  [draft]

Reference Mark Smallcombe: 15 Examples of Data Pipelines Built with Amazon Redshift

July 25, 2018 · 1 min · 12 words · Eric

Best practices of AWS EMR  [draft]

Reference AWS Big Data Blog: Best practices for resizing and automatic scaling in Amazon EMR

July 3, 2018 · 1 min · 15 words · Eric

Redshift Data Ingestion from S3  [draft]

S3 -> COPY -> Redshift Staging Database -> Redshift Database Reference Data Engineering in S3 and Redshift with Python Amazon redshift: bulk insert vs COPYing from s3

May 25, 2018 · 1 min · 27 words · Eric

Upload Files to S3 Bucket

Uploading local files to AWS S3 with boto3 is quite straight forward. You can install the AWS python SDK boto3 via 1 pip install boto3 Before any implementation, please make sure you have enough permission to interactive with S3. In order to upload file to S3, you can do something like the the following 1 2 3 4 import boto3 s3res = boto3.resource("s3", region="us-east-1") s3.meta.client.upload_file("<LOCAL_FILE_PATH>", "<YOUR_BUCKET>", "<YOUR_KEY>") For example, you can use the above snippet like...

May 23, 2018 · 1 min · 191 words · Eric

Understand AWS Redshift Query Execution Plan  [draft]

May 9, 2018 · 0 min · 0 words · Eric