Skip to main content

dsr

Daily Sales Report (DSR)

This document is a guide to the DSR process. It is important to follow the steps mentioned in the document to ensure the DSR view is updated correctly.

Points to remember:

  1. Whenever we upload any file in s3, we have to create a date partition, preferrably current date, in YYYY-MM-DD format. This is because the EL job picks up the latest partition and loads the data into redshift.
  2. We have 2 environments - staging and production. Each of them have their own AWS accounts, and hence staging and production s3 buckets. We have to upload the files in both the buckets.
    • Staging S3 bucket: s3://gobblecube-stage-cms/
    • Production S3 bucket: s3://gobblecube-prod-cms/
    • Note : If you want to avoid uploading to 2 different buckets, add the path in this. It copies all the files present in file from production to staging bucket.

Data Ingestion

There are essentially 3 sources with which we show the final DSR view:

  1. Unicommerce tallygst report data - This is an automated pipeline, which runs everyday at 5:30AM and dumps the data into redshift. Unicommerce Job Name: Tally GST Report Job Name: el_unicommerce_job

  2. Amazon Order Reports - Currently, this is a manual process. The data is downloaded from Amazon Seller Central and uploaded to S3. After this, we need to run an EL dagster job to land the data in redshift. S3 Path: wholetruth/bronze/unicommerce/tallygstreport Job Name: el_cms_internal_amazon_job

  3. Reconciliation data - This is a manual process. Every week, we receive an excel file in an agreed upon format from the client. We have to manually upload this into a new folder in S3 and then run an EL dagster job to land the data in redshift. S3 Path: wholetruth/bronze/unicommerce/tallygstreport-recon Job Name: el_cms_internal_unicommerce_job

        SELECT
    *
    FROM
    bronze.unicommerce_tallygstreport_recon
    LIMIT 10;

    In case you want to validate the file received, have a look at these dbt tests


Data Modelling

We use dbt to model the data. The [dbt model] (https://github.com/gobblecube/dbt-models/blob/master/wholetruth/models/aggregate/daily_sales_report.sql) runs everyday at 8:00 AM IST from dagster.

Steps to run the dbt model:

  1. Go the dagster job daily_sales_report_job
  2. Click on the "Materialize all" button on the top right corner of the page.
  3. Select the "latest" from the pop-up window, then click on Launch 1 run button.
  4. A pop-up window in blue will appear in the middle of the page, click on the View button.
  5. You will be redirected to the dagster run page, where you can see the progress of the dbt model run.

Gobblecube UI

For performance issues with redshift, we had decided to move the aggregate views to postgres. Hence, we run this job to copy the tables from redshift to postgres.