A sales organization wants to give incentive to their sales persons according their sold. They give incentive those sales persons who have sold maximum on every month in a year.
Data Engineering
PyCharm IDE, Python3.10, PySpark, AWS S3, PostgreSQL.
Step-1: You should have any IDE and python3.10+.
Step-2: Apache spark should also be installed.
Step-3: Create a virtual environment.
Step-4: Go to dependencies ---> dev directory and install all required libraries by running below command:
pip install -r dev-dependencies.txt
Step-5: Then set AWS credentials (access key and secret key) in config file.
Step-6: When all gets installed run main.py file inside src/main/transformations/jobs/ directory.
de-project
├── dependencies
│ ├── dev
│ │ ├── config.py
│ │ ├── dev-dependencies.txt
│ └── prod
│ ├── config.py
│ └── prod-dependencies.txt
├── src
│ ├── __init__.py
│ ├── main
│ │ ├── copy
│ │ │ ├── copy_invalid_files.py
│ │ ├── delete
│ │ │ └── delete_local_files.py
│ │ ├── download
│ │ │ └── s3_files_download.py
│ │ ├── __init__.py
│ │ ├── move
│ │ │ └── move_files_s3_to_s3.py
│ │ ├── read
│ │ │ ├── database_reader.py
│ │ │ └── s3_read.py
│ │ ├── transformations
│ │ │ ├── __init__.py
│ │ │ └── jobs
│ │ │ ├── customer_total_cost_calculation.py
│ │ │ ├── dimension_tables_join.py
│ │ │ ├── __init__.py
│ │ │ ├── main.py
│ │ │ └── sales_team_total_sales_calculation.py
│ │ ├── upload
│ │ │ └── s3_upload.py
│ │ └── write
│ │ ├── database_writer.py
│ │ ├── file_writer.py
│ ├── sql_scripts
│ │ └── scripts.sql
│ └── utility
│ ├── db_session.py
│ ├── encrypt_decrypt.py
│ ├── __init__.py
│ ├── s3_client_session.py
│ └── spark_session.py
└── test
├── dev_tests
│ ├── test.py
└── prod_tests
└── test.py