GitHub

Robotic Palletizing Problem using RL

📝 Table of Contents

About
Considerations
Theory
Getting Started
Usage
TODO
Authors

🧐 About

Modelled the palletizing problem as a POMDP and solved it using a Soft-Actor Critic Method

Considerations

The boxes will be coming in an online fashion and a particular sequence
Boxes can be rotated before placement
Placement done in top-down approach
Box not placed successfully is discarded
All 6 orientations of a box are considered stable

Theory

State

State is made of 2 components. $s_p, s_b$

The palette state $s_p$ can be understood as the height state of the pallete. It's of size $L\times W$
The box state $s_b$ captures the state of the boxes. $s_b = {b_i}$, $i={1,...,n}$ where $n$ is the number of boxes in the sequence. $b_i$ = $(l_i, w_i, h_i, P_i)$. Where $P_i = 0$ if the box is yet to be loaded, $1$ if is loaded successfully, $-1$ if discarded.

Action

Since the boxes are placed in a sequence, we only care of the boxes orientation and location placed. Hence $a = (a_o, a_l)$, $a_{o_i} \in \left((l_i, w_i, h_i), (l_i, h_i, w_i), (w_i, l_i, h_i), (w_i, h_i, l_i), (h_i, l_i, w_i), (h_i, w_i, l_i)\right)$ and $a_l$ is a coordinate of where the front, lower, left corner of the box is placed in the palette.

Observation

At each timestep, only a certain segment of the sequence is visible to the agent

Reward

We wish to maximise for maximum volume packed in the pallete. Hence the reward is the fraction of the box packed of the total palette volume. The reward is 0 if the box is discarded. Additionally, there is an instability penalty, which is measured using local differencing method: Taking the RMSE of the convolution with filter of [[-1, -1, -1],[-1, 8, -1], [-1, -1, -1]] and normalizing with $L \times W$

Agent

I used a Soft Actor Critic network with a double Q network for the critic.

🏁 Getting Started

Installing

Create the environment

conda env create -f environment.yml

🎈 Usage

Activate environment conda activate PalletizerEnv

Run python main.py

Future Steps

Consider using a PCT for POMDP structure
Incorporate a transformer based SAC network

✍️ Authors

@saksham36

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
logs		logs
utils		utils
README.md		README.md
__init__.py		__init__.py
config.yaml		config.yaml
env.py		env.py
environment.yml		environment.yml
main.py		main.py
model.py		model.py
replay_memory.py		replay_memory.py
sac.py		sac.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robotic Palletizing Problem using RL

📝 Table of Contents

🧐 About

Considerations

Theory

State

Action

Observation

Reward

Agent

🏁 Getting Started

Installing

🎈 Usage

Future Steps

✍️ Authors

About

Releases

Packages

Languages

saksham36/RL_Robot_3D_Box_Palletizing

Folders and files

Latest commit

History

Repository files navigation

Robotic Palletizing Problem using RL

📝 Table of Contents

🧐 About

Considerations

Theory

State

Action

Observation

Reward

Agent

🏁 Getting Started

Installing

🎈 Usage

Future Steps

✍️ Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages