Skip to content

Commit

Permalink
Merge pull request #6284 from osamahammad21/drt-distributed-cleanup
Browse files Browse the repository at this point in the history
DRT: fix distributed routing and cleanup code
  • Loading branch information
maliberty authored Jan 4, 2025
2 parents 2d06ae7 + d060c10 commit 0ab6013
Show file tree
Hide file tree
Showing 6 changed files with 44 additions and 79 deletions.
13 changes: 11 additions & 2 deletions src/drt/doc/Distributed.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The Diagram above shows four main components to run distributed routing:

* Under the “default-pool” section from the leftside menu select “Nodes”, then:
* You can configure the type of the machine that your Kuberenets pods would be running on. It is worth noting that a pod (worker/balancer) can acquire a number of CPUs <= the number of cpus available in the machine selected at this stage. This also applies for the memory.
* This is all what we need, now click “Create” at the bottom of the page.
* This is all we need, now click “Create” at the bottom of the page.
* As the cluster is being created, we move on to setting up a shared folder on a NFS.


Expand All @@ -56,6 +56,15 @@ A shared folder on a NFS is used to share routing updates between the leader and

[https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nfs-mount-on-ubuntu-20-04](https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nfs-mount-on-ubuntu-20-04)

The main steps on the NFS server machine are:

1. run ` sudo apt install nfs-kernel-server ` and ` sudo apt install nfs-common `
1. In the etc/exports file, add the following line replacing \${PATH} with the actual shared directory path you choose and the 10.128 portion with the actual subnet portion of the network you set up: ` ${PATH} 10.128.0.0/255.255.0.0(rw,no_subtree_check,no_root_squash) `
2. Run ` sudo systemctl restart nfs-kernel-server ` on the server macine.

Now your NFS server machine is ready and the shared directory can be accessed.


**N.B:** Since we are using google cloud, we set up our shared folder on a VM instance and record its internal IP to be used in later steps.


Expand Down Expand Up @@ -88,7 +97,7 @@ In the first step, we created a cluster on Google cloud. In this step we connect
1. The value of “replicas” under “spec” represents the number of workers that will be created in the cluster.
2. The value of “serviceName” under “spec” must match the value of the service name in the first section.
3. Under “spec” / “template” / “spec” / “containers”:
1. “image” must have the value of openroad docker image directory on docker hub.
1. “image” must have the value of openroad docker image on docker hub.
2. Under “command”, you can find two commands, the first runs openroad. The second runs the tcl file in the shared directory. It’s necessary to change the directory to match your shared folder directory.
3. Under “volumeMounts”, the value of “mountPath” must match the path of the shared directory.
4. Under “env” for the “value” under the name: “MY_POD_CPU” determines the thread count that openroad will be using.
Expand Down
18 changes: 9 additions & 9 deletions src/drt/src/TritonRoute.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -217,9 +217,9 @@ std::string TritonRoute::runDRWorker(const std::string& workerStr,
= on && FlexDRGraphics::guiActive() ? std::make_unique<FlexDRGraphics>(
debug_.get(), design_.get(), db_, logger_)
: nullptr;
auto worker
= FlexDRWorker::load(workerStr, logger_, design_.get(), graphics_.get());
worker->setViaData(viaData);
auto worker = FlexDRWorker::load(
workerStr, viaData, design_.get(), logger_, router_cfg_.get());
worker->setGraphics(graphics_.get());
worker->setSharedVolume(shared_volume_);
worker->setDebugSettings(debug_.get());
if (graphics_) {
Expand All @@ -243,7 +243,7 @@ void TritonRoute::debugSingleWorker(const std::string& dumpDir,
frIArchive ar(viaDataFile);
ar >> viaData;

std::unique_ptr<FlexDRGraphics> graphics_
std::unique_ptr<FlexDRGraphics> graphics
= on && FlexDRGraphics::guiActive() ? std::make_unique<FlexDRGraphics>(
debug_.get(), design_.get(), db_, logger_)
: nullptr;
Expand All @@ -252,8 +252,9 @@ void TritonRoute::debugSingleWorker(const std::string& dumpDir,
std::string workerStr((std::istreambuf_iterator<char>(workerFile)),
std::istreambuf_iterator<char>());
workerFile.close();
auto worker
= FlexDRWorker::load(workerStr, logger_, design_.get(), graphics_.get());
auto worker = FlexDRWorker::load(
workerStr, &viaData, design_.get(), logger_, router_cfg_.get());
worker->setGraphics(graphics.get());
if (debug_->mazeEndIter != -1) {
worker->setMazeEndIter(debug_->mazeEndIter);
}
Expand All @@ -277,9 +278,8 @@ void TritonRoute::debugSingleWorker(const std::string& dumpDir,
}
worker->setSharedVolume(shared_volume_);
worker->setDebugSettings(debug_.get());
worker->setViaData(&viaData);
if (graphics_) {
graphics_->startIter(worker->getDRIter(), router_cfg_.get());
if (graphics) {
graphics->startIter(worker->getDRIter(), router_cfg_.get());
}
std::string result = worker->reloadedMain();
bool updated = worker->end(design_.get());
Expand Down
21 changes: 8 additions & 13 deletions src/drt/src/dr/FlexDR.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2002,8 +2002,6 @@ void FlexDRWorker::serialize(Archive& ar, const unsigned int version)
(ar) & bestMarkers_;
(ar) & isCongested_;
if (is_loading(ar)) {
gridGraph_.setTech(design_->getTech());
gridGraph_.setWorker(this);
// boundaryPin_
int sz = 0;
(ar) & sz;
Expand Down Expand Up @@ -2031,19 +2029,16 @@ void FlexDRWorker::serialize(Archive& ar, const unsigned int version)
}
}

std::unique_ptr<FlexDRWorker> FlexDRWorker::load(const std::string& workerStr,
utl::Logger* logger,
frDesign* design,
FlexDRGraphics* graphics)
std::unique_ptr<FlexDRWorker> FlexDRWorker::load(
const std::string& workerStr,
FlexDRViaData* via_data,
frDesign* design,
utl::Logger* logger,
RouterConfiguration* router_cfg)
{
auto worker = std::make_unique<FlexDRWorker>();
auto worker
= std::make_unique<FlexDRWorker>(via_data, design, logger, router_cfg);
deserializeWorker(worker.get(), design, workerStr);

// We need to fix up the fields we want from the current run rather
// than the stored ones.
worker->setLogger(logger);
worker->setGraphics(graphics);

return worker;
}

Expand Down
11 changes: 4 additions & 7 deletions src/drt/src/dr/FlexDR.h
Original file line number Diff line number Diff line change
Expand Up @@ -295,11 +295,6 @@ class FlexDRWorker
rq_(this)
{
}
FlexDRWorker()
: // for serialization
rq_(this)
{
}
// setters
void setDebugSettings(frDebugSettings* settings)
{
Expand Down Expand Up @@ -449,11 +444,13 @@ class FlexDRWorker
logger_ = logger;
gridGraph_.setLogger(logger);
}
void setRouterCfg(RouterConfiguration* in) { router_cfg_ = in; }

static std::unique_ptr<FlexDRWorker> load(const std::string& workerStr,
utl::Logger* logger,
FlexDRViaData* via_data,
frDesign* design,
FlexDRGraphics* graphics);
utl::Logger* logger,
RouterConfiguration* router_cfg);

// distributed
void setDistributed(dst::Distributed* dist,
Expand Down
42 changes: 2 additions & 40 deletions src/drt/src/dr/FlexGridGraph.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,9 @@ class FlexGridGraph
: tech_(techIn),
logger_(loggerIn),
drWorker_(workerIn),
router_cfg_(router_cfg)
router_cfg_(router_cfg),
ap_locs_(techIn->getTopLayerNum() + 1)
{
ap_locs_.resize(tech_->getTopLayerNum() + 1);
}
// getters
frTechObject* getTech() const { return tech_; }
Expand Down Expand Up @@ -425,14 +425,7 @@ class FlexGridGraph
return sol;
}
// setters
void setTech(frTechObject* techIn)
{
tech_ = techIn;
ap_locs_.clear();
ap_locs_.resize(tech_->getTopLayerNum() + 1);
}
void setLogger(Logger* loggerIn) { logger_ = loggerIn; }
void setWorker(FlexDRWorker* workerIn) { drWorker_ = workerIn; }
bool addEdge(frMIdx x,
frMIdx y,
frMIdx z,
Expand Down Expand Up @@ -1126,8 +1119,6 @@ class FlexGridGraph
bool debug_{false};
frUInt4 curr_id_{1};

FlexGridGraph() = default;

void printExpansion(const FlexWavefrontGrid& currGrid,
const std::string& keyword);
// unsafe access, no idx check
Expand Down Expand Up @@ -1341,35 +1332,6 @@ class FlexGridGraph
bool hasOutOfDieViol(frMIdx x, frMIdx y, frMIdx z);
bool isWorkerBorder(frMIdx v, bool isVert);

template <class Archive>
void serialize(Archive& ar, const unsigned int version)
{
// The wavefront should always be empty here so we don't need to
// serialize it.
if (!wavefront_.empty()) {
throw std::logic_error("don't serialize non-empty wavefront");
}
if (is_loading(ar)) {
tech_ = ar.getDesign()->getTech();
}
(ar) & drWorker_;
(ar) & nodes_;
(ar) & prevDirs_;
(ar) & srcs_;
(ar) & dsts_;
(ar) & guides_;
(ar) & xCoords_;
(ar) & yCoords_;
(ar) & zCoords_;
(ar) & zHeights_;
(ar) & layerRouteDirections_;
(ar) & dieBox_;
(ar) & ggDRCCost_;
(ar) & ggMarkerCost_;
(ar) & halfViaEncArea_;
(ar) & ap_locs_;
}
friend class boost::serialization::access;
friend class FlexDRWorker;
};

Expand Down
18 changes: 10 additions & 8 deletions src/drt/test/Distributed/k8s-drt.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,13 @@ spec:
hostNetwork: true
containers:
- name: openroad
image: openroad/centos-binary:latest
image: openroad/orfs:v3.0-1871-g4e29c449
imagePullPolicy: Always
command: ["/bin/sh", "-c"]
command: ["/bin/bash", "-c"]
args:
- echo -e 'set_thread_count $::env(MY_POD_CPU)\nrun_worker -host $::env(MY_POD_IP) -port 1111' > /home/worker.tcl ;
/usr/bin/openroad /home/worker.tcl ;
- source env.sh;
printf 'set_thread_count %s\nrun_worker -host %s -port 1111\n' "$MY_POD_CPU" "$MY_POD_IP" > /home/worker.tcl ;
openroad /home/worker.tcl ;
ports:
- containerPort: 1111
volumeMounts:
Expand Down Expand Up @@ -81,12 +82,13 @@ spec:
spec:
containers:
- name: openroad
image: openroad/centos-binary:latest
image: openroad/orfs:v3.0-1871-g4e29c449
imagePullPolicy: Always
command: ["/bin/sh", "-c"]
command: ["/bin/bash", "-c"]
args:
- echo 'run_load_balancer -host $::env(MY_POD_IP) -port 1111 -workers_domain headless-udp' > /home/balancer.tcl ;
/usr/bin/openroad /home/balancer.tcl ;
- source env.sh;
printf 'run_load_balancer -host %s -port 1111 -workers_domain workers' "$MY_POD_IP" > /home/balancer.tcl ;
openroad /home/balancer.tcl ;
ports:
- containerPort: 1111
env:
Expand Down

0 comments on commit 0ab6013

Please sign in to comment.