Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Autotuner] - Tunable variables check v2 #2640

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions docs/user/FlowVariables.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ configuration file.
| <a name="FILL_CELLS"></a>FILL_CELLS| Fill cells are used to fill empty sites. If not set or empty, fill cell insertion is skipped.| | |
| <a name="FILL_CONFIG"></a>FILL_CONFIG| JSON rule file for metal fill during chip finishing.| | |
| <a name="FLOORPLAN_DEF"></a>FLOORPLAN_DEF| Use the DEF file to initialize floorplan.| | |
| <a name="GDS_ALLOW_EMPTY"></a>GDS_ALLOW_EMPTY| Regular expression of module names of macros that have no .gds file| | |
| <a name="GDS_FILES"></a>GDS_FILES| Path to platform GDS files.| | |
| <a name="GENERATE_ARTIFACTS_ON_FAILURE"></a>GENERATE_ARTIFACTS_ON_FAILURE| For instance Bazel needs artifacts (.odb and .rpt files) on a failure to allow the user to save hours on re-running the failed step locally, but when working with a Makefile flow, it is more natural to fail the step and leave the user to manually inspect the logs and artifacts directly via the file system. Set to 1 to change the behavior to generate artifacts upon failure to e.g. do a global route. The exit code will still be non-zero on all other failures that aren't covered by the "useful to inspect the artifacts on failure" use-case. Example: just like detailed routing, a global route that fails with congestion, is not a build failure(as in exit code non-zero), it is a successful(as in zero exit code) global route that produce reports detailing the problem. Detailed route will not proceed, if there is global routing congestion This allows build systems, such as bazel, to create artifacts for global and detailed route, even if the operation had problems, without having know about the semantics between global and detailed route. Considering that global and detailed route can run for a long time and use a lot of memory, this allows inspecting results on a laptop for a build that ran on a server.| 0| |
| <a name="GLOBAL_PLACEMENT_ARGS"></a>GLOBAL_PLACEMENT_ARGS| Use additional tuning parameters during global placement other than default args defined in global_place.tcl.| | |
Expand All @@ -102,7 +103,7 @@ configuration file.
| <a name="GPL_ROUTABILITY_DRIVEN"></a>GPL_ROUTABILITY_DRIVEN| Specifies whether the placer should use routability driven placement.| 1| |
| <a name="GPL_TIMING_DRIVEN"></a>GPL_TIMING_DRIVEN| Specifies whether the placer should use timing driven placement.| 1| |
| <a name="GUI_TIMING"></a>GUI_TIMING| Load timing information when opening GUI. For large designs, this can be quite time consuming. Useful to disable when investigating non-timing aspects like floorplan, placement, routing, etc.| 1| |
| <a name="HOLD_SLACK_MARGIN"></a>HOLD_SLACK_MARGIN| Specifies a time margin for the slack when fixing hold violations. This option allows you to overfix or underfix(negative value, terminate retiming before 0 or positive slack). Use min of HOLD_SLACK_MARGIN and 0(default hold slack margin) in floorplan. This avoids overrepair in floorplan for hold by default, but allows skipping hold repair using a negative HOLD_SLACK_MARGIN. Exiting timing repair early is useful in exploration where the .sdc has a fixed clock period at designs target clock period and where HOLD/SETUP_SLACK_MARGIN is used to avoid overrepair(extremely long running times) when exploring different parameter settings.| 0| |
| <a name="HOLD_SLACK_MARGIN"></a>HOLD_SLACK_MARGIN| Specifies a time margin for the slack when fixing hold violations. This option allows you to overfix or underfix(negative value, terminate retiming before 0 or positive slack). floorplan.tcl uses min of HOLD_SLACK_MARGIN and 0(default hold slack margin). This avoids overrepair in floorplan for hold by default, but allows skipping hold repair using a negative HOLD_SLACK_MARGIN. Exiting timing repair early is useful in exploration where the .sdc has a fixed clock period at the design's target clock period and where HOLD/SETUP_SLACK_MARGIN is used to avoid overrepair(extremely long running times) when exploring different parameter settings. When an ideal clock is used, that is before CTS, a clock insertion delay of 0 is used in timing paths. This creates a mismatch between macros that have a .lib file from after CTS, when the clock is propagated. To mitigate this, OpenSTA will use subtract the clock insertion delay of macros when calculating timing with ideal clock. Provided that min_clock_tree_path and max_clock_tree_path are in the .lib file, which is the case for macros built with OpenROAD. This is less accurate than if OpenROAD had created a placeholder clock tree for timing estimation purposes prior to CTS. There will inevitably be inaccuracies in the timing calculation prior to CTS. Use a slack margin that is low enough, even negative, to avoid overrepair. Inaccuracies in the timing prior to CTS can also lead to underrepair, but there no obvious and simple way to avoid underrapir in these cases. Overrepair can lead to excessive runtimes in repair or too much buffering being added, which can present itself as congestion of hold cells or buffer cells. Another use of SETUP/HOLD_SLACK_MARGIN is design parameter exploration when trying to find the minimum clock period for a design. The SDC_FILE for a design can be quite complicated and instead of modifying the clock period in the SDC_FILE, which can be non-trivial, the clock period can be fixed at the target frequency and the SETUP/HOLD_SLACK_MARGIN can be swept to find a plausible current minimum clock period.| 0| |
| <a name="IO_CONSTRAINTS"></a>IO_CONSTRAINTS| File path to the IO constraints .tcl file.| | |
| <a name="IO_PLACER_H"></a>IO_PLACER_H| The metal layer on which to place the I/O pins horizontally (top and bottom of the die).| | |
| <a name="IO_PLACER_V"></a>IO_PLACER_V| The metal layer on which to place the I/O pins vertically (sides of the die).| | |
Expand Down Expand Up @@ -137,13 +138,13 @@ configuration file.
| <a name="PWR_NETS_VOLTAGES"></a>PWR_NETS_VOLTAGES| Used for IR Drop calculation.| | |
| <a name="RCX_RULES"></a>RCX_RULES| RC Extraction rules file path.| | |
| <a name="RECOVER_POWER"></a>RECOVER_POWER| Specifies how many percent of paths with positive slacks can be slowed for power savings [0-100].| 0| |
| <a name="REMOVE_ABC_BUFFERS"></a>REMOVE_ABC_BUFFERS| Remove abc buffers from the netlist. If timing repair in floorplanning is taking too long, use a SETUP_HOLD_MARGIN to terminate timing repair early instead of using REMOVE_ABC_BUFFERS or set SKIP_LAST_GAST=1.| | yes|
| <a name="REMOVE_ABC_BUFFERS"></a>REMOVE_ABC_BUFFERS| Remove abc buffers from the netlist. If timing repair in floorplanning is taking too long, use a SETUP/HOLD_SLACK_MARGIN to terminate timing repair early instead of using REMOVE_ABC_BUFFERS or set SKIP_LAST_GASP=1.| | yes|
| <a name="REMOVE_CELLS_FOR_EQY"></a>REMOVE_CELLS_FOR_EQY| String patterns directly passed to write_verilog -remove_cells <> for equivalence checks.| | |
| <a name="REPAIR_PDN_VIA_LAYER"></a>REPAIR_PDN_VIA_LAYER| Remove power grid vias which generate DRC violations after detailed routing.| | |
| <a name="REPORT_CLOCK_SKEW"></a>REPORT_CLOCK_SKEW| Report clock skew as part of reporting metrics, starting at CTS, before which there is no clock skew. This metric can be quite time-consuming, so it can be useful to disable.| 1| |
| <a name="RESYNTH_AREA_RECOVER"></a>RESYNTH_AREA_RECOVER| Enable re-synthesis for area reclaim.| 0| |
| <a name="RESYNTH_TIMING_RECOVER"></a>RESYNTH_TIMING_RECOVER| Enable re-synthesis for timing optimization.| 0| |
| <a name="ROUTING_LAYER_ADJUSTMENT"></a>ROUTING_LAYER_ADJUSTMENT| Default routing layer adjustment| 0.5| |
| <a name="ROUTING_LAYER_ADJUSTMENT"></a>ROUTING_LAYER_ADJUSTMENT| Adjusts routing layer capacities to manage congestion and improve detailed routing. High values ease detailed routing but risk excessive detours and long global routing times, while low values reduce global routing failure but can complicate detailed routing. The global routing running time normally reduces dramatically(entirely design specific, but going from hours to minutes has been observed) when the value is low(such as 0.10). Sometimes, global routing will succeed with lower values and fail with higher values. Exploring results with different values can help shed light on the problem. Start with a too low value, such as 0.10, and bisect to value that works by doing multiple global routing runs. As a last resort, `make global_route_issue` and using the tools/OpenROAD/etc/deltaDebug.py can be useful to debug global routing errors. If there is something specific that is impossible to route, such as a clock line over a macro, global routing will terminate with DRC errors routes that could have been routed were it not for the specific impossible routes. deltaDebug.py should weed out the possible routes and leave a minimal failing case that pinpoints the problem.| 0.5| |
| <a name="RTLMP_AREA_WT"></a>RTLMP_AREA_WT| Weight for the area of the current floorplan.| 0.1| |
| <a name="RTLMP_ARGS"></a>RTLMP_ARGS| Overrides all other RTL macro placer arguments.| | |
| <a name="RTLMP_BOUNDARY_WT"></a>RTLMP_BOUNDARY_WT| Weight for the boundary or how far the hard macro clusters are from boundaries.| 50.0| |
Expand All @@ -167,7 +168,7 @@ configuration file.
| <a name="SDC_FILE"></a>SDC_FILE| The path to design constraint (SDC) file.| | |
| <a name="SDC_GUT"></a>SDC_GUT| Load design and remove all internal logic before doing synthesis. This is useful when creating a mock .lef abstract that has a smaller area than the amount of logic would allow. bazel-orfs uses this to mock SRAMs, for instance.| | |
| <a name="SEAL_GDS"></a>SEAL_GDS| Seal macro to place around the design.| | |
| <a name="SETUP_SLACK_MARGIN"></a>SETUP_SLACK_MARGIN| Specifies a time margin for the slack when fixing setup violations. This option allows you to overfix or underfix(negative value, terminate retiming before 0 or positive slack).| 0| |
| <a name="SETUP_SLACK_MARGIN"></a>SETUP_SLACK_MARGIN| Specifies a time margin for the slack when fixing setup violations. This option allows you to overfix or underfix(negative value, terminate retiming before 0 or positive slack). See HOLD_SLACK_MARGIN for more details.| 0| |
| <a name="SET_RC_TCL"></a>SET_RC_TCL| Metal & Via RC definition file path.| | |
| <a name="SKIP_CTS_REPAIR_TIMING"></a>SKIP_CTS_REPAIR_TIMING| Skipping CTS repair, which can take a long time, can be useful in architectural exploration or when getting CI up and running.| | |
| <a name="SKIP_GATE_CLONING"></a>SKIP_GATE_CLONING| Do not use gate cloning transform to fix timing violations (default: use gate cloning).| | |
Expand Down Expand Up @@ -343,6 +344,7 @@ configuration file.
## final variables

- [ADDITIONAL_GDS](#ADDITIONAL_GDS)
- [GDS_ALLOW_EMPTY](#GDS_ALLOW_EMPTY)
- [GND_NETS_VOLTAGES](#GND_NETS_VOLTAGES)
- [MAX_ROUTING_LAYER](#MAX_ROUTING_LAYER)
- [MIN_ROUTING_LAYER](#MIN_ROUTING_LAYER)
Expand Down
8 changes: 8 additions & 0 deletions flow/scripts/variables.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -104,13 +104,15 @@ CORE_UTILIZATION:
The core utilization percentage (0-100).
stages:
- floorplan
tunable: 1
CORE_AREA:
description: >
The core area specified as a list of lower-left and upper-right corners in
microns
(X1 Y1 X2 Y2).
stages:
- floorplan
tunable: 1
REPORT_CLOCK_SKEW:
description:
Report clock skew as part of reporting metrics, starting at CTS,
Expand Down Expand Up @@ -344,6 +346,7 @@ CELL_PAD_IN_SITES_DETAIL_PLACEMENT:
- cts
- grt
default: 0
tunable: 1
PLACE_PINS_ARGS:
description: >
Arguments to place_pins
Expand All @@ -362,6 +365,7 @@ PLACE_DENSITY_LB_ADDON:
description: >
Check the lower boundary of the PLACE_DENSITY and add
PLACE_DENSITY_LB_ADDON if it exists.
tunable: 1
REPAIR_PDN_VIA_LAYER:
description: >
Remove power grid vias which generate DRC violations after detailed routing.
Expand Down Expand Up @@ -657,13 +661,15 @@ CORE_MARGIN:
is undefined.
stages:
- floorplan
tunable: 1
DIE_AREA:
description: >
The die area specified as a list of lower-left and upper-right corners in
microns
(X1 Y1 X2 Y2).
stages:
- floorplan
tunable: 1
RESYNTH_AREA_RECOVER:
description: >
Enable re-synthesis for area reclaim.
Expand Down Expand Up @@ -702,12 +708,14 @@ CTS_CLUSTER_DIAMETER:
default: 20
stages:
- cts
tunable: 1
CTS_CLUSTER_SIZE:
description: >
Maximum number of sinks per cluster.
default: 50
stages:
- cts
tunable: 1
CTS_SNAPSHOT:
description: >
Creates ODB/SDC files prior to clock net and setup/hold repair.
Expand Down
1 change: 1 addition & 0 deletions tools/AutoTuner/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ tensorboard>=2.14.0,<=2.16.2
protobuf==3.20.3
SQLAlchemy==1.4.17
urllib3<=1.26.15
pyyaml==6.0.1
52 changes: 13 additions & 39 deletions tools/AutoTuner/src/autotuner/distributed.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import glob
import subprocess
import random
import yaml
from datetime import datetime
from multiprocessing import cpu_count
from subprocess import run
Expand Down Expand Up @@ -368,42 +369,18 @@ def read_tune_pbt(name, this):
return config, sdc_file, fr_file


def parse_flow_variables():
def parse_tunable_variables():
"""
Parse the flow variables from source
- Code: Makefile `vars` target output

Parse the tunable variables from variables.yaml
TODO: Tests.

Output:
- flow_variables: set of flow variables
"""
cur_path = os.path.dirname(os.path.realpath(__file__))
vars_path = os.path.join(cur_path, "../../../../flow/scripts/variables.yaml")

# first, generate vars.tcl
makefile_path = os.path.join(cur_path, "../../../../flow/")
initial_path = os.path.abspath(os.getcwd())
os.chdir(makefile_path)
result = subprocess.run(["make", "vars", f"PLATFORM={args.platform}"])
if result.returncode != 0:
print(f"[ERROR TUN-0018] Makefile failed with error code {result.returncode}.")
sys.exit(1)
if not os.path.exists("vars.tcl"):
print(f"[ERROR TUN-0019] Makefile did not generate vars.tcl.")
sys.exit(1)
os.chdir(initial_path)

# for code parsing, you need to parse from both scripts and vars.tcl file.
pattern = r"(?:::)?env\((.*?)\)"
files = glob.glob(os.path.join(cur_path, "../../../../flow/scripts/*.tcl"))
files.append(os.path.join(cur_path, "../../../../flow/vars.tcl"))
variables = set()
for file in files:
with open(file) as fp:
matches = re.findall(pattern, fp.read())
for match in matches:
for variable in match.split("\n"):
variables.add(variable.strip().upper())
# Read from variables.yaml and get variables with tunable = 1
with open(vars_path) as file:
result = yaml.safe_load(file)
variables = {key for key, value in result.items() if value.get("tunable", 0) == 1}
return variables


Expand All @@ -414,7 +391,7 @@ def parse_config(config, path=os.getcwd()):
options = ""
sdc = {}
fast_route = {}
flow_variables = parse_flow_variables()
flow_variables = parse_tunable_variables()
for key, value in config.items():
# Keys that begin with underscore need special handling.
if key.startswith("_"):
Expand All @@ -432,15 +409,12 @@ def parse_config(config, path=os.getcwd()):
"[WARNING TUN-0013] Non-flatten the designs are not "
"fully supported, ignoring _SYNTH_FLATTEN parameter."
)
# Default case is VAR=VALUE
else:
# FIXME there is no robust way to get this metainformation from
# ORFS about the variables, so disable this code for now.

# Default case is VAR=VALUE
# Sanity check: ignore all flow variables that are not tunable
# if key not in flow_variables:
# print(f"[ERROR TUN-0017] Variable {key} is not tunable.")
# sys.exit(1)
if key not in flow_variables:
print(f"[ERROR TUN-0017] Variable {key} is not tunable.")
sys.exit(1)
options += f" {key}={value}"
if bool(sdc):
write_sdc(sdc, path)
Expand Down
Loading