中科鉴芯（北京）科技有限责任公司-aieda_Placement_and

2024 TVLSI

Hierarchical Graph Learning-Based Floorplanning With Dirichlet Boundary Conditions

Authors: Yiting Liu; Hai Zhou; Jia Wang; Fan Yang; Xuan Zeng; Li Shang

Affiliation: School of Microelectronics, State Key Laboratory of Integrated Chips and System, Fudan University, Shanghai, China

Abstract:

Floorplanning is a complex physical design problem that produces initial locations of movable objects, the quality of which has a great impact on downstream tasks such as placement and routing. To improve the efficacy of floorplanning, machine learning techniques have recently been recruited for help. However, the application-specific location constraints (IOs and cells with fixed locations) pose a huge challenge for machine learning. This article presents a novel uniformization approach by Dirichlet boundary conditions, which decomposes floorplanning into two easier-to-solve subproblems, namely a convex quadratic wirelength optimization problem with location constraints and an NP-hard combinatorial problem with homogeneous Dirichlet boundary conditions. The former problem is efficiently solved using quadratic optimization, and the latter is addressed by efficient graph inference using the proposed hierarchical GNN-based model. The proposed floorplanner called DPlanner has been integrated with state-of-the-art mixed-size placers to generate high-quality placement solutions with up to 56% and 41% improvement in placement iterations and runtime. In addition, compared to the state-of-the-art integrated floorplanning-placement flow, DPlanner achieves over a 20% improvement in placement iteration and more than a 21% reduction in total runtime, along with a 2% average reduction in wirelength.

2024 NeurIPS

On joint learning for solving placement and routing in chip design.

Author: Ruoyu Cheng, Junchi Yan

Affiliation: Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence, AI Institute Shanghai Jiao Tong University, Shanghai, China, 200240

Abstract:

For its advantage in GPU acceleration and less dependency on human experts, machine learning has been an emerging tool for solving the placement and routing problems, as two critical steps in modern chip design flow. Being still in its early stage, there are fundamental issues: scalability, reward design, and end-toend learning paradigm etc. To achieve end-to-end placement learning, we first propose a joint learning method termed by DeepPlace for the placement of macros and standard cells, by the integration of reinforcement learning with a gradient based optimization scheme. To further bridge the placement with the subsequent routing task, we also develop a joint learning approach via reinforcement learning to fulfill both macro placement and routing, which is called DeepPR. One key design in our (reinforcement) learning paradigm involves a multi-view embedding model to encode both global graph level and local node level information of the input macros. Moreover, the random network distillation is devised to encourage exploration. Experiments on public chip design benchmarks show that our method can effectively learn from experience and also provides intermediate placement for the post standard cell placement, within few hours for training.

2024 ISPD

Challenges in Floorplanning and Macro placement for Modern SoCs.

Author: I-Lun Tseng

Affiliation: MediaTek Inc., Hsinchu, Taiwan

Abstract:

Modern System-on-Chips (SoCs), such as smartphone microprocessors, are composed of billions of transistors existing in various subsystems. These subsystems can include Central Processing Units (CPUs), Graphics Processing Units (GPUs), Neural Processing Units (NPUs), Image Signal Processors (ISPs), Digital Signal Processors (DSPs), communication modems, memory controllers, and many others. For efficient Electronic Design Automation (EDA) tasks, such as those involving logic synthesis, placement, clock tree synthesis (CTS), and/or routing, these subsystems are typically broken down into smaller, more manageable circuit blocks, or circuit partitions. This subdivision strategy is crucial for keeping design times within reasonable limits. During the top-level floorplanning phase of chip design, the dimensions, interconnect ports, and physical locations of circuit partitions are defined; the physical boundaries of these partitions are commonly designed as rectilinear shapes rather than rectangles. Partitions that are excessively large can lead to inefficient use of chip area, higher power consumption, and higher production costs. Conversely, undersized partitions can hinder subsequent physical design processes, potentially causing delays in the overall chip design schedule. Furthermore, a poor floorplan can lead to longer wire lengths and can increase feedthrough net counts in partitions, adversely affecting power, performance, and area (PPA). In practice, the top-level floorplanning phase of chip design can involve multiple iterations of its processes. An initial iteration typically involves estimating the approximate area of each circuit partition based on various factors, such as the dimensions of macros (including SRAM macros), the number of standard cell instances, and the standard cell utilization rate, which can be projected based on the data from previous designs. These preliminary estimates are crucial for defining the initial shapes, dimensions, interconnect ports, and physical locations of the partitions. Subsequently, the downstream design processes can advance either to partition-level physical design (which includes macro placement, standard cell placement, CTS, routing, etc.) or to physical-aware logic synthesis, which uses the defined layout data to more precisely assess layout-induced effects and produce more accurate gate-level netlists. Once the dimensions and interconnect locations of circuit partitions are defined, macro placement, which is usually followed by standard cell placement and routing processes, can be conducted. After performing these processes, PPA results may indicate that certain partitions require size adjustments due to being too small, whereas others may be identified as candidates for area reduction. Such alterations in the circuit partition areas necessitate modifications to the top-level floorplan. Furthermore, in subsequent iterations of floorplanning, certain elements (such as feedthrough nets/ports) may be added into and/or removed from partitions, prompting a reevaluation of the physical implementation feasibility for these partitions; the reevaluation stage may involve additional macro placement, cell placement, and routing activities. Macro placement is crucial in physical design as its outcomes can substantially influence standard cell placement, CTS, routing, circuit timing, and even power consumption. However, at advanced technology nodes, macro placement outcomes produced by commercial EDA tools and reinforcement learning (RL)-based tools often require human modifications prior to practical use, which in part owing to complex design rules associated with advanced technology nodes, although these tools can rapidly generate results. Additionally, it has been observed that suboptimal macro placement can lead to issues such as IR drop and increased dynamic/static power consumption. However, these issues, which may be checked more accurately in later stages of a design flow, are frequently not addressed in a typical macro placement process. In modern SoCs, moreover, it is very common that a circuit partition contains multiple power domains. Performing macro placement on this type of circuit partition may require domain floorplanning prior to placing macros and standard cell instances within their respective power domain regions. As described previously, the floorplanning and the macro placement are often interrelated. Early iterations of floorplanning may not achieve the best configurations for partitions in terms of PPA, leading to additional iterations in the design flow. Also, the macro placement process, along with subsequent cell placement and routing tasks, can serve as a critical and potentially fast evaluation step to assess each partition's physical implementation feasibility, thereby driving continuous refinements in the floorplan. This iterative methodology is crucial in achieving a more refined and optimized chip design, which is especially critical at advanced technology nodes where wafer costs are significantly high. In designing modern SoCs, the importance of performing high-quality floorplanning and high-quality macro placement cannot be overemphasized. Specifically, the floorplanning and the macro placement challenges encountered in the industry, and the obstacles preventing complete automation of these processes need to be re-examined. With ongoing advancements in EDA and AI/ML technologies, such as the application of reinforcement learning (RL) in tuning design flow parameters, coupled with enhanced computational power, we anticipate a substantial improvement and/or potential automation in the iterative aspects of these design processes. Such advancements will not only alleviate the workload of engineers but also enhance the overall quality of results (QoR) in chip designs.

2024 DAC

Redistribution Layer routing with Dynamic Via Insertion Under Irregular Via Structure

Author: Je-Wei Chuang, Zong-Han Wu, Bo-Ying Huang, Yao-Wen Chang

Affiliation: National Taiwan University

Abstract:

In modern advanced packaging, redistribution layers (RDLs) are often used for signal transmission among chips, and vias are used for communication among different layers. Most existing RDL routers perform via planning before routing. However, since vias can be placed at arbitrary locations under the irregular via structure, via planning limits the solution space and reduces layout flexibility. This paper proposes a new flow with a novel routing graph model for 90- and 135-degree routing, which allows dynamic via insertion during routing. The proposed algorithm enlarges the solution space by providing more choices during path-finding, achieving higher routing quality. The experimental results based on commonly used benchmark suites show that our router achieves over 10\% better wirelength with over 29X speedup over the state-of-the-art work and even achieves 0.4\% better wirelength with 55X speedup over the state-of-the-art any-angle router.

2024 DAC

ChatPattern: Layout Pattern Customization via Natural Language

Author: Zixiao Wang, Yunheng Shen, Xufeng Yao, Wenqian Zhao, Yang Bai, Farzan Farnia, Bei Yu

Affiliation: The Chinese University of Hong Kong; Tsinghua University

Abstract:

Existing works focus on fixed-size layout pattern generation. ChatPattern proposes a novel framework for flexible pattern customization using an LLM agent and layout pattern generator. The LLM agent interprets natural language requirements, while the generator excels in conditional layout generation. ChatPattern aims to synthesize high-quality large-scale patterns through experiments. Contemporary machine-learning-based lithography design applications require extensive layout patterns for training networks. Rule-based methods were used before machine learning for synthesizing layout patterns automatically. Recent learning-based methods can generate diverse layout patterns closely matching dataset distribution. Learning-based methods lack fine-grained modifications and support for pattern edition. Large Language Models (LLMs) have potential in handling complex tasks like layout pattern library building.

2024 DAC

Net Resource Allocation: A Desirable Initial Routing Step

Author: Zhisheng Zeng, Jikang Liu, Zhipeng Huang, Ye Ca, Biwei Xie, Yungang Bao, Xingquan Li

Affiliation: SKLP, Institute of Computing Technology, CAS, Peng Cheng Laboratory; College of Computer Science and Software Engineering, Shenzhen University; State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences

Abstract:

In modern IC design, routing significantly impacts chip performance, power, area, and design iteration count. Critical challenges in routing include generating rectilinear Steiner minimum tree (RSMT) for each net and handling routing resource among nets. Due to limited resources and net scale, congestion is inevitable in VLSI circuit routing. Most competitive routers address congestion after routing without prior net guidance, leading to difficulty in managing resources among nets. To tackle routing and congestion, we suggest introducing a net resource allocation step as a potentially desirable initial routing stage. Firstly, we introduce the concept of net region probability density (NRPD) to achieve suitable net resource allocation. Using a prior NRPD, we model the resource allocation problem as quadratic programming (QP). We utilize penalty method to solve the QP quickly and obtain a posterior NRPD for each net on each grid. Based on the posterior NRPD and congestion map, we introduce a cost scheme to guide net routing. This cost scheme supports a weighted RSMT construction technique for better topological solutions. Additionally, we propose an iterative method for global routing and track assignment, improving detailed routing quality and optimizing design rule violations. Experimental results show the effectiveness of net resource allocation and demonstrate superior performance of our router over the OpenROAD's router across multiple metrics.

2024 ASP-DAC

IPD: An Open-source intelligent Physical Design Toolchain

Author: Xingquan Li, Simin Tao, Shijian Chen, Zhisheng Zeng, Zhipeng Huang, Hongxi Wu, Weiguo Li, Zengrong Huang, Liwei Ni, Xueyan Zhao, He Liu, Shuaiying Long, Ruizhi Liu , Xiaoze Lin, Bo Yang, Fuxing Huang, Zonglin Yang, Yihang Qiu, Zheqing Shao, Jikang Liu, Yuyao Liang, Biwei Xie, Yungang Bao, Bei Yu

Affiliation: Peng Cheng Laboratory; Institute of Computing Technology, Chinese Academy of Sciences; Beijing Institute of Open Source Chip; Fuzhou University; Minnan Normal University; Peking University; Shenzhen University; Sun Yat-sen University; University of Science and Technology of China; University of Chinese Academy of Sciences; The Chinese University of Hong Kong

Abstract:

Open-source electronic design automation (EDA) shows promising potential in unleashing EDA innovation and lowering the cost of chip design. The open-source EDA toolchain is a comprehensive set of software tools designed to facilitate the design, analysis, and verification of electronic circuits and systems. We developed a physical design EDA toolchain (named iPD) from netlist to GDS-II, including design, analysis, and verification. iPD now covers the whole flow of physical design (including floorplan, placement, clock tree synthesis, routing, timing optimization etc.), part of the analysis tools (timing analysis and power analysis), and part of the verification tools (design rule check). For more friendly support EDA research and development and chip design, we design a reliability, extendibility, ease-of-use, and feature richness physical design toolchain. This paper introduces the software structure, functions, and metrics of the iPD toolchain.

2023 TACD

GraphPlanner: Floorplanning with Graph Neural Network

Authors：Yiting Liu, Ziyi Ju, Zhengming Li, Mingzhi Dong, Hai Zhou, Jia Wang, Fan Yang, Xuan Zeng, and Li Shang

Affiliation: School of Microelectronics, State Key Laboratory of Integrated Chips and System, Fudan University, Shanghai, China

Abstract:

Chip floorplanning has long been a critical task with high computation complexity in the physical implementation of VLSI chips. Its key objective is to determine the initial locations of large chip modules with minimized wirelength while adhering to the density constraint, which in essence is a process of constructing an optimized mapping from circuit connectivity to physical locations. Proven to be an NP-hard problem, chip floorplanning is difficult to be solved efficiently using algorithmic approaches. This article presents GraphPlanner, a variational graph-convolutional-network-based deep learning technique for chip floorplanning. GraphPlanner is able to learn an optimized and generalized mapping between circuit connectivity and physical wirelength and produce a chip floorplan using efficient model inference. GraphPlanner is further equipped with an efficient clustering method, a unification of hyperedge coarsening with graph spectral clustering, to partition a large-scale netlist into high-quality clusters with minimized inter-cluster weighted connectivity. GraphPlanner has been integrated with two state-of-the-art mixed-size placers. Experimental studies using both academic benchmarks and industrial designs demonstrate that compared to state-of-the-art mixed-size placers alone, GraphPlanner improves placement runtime by 25% with 4% wirelength reduction on average.

2023 ICMLC

A Hybrid Reinforcement Learning and Genetic Algorithm for VLSI Floorplanning

Authors: Ke Liu,Gu Jian,Hao Gu,Ziran Zhu

Affiliation: Southeast University

Abstract:

Floorplanning plays an essential role in very large-scale integration (VLSI) design flow since its solution quality significantly affects the circuit’s power, performance, and area (PPA). Under practical manufacturing circumstances, the fixed boundary of a floorplan needs to be considered, and such a tight fixed-outline makes the floorplanning problem more complicated. In this paper, we develop a hybrid reinforcement learning and genetic algorithm for VLSI floorplanning. On the one hand, the crossover and mutation operations of the genetic algorithm are adopted to filter out individuals which are more likely to transform to the global optimal solution. On the other hand, we propose an off-policy based reinforcement learning method to further optimize the individuals, which trains an agent from scratch and makes the agent learn local search optimization strategy of the floorplanning problem. Experiment results show that the proposed algorithm can obtain a better area and wirelength of a floorplan.

2023 ASICON

OpenPARF: An Open-Source Placement and Routing Framework for Large-Scale Heterogeneous FPGAs with Deep Learning Toolkit

Authors：Jing Mai; Jiarui Wang; Zhixiong Di; Guojie Luo; Yun Liang; Yibo Lin

Affiliation: National Key Laboratory for Multimedia Information Processing, School of Computer Science, and the Center for Energy-Efficient Computing and Applications, Peking University

Abstract:

This paper proposes OpenPARF, an open-source placement and routing framework for large-scale FPGA designs 1 . OpenPARF is implemented with the deep learning toolkit PyTorch and supports massive parallelization on GPU. The framework proposes a novel asymmetric multi-electrostatic field system to solve FPGA placement. It considers fine-grained routing resources inside configurable logic blocks (CLBs) for FPGA routing and supports large-scale irregular routing resource graphs. Experimental results on ISPD 2016 and ISPD 2017 FPGA contest benchmarks and industrial benchmarks demonstrate that OpenPARF can achieve 0.4-12.7% improvement in routed wirelength and more than 2× speedup in placement. We believe that OpenPARF can pave the road for developing FPGA physical design engines and stimulate further research on related topics.

2023 ISPD

AutoDMP Automated DREAMplace based macro placement

Author: Anthony Agnesina, Puranjay Rajvanshi, Tian Yang, Geraldo Pradipta, Austin Jiao, Ben Keller, Brucek Khailany ,Haoxing Ren

Affiliation: NVIDIA Corporation, Austin, TX, USA;

Abstract:

Macro placement is a critical very large-scale integration (VLSI) physical design problem that significantly impacts the design powerperformance-area (PPA) metrics. This paper proposes AutoDMP, a methodology that leverages DREAMPlace, a GPU-accelerated placer, to place macros and standard cells concurrently in conjunction with automated parameter tuning using a multi-objective hyperparameter optimization technique. As a result, we can generate high-quality predictable solutions, improving the macro placement quality of academic benchmarks compared to baseline results generated from academic and commercial tools. AutoDMP is also computationally efficient, optimizing a design with 2.7 million cells and 320 macros in 3 hours on a single NVIDIA DGX Station A100. This work demonstrates the promise and potential of combining GPU-accelerated algorithms and ML techniques for VLSI design automation.

2023 ISPD

DREAM-GAN: Advancing DREAMplace towards Commercial-Quality using Generative Adversarial Learning.

Author: Yi-Chen Lu, Haoxing Ren, Hao-Hsiang Hsiao, Sung Kyu Lim

Affiliation: Georgia Institute of Technology, Atlanta, GA, USA; Nvidia, Austin, TX, USA

Abstract:

DREAMPlace is a renowned open-source placer that provides GPUacceleratable infrastructure for placements of Very-Large-ScaleIntegration (VLSI) circuits. However, due to its limited focus on wirelength and density, existing placement solutions of DREAMPlace are not applicable to industrial design flows. To improve DREAMPlace towards commercial-quality without knowing the black-boxed algorithms of the tools, in this paper, we present DREAM-GAN, a placement optimization framework that advances DREAMPlace using generative adversarial learning. At each placement iteration, aside from optimizing the wirelength and density objectives of the vanilla DREAMPlace, DREAM-GAN computes and optimizes a differentiable loss that denotes the similarity score between the underlying placement and the tool-generated placements in commercial databases. Experimental results on 5 commercial and OpenCore designs using an industrial design flow implemented by Synopsys ICC2 not only demonstrate that DREAM-GAN significantly improves the vanilla DREAMPlace at the placement stage across each benchmark, but also show that the improvements last firmly to the post-route stage, where we observe improvements by up to 8.3% in wirelength and 7.4% in total power

2023 ISPD

Assessment of Reinforcement Learning for Macro placement.

Author: Chung-Kuan Cheng, Andrew B. Kahng, Sayak Kundu, Yucheng Wang, Zhiang Wang

Affiliation: University of California, San Diego, La Jolla, CA, USA;

Abstract:

We provide open, transparent implementation and assessment of Google Brain's deep reinforcement learning approach to macro placement (Nature) and its Circuit Training (CT) implementation in GitHub. We implement in open-source key "blackbox" elements of CT, and clarify discrepancies between CT and Nature. New testcases on open enablements are developed and released. We assess CT alongside multiple alternative macro placers, with all evaluation flows and related scripts public in GitHub. Our experiments also encompass academic mixed-size placement benchmarks, as well as ablation and stability studies. We comment on the impact of Nature and CT, as well as directions for future research.

2023 DAC

Mitigating Distribution Shift for Congestion Optimization in Global placement.

Author: Su Zheng, Lancheng Zou, Siting Liu, Yibo Lin, Bei Yu, Martin Wong

Affiliation: Chinese University of Hong Kong; Peking University

Abstract:

The placement and routing (PnR) flow plays a critical role in physical design. Poor routing congestion is a possible problem causing severe routing detours, which can lead to deteriorated timing performance or even routing failure. Deep-learning-based congestion prediction model is designed to guide the global placement process in previous work. However, the distribution shift problem in this method limits its performance. In this paper, we mitigate the distribution shift problem with a look-ahead mechanism inspired by optical flow prediction and an invariant feature space learning technique. With the proposed method, we can achieve better congestion prediction performance and less-congested placement results.

2023 DAC

PUFFER: A Routability-Driven placement Framework via Cell Padding with Multiple Features and Strategy Exploration.

Author: Zhijie Cai, Peng Zou, Zhengtao Wu, Xingyu Tong, Jun Yu, Jianli Chen, Yao-Wen Chang

Affiliation: State Key Lab of ASIC & System, Fudan University, Shanghai 200433, China; Shanghai LEDA Technology Co., Ltd., Shanghai 201203, China; Graduate Institute of Electronics Engineering, National Taiwan University, Taipei 10617, Taiwan; Department of Electrical Engineering, National Taiwan University, Taipei 10617, Taiwan

Abstract:

Placement is a critical stage in VLSI physical design, especially for routability optimization. Due to the large scale and high integration introduced by the advanced semiconductor manufacturing technology, there remains a significant challenge in routability in the placement stage, which will affect the subsequent routing process. This paper proposes a placement framework, called PUFFER, to optimize routability by cell padding and strategy exploration. The framework first estimates congestion by imitating the behaviors of routing detours and clustered cell spreading. Then it calculates cell padding based on multiple features inspired by the characteristics of convolutional and graph neural networks. Besides, it applies a Bayesian-based method to explore a better placement strategy. Compared with a commercial tool and the state-of-the-art academic RePlAce placer, experiments on industrial benchmarks show that our framework achieves the best routability on average, with a 2.7× speedup over the commercial tool.

2023 NIPS

Hubrouter: Learning global routing via hub generation and pin-hub connection.

Author: Xingbo Du, Chonghua Wang, Ruizhe Zhong, Junchi Yan

Affiliation: Dept. of Computer Science and Engineering & MoE Key Lab of AI, Shanghai Jiao Tong University

Abstract:

Global Routing (GR) is a core yet time-consuming task in VLSI systems. It recently attracted efforts from the machine learning community, especially generative models, but they suffer from the non-connectivity of generated routes. We argue that the inherent non-connectivity can harm the advantage of its one-shot generation and has to be post-processed by traditional approaches. Thus, we propose a novel definition, called hub, which represents the key point in the route. Equipped with hubs, global routing is transferred from a pin-pin connection problem to a hub-pin connection problem. Specifically, to generate definitely-connected routes, this paper proposes a two-phase learning scheme named HubRouter, which includes 1) hub-generation phase: A condition-guided hub generator using deep generative models; 2) pin-hub-connection phase: An RSMT construction module that connects the hubs and pins using an actor-critic model. In the first phase, we incorporate typical generative models into a multi-task learning framework to perform hub generation and address the impact of sensitive noise points with stripe mask learning. During the second phase, HubRouter employs an actor-critic model to finish the routing, which is efficient and has very slight errors. Experiments on simulated and real-world global routing benchmarks are performed to show our approach’s efficiency, particularly HubRouter outperforms the state-of-theart generative global routing methods in wirelength, overflow, and running time. Moreover, HubRouter also shows strength in other applications, such as RSMT construction and interactive path replanning.

2023 ICCAD

Towards Timing-Driven routing: An Efficient Learning Based Geometric Approach.

Author: Liying Yang, Guowei Sun, Hu Ding

Affiliation: School of Data Science, University of Science and Technology of China, Anhui, China

Abstract:

As the rapid increasing of the circuits complexity, it is urgent to develop efficient algorithmic techniques for EDA. In this paper, we consider the routing problem which is a key part for designing high-quality chips. In particular, we combine both the max path length and total wirelength for modeling our optimization objective, since the path delay often causes timing issue that can seriously degrade the whole routing efficiency (even if the total wirelength is small). Comparing with most of the previous works that only considering wirelength, the timing-driven routing objective is much more challenging to optimize. We propose an efficient learning-based approach together with several novel insights in geometry. For moderate-degree nets, our approach can yield a better smooth trade-off between the wirelength and max path length comparing with the state-of-the-art methods. For large-degree nets, we propose an elegant and easy-to-implement geometric data structure called “data-dependent polar quadtree” in the space; using this structure, we can successfully plug our learning-based approach into a divide & merge framework and the optimization quality over the whole instance can be well preserved.

2023 DATE

RL-Legalizer: Reinforcement Learning-based Cell Priority Optimization in Mixed-Height Standard Cell Legalization.

Author: Sung-Yun Lee, Seonghyeon Park, Daeyeon Kim, Minjae Kim, Tuyen P. Le, Seokhyeong Kang

Affiliation: Dept. of Electrical Engineering, POSTECH, Rep. of Korea; Rep. of Korea, AgileSoDA Company

Abstract:

Cell legalization order has a substantial effect on the quality of modern VLSI designs, which use mixed-height standard cells. In this paper, we propose a deep reinforcement learning framework to optimize cell priority in the legalization phase of various designs. We extract the selected features of movable cells and their surroundings, then embed them into cell-wise deep neural networks. We then determine cell priority and legalize them in order using a pixel-wise search algorithm. The proposed framework uses a policy gradient algorithm and several training techniques, including grid-cell subepisode, data normalization, reduced-dimensional state, and network optimization. We aim to resolve the suboptimality of existing sequential legalization algorithms with respect to displacement and wirelength. On average, our proposed framework achieved 34% lower legalization costs in various benchmarks compared to that of the state-of-the-art legalization algorithm.

2023 ARXIV

iEDA: An Open-Source Intelligent Physical Implementation Toolkit and Library

Author: Xingquan Li,Simin Tao, Zengrong Huang, Shijian Chen, Zhisheng Zeng, Liwei Ni,Zhipeng Huang, Chunan Zhuang, Hongxi Wu, Weiguo Li, Xueyan Zhao, He Liu, Shuaiying Long,Wei He, Bojun Liu, Sifeng Gan, Zihao Yu, Tong Liu, Yuchi Miao, Zhiyuan Yan, Hao Wang,Jie Zhao, Yifan Li, Ruizhi Liu, Xiaoze Lin, Bo Yang, Zhen Xue, Fuxing Huang, Zonglin Yang,Zhenggang Wu, Jiangkao Li, Yuezuo Liu, Ming Peng, Yihang Qiu, Wenrui Wu, Zheqing Shao,Kai Mo, Jikang Liu, Yuyao Liang, Mingzhe Zhang, Zhuang Ma, Xiang Cong, Daxiang Huang,Guojie Luo, Huawei Li, Haihua Shen, Mingyu Chen, Dongbo Bu, Wenxing Zhu,Ye Cai, Xiaoming Xiong, Ying Jiang, Yi Heng, Peng Zhang, Biwei Xie,B, and Yungang Bao

Affiliation: Peng Cheng Laboratory, Shenzhen, China; State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; Beijing Institute of Open Source Chip, Beijing, China;Fuzhou University, Fuzhou, China; Peking University, Beijing, China;University of Science and Technology of China, Hefei, China; Institute of Microelectronics, Chinese Academy of Sciences, Beijing, China; Shenzhen University, Shenzhen, China; Sun Yat-sen University, Guangzhou, China;Guangdong University of Technology, Guangzhou, China; Minnan Normal University, Zhangzhou, China; The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China; University of Chinese Academy of Sciences, Beijing, China; Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

Abstract:

Open-source EDA shows promising potential in unleashing EDA innovation and lowering the cost of chip design. This paper presents an open-source EDA project, iEDA, aiming for building a basic infrastructure for EDA technology evolution and closing the industrial-academic gap in the EDA area. iEDA now covers the whole flow of physical design (including Floorplan, Placement, CTS, Routing, Timing Optimization etc.), and part of the analysis tools (Static Timing Analysis and Power Analysis). To demonstrate the effectiveness of iEDA, we implement and tape out three chips of different scales (from 700k to 1.5M gates) on different process nodes (110nm and 28nm) with iEDA. iEDA is publicly available from the project home page http://ieda.oscc.cc.

2022 TCAD

Preplacement Net Length and Timing Estimation by Customized Graph Neural Network

Author: Zhiyao Xie, Rongjian Liang, Xiaoqing Xu, Jiang Hu, Chen-Chia Chang, Jingyu Pan, Yiran Chen

Affiliation: Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong, SAR; ASIC and VLSI Research Group, Nvidia, Austin, TX, USA; Arm Research, Austin, TX, USA

Abstract:

Net length is a key proxy metric for optimizing timing and power across various stages of a standard digital design flow. However, the bulk of net length information is not available until cell placement, and hence it is a significant challenge to explicitly consider net length optimization in design stages prior to placement, such as logic synthesis. In addition, the absence of net length information makes accurate pre-placement timing estimation extremely difficult. Poor predictability on the timing not only affects timing optimizations but also hampers the accurate evaluation of synthesis solutions. This work addresses these challenges by a pre-placement prediction flow with estimators on both net length and timing. We propose a graph attention network method with customization, called Net2, to estimate individual net length before cell placement. Its accuracy-oriented version Net2a achieves about 15% better accuracy than several previous works in identifying both long nets and long critical paths. Its fast version Net2f is more than 1000× faster than placement while still outperforms previous works and other neural network techniques in terms of various accuracy metrics. Based on net size estimations, we propose the first ML-based pre-placement timing estimator. Compared with the pre-placement timing report from commercial tools, it improves the correlation coefficient in arc delays by 0.08, and reduces the mean absolute error in slack, WNS, and TNS estimations by more than 50%.

2022 TCAD

GoodFloorplan: Graph Convolutional Network and Reinforcement Learning-Based Floorplanning

Author: Qi Xu, Hao Geng, Song Chen, Bo Yuan , Cheng Zhuo, Yi Kang, Xiaoqing Wen

Affiliation: School of Microelectronics, University of Science and Technology of China, Hefei, China; Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong

Abstract:

Electronic design automation (EDA) comprises a series of computationally difficult optimization problems that require substantial specialized knowledge as well as a considerable amount of trial-and-error efforts. However, open challenges, including long simulation runtime and lack of generalization, continue to restrict the applications of the existing EDA tools. Recently, learning-based algorithms, especially reinforcement learning (RL), have been successfully applied to handle various combinatorial optimization problems by automatically acquiring knowledge from the past experience. In this article, we formulate the floorplanning problem, the first stage of the physical design flow, as a Markov decision process (MDP). An end-to-end learning-based floorplanning framework GoodFloorplan is proposed to explore the design space, which combines graph convolutional network (GCN) and RL. Experimental results demonstrate that compared with state-of-the-art heuristic-based floorplanners, the proposed GoodFloorplan can provide better area and wirelength.

2022 NIPS

The policy-gradient placement and generative routing neural networks for chip design.

Author: Ruoyu Cheng, Xianglong Lyu, Yang Li, Junjie Ye, Jianye Hao, Junchi Yan

Affiliation: Department of Computer Science and Engineering, Shanghai Jiao Tong University; Huawei Noah’s Ark Lab

Abstract:

Placement and routing are two critical yet time-consuming steps of chip design in modern VLSI systems. Distinct from traditional heuristic solvers, this paper on one hand proposes an RL-based model for mixed-size macro placement, which differs from existing learning-based placers that often consider the macro by coarse grid-based mask. While the standard cells are placed via gradient-based GPU acceleration. On the other hand, a one-shot conditional generative routing model, which is composed of a special-designed input-size-adapting generator and a bi-discriminator, is devised to perform one-shot routing to the pins within each net, and the order of nets to route is adaptively learned. Combining these techniques, we develop a flexible neural pipeline, which to our best knowledge, is the first joint placement and routing network without involving any traditional heuristic solver. Experimental results on chip design benchmarks showcase the effectiveness of our approach. Source code will be made publicly available at: https://github.com/Thinklab-SJTU/EDA-AI

2022 DAC

Floorplanning with graph attention

Authors：Yiting Liu, Ziyi Ju, Zhengming Li, Mingzhi Dong, Hai Zhou, Jia Wang, Fan Yang, Xuan Zeng, and Li Shang

Affiliation: School of Microelectronics, State Key Laboratory of Integrated Chips and System, Fudan University, Shanghai, China

Abstract:

Floorplanning has long been a critical physical design task with high computation complexity. Its key objective is to determine the initial locations of macros and standard cells with optimized wirelength for a given area constraint. This paper presents Flora, a graph attention-based floorplanner to learn an optimized mapping between circuit connectivity and physical wirelength, and produce a chip floorplan using efficient model inference. Flora has been integrated with two state-of-the-art mixed-size placers. Experimental studies using both academic benchmarks and industrial designs demonstrate that compared to state-of-the-art mixed-size placers alone, Flora improves placement runtime by 18%, with 2% wirelength reduction on average.

2022 DATE

RLplace: Deep RL Guided Heuristics for Detailed placement Optimization.

Author: Uday Mallappa, Sreedhar Pratty, David Brown

Affiliation: University of California, San Diego; Nvidia Corporation

Abstract:

The solution space of detailed placement becomes intractable with increase in thenumber of placeable cells and their possible locations. So, the existing works either focus on the sliding window-based optimization or row-based optimization. Though these region-based methods enable us to use linear-programming, pseudo-greedy or dynamic-programming algorithms, locally optimal solutions from these methods are globally sub-optimal with inherent heuristics. The heuristics such as the order in which we choose these local problems or size of each sliding window (runtime vs. optimality tradeoff) account for the degradation of solution quality. Our hypothesis is that learning-based techniques (with their richer representation ability) have shown a great success in problems with huge solution spaces, and can offer an alternative to the existing rudimentary heuristics. We propose a two-stage detailed-placement algorithm RLPlace that uses reinforcement learning (RL) for coarse re-arrangement and Satisfiability Modulo Theories (SMT) for fine-grain refinement. With global placement output of two critical IPs as the start point, RLPlace achieves upto 1.35% HPWL improvement as compared to the commercial tool's detailed-placement result. In addition, RLPlace shows at least 1.2% HPWL improvement over highly optimized detailed-placement variants of the two IPs.

2022 ISPD

RTL-MP: Toward Practical, Human-Quality Chip Planning and Macro placement.

Author: Andrew B. Kahng, Ravi Varadarajan, Zhiang Wang

Affiliation: University of California San Diego La Jolla, CA, USA

Abstract:

In a typical RTLtoGDSII flow, floorplanning plays an essential role in achieving decent quality of results (QoR). A good floorplan typically requires interaction between the frontend designer, who is responsible for the functionality of the RTL, and the backend physical design engineer. The increasing complexity of macrodominated designs (especially machine learning accelerators with autogenerated RTL) has made the floorplanning task even more challenging and timeconsuming. In this paper, we propose RTLMP, a novel macro placer which utilizes RTL information and tries to “mimic” the interaction between the frontend RTL designer and the backend physical design engineer to produce humanquality floorplans. By exploiting the logical hierarchy and processing logical modules based on connection signatures, RTLMP can capture the dataflow inherent in the RTL and use the dataflow information to guide macro placement. We also apply autotuning [37] to optimize hyperparameter settings based on input designs. We have built RTLMP based on OpenROAD infrastructure [25, 49] and applied RTLMP to a set of industrial designs. RTLMP outperforms stateoftheart commercial macro placers and achieves QoR similar to that of handcrafted floorplans

2022 ISPD

Congestion and Timing Aware Macro placement Using Machine Learning Predictions from Different Data Sources: Cross-design mode Applicability and the Discerning Ensemble.

Author:Xiang Gao, Yi-Min Jiang, Lixin Shao, Pedja Raspopovic, Menno E. Verbeek, Manish Sharma, Vineet Rashingkar, Amit Jalota

Affiliation: Synopsys Inc., Mountain View, CA, USA;

Abstract:

Modern very large-scale integration (VLSI) designs typically use a lot of macros (RAM, ROM, IP) that occupy a large portion of the core area. Also, macro placement being an early stage of the physical design flow, followed by standard cell placement, physical synthesis (place-opt), clock tree synthesis and routing, etc., has a big impact on the final quality of result (QoR). There is a need for Electronic Design Automation (EDA) physical design tools to provide predictions for congestion, timing, and power etc., with certainty for different macro placements before running time-consuming flows. However, the diversity of IC designs that commercial EDA tools must support and the limited number of similar designs that can provide training data, make such machine learning (ML) predictions extremely hard. Because of this, ML models usually need to be completely retrained for unseen designs to work properly. However, collecting full flow macro placement ML data is time consuming and impractical. To make things worse, common ML methods, such as regression, support vector machine (SVM), random forest (RF), neural network (NN) in general, lack a good estimation of prediction accuracy or confidence and lack debuggability for cross-design applications. In this paper, we present a novel discerning ensemble technique for cross-design ML prediction for macro placement. We developed our solution based on a large number of designs with different design styles and technology nodes, and tested the solution on 8 leading-edge industry designs and achieved comparable or even better results in a few hours (per design) than manual placement results that take many engineers weeks or even months to achieve. Our method shows great promise for many ML problems in EDA applications, or even in other areas.

2022 ISPD

Scalability and Generalization of Circuit Training for Chip Floorplanning.

Author: Summer Yue, Ebrahim M. Songhori, Joe Wenjie Jiang, Toby Boyd, Anna Goldie, Azalia Mirhoseini, Sergio Guadarrama

Affiliation: Google Research, San Francisco, CA, USA,

Abstract:

Chip floorplanning is a complex task within the physical design process, with more than six decades of research dedicated to it. In a recent paper published in Nature~\citemirhoseini2021graph, a new methodology based on deep reinforcement learning was proposed that solves the floorplanning problem for advanced chip technologies with production quality results. The proposed method enables generalization, which means that the quality of placements improves as the policy is trained on a larger number of chip blocks. In this paper, we describe Circuit Training, an open-source distributed reinforcement learning framework that re-implements the proposed methodology in TensorFlow v2.x. We will explain the framework and discuss ways it can be extended to solve other important problems within physical design and more generally chip design. We also show new experimental results that demonstrate the scaling and generalization performance of Circuit Training.

2022 ISPD

A Reinforcement Learning Agent for Obstacle-Avoiding Rectilinear Steiner Tree Construction.

Author: Po-Yan Chen, Bing-Ting Ke, Tai-Cheng Lee, I-Ching Tsai, Tai-Wei Kung, Li-Yi Lin, En-Cheng Liu, Yun-Chih Chang, Yih-Lang Li, Mango C.-T. Chao

Affiliation: National Yang Ming Chiao Tung University, Hsinchu, Taiwan Roc; Realtek Semiconductor Corporation, Hsinchu, Taiwan Roc;

Abstract:

This paper presents a router, which tackles a classic algorithm problem in EDA, obstacle-avoiding rectilinear Steiner minimum tree (OARSMT), with the help of an agent trained by our proposed policy-based reinforcement-learning (RL) framework. The job of the policy agent is to select an optimal set of Steiner points that can lead to an optimal OARSMT based on a given layout. Our RL framework can iteratively upgrade the policy agent by applying Monte-Carlo tree search to explore and evaluate various choices of Steiner points on various unseen layouts. As a result, our policy agent can be viewed as a self-designed OARSMT algorithm that can iteratively evolves by itself. The initial version of the agent is a sequential one, which selects one Steiner point at a time. Based on the sequential agent, a concurrent agent can then be derived to predict all required Steiner points with only one model inference. The overall training time can be further reduced by applying geometrically symmetric samples for training. The experimental results on single-layer 15x15 and 30x30 layouts demonstrate that our trained concurrent agent can outperform a state-of-the-art OARSMT router on both wire length and runtime.

2021 NATURE

On Advancing Physical Design Using Graph Neural Networks.

Author: Yi-Chen Lu, Sung Kyu Lim

Affiliation: Georgia Institute of Technology, Atlanta, Georgia, USA

Abstract:

As modern Physical Design (PD) algorithms and methodologies evolve into the post-Moore era with the aid of machine learning, Graph Neural Networks (GNNs) are becoming increasingly ubiquitous given that netlists are essentially graphs. Recently, their ability to perform effective graph learning has provided significant insights to understand the underlying dynamics during netlist-to-layout transformations. GNNs follow a message-passing scheme, where the goal is to construct meaningful representations either at the entire graph or node-level by recursively aggregating and transforming the initial features. In the realm of PD, the GNN-learned representations have been leveraged to solve the tasks such as cell clustering, quality-of-result prediction, activity simulation, etc., which often overcome the limitations of traditional PD algorithms. In this work, we first revisit recent advancements that GNNs have made in PD. Second, we discuss how GNNs serve as the backbone of novel PD flows. Finally, we present our thoughts on ongoing and future PD challenges that GNNs can tackle and succeed.

2021 NATURE

A graph placement methodology for fast chip design

Author: Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, J. Jiang, Ebrahim M. Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Azade Nazi, Jiwoo Pak, Andy Tong, Kavya Srinivasa, W. Hang, Emre Tuncer, Quoc V. Le, J. Laudon, Richard Ho, Roger Carpenter, J. Dean

Affiliation: Rice University; Haper Adams University

Abstract:

Chip floorplanning is the engineering task of designing the physical layout of a computer chip. Despite five decades of research1, chip floorplanning has defied automation, requiring months of intense effort by physical design engineers to produce manufacturable layouts. Here we present a deep reinforcement learning approach to chip floorplanning. In under six hours, our method automatically generates chip floorplans that are superior or comparable to those produced by humans in all key metrics, including power consumption, performance and chip area. To achieve this, we pose chip floorplanning as a reinforcement learning problem, and develop an edge-based graph convolutional neural network architecture capable of learning rich and transferable representations of the chip. As a result, our method utilizes past experience to become better and faster at solving new instances of the problem, allowing chip design to be performed by artificial agents with more experience than any human designer. Our method was used to design the next generation of Google’s artificial intelligence (AI) accelerators, and has the potential to save thousands of hours of human effort for each new generation. Finally, we believe that more powerful AI-designed hardware will fuel advances in AI, creating a symbiotic relationship between the two fields. Machine learning tools are used to greatly accelerate chip layout design, by posing chip floorplanning as a reinforcement learning problem and using neural networks to generate high-performance chip layouts.

2021 DATE

Global placement with Deep Learning-Enabled Explicit Routability Optimization

Author: Siting Liu, Qi Sun, Peiyu Liao, Yibo Lin, Bei Yu

Affiliation: The Chinese University of Hong Kong; Peking University

Abstract:

Placement and routing (PnR) is the most time-consuming part of the physical design flow. Recognizing the routing performance ahead of time can assist designers and design tools to optimize placement results in advance. In this paper, we propose a fully convolutional network model to predict congestion hotspots and then incorporate this prediction model into a placement engine, DREAMPlace, to get a more route-friendly result. The experimental results on ISPD2015 benchmarks show that with the superior accuracy of the prediction model, our proposed approach can achieve up to 9.05% reduction in congestion rate and 5.30% reduction in routed wirelength compared with the state-of-the-art.

2020 TCAD

DREAMplace Deep Learning Toolkit-Enabled GPU Acceleration for Modern VLSI placement

Author: Yibo Lin, Shounak Dhar, Wuxi Li, Haoxing Ren, Brucek Khailany, David Z. Pan

Affiliation: ECE Department, UT Austin; ECE Department, UT Austin

Abstract:

Placement for very-large-scale integrated (VLSI) circuits is one of the most important steps for design closure. This paper proposes a novel GPU-accelerated placement framework DREAMPlace, by casting the analytical placement problem equivalently to training a neural network. Implemented on top of a widely-adopted deep learning toolkit PyTorch, with customized key kernels for wirelength and density computations, DREAMPlace can achieve over 30× speedup in global placement without quality degradation compared to the state-of-the-art multi-threaded placer RePlAce. We believe this work shall open up new directions for revisiting classical EDA problems with advancement in AI hardware and software

2020 ARXIV

Chip placement with Deep Reinforcement Learning

Author: Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee

Affiliation: Google

Abstract:

this work, we present a learning-based approach to chip placement, one of the most complex and time-consuming stages of the chip design process. Unlike prior methods, our approach has the ability to learn from past experience and improve over time. In particular, as we train over a greater number of chip blocks, our method becomes better at rapidly generating optimized placements for previously unseen chip blocks. To achieve these results, we pose placement as a Reinforcement Learning (RL) problem and train an agent to place the nodes of a chip netlist onto a chip canvas. To enable our RL policy to generalize to unseen blocks, we ground representation learning in the supervised task of predicting placement quality. By designing a neural architecture that can accurately predict reward across a wide variety of netlists and their placements, we are able to generate rich feature embeddings of the input netlists. We then use this architecture as the encoder of our policy and value networks to enable transfer learning. Our objective is to minimize PPA (power, performance, and area), and we show that, in under 6 hours, our method can generate placements that are superhuman or comparable on modern accelerator netlists, whereas existing baselines require human experts in the loop and take several weeks.

2019 DATE

Routability-Driven Macro placement with Embedded CNN-Based Prediction Model

Author: Yu-Hung Huang, Zhiyao Xie, Guan-Qi Fang, Tao-Chun Yu, Haoxing Ren, Shao-Yun Fang, Yiran Chen, Jiang Hu

Affiliation: National Taiwan University of Science and Technology, Taipei, Taiwan; Duke Univeristy, Durham, NC, USA; Nvidia Corporation, Austin, TX, USA; Texas A&M University, College Station, TX, USA

Abstract:

With the dramatic shrink of feature size and the advance of semiconductor technology nodes, numerous and complicated design rules need to be followed, and a chip design can only be taped-out after passing design rule check (DRC). The high design complexity seriously deteriorates design routability, which can be measured by the number of DRC violations after the detailed routing stage. In addition, a modern large-scaled design typically consists of many huge macros due to the wide use of intellectual properties (IPs). Empirically, the placement of these macros greatly determines routability, while there exists no effective cost metric to directly evaluate a macro placement because of the extremely high complexity and unpredictability of cell placement and routing. In this paper, we propose the first work of routability driven macro placement with deep learning. A convolutional neural network (CNN)-based routability prediction model is proposed and embedded into a macro placer such that a good macro placement w ith minimized DRC violations can be derived through a simulated annealing (SA) optimization process. Experimental results show the accuracy of the predictor and the effectiveness of the macro placer.

2012 DAC

PADE: A high-performance placer with automatic datapath extraction and evaluation through high dimensional data learning.

Author: Samuel Ward; Duo Ding; David Z. Pan

Affiliation: ECE Department, University of Texas, Austin, Austin, TX, USA

Abstract:

This work presents PADE, a new placement flow with automatic datapath extraction and evaluation. PADE applies novel data learning techniques to train, predict, and evaluate potential datapaths using high-dimensional data such as netlist symmetrical structures, initial placement hints and relative area. Extracted datapaths are mapped to bit-stack structures that are aligned and simultaneously placed with the random logic using SAPT [1], the SAPT, a placer built on top of SimPL [2]. Results show at least 7% average total Half-Perimeter Wire Length (HPWL) and 12% Steiner Wire Length (StWL) improvements on industrial hybrid benchmarks and at least 2% average total HPWL and 3% StWL improvements on ISPD 2005 contest benchmarks. To the best of our knowledge, this is the first attempt to link data learning, datapath extraction with evaluation, and placement and has the tremendous potential for pushing placement state-of-the-art for modern circuits which have datapath and random logics.

2012 DAC

PADE A high-performance placer with automatic datapath extraction and evaluation through high-dimensional data learning

Author: Samuel Ward, Duo Ding, David Z. Pan

Affiliation: ECE Dept. The University of Texas at Austin, Austin, TX 78712

Abstract:

2010 ESL

Accurate Machine-Learning-Based On-Chip router modeing

Author: Kwangok Jeong, Andrew B. Kahng, Bill Lin, Kambiz Samadi

Affiliation: Department of Electrical and Computer Engineering, University of California, San Diego, CA, USA

Abstract:

As industry moves towards multicore chips, networks-on-chip (NoCs) are emerging as the scalable fabric for interconnecting the cores. With power now the first-order design constraint, early-stage estimation of NoC power, performance, and area has become crucially important. In this work, we develop accurate architecture-level on-chip router cost models using machine-learning-based regression techniques. Compared against existing models (e.g., ORION 2.0 and parametric models), our models reduce estimation error by up to 89% on average.

AI+EDA

Placement and routing