Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat[venom]: function inliner #4266

Closed
Closed
Show file tree
Hide file tree
Changes from 68 commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
2560bbf
include additional memory operation to raise to variables
harkal May 21, 2024
0dbd0b7
refactor and apply a threashold on the number of memory ops we are re…
harkal May 21, 2024
93f6c51
optimization
harkal May 21, 2024
05c2d7d
aesthetics
harkal May 21, 2024
679ba7b
ident
harkal May 21, 2024
7020b06
Merge branch 'master' into fix/palloca
harkal May 21, 2024
805299f
fix typo
harkal May 21, 2024
a810d30
remove dead code
harkal May 22, 2024
10d1a5f
Merge branch 'master' into fix/palloca
charles-cooper May 25, 2024
ed2d84f
Merge remote-tracking branch 'origin-vyper/master' into fix/palloca
harkal May 28, 2024
d538e6e
lint
harkal May 28, 2024
df1e011
Merge branch 'master' into fix/palloca
charles-cooper May 29, 2024
2d0096b
Merge remote-tracking branch 'origin-vyper/master' into fix/palloca
harkal May 30, 2024
f013a07
Merge branch 'master' into fix/palloca
charles-cooper May 30, 2024
a8ec72b
Merge branch 'master' into fix/palloca
charles-cooper Sep 29, 2024
09ada41
cleanup
charles-cooper Sep 29, 2024
f7ef8c9
hygiene
charles-cooper Sep 29, 2024
97aa938
fix lint
charles-cooper Sep 29, 2024
d061341
change mem2var variable prefix
charles-cooper Sep 29, 2024
50996c1
change calling convention
charles-cooper Sep 29, 2024
da363a1
Merge branch 'fix/palloca' into feat/venom/inliner
charles-cooper Sep 29, 2024
16fa951
simple function inliner
charles-cooper Sep 29, 2024
62f23a9
add inliner threshold
charles-cooper Sep 29, 2024
575478f
fix stuff
charles-cooper Sep 29, 2024
f000e09
fix stuff
charles-cooper Sep 29, 2024
5dc01e8
add a note
charles-cooper Sep 30, 2024
8e71219
fix lint
charles-cooper Oct 1, 2024
06b8533
remove global symbol copying
charles-cooper Oct 21, 2024
7376501
float allocas
charles-cooper Oct 23, 2024
20a70c0
Merge branch 'master' into feat/venom/inliner
charles-cooper Oct 28, 2024
c5d78c6
fix lint
charles-cooper Oct 28, 2024
89ba4e5
Merge branch 'master' into feat/venom/inliner
charles-cooper Nov 20, 2024
e8bbdb7
Merge branch 'master' into feat/venom/inliner
charles-cooper Nov 20, 2024
0478878
some fixes
charles-cooper Nov 20, 2024
a506708
Merge branch 'master' into fix/alloca
charles-cooper Nov 22, 2024
ebd3895
don't float
charles-cooper Nov 22, 2024
f6687f8
debug
charles-cooper Nov 22, 2024
682bbd3
sanity check
charles-cooper Nov 22, 2024
c2f2945
wip
charles-cooper Nov 22, 2024
ff611ed
wip2
charles-cooper Nov 22, 2024
92d8da0
working(?)
charles-cooper Nov 22, 2024
72d62cb
optimize
charles-cooper Nov 22, 2024
2ca6b80
fix alloca id generator
charles-cooper Nov 22, 2024
be03bb5
delete dead code
charles-cooper Nov 22, 2024
e9d7d74
rename global_symbols to alloca_table
charles-cooper Nov 22, 2024
17246c8
add id to alloca variables for better debugging
charles-cooper Nov 22, 2024
db11269
update readme
charles-cooper Nov 22, 2024
c9f2a98
fix sqrt
charles-cooper Nov 22, 2024
0d22091
fix lint
charles-cooper Nov 22, 2024
280d732
Merge branch 'master' into fix/alloca
charles-cooper Nov 22, 2024
b460c17
add notes
charles-cooper Nov 22, 2024
4f20568
slight cleanup
charles-cooper Nov 22, 2024
e21642a
roll back a change
charles-cooper Nov 22, 2024
e95de4b
remove dead variable
charles-cooper Nov 22, 2024
6f61f53
Merge branch 'fix/alloca' into feat/venom/inliner
charles-cooper Nov 22, 2024
6be1300
wip callocas
charles-cooper Nov 22, 2024
7d83d93
notes
charles-cooper Nov 23, 2024
c75a273
feat[venom]: add calloca instruction
charles-cooper Nov 26, 2024
c647456
save/restore alloca table inside functions
charles-cooper Nov 26, 2024
f660a3c
float calloca
charles-cooper Nov 26, 2024
23d1267
add callsite to calloca
charles-cooper Dec 2, 2024
3275824
update readme
charles-cooper Dec 17, 2024
ca1c09d
Merge branch 'master' into feat/venom/calloca
charles-cooper Dec 18, 2024
a26b503
formatting
charles-cooper Dec 18, 2024
ef3ad9c
Merge branch 'master' into feat/venom/calloca
charles-cooper Dec 19, 2024
e127a2b
Merge branch 'master' into feat/venom/inliner
charles-cooper Dec 20, 2024
d9dbf19
fix a branch
charles-cooper Dec 20, 2024
2ef864e
Merge branch 'feat/venom/calloca' into feat/venom/inliner
charles-cooper Dec 20, 2024
5cef52e
Merge branch 'master' into feat/venom/calloca
charles-cooper Dec 20, 2024
2065480
Merge branch 'master' into feat/venom/calloca
charles-cooper Dec 20, 2024
0924941
fixes
charles-cooper Dec 22, 2024
c16d171
remove callsite from calloca
charles-cooper Dec 22, 2024
cf407ec
Merge branch 'master' into feat/venom/inliner
charles-cooper Dec 22, 2024
be35da5
fix lint
charles-cooper Dec 22, 2024
e45d42d
Merge branch 'feat/venom/calloca' into feat/venom/inliner
charles-cooper Dec 22, 2024
371c1ec
fix more lint
charles-cooper Dec 22, 2024
f462010
remove dead function
charles-cooper Dec 22, 2024
f1e344b
fix garbage vars
charles-cooper Dec 22, 2024
286e969
fix: inline in call graph order
charles-cooper Dec 22, 2024
2ebab45
ditch calloca
charles-cooper Dec 23, 2024
48b6c8c
Merge branch 'master' into feat/venom/inliner
charles-cooper Dec 27, 2024
35c1048
reorder basic blocks in cfg simplification
charles-cooper Dec 27, 2024
594ba80
prune duplicate allocas
charles-cooper Dec 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions vyper/codegen/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ class Alloca:

_id: int

# special metadata for calloca. hint for venom to tie calloca to call site.
_callsite: Optional[str] = None

def __post_init__(self):
assert self.typ.memory_bytes_required == self.size

Expand Down
4 changes: 2 additions & 2 deletions vyper/codegen/function_definitions/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from functools import cached_property
from typing import Optional

from vyper.codegen.context import Constancy, Context
from vyper.codegen.context import Constancy, Context, VariableRecord
from vyper.codegen.ir_node import IRnode
from vyper.codegen.memory_allocator import MemoryAllocator
from vyper.evm.opcodes import version_check
Expand All @@ -16,7 +16,7 @@
class FrameInfo:
frame_start: int
frame_size: int
frame_vars: dict[str, tuple[int, VyperType]]
frame_vars: dict[str, VariableRecord]

@property
def mem_used(self):
Expand Down
26 changes: 25 additions & 1 deletion vyper/codegen/self_call.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
import copy
import dataclasses

from vyper.codegen.core import _freshname, eval_once_check, make_setter
from vyper.codegen.ir_node import IRnode
from vyper.codegen.memory_allocator import MemoryAllocator
from vyper.evm.address_space import MEMORY
from vyper.exceptions import StateAccessViolation
from vyper.semantics.types.subscriptable import TupleT
Expand Down Expand Up @@ -66,7 +70,27 @@

# note: dst_tuple_t != args_tuple_t
dst_tuple_t = TupleT(tuple(func_t.argument_types))
args_dst = IRnode(func_t._ir_info.frame_info.frame_start, typ=dst_tuple_t, location=MEMORY)
if context.settings.experimental_codegen:
arg_items = ["multi"]
frame_info = func_t._ir_info.frame_info

for var in frame_info.frame_vars.values():
var = copy.copy(var)
alloca = var.alloca
assert alloca is not None
assert isinstance(var.pos, str) # help mypy
if not var.pos.startswith("$palloca"):
continue
newname = var.pos.replace("$palloca", "$calloca")
var.pos = newname
alloca = dataclasses.replace(alloca, _callsite=return_label)
irnode = var.as_ir_node()
irnode.passthrough_metadata["alloca"] = alloca
arg_items.append(irnode)
args_dst = IRnode.from_list(arg_items, typ=dst_tuple_t)
else:
# legacy
args_dst = IRnode(func_t._ir_info.frame_info.frame_start, typ=dst_tuple_t, location=MEMORY)

# if one of the arguments is a self call, the argument
# buffer could get borked. to prevent against that,
Expand Down
5 changes: 5 additions & 0 deletions vyper/venom/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,11 @@ Assembly can be inspected with `-f asm`, whereas an opcode view of the final byt
out = palloca size, offset, id
```
- Like the `alloca` instruction but only used for parameters of internal functions which are passed by memory.
- `calloca`
- ```
out = calloca size, offset, id, <callsite label>
```
- Similar to the `calloca` instruction but only used for parameters of internal functions which are passed by memory. Used at the call-site of a call.
- `iload`
- ```
out = iload offset
Expand Down
12 changes: 10 additions & 2 deletions vyper/venom/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
AlgebraicOptimizationPass,
BranchOptimizationPass,
DFTPass,
FunctionInlinerPass,
FloatAllocas,
MakeSSA,
Mem2Var,
Expand Down Expand Up @@ -46,15 +47,22 @@ def generate_assembly_experimental(
def _run_passes(fn: IRFunction, optimize: OptimizationLevel) -> None:
# Run passes on Venom IR
# TODO: Add support for optimization levels
ac = IRAnalysesCache(fn, optimize)

ac = IRAnalysesCache(fn)
FunctionInlinerPass(ac, fn).run_pass()

FloatAllocas(ac, fn).run_pass()

SimplifyCFGPass(ac, fn).run_pass()
MakeSSA(ac, fn).run_pass()

Mem2Var(ac, fn).run_pass()
MakeSSA(ac, fn).run_pass()

# function inliner can insert bad variables, remove them before sccp

RemoveUnusedVariablesPass(ac, fn).run_pass()

SCCP(ac, fn).run_pass()
StoreElimination(ac, fn).run_pass()
MemMergePass(ac, fn).run_pass()
Expand All @@ -77,11 +85,11 @@ def _run_passes(fn: IRFunction, optimize: OptimizationLevel) -> None:
def run_passes_on(ctx: IRContext, optimize: OptimizationLevel):
for fn in ctx.functions.values():
_run_passes(fn, optimize)
ctx.prune_unreachable_functions()


def generate_ir(ir: IRnode, optimize: OptimizationLevel) -> IRContext:
# Convert "old" IR to "new" IR
ctx = ir_node_to_venom(ir)
run_passes_on(ctx, optimize)

return ctx
8 changes: 7 additions & 1 deletion vyper/venom/analysis/analysis.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from typing import Type

from vyper.compiler.settings import OptimizationLevel
from vyper.venom.function import IRFunction


Expand Down Expand Up @@ -36,11 +37,16 @@ class IRAnalysesCache:

function: IRFunction
analyses_cache: dict[Type[IRAnalysis], IRAnalysis]
optimize: OptimizationLevel

def __init__(self, function: IRFunction):
def __init__(self, function: IRFunction, optimize=None):
self.analyses_cache = {}
self.function = function

if optimize is None:
optimize = OptimizationLevel.default()
self.optimize = optimize

def request_analysis(self, analysis_cls: Type[IRAnalysis], *args, **kwargs):
"""
Request a specific analysis to be run on the IR. The result is cached and
Expand Down
7 changes: 7 additions & 0 deletions vyper/venom/basicblock.py
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,13 @@ def __init__(
self.ast_source = None
self.error_msg = None

def copy(self):
cls = self.__class__
ret = cls.__new__(cls)
ret.__dict__ = self.__dict__.copy()
ret.operands = ret.operands.copy()
return ret

@property
def is_volatile(self) -> bool:
return self.opcode in VOLATILE_INSTRUCTIONS
Expand Down
22 changes: 22 additions & 0 deletions vyper/venom/context.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from typing import Optional

from vyper.utils import OrderedSet
from vyper.venom.basicblock import IRInstruction, IRLabel, IROperand
from vyper.venom.function import IRFunction

Expand Down Expand Up @@ -39,6 +40,23 @@ def get_next_label(self, suffix: str = "") -> IRLabel:
self.last_label += 1
return IRLabel(f"{self.last_label}{suffix}")

def prune_unreachable_functions(self):
entry = next(iter(self.functions.values()))
to_visit = OrderedSet([entry])
seen = OrderedSet()
while to_visit:
fn = to_visit.pop()
seen.add(fn)
for bb in fn.get_basic_blocks():
for inst in bb.instructions:
if inst.opcode == "invoke":
label = inst.operands[0]
next_fn = self.get_function(label)
if next_fn not in seen:
to_visit.add(next_fn)

self.functions = {label: fn for label, fn in self.functions.items() if fn in seen}

def chain_basic_blocks(self) -> None:
"""
Chain basic blocks together. This is necessary for the IR to be valid, and is done after
Expand All @@ -47,6 +65,10 @@ def chain_basic_blocks(self) -> None:
for fn in self.functions.values():
fn.chain_basic_blocks()

def float_allocas(self) -> None:
for fn in self.functions.values():
fn.float_allocas()

def append_data(self, opcode: str, args: list[IROperand]) -> None:
"""
Append data
Expand Down
19 changes: 19 additions & 0 deletions vyper/venom/function.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,25 @@ def chain_basic_blocks(self) -> None:
else:
bb.append_instruction("stop")

def float_allocas(self):
entry_bb = self.entry
assert entry_bb.is_terminated
tmp = entry_bb.instructions.pop()

for bb in self.get_basic_blocks():
if bb is entry_bb:
continue

# "fast" way to strip allocas from each basic block
def is_alloca(inst):
return inst.opcode in ("alloca", "palloca")

bb.instructions.sort(key=is_alloca)
while len(bb.instructions) > 0 and is_alloca(bb.instructions[-1]):
entry_bb.insert_instruction(bb.instructions.pop())

entry_bb.instructions.append(tmp)

def copy(self):
new = IRFunction(self.name)
new._basic_block_dict = self._basic_block_dict.copy()
Expand Down
24 changes: 24 additions & 0 deletions vyper/venom/ir_node_to_venom.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,13 @@ def ir_node_to_venom(ir: IRnode) -> IRContext:

ctx.chain_basic_blocks()

# float allocas to the front of the function. we could probably move
# them to the immediate dominator of the basic block defining the alloca
# instead of the entry (which dominates all basic blocks), but this is
# done for expedience. without this step, sccp fails, possibly because
# dominators are not guaranteed to be traversed first.
ctx.float_allocas()

return ctx


Expand Down Expand Up @@ -177,9 +184,14 @@ def _handle_self_call(fn: IRFunction, ir: IRnode, symbols: SymbolTable) -> Optio
def _handle_internal_func(
fn: IRFunction, ir: IRnode, does_return_data: bool, symbols: SymbolTable
) -> IRFunction:
global _alloca_table

fn = fn.ctx.create_function(ir.args[0].args[0].value)
bb = fn.get_basic_block()

_saved_alloca_table = _alloca_table
_alloca_table = {}

# return buffer
if does_return_data:
symbols["return_buffer"] = bb.append_instruction("param")
Expand All @@ -191,6 +203,7 @@ def _handle_internal_func(

_convert_ir_bb(fn, ir.args[0].args[2], symbols)

_alloca_table = _saved_alloca_table
return fn


Expand Down Expand Up @@ -539,6 +552,17 @@ def emit_body_blocks():
_alloca_table[alloca._id] = ptr
return _alloca_table[alloca._id]

elif ir.value.startswith("$calloca"):
alloca = ir.passthrough_metadata["alloca"]
if alloca._id not in _alloca_table:
assert alloca._callsite is not None
callsite = IRLabel(alloca._callsite)
ptr = fn.get_basic_block().append_instruction(
"calloca", alloca.offset, alloca.size, alloca._id, callsite
)
_alloca_table[alloca._id] = ptr
return _alloca_table[alloca._id]

return symbols.get(ir.value)
elif ir.is_literal:
return IRLiteral(ir.value)
Expand Down
1 change: 1 addition & 0 deletions vyper/venom/passes/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from .algebraic_optimization import AlgebraicOptimizationPass
from .branch_optimization import BranchOptimizationPass
from .dft import DFTPass
from .function_inliner import FunctionInlinerPass
from .float_allocas import FloatAllocas
from .make_ssa import MakeSSA
from .mem2var import Mem2Var
Expand Down
2 changes: 1 addition & 1 deletion vyper/venom/passes/float_allocas.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ def run_pass(self):
# Extract alloca instructions
non_alloca_instructions = []
for inst in bb.instructions:
if inst.opcode in ("alloca", "palloca"):
if inst.opcode in ("alloca", "palloca", "calloca"):
# note: order of allocas impacts bytecode.
# TODO: investigate.
entry_bb.insert_instruction(inst)
Expand Down
Loading
Loading