Web lists-archives.com

[PATCH RFC 0/3] Static calls

These patches are related to two similar patch sets from Ard and Steve:

- https://lkml.kernel.org/r/20181005081333.15018-1-ard.biesheuvel@xxxxxxxxxx
- https://lkml.kernel.org/r/20181006015110.653946300@xxxxxxxxxxx

The code is also heavily inspired by the jump label code, as some of the
concepts are very similar.

There are three separate implementations, depending on what the arch

  1) CONFIG_HAVE_STATIC_CALL_OPTIMIZED: patched call sites - requires
     objtool and a small amount of arch code
  2) CONFIG_HAVE_STATIC_CALL_UNOPTIMIZED: patched trampolines - requires
     a small amount of arch code
  3) If no arch support, fall back to regular function pointers


- I'm not sure about the objtool approach.  Objtool is (currently)
  x86-64 only, which means we have to use the "unoptimized" version
  everywhere else.  I may experiment with a GCC plugin instead.

- Does this feature have much value without retpolines?  If not, should
  we make it depend on retpolines somehow?

- Find some actual users of the interfaces (tracepoints? crypto?)

Details (cribbed from comments in include/linux/static_call.h):

Static calls use code patching to hard-code function pointers into
direct branch instructions.  They give the flexibility of function
pointers, but with improved performance.  This is especially important
for cases where retpolines would otherwise be used, as retpolines can
significantly impact performance.

API overview:

 static_call(key, args...);
 static_call_update(key, func);

Usage example:

  # Start with the following functions (with identical prototypes):
  int func_a(int arg1, int arg2);
  int func_b(int arg1, int arg2);

  # Define a 'my_key' reference, associated with func_a() by default
  DEFINE_STATIC_CALL(my_key, func_a);

  # Call func_a()
  static_call(my_key, arg1, arg2);

  # Update 'my_key' to point to func_b()
  static_call_update(my_key, func_b);

  # Call func_b()
  static_call(my_key, arg1, arg2);

Implementation details:

There are three different implementations:

1) Optimized static calls (patched call sites)

   This requires objtool, which detects all the static_call() sites and
   annotates them in the '.static_call_sites' section.  By default, the call
   sites will call into a temporary per-key trampoline which has an indirect
   branch to the current destination function associated with the key.
   During system boot (or module init), all call sites are patched to call
   their destination functions directly.  Updates to a key will patch all
   call sites associated with that key.

2) Unoptimized static calls (patched trampolines)

   Each static_call() site calls into a permanent trampoline associated with
   the key.  The trampoline has a direct branch to the default function.
   Updates to a key will modify the direct branch in the key's trampoline.

3) Generic implementation

   This is the default implementation if the architecture hasn't implemented
   CONFIG_HAVE_STATIC_CALL_[UN]OPTIMIZED.  In this case, a basic
   function pointer is used.

Josh Poimboeuf (3):
  static_call: Add static call infrastructure
  x86/static_call: Add x86 unoptimized static call implementation
  x86/static_call: Add optimized static call implementation for 64-bit

 arch/Kconfig                                  |   6 +
 arch/x86/Kconfig                              |   4 +-
 arch/x86/include/asm/static_call.h            |  42 +++
 arch/x86/kernel/Makefile                      |   1 +
 arch/x86/kernel/static_call.c                 |  84 +++++
 include/asm-generic/vmlinux.lds.h             |  11 +
 include/linux/module.h                        |  10 +
 include/linux/static_call.h                   | 186 +++++++++++
 include/linux/static_call_types.h             |  19 ++
 kernel/Makefile                               |   1 +
 kernel/module.c                               |   5 +
 kernel/static_call.c                          | 297 ++++++++++++++++++
 tools/objtool/Makefile                        |   3 +-
 tools/objtool/check.c                         | 126 +++++++-
 tools/objtool/check.h                         |   2 +
 tools/objtool/elf.h                           |   1 +
 .../objtool/include/linux/static_call_types.h |  19 ++
 tools/objtool/sync-check.sh                   |   1 +
 18 files changed, 815 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/include/asm/static_call.h
 create mode 100644 arch/x86/kernel/static_call.c
 create mode 100644 include/linux/static_call.h
 create mode 100644 include/linux/static_call_types.h
 create mode 100644 kernel/static_call.c
 create mode 100644 tools/objtool/include/linux/static_call_types.h