Skip to content

Latest commit

 

History

History
696 lines (512 loc) · 23.4 KB

File metadata and controls

696 lines (512 loc) · 23.4 KB

PXXXXR0: Unified Function Call Syntax (UFCS)

Date: TODAY
Audience: EWG
Reply-to: Shady


Table of Contents

  1. Abstract
  2. Motivation
  3. Design Principles
  4. Proposal
  5. Examples
  6. Technical Specification
  7. Considered Alternatives
  8. Response to P3027
  9. Limitations and Impact
  10. Implementation
  11. References

1. Abstract

This paper proposes extending the . (and ->) member call syntax to fall back to calling non-member functions when no matching member is found. Two forms are introduced

  • Implicit fallback: obj.f(args...) falls back to f(obj, args...) using ADL restricted to the own namespace of obj's type and its base classes, excluding template argument namespaces.
  • Explicit qualified form: obj.NS::f(args...) is a direct syntactic rewrite to NS::f(obj, args...) equalivent to it in every regard.

This design directly addresses the concerns raised in [P3027R0] by bounding silent dispatch to namespaces the class author owns or has explicitly accepted as dependencies via public inheritance, while preserving the primary benefits of unified call syntax: generic code, IDE discoverability, fluent chaining, and C library interoperability.


2. Motivation

2.1 two incompatible call syntaxes

C++ has two incompatible calling syntaxes:

  • x.f(y) can only call member functions
  • f(x, y) can only call non-member functions

This forces generic code to commit to one syntax, prevents free functions from participating in fluent left-to-right call chains and harms IDE tooling: when a programmer types x., the IDE cannot suggest free functions that take x as their first argument.

2.2 we already have UFCS but only for operators

C++ already applies unified lookup for operators:

auto add(auto a, auto b) {
    return a + b; // calls member or non-member operator+
}

There is no reason this uniformity should be limited to operators. This proposal extends the same idea to all named functions with restrictions.

2.3 massive duplication in the standard lib

A large class of member functions across standard types are pure algorithms that require no access to private state. Because the member-only call syntax makes free functions second-class citizens and harder to discover, these algorithms are duplicated across every type rather than written once generically.

container interface duplication

Member Function string array vector span string_view
max_size() 1 1 1 1 1
empty() 1 1 1 1 1
rbegin()/rend() 2 2 2 2 1
cbegin()/cend() 2 2 2 2 1
crbegin()/crend() 2 2 2 2 1
data() 2 2 2 1 1
front() 2 2 2 1 1
back() 2 2 2 1 1
at() 2 2 2 1 1
Total 14 14 14 12 9

string/string_view duplication

Member Function string string_view
copy() 1 1
substr() 1 1
starts_with() 4 4
ends_with() 4 4
compare() 9 9
find() 4 4
rfind() 4 4
find_first_of() 4 4
find_last_of() 4 4
find_first_not_of() 4 4
find_last_not_of() 4 4
Total 46 46

That is over 150 function implementations duplicated across just these types, for what are essentially algorithms already expressible generically. With this proposal, a single generic free function serves all compatible types automatically.

2.4 IDE discoverability

The member call syntax is suited to IDE tooling because it puts the object first. When you type x., the IDE has a concrete type and can enumerate valid operations. When you type f(, the set of valid first arguments is potentially infinite and undecidable

std::istream is;
is.  // IDE can now suggest std::getline, and any other
     // free function taking std::istream& as its first argument

2.5 fluent chaining without workarounds

// Today: operator| workaround, extra functor machinery, worse debug codegen
ints | std::views::filter(even) | std::views::transform(square)

// With this proposal: natural dot syntax, same semantics, 0 overhead
ints.std::views::filter(even).std::views::transform(square)

The operator| approach requires special functor infrastructure, increases compile times, and cannot be used with arbitrary free functions the library author did not anticipate. This proposal obsoletes it.

2.6 C library compataiblity

C Style OOP functions work naturally

if (FILE* f = fopen("a.txt", "rb")) {
    f.fseek(9, SEEK_SET);
    long pos = f.ftell();
    int  ch  = f.fgetc();
    f.fprintf("pos=%ld\n", pos);
    f.fclose();
}

2.7 bringing uniformity between enumerations and built-in types and class types

Enums as of today can't have member functions which result in workarounds like

struct Enum 
{
    enum Type : uint8_t {
        RED,
        GREEN,
        BLUE
    };
    Enum(Type t) : type(t) {}
    std::string to_string() const;
    Type type;
};

This has downsides

  1. pointer indirection to just access a byte value
  2. boilerplate
  3. debug overhead
  4. it isn't an enum
  5. the enumerators can be converted to integral types. With this proposal
namespace Lib {
    enum class Color { RED, GREEN, BLUE };
    std::string to_string(Color c);
}

auto c = Lib::Color::Red;
std::cout << c.to_string(); // rewrites to Lib::to_string(c)

2.8 less header bloat

Today one must stuff every function inside the class if they want to maintain fluent syntax and better IDE discoverability.

// Rect.hpp
struct Rect 
{
    int x,y;
    int w,h;
};

This is a simple class, it has 0 dependencies and compiles very quickly, but now you want to add more functions like intersect()

#include <optional> // dependency
struct Rect 
{
    int x,y;
    int w,h;
    std::optional<Rect> intersects(Rect r) const;
};

Although one may not need to call intersects at all in his 90% of his codebase he is still paying the cost for recompiling optional in every single pcp file that includes this core header, this leads to the state we live today we choose between

  1. Good compile times
  2. Good syntax (aka member function)

This proposal makes it so this is not a worrysince now you can make another header called RectUtils.hpp that includes all non integral functions to Rect

3. Design Principles

Stolen from (Herb's paper)[https://open-std.org/JTC1/SC22/WG21/docs/papers/2023/p3021r0.pdf]

Note These principles apply to all design efforts and aren’t specific to this paper. Please steal and reuse. The primary design goal is conceptual integrity [Brooks 1975], which means that the design is coherent and relia- bly does what the user expects it to do. Conceptual integrity’s major supporting principles are:

  • Be consistent: Don’t make similar things different, including in spelling, behavior, or capability. Don’t make different things appear similar when they have different behavior or capability. – For example, ena- ble generic code to call a named function without requiring it to be provided only as a member function or only as a non-member function. Replace the need current workarounds such as invoking non-member std::begin and providing range/view operator|. Reduce the incentive for future special-purpose lan- guage evolution features like operator|>.

  • Be orthogonal: Avoid arbitrary coupling. Let features be used freely in combination. – For example, allow all functions that work on a given type, including non-member non-friends (whether written by the class author themselves for better encapsulation, or by a library user), be used uniformly with objects of that type, without the need for special features like extension methods.

  • Be general: Don’t restrict what is inherent. Don’t arbitrarily restrict a complete set of uses. Avoid special cases and partial features. – For example, today we already have UFCS, but only for overloaded opera- tors; it should be provided for all functions


4. Proposal

4.1 implicit lookup fallback form

For E1.f(args...) where f is an unqualified name:

  1. Perform member lookup for f in the class of E1.
  2. If a viable member is found, use it. No fallback.
  3. Otherwise rewrite to f(E1, args...) and search: The namespace of E1's type The namespaces of all base classes of E1's type
  4. Perform overload resolution with the new overload set from 3 with the arguements (E1, args...).

4.2 explicit qualified form

For E1.NS::f(args...) where NS is a namespace:

  1. Unconditionally rewrite to NS::f(E1, args...).
  2. Works for all types including builtins and pointers.

-> operator is covered by the rule that E1->f() is equal to (*E1).f() and same rules apply.

5. Examples

5.1 basic

namespace Math {
    struct Vector { double x, y; };
    Vector normalize(const Vector& v);
    double dot(const Vector& a, const Vector& b);
}

Math::Vector v{1.0, 2.0};
auto n = v.normalize();                   // Math::normalize(v)
auto d = v.dot(Math::Vector{0.0, 1.0});   // Math::dot(v, Math::Vector{0.0,1.0})

5.2 qualified Form

std::vector<int> v = {1, 2, 3};
auto s = v.std::size();    // std::size(v) 
bool e = v.std::empty();   // std::empty(v)

5.3 ranges

#include <ranges>
#include <vector>

constexpr auto even   = [](int i) { return i % 2 == 0; };
constexpr auto square = [](int i) { return i * i; };

std::vector<int> v = {1, 2, 3, 4, 5};

for (int i : v.std::views::filter(even)
              .std::views::transform(square)) {
    // ...
}

5.4 generic code works uniformly

namespace Graphics {
    struct Point { double x, y; };
    void draw(const Point& p);  // draw(Point): Point is first ✓
}

struct Widget {
    void draw() const;
};

template<typename T>
void render(T& t) {
    t.draw(); // Widget::draw()   when T = Widget (member wins)
              // Graphics::draw(t)     when T = Graphics::Point (fallback)
}

5.5 code reuse eliminating standard library duplication

namespace std {
    // Written once works everywhere
    template<class Rng>
    auto sub(const Rng& r, std::size_t pos, std::size_t count) {
        auto it = r.begin() + pos;
        return Rng(it, it + std::min(count, r.size() - pos));
    }

    template<class Rng, class Rng2>
    bool starts_with(const Rng& obj, const Rng2& other) {
        // explicit qualified form to avoid ADL
        return obj.std::sub(0, other.size()) == other;
    }
}

std::string        s  = "hello world";
std::string_view   sv = s;
std::vector<int>   v  = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
std::span<const int> sp = v;

auto p1 = s.sub(6, 5);           // "world"
auto p2 = sv.std::sub(6, 5);     // "world"
auto p3 = v.sub(6, 4);           // {7,8,9,10}
auto p4 = sp.std::sub(2, 3);     // {3,4,5}

s.std::starts_with(sv);          // true
sp.std::starts_with(v);          // true

// left to right reading like English
sv.std::sub(6, 5).std::starts_with("world");  

auto result = "Python in my {}".std::format("C++");

5.6 enums and builtins

namespace MyLib {
    enum class Color { Red, Green, Blue };
    std::string to_string(Color c); // to_string(Color): Color is first
}

MyLib::Color c = MyLib::Color::Red;
auto s = c.to_string(); // MyLib::to_string(c)
int i;
i = i.std::abs(); // calls std::abs(i)

5.7 C library interoperability

Only functions where FILE* is the first parameter work with implicit UFCS:

// Signatures:
// fseek  (FILE*, long, int)          — FILE* first ✓
// fclose (FILE*)                     — FILE* first ✓
// fprintf(FILE*, const char*, ...)   — FILE* first ✓
// ftell  (FILE*)                     — FILE* first ✓
// feof   (FILE*)                     — FILE* first ✓
// fgetc  (FILE*)                     — FILE* first ✓

FILE* f = fopen("a.txt", "rb");
if (f) {
    f.fseek(9, SEEK_SET);
    long pos = f.ftell();
    int  ch  = f.fgetc();
    f.fprintf("pos=%ld\n", pos);
    f.fclose();
}

5.8 member preference

namespace Foo {
    struct Bar { void f() {}; };
    void f(Bar&);
}

Foo::Bar b;
b.f();      // OK: calls Bar::f
Foo::f(b);  // OK: calls Foo::f(Bar&)
b.Foo::f(); // OK: explicit qualified form calls Foo::f(b)

5.9 base class association

namespace Base {
    class Shape { double area; };
    double perimeter(const Shape& s); 
}

namespace Derived {
    class Circle : public Base::Shape { double radius; };
    double circumference(const Circle& c); 
}

Derived::Circle c;
// Searches in Circle first nothing was found go searches in it's namespace 
// which is `Derived` and finds Derived::circumference
c.circumference(); 
// Searches in Circle first nothing was found go searches in it's namespace find's nothing as well then goes
// to its base classes namespace which is `Base` and finds Base::perimeter
c.perimeter();     // Base::perimeter(c) found by `Shape` namespace

6. Technical Specification

6.1 Implicit Fallback Lookup

For E1.E2(args) where E2 is an unqualified name:

  1. Perform member lookup for E2 in the class of E1 per [class.member.lookup].
  2. If a deleted member is the best viable candidate, the expression is ill-formed. No fallback.
  3. If a viable non-deleted member is found, use it. No fallback.
  4. Otherwise rewrite to E2(E1, args) and perform lookup restricted to:
    • Innermost enclosing namespace of E1's type
    • Innermost enclosing namespaces of all public direct and indirect base classes
    • Global namespace if E1's type is declared there or is a built-in type
    • Not namespaces of template arguments (per [P0934R0])
    • Not namespaces of private or protected base classes
  5. Perform overload resolution with argument list (E1, args).

E1's associated namespaces depend on

if: E1 is a pointer or reference type the associated namespaces are those of the underlying type; if: E1 is a array type of known or unknown bounds the associated namespaces are those of the element type; if: E1 is a function pointer type associated are those of the arguments and return type; if: E1 is a member function pointer type associated are those of the arguments and return type and class;

6.2 Explicit Qualified Lookup

For E1.NS::E2(args) where NS is a namespace:

  1. Unconditionally rewrite to NS::E2(E1, args).
  2. Qualified lookup in NS only. No ADL. Purely syntactic.

7. Considered Alternatives

7.1 Full ADL UFCS (N4174, P0251)

Rewrites x.f(args) to f(x, args) with full ADL across all argument types. functions from completely unrelated namespaces can be silently found based on other argument types. This proposal's restricted lookup scheme eliminates that problem.

7.2 Extension Methods (C#-style, N1585)

a simple example that can break is for example adding starts_with to std::string_view.

// not inside the namespace of string_view
// imaginary syntax for extensions
size_t starts_with(const this std::string_view& a,std::string_view b)
{
     return a.substr(0,b.size()) == b;
}
std::string_view s;
s.starts_with("Hello");

this is all good, until std::string_view itself gets a member named starts_with which takes a const char* that instead will be chosen since it is a direct match.

This is fine since both of the functions do the same thing in this case but what if they don't? you will get silentbreakage and class writers can't gurantee what their api provides, and this is why this paper chosed an this new lookupscheme that makes functions related, it avoids the issues with implicit lookup and keeps the class interface stable andvisually seperate from extensions if you qualify them because for example an API canprovide a gurantee that all accesses of its const member functions are thread-safe, extension methods would break that asboth x.ext() and x.mem() look the same, while this proposal provides explicit intent x.Utils::ext() is clear that itis not part of the official API and therefore the gurantee does not have to necessarily appy.

7.3 operator| Pipeline (std::ranges)

Works today but is a workaround that

  1. Requires custom functor infrastructure.
  2. Increases already bad compilation times.
  3. Produces terrible debug code
  4. Cannot be applied to arbitrary existing free functions.

This proposal obsoletes it.

7.4 operator|> (P2011, P2672)

This proposal is against one off symbols or features This proposal makes it unnecessary.

7.5 CRTP mixins / deducing this

This is a common reply against having this feature builtin, you can acheive the same thing mostly with the tools we have.

struct MonadicMixin {
    auto value_or(this auto&& self,auto&& default_)
    {
        return self ? *self : default_;
    }
};

template<typename T>
struct optional : MonadicMixin {};

template<typename T,typename E>
struct expected : MonadicMixin {};

struct Thing : MonadicMixin {
    optional<int> a;
};
// sizeof(Thing) is 8 not 4!
// unnecesary cost

Due to that the base class exists twice there must be differing addresses which causes the class to grow larger than it needs to be.

The proper "solution" would be to use CRTP to have unique base classes for each type

template<typename CRTP>
struct MonadicMixin 
{
    using P = CRTP&;
    auto value_or(auto&& default_) 
    {
        return P(*this) ? *P(*this) : default_;
    }
};

template<typename T>
struct optional : MonadicMixin<optional<T>> {};

template<typename T,typename E>
struct expected : MonadicMixin<expected<T,E>> {};

struct Thing : MonadicMixin<Thing> {
    optional<int> a;
};

// sizeof(Thing) == 4;

However, the CRTP approach introduces additional complexity and is not easily teachable, especially to beginners and causes slower compilation times and bigger debug symbols and worse error messages given the longer symbols.


8. Response to P3027

[P3027R0] argues that any UFCS turning a member call into a free function call "breaks the guarantee that code using member function calls will never be subject to the complexity and woes of ADL." and this proposal agrees.

8.1 namespace ownership

P3027's breakage scenarios share one assumption the silently found free function is "arbitrarily far away in code you never wrote, brought in by any of the many associated namespaces." this is valid against full ADL UFCS. it shouldn't be applied here.

with this this proposal, implicit fallback can only find functions in

  • The namespace of the type you are working with
  • The namespaces of types you explicitly chose to inherit from

If you are refactoring a type, you own its namespace. You know every function in it. The "arbitrarily far away code you never wrote" scenario is eliminated by design.

8.2 rename of a member

P3027: Foo::snap() is renamed to Foo::slap(). A call f.snap() silently finds a free function instead of erroring.

under this proposal, the only free snap that can be found must be in namespace Foo the same namespace you are refactoring. you should know that it exists. if a fallback is undesired, add void snap() = delete; to block it explicitly.

8.3 removal of a member function

Same analysis. A removed member can only fall back to a free function in the same namespace or the base classes namespace. The = delete escape hatch provides a hard opt-out.

8.4 changing a parameter type

P3027 raises: change Kraken& to Kraken*, forget to change k.free() to k->free(), now ::free(k) gets called.

an exactly identical hazard already exists today without UFCS:

struct Kraken { void reset(); };

void release(std::unique_ptr<Kraken>& k) {
    k.reset(); // intended k->reset() calls unique_ptr reset() instead
               // already silently wrong today, no UFCS involved
}

The pointer/reference confusion bug class predates this proposal. UFCS does not introduce it, it may make it more common. The fixes should be in tooling via warnings.

8.5 deleted member should not fall back

P3027 argues correctly that a deleted member should remain a hard error. This proposal agrees.

8.6 inheritance from third party namespaces

inheritance from an unowned namespace means that namespace is searched on fallback. This is consistent: inheritance is an explicit declaration that this type IS-A base type. The programmer already accepted fragility on the base's member interface calling a base member function you didn't write already works today and nobody objects to it. extending that same accepted dependency to free functions in the base's namespace is consistent behavior.


9. impact

9.1 first arg only

UFCS only applies when the object is the first argument of the free function. C functions where the primary object is not the first parameter cannot use either form and must be called traditionally:

// fputs(const char* str, FILE* stream) — FILE* is second
// file.fputs("str") rewrites to fputs(file, "str") does not work
fputs("Hello\n", f);  // must remain traditional
"Hello\n".::fputs(f); // also works but quirky

I think that this problem is impossible to solve and I would prefer if this stays an error.

9.2 No Breaking Changes

This is an extension. all existing well-formed code remains well-formed with identical meaning. fallback only applies to expressions that are currently ill-formed.

9.3 future on the standard library

This proposal enables the standard library to replace over 150 duplicated member functions with single generic free functions, and to deprecate the operator| infrastructure in std::ranges. existing members remain as the member priority path for backward compatibility.

The standard would also provide no gurantee that the expression a.f() may not be necessarily have f as a member of a.

10. Implementation

The author has implemented this proposal in a Clang fork resolving CWG1089.


11. References

  • [N4174] B. Stroustrup. "Call syntax: x.f(y) vs. f(x,y)" (WG21, October 2014)
  • [N4165] H. Sutter. "Unified call syntax" (WG21, October 2014)
  • [P3021R0] H. Sutter. "Unified function call syntax" (WG21, October 2023)
  • [P3027R0] V. Voutilainen et al. "UFCS is a breaking change, of the absolutely worst kind" (WG21, October 2023)
  • [P0934R0] H. Sutter. "A Modest Proposal: Fixing ADL" (WG21, February 2018)
  • [P2011R0] B. Revzin. "A pipeline-rewrite operator" (WG21, January 2020)
  • [P2672R0] B. Revzin. "Exploring the design space for a pipeline operator" (WG21, October 2022)
  • [N1585] F. Glassborow. "Extension Methods" (WG21, 2004)
  • [CWG1089] "Template parameters in member selections" CWG active issues
  • [UFCS History] B. Revzin. "UFCS History" April 2019