Proposal: Introduce f sigils for string interpolation

Hi everyone,

I’d like to introduce a proposal to simplify string interpolation in Erlang by extending triple-quoted strings with the f sigil. This builds on the foundation of EEP 64 (Triple-Quoted Strings) and addresses the verbosity of dynamic content insertion.

Motivation

Currently, interpolating values into triple-quoted strings requires manual concatenation:

% Tedious today:
render(Bindings) ->
    <<
    "<html>\n"
    "  <head>\n"
    "    <title>", (maps:get(title, Bindings))/binary, "</title>\n"
    "  </head>\n"
    "</html>"
    >>.

AFAIK, it’s not possible to do the above using Triple-Quoted Strings.

This PR proposes a cleaner syntax using ~f"""...""" , allowing embedded Erlang expressions enclosed in {}:

% Proposed:
render(Bindings) ->
    ~f"""
    <html>
      <head>
        <title>{maps:get(title, Bindings)}</title>
      </head>
    </html>
    """.

Example output:

1> render(#{title => ~"Example"}).
<<"<html>\n  <head>\n    <title>Example</title>\n  </head>\n</html>">>

Key Features

  1. f Sigil Syntax: ~f"""...""" evaluates to a binary with interpolated values.
  2. Expression Handling: Values inside {} must be binaries (supports nesting and escaping \{).
  3. Alignment with EEP 64: Complements triple-quoted strings without breaking existing code.

There is EEP 62 that proposes interpolation but with a different syntax.

Feedback Welcome!

This is a draft implementation targeting OTP-28. I’d appreciate thoughts on:

  • The f sigil vs. alternatives (e.g., i)
  • Syntax choices ({} vs. other delimiters)
  • Edge cases or potential ambiguities

Please review the PR and share your opinions here. Let’s discuss!


PR Link:

9 Likes

Example 1: Dynamic Email Template Generation

Use Case: Generating HTML/text email templates for password resets with embedded variables.

Problem Without Interpolation

Constructing multi-format emails requires manual concatenation, leading to error-prone and visually noisy code:

send_reset_email(User, ResetToken) ->
  HtmlBody = <<
    "<html>\n",
    "<body>\n",
    "  <h1>Password Reset for ", (User#user.name)/binary, "</h1>\n",
    "  <p>Click <a href=\"https://example.com/reset?token=", 
    (ResetToken)/binary, "\">here</a> to reset.</p>\n",
    "</body>\n",
    "</html>"
  >>,
  TextBody = <<"Password Reset for ", (User#user.name)/binary, "\n\n",
               "Reset link: https://example.com/reset?token=", (ResetToken)/binary>>,
  send_email(User#user.email, "Password Reset", HtmlBody, TextBody).

Solution With ~f Interpolation

Cleaner syntax with proper structure preservation:

send_reset_email(User, ResetToken) ->
    HtmlBody = ~f"""
    <html>
      <body>
        <h1>Password Reset for {User#user.name}</h1>
        <p>Click <a href="https://example.com/reset?token={ResetToken}">here</a> to reset.</p>
      </body>
    </html>
    """,
    TextBody = ~f"""
    Password Reset for {User#user.name}
    
    Reset link: https://example.com/reset?token={ResetToken}
    """,
    send_email(User#user.email, "Password Reset", HtmlBody, TextBody).

Discussion:

  • Variables (User#user.name, ResetToken) are injected directly into the template.
  • Maintains visual alignment of HTML/Text content.
  • No risk of missing <<...>> operators or commas.

Example 2: Structured Error Logging

Solution With ~f Interpolation

Direct interpolation simplifies context inclusion:

log_error(Req, Error) ->
    logger:error(~f"""
    Request {Req#request.id} failed: {Error}. Path: {Req#request.path}
    """).
Example Output
1> Req = #request{id = ~"req_2fg8", path = ~"/user/123/profile"}.
#request{id = <<"req_2fg8">>, path = <<"/user/123/profile">>}
2> log_error(Req, ~"Permission denied").
=ERROR REPORT==== 1-Mar-2025::06:31:01.618403 ===
Request req_2fg8 failed: Permission denied. Path: /user/123/profile
ok

Discussion:

  • Dynamic values (id, path, Error) are embedded naturally.
  • No need for io_lib:format/2 or positional placeholders (~s).

Example 3: OTP code

Take this function of the otp_man_index module:

module_table() ->
    % ...supressed code
    ["|  Module name | Description | Application |\n",
     "|--------------|-------------|-------------|\n",
     [["|  `m:", M, "` | ",ModuleDoc," | [", App,"-",Vsn,"](`e:",App,":index.html`) |\n"] ||
         {M, {App,Vsn}, ModuleDoc} <- Modules],
     "\n\n"].

List-based concatenation is hard to read and maintain.

Solution With ~f Interpolation

module_table() ->
    % ...supressed code
    ~f""""
    |  Module name | Description | Application |
    |--------------|-------------|-------------|
    {<< ~f""" 
    |  `m:{M}` | {ModuleDoc} | [{App}"-"{Vsn}](`e:{App}:index.html`) |

    """ || {M, {App,Vsn}, ModuleDoc} <- Modules >>}
    """".
Example Output

This example uses a modified version of the function to accept the Modules as a param.

1> io:format("~s", [module_table([
       {~"foo", {~"kernel", ~"8.2"}, ~"Core OTP functionality"},
       {~"bar", {~"stdlib", ~"5.1"}, ~"Standard library utilities"}
   ])]).
|  Module name | Description | Application |
|--------------|-------------|-------------|
|  `m:foo` | Core OTP functionality | [kernel"-"8.2](`e:kernel:index.html`) |
|  `m:bar` | Standard library utilities | [stdlib"-"5.1](`e:stdlib:index.html`) |
ok
1 Like

How the f Sigil Works

The ~f sigil parses the string, splits it into static text and dynamic expressions,
and generates a binary by concatenating these parts. Each interpolated expression ({...})
is evaluated and coerced into a binary.

Example

1> FirstName = ~"Bob",
   Age = 27,
   ~f"""
   First Name: {FirstName}
   Age: {integer_to_binary(Age)}
   """.
<<"First Name: Bob\nAge: 27">>

Transformation Steps

  1. Parsing:
    The ~f sigil splits the string into:

    • Static text: "First Name: ", "\nAge: "
    • Dynamic expressions: FirstName, integer_to_binary(Age)
  2. Code Generation:
    Each segment becomes a binary part:

    <<
      "First Name: "/utf8,                      % Static text (UTF-8 binary)
      (begin FirstName end)/binary,             % Expression 1 (binary)
      "\nAge: "/utf8,                           % Static text (UTF-8 binary)
      (begin integer_to_binary(Age) end)/binary % Expression 2 (binary)
    >>
    
    

Equivalent Manual Code

1> FirstName = ~"Bob",
   Age = 27,
   <<
     "First Name: "/utf8, FirstName/binary, "\n" 
     "Age: "/utf8, (integer_to_binary(Age))/binary
   >>.
<<"First Name: Bob\nAge: 27">>

Key Details

  • Static Text: Preserved as UTF-8 binaries (e.g., "First Name: "/utf8).

  • Dynamic Expressions:

    • Wrapped in begin ... end to isolate evaluation.
    • Must resolve to binaries (e.g., integer_to_binary/1).
  • Transparent: No hidden magic – the transformation is predictable.

  • Erlang-Friendly: Works seamlessly with existing binary/string conventions.

1 Like

I’m not sure this is helpful and I do believe what’s being constructed in the example is indeed a payload. There’s also a lot of value in constructing and passing bin refs around for memory efficiency purposes, which in turn should lead to less gc pressure.

3 Likes

Following discussions about balancing efficiency and usability with the ~f sigil, I’ve introduced s and S modifiers to enable iolist generation alongside the existing binary output. This provides more flexibility for scenarios where avoiding unnecessary concatenation is critical.

What’s New?

  • ~f or fb or fB: Returns a flattened binary (original behavior).
  • ~fs or ~fS: Returns an iolist (nested list of binaries/strings), preserving structure for efficient I/O operations.

Example

Picking up my first example, the output using the fs (or fS) is:

1> render(Bindings) ->
       ~fs"""
       <html>
         <head>
           <title>{maps:get(title, Bindings)}</title>
         </head>
       </html>
       """.
ok
2> render(#{title => ~"Example"}).
[<<"<html>\n  <head>\n    <title>">>,<<"Example">>,
 <<"</title>\n  </head>\n</html>">>]

This change is reflected in the updated proposal PR.
Feedback and use cases are welcome :smiley:

1 Like

The example is definitely NOT a payload in the sense I was using the term. In fact none of the examples I have now seen in this thread can be considered as good cases for using a string, however constructed.

Q: when should I use a string?

Thank you for engaging deeply with this proposal.

Your perspective on structured data is valid and aligns with best practices for handling formats like XML/HTML/JSON.

Let me clarify the intent of the ~f sigil and address your concerns directly.

Structured Data vs. Payloads

You’re absolutely right that structured data (e.g., XML trees, JSON objects) should be represented and manipulated as native Erlang terms (tuples, maps, records) during processing. The ~f sigil is not meant to replace this approach. Instead, it targets the final serialization step where validated data is converted into a string/binary for use as a:

  • Network payload (e.g., HTTP response, MQTT message).
  • Storage payload (e.g., writing to a file or database).
  • Human-readable output (e.g., logs, CLI tools).

This aligns with your definition of strings as “payload data.”

Language Complexity

The ~f sigil is:

  • Opt-in: Teams can adopt it only where payload generation benefits outweigh manual concatenation.
  • Complementary: It doesn’t preclude using trees for structured data.
  • Minimal: Builds on EEP 64’s triple-quoted strings with a small syntax addition.

Addressing Your Example Critique

You mentioned:

“The motivating example is a truly terrible way to construct XML in Erlang.”

I agree—building XML via interpolation during manipulation would be error-prone.
However, if the XML is the final output of a validated tree, interpolation simplifies serialization:

% Assume `Doc` is a validated XML tree (e.g., via `xmerl`)  
serialize_xml(Doc) ->  
  ~f"<?xml version='1.0'?>{render_xml(Doc)}".

The ~f sigil targets payload generation, not structured data manipulation.
It aims to reduce boilerplate and errors in scenarios where strings are the endpoint, not the intermediate format.

Would love to hear your thoughts on these use cases or suggestions for refining the proposal o/

Thank you for your time to make this! I like the proposed syntax and your examples showcase great improvements in readability. The iolist generation support is also smart!

What I’m a bit confused about is the overlap with feature: String interpolation by TD5 · Pull Request #7343 · erlang/otp · GitHub. This change seems to be specific to triple quoted strings, but your later example (serialize_xml/1) uses a single quoted string. Does it support both? In general I’m confused at the competing approaches with TD5’s PR, but I’m not too involved with how the EEP process works.

One suggestion I have is to maybe consider using {{ and }} as escaped delimiters. The reason is that if - hypothetically - we were to introduce a “raw” string sigil (which passes through all characters including \, directly) and combine it with the format string sigil, it would no longer be possible to escape the interpolation, since \{ should, I think, result in literally \{.

Hi @jchrist, thank you for your thoughtful feedback!
Let me clarify the scope of this proposal and address your points:

  1. Scope: Triple-Quoted vs. All Strings

    • Initial Title Misleading: Apologies for the confusion! The PR title “Introduce f sigils for string interpolation” reflects the broader scope. I cannot edit the title anymore. Maybe @AstonJ can fix it :slight_smile:

    • Supports All Strings: The ~f sigil works with both triple-quoted and single-line strings:

      % Triple-quoted  
      ~f"""  
      Multi-line  
      {Value}  
      """  
      
      % Single-line  
      ~f"Single-line {Value}"
      
  2. Escaping Delimiters ({{ vs. \{)

    Your suggestion to use {{/}} is interesting! Here’s why I opted for {/} with \ escaping:

    • Consistency: Matches common interpolation syntax (e.g., Elixir’s #{}, Python’s f"{}").

    • Escape Simplicity: A single \ before { avoids conflicts:

      ~f"Escaped: \{NotInterpolated}" % <<"Escaped: {NotInterpolated}">>
      
  3. Relationship with @TD5’s EEP

    • Complementary, Not Competing:

      • TD5’s EEP: Focuses on interpolation via io_lib-style formatting (e.g., ~p, ~s).
        See examples here.
      • This Proposal: Provides minimal, direct interpolation without formatting logic.
    • Why Both Can Coexist:

      • TD5’s EEP: for debug/auto-formatting.
      • This Proposal: for explicit, controlled interpolation (e.g., payloads).

Please let me know if this clarifies the proposal’s scope and tradeoffs.

Thank you for the extensive reply and clarification!

Sorry, I think I didn’t write this clear enough. I don’t mean that {{Hello}} should be used for interpolation, but for escaping interpolation. So {Hello} would work as before for interpolation, and {{Hello}} would literally be "{Hello}". The suggestion with the “raw” strings - should such a thing be added - would mean that e.g. ~rf"\{Hello}" wouldn’t make it ambiguous (is it "\{Hello}" or "\World"), since then you would write ~rf"{{Hello}}" and there would be no confusion about how the backslash ends up.

Anyways, I’m talking about a theoretical feature here, who knows if we might have that problem in the first place :slight_smile:

Instead, it targets the final serialization step where validated data is converted into a string/binary for use as a:

  • Network payload (e.g., HTTP response, MQTT message).

As someone who works with MQTT, I find parts of this claim dubious. MQTT is a terse binary protocol that is mostly used to carry binary payloads; there’s nothing you can express as a string literal in the first place. And why would I want to allocate an extra binary instead of working with iodata?

1 Like

This example is a bit of a strawman, now you can do something like


render(Bindings) ->
  [<<"""
     <html>
       <head>
         <title>
   """>>,
   maps:get(title, Bindings),
   <<"""
   </title>
      </head>
   </html>
   """>>
  ].

But the core of the issue, do we really want to make string interpolation “ergonomic as in modern languages”, and invite little Bobby Tables to Erlang? PHP got its infamy in large part thanks to that great feature. In Erlang, the ugliness of such constructions draws necessary attention to them, which I see as a feature.

P.S.

  • Storage payload (e.g., writing to a file or database).

Please don’t! Sane SQL database clients provide API for prepared statements, they should be used for security reasons.

3 Likes

Ok, so maybe my examples are not that good. Sorry about that :slight_smile:

Clarifying

The proposal provides both options:

  • ~f"...": Returns a UTF-8 binary.
  • ~fs"...": Returns an iolist (list of binaries/strings).

It is opt-in and developers can choose based on their specific needs.

Examples (not that good, as always)

Using the f sigil returns a UTF-8 binary:

1> FirstName = ~"Bob",
   Age = 27,
   ~f"""
   First Name: {FirstName}
   Age: {integer_to_binary(Age)}
   """.
<<"First Name: Bob\nAge: 27">>

Using the fs returns an iolist:

1> FirstName = ~"Bob",
   Age = 27,
   ~fs"""
   First Name: {FirstName}
   Age: {integer_to_binary(Age)}
   """.
[<<"First Name: ">>,<<"Bob">>,<<"\nAge: ">>,<<"27">>]

Both work in single-line strings:

1> Name = ~"World".
2> ~f"Hello, {Name}!".
<<"Hello, World!">>
3> ~fs"Hello, {Name}!".
[<<"Hello, ">>,<<"World">>,<<"!">>]

I’ve no idea of the PHP syntax nor the problems with it. Could you please elaborate on that part? What’s the cons?

They are perfect for illustrating my point, though. Almost every problem that calls for string interpolation has potential security implications, be it HTML rendering or querying a relational database, and there’s almost always a better alternative.

PHP boasted ease of mixing HTML with the code, which quickly lead it to becoming almost synonymous with malicious payload injections of all kinds.

1 Like

TBH, “PHP! :scream:” was the first thing that came to my mind when I read “String Interpolation”, too…

Thanks for sharing your thoughts! IMHO, this seems less about the feature itself and more about common pitfalls in implementation practices.

Example Approaches

Based on the epgsql docs example:

epgsql:squery(C, "insert into account (name) values ('alice'), ('bob')").
  1. Using io_lib:format/2

    insert(C, Alice, Bob) ->  
        epgsql:squery(C, io_lib:format("insert into account (name) values ('~s'), ('~s')", [Alice, Bob])).
    
  2. IOLists

    insert(C, Alice, Bob) ->
        epgsql:squery(C, ["insert into account (name) values ('", Alice, "'), ('", Bob, "')"]).
    
    % Or with string concatenation
    insert(C, Alice, Bob) ->
        epgsql:squery(C, "insert into account (name) values ('" ++ Alice ++ "'), ('" ++ Bob ++ "')").
    
  3. Binaries

    insert(C, Alice, Bob) ->  
        epgsql:squery(C, <<"insert into account (name) values ('", Alice/binary, "'), ('", Bob/binary, "')">>).
    
  4. f Sigil (String Interpolation)

    insert(C, Alice, Bob) ->
        epgsql:squery(C, ~f"""
        insert into account (name) values ('{Alice}'), ('{Bob}')
        """).
    

Serious Questions

Isn’t that concern about the feature a security off-topic?
Is string interpolation inherently risky here, or does the responsibility lie in how we educate about security (e.g., SQL injection risks)?

This discussion was triggered by your very own motivational examples.

Yes, I would say so. String interpolation encourages bad practices, both security- and performance-wise. Lack of ergonomics in this area serves as a speed bump.

2 Likes

Sorry, but using epgsql as an example is bad. Even leaving this example up for future readers to find may encourage bad patterns.

epgsql, like every Postgresql client, provides a parameterized query API.

You should never, ever, be creating SQL by hand where a parameter value is required unless you really want to fall foul of a SQL injection attack.

This is the job of the database engine.

2 Likes

Hi, @LeonardB!

Sorry, I was not encouraging bad patterns; I was just trying to exemplify that people have plenty of ways of doing bad things that are not related to that feature.

I admit I miserably failed in my examples. I will try to do better :slight_smile:

1 Like