Compiler warning for infinite loops?

eproxus · March 21, 2023, 2:33pm

I just had a bug of the form:

get(Path) -> get(Path).

get(Path, Headers) -> request(<<"GET">>, Path, Headers).

This took a looong while to find, because the app didn’t start properly but no errors showed up in the logs, just a bunch of timeouts down the line (because things that should have happened after calling get(Path) didn’t happen).

Would it make sense to have a compiler warning for infinite loops that directly call its own function? I can not figure out a real use case for ever writing such a function (apart from consuming CPU) so a warning should be fine?

elbrujohalcon · March 22, 2023, 10:02am

Of course, it should be smart enough not to emit a warning in the following situation, right?

loop(State) ->
  NewState = receive
    Msg -> handle(Msg)
  end,
  loop(NewState).

I mean… Eternal loops are the basis of most Erlang processes. But if there is a way to distinguish one kind of eternal loop from the other, maybe a warning would be good.

eproxus · March 22, 2023, 11:09am

I want a warning for the very obvious, basic case of a function directly calling itself without any local or global side effects. At least that’s a start, if more advanced cases can be found that’s of course also good

bradhanks · March 22, 2023, 11:18pm

I don’t think you get OTP without infinite loops.

eproxus · March 23, 2023, 9:01am

Perhaps I didn’t explain clearly enough. I’m specifically looking for a warning that checks if a function calls itself directly. E.g. the following assembler code should trigger it:

{function, my_func, 1, 2}.
  {label,1}.
    {func_info,{atom,my_module},{atom,my_func},1}.
  {label,2}.
    {call_only,1,{f,2}}.

As seen when compiling this module:

-module(my_module).
-export([my_func/1]).

my_func(Foo) -> my_func(Foo).

jimdigriz · March 23, 2023, 1:13pm

Maybe look to process_info(Pid, [reductions,current_function]), dbg tracing and etop to solve this next time?

Using both of these would have highlighted the infinite loop pretty much instantly; fast increasing reduction count and never leaving that function. They also help catching a whole bunch of the problems the compiler will never be able to highlight (which I think as others have pointed out include this one).

Personally, I think we all have been where you were in the original bug hunt but use that frustration as an excuse to learn other tools.

eproxus · March 23, 2023, 1:32pm

I mean, we found the bug eventually and I know how to use all those tools. I just think the compiler could have saved me a lot of time getting there, because I don’t agree about the “pretty much instantly” part

Not sure what everyone has against the idea about the compiler warning…

asabil · March 23, 2023, 1:33pm

I think having such a compiler warning would have saved me countless hours of debugging. I have been bitten by this exact same issue so many times…

eproxus · March 23, 2023, 1:37pm

Agree. I’d rather spend my time building cool stuff than “re-learning” how to use the debugger again.

That being said, I think the proposed warning would be of similar value as the this clause cannot match because a previous clause at line X always matches and I personally still can’t find a reason not to have it.

franbrau · April 6, 2023, 3:20am

OCaml does that

nzok · April 6, 2023, 7:19am

There has actually been a lot of work over the years on termination analysis.
If memory serves me correctly, in order to define a function at all in the
Boyer-Moore theorem prover (back in the 1970s) it had to be (made) obvious
to the verifier that the function would terminate. Typically, there had to
be a measure (by default, size) on the arguments such that any recursive call
was ‘simpler’.

An obvious violation of this is when the arguments of a recursive call are
identical to the arguments in the head.

It’s true that you don’t get OTP without infinite loops, but
loop(State) → … loop(NewState)
normally doesn’t pass the same state in the recursive call.

It’s good to get to know a wide range of debugging tools, but the sooner any
bug is found, the better for everyone.

The following AWK script might be a useful approximation of this bug check:
/^[a-zA-Z.@][a-zA-Z0-9.@][(].[)] ->/ {
head = $0
sub(/ ->.$/, “”, head)
gsub(/ /, “”, head)
defn = FNR
}
match($0, /[^a-zA-Z0-9.@][a-zA-Z.@][a-zA-Z0-9.@_][(].[)][,;. ](%.)?$/) {
x = substr($0, RSTART+1, RLENGTH-1)
sub(/[,;. ](%.)?$/, “”, x)
gsub(/ /, “”, x)
if (x == head) {
print FILENAME “: possible infinite loop”, defn, FNR
}
}

It’s not perfect, but it was good enough to experiment with and look for
false positives. server_loop/2 in zip.erl of stdlib has a nice set of
false positives. And of course it is confused by ‘when’.

Here’s an interesting case.

stat_loop(M, Old) →
timer:sleep(2000),
case statistics(run_queue) of
Old →
stat_loop(M, Old);
NewLoad →
M ! {node(), load, NewLoad}, %% async
stat_loop(M, NewLoad)
end.

This is an infinite recursion with an unchanged state.
It’s basically spinning until the run queue changes.
What I find interesting here is that a quick debugging hack
has drawn my attention to code that is dubious for another
reason: a magic delay. Why 2000? Why not 200 or 20000?
Is this a good pick in the majority of cases? It surely
deserves a comment.

A slightly more sophisticated version of this – based on
the abstract syntax after macro processing – looks as though
it might be useful.

jimdigriz · April 6, 2023, 8:17am

I would not be surprised if north of 75% of loops do maintain the same state for OTP applications.

Thinking here gen_{server,statem,...} where you field requests for resources the process maintains.

Thinking of even a slightly complex example.

You have a credential cache process[1] (eg. access and refresh tokens via OAuth2) and it receives requests to use those credentials in some manner.

You are stashing this in a process as maybe you want to treat them as sensitive and not make them available to everything (ie. poormans secure enclave) as you would if you used ets or something similar.

For the first request, you get a fresh pair of credentials that are good for an hour, and then every request afterwards, there is no need to change the state.

Infinite loops be here.

Maybe it is just the way I write my process code but I tend to not have much churn on the actual state. No particular reason I can think of for why I do that, just that is how I build them out.