Transforming `erl_parse:abstract_form()` to `erl_syntax:syntaxTree()`

duncanatt · February 19, 2024, 6:33pm

I would like to transform an erl_parse:abstract_form() representation into an erl_syntax:syntaxTree(). Is there a straightforward way to achieve this?

My main reason for wanting this transformation is to annotate AST nodes with typing information. It seems that using functions such as erl_syntax:set_ann/2, erl_syntax:get_ann/1, etc., is the cleanest way to write/read custom annotations to/from syntaxTree() nodes. Is my understanding correct?

One worry I have is this. The docs for erl_syntax state that developers should assume nothing about the internal representation of syntaxTree(), as it could change without notice. Would this mean that pattern matching against tree nodes (like in parse transformations) is discouraged? Using the functions provided by erl_syntax to create AST nodes is sensible in my head. Yet, adopting erl_syntax:type/1, erl_syntax:function_name/1, etc., for controlling the program flow or reading node information seems tedious and inelegant by comparison to pattern matching. Is there a better solution that I have missed?

Lastly, I’ve tried adding custom annotations to abstract_form() trees using the erl_anno module in lieu of using syntaxTree() since the latter representation is very verbose. This approach is hacky at best, however. I gather from the Erlang docs and implementation of erl_anno that this is not a correct way of decorating abstract_form() nodes with custom annotations. The only annotations allowed by erl_anno are column, location, text, line, file, generated, and record. Is this a correct interpretation?

mmin · February 20, 2024, 8:41am

There is a tradeoff - pattern matching will probably result in a smaller code, but will be more fragile. Using functions from erl_syntax will produce more code but will be more stable in the future. Use pattern matching if you want to prototype something fast, but if you know exactly what you want to do and that thing is of a long-term value then go with the second approach.

erl_anno is restrictive here, but if you can hack it to work, why not? Anno is either just a location or a proplist - and you can insert additional props into it without breaking getters like erl_anno:text/1. I don’t know if that breaks other stuff that use annotations from AST. As I can see, erl_syntax allows custom annotations.

duncanatt · February 20, 2024, 12:34pm

Thanks for your suggestion. I think what you say regarding prototyping vs. long-term value makes sense. I will probably follow this approach.

I agree with you that erl_anno is a location or a list of key-value tuples. In fact, I’ve inserted such tuples into annotations without problems. I can confirm that it does not disrupt getters such as erl_anno:text/1. I do not know whether it breaks other things in modules that rely on annotations in the AST. I’m hoping that such modules manipulate annotations exclusively via erl_anno, which should not cause things to break.

Now, I what I meant by hacking is this. For an arbitrary list of key-value tuples, erl_anno:is_anno/1 returns false. It only returns true when the key is one of the annotations I mentioned in the original post. This makes my current approach not purely ‘correct’. That being said, I’m using this workaround at the moment.

mmin · February 20, 2024, 1:00pm

FWIW, outside tests, erl_anno:is_anno/1 is used only 3 times in whole OTP (merl_transform, erl_eval and erl2html2) and in all of them it doesn’t result in an error.

duncanatt · February 20, 2024, 5:02pm

Thanks for pointing that out. I appreciate your point of view and will follow the pragmatic approach you suggest.