Skip to content

Releases: raj-open/phpytex

Bugfix + Improved Python Tokenisation

09 Jan 23:06
e0b32ab

Choose a tag to compare

A bugfix which resulted from a non-robust implementation of the python tokeniser was dealt with.
In the process, we have significantly improved our use of the python tokeniser, adding unit and case tests in the process to back up this behaviour.

Explanation of bug

As part of the transpilation-process, interrupted code-blocks such as in the following example are parsed

% file: root.tex
Begin list

<<< python
for thing, count in [
    ("ant", 3),
    ("cat", 2),
    (None, 5),
    ("elephant", 4),
]:
    if thing is not None:
>>>
- <<< thing; >>> x <<< count; >>>
<<< escape_once; >>>
<<< escape_once; >>>

End of list

When doing so it is important to compute the final indentation-level so that the interrupted parts are indented correctly, within the python code.

Before the bugfix, the above would result in

# file: phpytex_transpiled.py

# generate content from file 'root.tex'
def ____phpytex_generate_file_0():
    global __ROOT__
    global __DIR__
    global __FNAME__
    global __ANON__
    global __HIDE__
    global __IGNORE__
    global __MARGIN__
    __ROOT__ = '.'
    __DIR__ = '.'
    __FNAME__ = 'root.tex'
    __IGNORE__ = False
    __ANON__ = False
    __HIDE__ = False
    __MARGIN__ = ''
    # Save current state locally. Use to restore state after importing subfiles.
    __STATE__ = (__ROOT__, __DIR__, __FNAME__, __ANON__, __HIDE__, __IGNORE__)
    for thing, count in [
        ("ant", 3),
        ("cat", 2),
        (None, 5),
        ("elephant", 4),
    ]:
        if thing is not None:
        __MARGIN__ = ''
        ____print('''- {subst_0} x {subst_1}'''.format(
            subst_0 = thing,
            subst_1 = count,
        ), anon=False, hide=False, align=True)
    pass
    pass
    ____print(''''''.format(), anon=False, hide=False, align=True)
    ____print('', anon=False, hide=False, align=True)
    return

which was obviously badly indented.

This in turn was down to the final indentation level being incorrectly computed.

Explanation of Fix

Our previous use of the native python tokeniser was non-robust and relied flimsily on the contents of the tokens, rather than their types, in order to guess indentation levels.

By studying the tokeniser, one can observer the following behaviour:

  • it groups lines of code in sections, marked off by an ENDMARKER token.
  • within a section, before this token is reached, a final NEWLINE or NL token is generated,
    followed by a series of DEDENT tokens and nothing else.

I.e. the sequence of tokens in a section is always of the form:

Tok1-Tok2-...-TokN-NEWLINE/NL-DEDENT-DEDENT-...-DEDENT-ENDMARKER

The number of DEDENT in this "gene sequence" always reliably reflects the final indentation of the last line (even if comments are used). We further add 1, if the last non-comment token before the NEWLINE/NL token is an OP-token of type :.

This provides a reliable and robust means to compute final indentations (and in the process fixes the above bug).

Upgraded package manager + linter

09 Jan 00:26
3402139

Choose a tag to compare

In this version we upgraded the handling of dependencies in the repository:

  • instead of poetry as package manager, we now use ruff;
  • instead of black as linter, we now use uv (belongs to ruff).

These tools are used by widely used projects, such as polars, pandas, pytorch, etc. and significantly speed up build times.

In this release, the code itself has been refactored in various ways via the use of cleaner internal models, more use of pydantics, better handling of byte-streams, etc.

Note

A full refactorisation is still in the works.

Bugfix - Emptyarrays

29 Jul 10:52
e3a149f

Choose a tag to compare

Corrected (un)parser for empty array.
Previously this was

x = (,) # for empty tuples
x = [,] # for empty arrays
x = {,} # for empty dicts

now

x = () # for empty tuples
x = [] # for empty arrays
x = {} # for empty dicts

A trailing comma is only used if num elements ≥ 1.

Allow `kebab-case` in user config -> convert to `snake_case` in code

28 Apr 20:52
ccf65a3

Choose a tag to compare

Now allow users to use hyphenated keys in user config (.phpytex.yaml). When parse, these are cleaned so that they can be used in code. E.g.

# .phpytex.yaml
...
parameters:
  file: src.params
  overwrite: ...
  options:
    ...
    author-name: "Maximilian Ulysses"
    author-surname: "Mustermann"
    page_language-list:
      - en-GB
      - de-DE
      - fr-FR
    ...
...

transpiles to

# src/params.py
...
author_name = "Maximilian Ulysses"
author_surname = "Mustermann"
page_language_list = [ "en-GB", "de-DE", "fr-FR" ]
...

Bugfixes + Custom Python Path

28 Apr 16:09
f8e709e

Choose a tag to compare

Various bugfixes:

  • corrected unparse method (python values to text for python code)

  • allow user to now choose a custom path to python to execute the transpiled code. This can either be set in the compile > options section of .phpytex.yaml:

    compile:
      options:
        root: root.tex
        output: main.tex
        ...
        python-path: "Path/To/My Project/.venv/bin/python3"
        ...

    or on-the-fly via the CLI-flag --compile option:

    phpytex run TRANSPILE --compile "{\"python-path\": "Path/To/My Project/.venv/bin/python3"}"

We also removed support for binary artefacts

  • The binary artefact is not a self-enclosed application. It uses the external python distribution as opposed to a venv (for practicality, otherwise the binary would be too large).

  • Therefore we shall stop supporting this option and instead aim towards a proper package (python module) solution in future.

To use the tool, run just setup, set the .env file, then run

just deploy

which creates an open source copy of the repo under the current version and links a bash script command to the execution commands found in the justfile.

Initial Release of Refactored Repo

28 Apr 14:12
72fcd43

Choose a tag to compare

V2 of the repository became quite entangled, hence a V3 of the repository (the present one) has been established. Much of the functionality is as in V2.