Releases: raj-open/phpytex
Bugfix + Improved Python Tokenisation
A bugfix which resulted from a non-robust implementation of the python tokeniser was dealt with.
In the process, we have significantly improved our use of the python tokeniser, adding unit and case tests in the process to back up this behaviour.
Explanation of bug
As part of the transpilation-process, interrupted code-blocks such as in the following example are parsed
% file: root.tex
Begin list
<<< python
for thing, count in [
("ant", 3),
("cat", 2),
(None, 5),
("elephant", 4),
]:
if thing is not None:
>>>
- <<< thing; >>> x <<< count; >>>
<<< escape_once; >>>
<<< escape_once; >>>
End of listWhen doing so it is important to compute the final indentation-level so that the interrupted parts are indented correctly, within the python code.
Before the bugfix, the above would result in
# file: phpytex_transpiled.py
# generate content from file 'root.tex'
def ____phpytex_generate_file_0():
global __ROOT__
global __DIR__
global __FNAME__
global __ANON__
global __HIDE__
global __IGNORE__
global __MARGIN__
__ROOT__ = '.'
__DIR__ = '.'
__FNAME__ = 'root.tex'
__IGNORE__ = False
__ANON__ = False
__HIDE__ = False
__MARGIN__ = ''
# Save current state locally. Use to restore state after importing subfiles.
__STATE__ = (__ROOT__, __DIR__, __FNAME__, __ANON__, __HIDE__, __IGNORE__)
for thing, count in [
("ant", 3),
("cat", 2),
(None, 5),
("elephant", 4),
]:
if thing is not None:
__MARGIN__ = ''
____print('''- {subst_0} x {subst_1}'''.format(
subst_0 = thing,
subst_1 = count,
), anon=False, hide=False, align=True)
pass
pass
____print(''''''.format(), anon=False, hide=False, align=True)
____print('', anon=False, hide=False, align=True)
returnwhich was obviously badly indented.
This in turn was down to the final indentation level being incorrectly computed.
Explanation of Fix
Our previous use of the native python tokeniser was non-robust and relied flimsily on the contents of the tokens, rather than their types, in order to guess indentation levels.
By studying the tokeniser, one can observer the following behaviour:
- it groups lines of code in sections, marked off by an
ENDMARKERtoken. - within a section, before this token is reached, a final
NEWLINEorNLtoken is generated,
followed by a series ofDEDENTtokens and nothing else.
I.e. the sequence of tokens in a section is always of the form:
Tok1-Tok2-...-TokN-NEWLINE/NL-DEDENT-DEDENT-...-DEDENT-ENDMARKER
The number of DEDENT in this "gene sequence" always reliably reflects the final indentation of the last line (even if comments are used). We further add 1, if the last non-comment token before the NEWLINE/NL token is an OP-token of type :.
This provides a reliable and robust means to compute final indentations (and in the process fixes the above bug).
Upgraded package manager + linter
In this version we upgraded the handling of dependencies in the repository:
- instead of
poetryas package manager, we now use ruff; - instead of
blackas linter, we now use uv (belongs to ruff).
These tools are used by widely used projects, such as polars, pandas, pytorch, etc. and significantly speed up build times.
In this release, the code itself has been refactored in various ways via the use of cleaner internal models, more use of pydantics, better handling of byte-streams, etc.
Note
A full refactorisation is still in the works.
Bugfix - Emptyarrays
Corrected (un)parser for empty array.
Previously this was
x = (,) # for empty tuples
x = [,] # for empty arrays
x = {,} # for empty dictsnow
x = () # for empty tuples
x = [] # for empty arrays
x = {} # for empty dictsA trailing comma is only used if num elements ≥ 1.
Allow `kebab-case` in user config -> convert to `snake_case` in code
Now allow users to use hyphenated keys in user config (.phpytex.yaml). When parse, these are cleaned so that they can be used in code. E.g.
# .phpytex.yaml
...
parameters:
file: src.params
overwrite: ...
options:
...
author-name: "Maximilian Ulysses"
author-surname: "Mustermann"
page_language-list:
- en-GB
- de-DE
- fr-FR
...
...transpiles to
# src/params.py
...
author_name = "Maximilian Ulysses"
author_surname = "Mustermann"
page_language_list = [ "en-GB", "de-DE", "fr-FR" ]
...Bugfixes + Custom Python Path
Various bugfixes:
-
corrected
unparsemethod (python values to text for python code) -
allow user to now choose a custom path to python to execute the transpiled code. This can either be set in the
compile > optionssection of.phpytex.yaml:compile: options: root: root.tex output: main.tex ... python-path: "Path/To/My Project/.venv/bin/python3" ...
or on-the-fly via the CLI-flag
--compileoption:phpytex run TRANSPILE --compile "{\"python-path\": "Path/To/My Project/.venv/bin/python3"}"
We also removed support for binary artefacts
-
The binary artefact is not a self-enclosed application. It uses the external python distribution as opposed to a venv (for practicality, otherwise the binary would be too large).
-
Therefore we shall stop supporting this option and instead aim towards a proper package (python module) solution in future.
To use the tool, run just setup, set the .env file, then run
just deploywhich creates an open source copy of the repo under the current version and links a bash script command to the execution commands found in the justfile.
Initial Release of Refactored Repo
V2 of the repository became quite entangled, hence a V3 of the repository (the present one) has been established. Much of the functionality is as in V2.