-
I have installed
The Python I have in this environment is:
Now, I have a collection of Pandoc lua filter scripts, which I want to keep in a subfolder; I also have a Here is a test setup:
I use
Note that, to begin with, the call to python script is commented. So now, if I do:
So, without the Python script, building pdf using So, let's remove the comment character from the
... and try the build again:
So, somehow, from the
Not surprising, then, that python reports error "can't open file ... No such file or directory". I recall, in earlier versions of python (I think it was 2.13), I had tried entering an absolute ("mixed" mode, with Windows drives but forward slashes) path for the python script, as found via:
... in the .yaml:
... but if I try it in the same example as above, I get again:
... so it does not work in the most recent version. But even if it did, I would not want to use it, because it then breaks my entire approach of:
Finally, if I just use the MSYS2 Unix path, that is:
... in the .yaml file, then I get a different error:
... which apparently means that the
So - how can I have a relative path reference to a Python filter script for Pandoc, in a Pandoc .yaml file? |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 1 reply
-
Do you get the same problem if you are producing html rather than pdf? That would help rule out Text.Pandoc.PDF as a culprit. |
Beta Was this translation helpful? Give feedback.
-
Thanks @jgm :
If I recall correctly, all I need to do is to change the output filename extension from
... I still get the same error for .html output:
|
Beta Was this translation helpful? Give feedback.
-
It looks like the way takeDirectory <$> canonicalizePath defaultsFilePath Maybe this behaves in unexpected ways on Windows -- I'm sorry, I don't have a Windows box set up to test on. |
Beta Was this translation helpful? Give feedback.
-
If you have ghc installed, you could try running ghci and then
where for defaultsFilePath put (in "") the path to the defaults file just as specified on the command line. |
Beta Was this translation helpful? Give feedback.
-
Thanks @jgm :
No problem, thanks for taking a look! However, I do not think this is solely
Sure! I did not have ghc installed, but after a bit of a mess ( https://stackoverflow.com/questions/77800125/completely-avoid-use-of-appdata-when-installing-haskell-stack-on-windows ) managed to install it; here is what I got, running from the
Looks reasonable, since I'm starting to suspect, that if the MSYS2 version of Python sees an argument which does not look like an UNIX absolute path (i.e. it does not start with a / slash, and has a colon), then it prepends the UNIX path of the current working directory to that path/argument; I'll see if I can confirm that ... |
Beta Was this translation helpful? Give feedback.
-
Right - I think I can confirm now, that this:
.... was indeed the culprit. Note that in MSYS2, you might end up with multiple Python installations, depending on the shell you use:
Also, my Python script in the OP was very wrong, here is the corrected #! /usr/bin/env python
import subprocess
import os
import sys
import shutil
from pandocfilters import toJSONFilter, Str, Para, Image, Span, attributes
def do_process(key, value, fmt, meta):
#sys.stderr.write(f"{key=}, {value=}, {fmt=}, {meta=}\n")
if key == 'Span':
sys.stderr.write("Hey, python got a span, key: {} value: {} fmt: {} meta: {}\n".format(key, value, fmt, meta))
# see also: [Pandoc filter to convert Critic Markup to Spans (which will be converted to Tracked Changes in Docx).](https://gist.github.com/HeirOfNorton/7dbcaa7c22297fe1b303)
retval = Span(value[0], value[1])
#sys.stderr.write("retval {}\n".format(retval))
return retval
if __name__ == "__main__":
#print("main python (print)") # no printout; AND causes "Error in $: Failed reading: not a valid json value at 'mainpython(print)"!!
sys.stderr.write("main python (stderr)\n") # works
toJSONFilter(do_process) So - if I use this pdf-engine: xelatex
# NOTE https://github.com/jgm/pandoc/issues/6152 - resource path should be colon separated?
resource-path: [".:../scripts/pandoc:${.}/../scripts/pandoc"]
filters:
- ${.}/../scripts/pandoc/span-test.lua
- ${.}/../scripts/pandoc/python-test.py ... then, if I run the
... however, if I run the
Also, re-checked the results while specifying absolute paths for the python script (even if it does not work for my use case): # ...
#- ${.}/../scripts/pandoc/python-test.py
#- /c/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py # nowork in neither MSYS2 nor MINGW64
- C:/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py # works in MINGW64, nowork in MSYS2 ... or in other words:
So, the general workaround for now is: if you want to run a Python filter script (specified in .yaml file via |
Beta Was this translation helpful? Give feedback.
-
Since this problem was irritating, I found also a workaround that makes a Python Pandoc filter script work in both MSYS2 and MINGW64 shells. In brief - we create a "passthrough" Lua script, which then calculates the correct Python filter script path depending on the calling shell, and calls python with the right script path, to process the entire Pandoc document. Since our Python script was called function dump(o)
if type(o) == 'table' then
local s = '{ '
for k,v in pairs(o) do
if type(k) ~= 'number' then k = '"'..k..'"' end
s = s .. '['..type(k)..'] = ' .. dump(v) .. ',\n'
end
return s .. '} '
else
return tostring(o) .. "//" .. type(o)
end
end
function getPath(str)
return str:match("(.*[/\\])")
end
-- https://stackoverflow.com/questions/49702708/using-a-lua-filter-how-do-you-convert-a-table-into-json-or-native-text
local utils = require 'pandoc.utils'
local current_dir=io.popen"cd":read'*l' -- SO:6032268, works in MSYS2/Windows; the cwd (where we call the pandoc command from)!
local fullpath = debug.getinfo(1,"S").source:sub(2) -- http://lua-users.org/lists/lua-l/2020-01/msg00345.html ; full path to this .lua script
print("current_dir", current_dir)
print("fullpath", fullpath)
--current_dir C:\msys64\tmp\pandoc_test\docs\test_example
--fullpath C:\msys64\tmp\pandoc_test\docs/../scripts/pandoc/python-test.lua
-- print(debug.getinfo(1,"S")) -- table: 0000018cf08bc050
--print(dump(debug.getinfo(1,"S")))
--
--{ [string] = @C:\msys64\tmp\pandoc_test\docs/../scripts/pandoc/python-test.lua//string,
--[string] = 0//number,
--[string] = 0//number,
--[string] = main//string,
--[string] = ...4\tmp\pandoc_test\docs/../scripts/pandoc/python-test.lua//string,
--}
-- https://www.tutorialspoint.com/io-popen-function-in-lua-programming
-- https://www.gammon.com.au/scripts/doc.php?lua=io.popen
-- https://www.gammon.com.au/scripts/doc.php?lua=f:read
-- apparently, cmdline gets executed by cmd.exe - even if we run pandoc from a MSYS2 (or MINGW64) bash shell!
-- so, we first want to execute cd to target directory, then just run `cd` without arguments to print out current path (in Linux, `cd` without arguments does not print anything, and instead changes current working directory to the user home directory) - and since cmd.exe executes this, we use & as command separator
local cmdline = "cd " .. getPath(fullpath) .. " & cd"
print("cmdline", cmdline)
local handle = io.popen(cmdline)
-- use "*l" to get a single line; if we use "*a" for all, we also get extra (CR)LF!
local result_winpath_scriptdir = handle:read("*l")
handle:close()
print("result_winpath_scriptdir", result_winpath_scriptdir)
-- cmdline cd C:\msys64\tmp\pandoc_test\docs/../scripts/pandoc/ & cd
-- result_winpath_scriptdir C:\msys64\tmp\pandoc_test\scripts\pandoc
-- so result_winpath_scriptdir is currently a Windows path;
-- convert it to Unix/MSYS2 one with string processing:
-- 1. replace backslash with forward slash:
local mixed_scriptdir = result_winpath_scriptdir:gsub("\\","/")
--print("unixpath", unixpath) -- " C:/msys64/tmp/pandoc_test/scripts/pandoc"
-- 2. Replace the starting "DRIVELETTER:" with "/DRIVELETTER"
-- seemingly, msys can resolve drive letters with wither uppercase (e.g. `/C/`) or lowercase (e.g. `/c/`) letter; so here we can just remove the colon first:
local unixpath = mixed_scriptdir:gsub(":","")
-- ... and then just prepend a forward slash:
unixpath = "/" .. unixpath
print("unixpath", unixpath) -- "/C/msys64/tmp/pandoc_test/scripts/pandoc"
-- calculate the unix path to Python script
local unixpypath = unixpath .. "/" .. "python-test.py"
print("unixpypath", unixpypath) -- "/C/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py"
local mixedpypath = mixed_scriptdir .. "/" .. "python-test.py"
print("mixedpypath", mixedpypath) -- "C:/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py"
-- NOTE: the "/C/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py" path works in MSYS2, but MINGW64 fails with: "python: can't open file 'C:/C/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py': [Errno 2] No such file or directory";
-- so MINGW64 ends up prepending DRIVELETTER: to the Unix path;
-- therefore, determine the calling shell, and supply the python script path accordingly
-- https://stackoverflow.com/questions/55559604/how-can-i-find-out-the-version-of-msys2-from-its-bash-shell
-- `cat /proc/version` might produce:
-- MSYS_NT-10.0-19045 version 3.4.10.x86_64 (runneradmin@fv-az1495-832) (gcc version 13.2.0 (GCC) ) 2023-12-22 10:06 UTC
-- MINGW64_NT-10.0-19045 version 3.4.10.x86_64 (runneradmin@fv-az1495-832) (gcc version 13.2.0 (GCC) ) 2023-12-22 10:06 UTC
handle = io.popen("cat /proc/version")
local shell_version = handle:read("*l")
handle:close()
print("shell_version", shell_version)
-- finally, allow for calling the Python filter script on the entire Pandoc document
-- https://pandoc.org/lua-filters.html#pandoc.utils.run_json_filter
function Pandoc(blocks)
--local proc_doc = utils.run_json_filter(blocks, unixpypath) -- Could not find executable /C/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py
--local proc_doc = utils.run_json_filter(blocks, "python " .. unixpypath) -- Could not find executable python /C/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py
local pypathargs = {unixpypath} -- for MSYS2
if shell_version:match "^MINGW64" then -- SO:30150096
pypathargs = {mixedpypath}
end
print("pypathargs", pypathargs) -- table: 000001d008ff16f0
print( dump(pypathargs) )
local proc_doc = utils.run_json_filter(blocks, "python", pypathargs)
--return proc_doc.blocks -- Error running filter Pandoc expected, got Blocks
return proc_doc
end So, with this script, the structure in this example looks like:
Also, replace pdf-engine: xelatex
# NOTE https://github.com/jgm/pandoc/issues/6152 - resource path should be colon separated?
resource-path: [".:../scripts/pandoc:${.}/../scripts/pandoc"]
filters:
- ${.}/../scripts/pandoc/span-test.lua
#- ${.}/../scripts/pandoc/python-test.py
#- /c/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py # nowork in neither MSYS2 nor MINGW64
#- C:/msys64/tmp/pandoc_test/scripts/pandoc/python-test.py # works in MINGW64, nowork in MSYS2
- ${.}/../scripts/pandoc/python-test.lua Now, we can call
The only difference in the above output, when
Hope that was it with MSYS2/MINGW64 Python filter scripts for |
Beta Was this translation helpful? Give feedback.
Right - I think I can confirm now, that this:
.... was indeed the culprit.
Note that in MSYS2, you might end up with multiple Python installations, depending on the shell you use: