Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OMJulia not responding with long debug message #59

Open
AnHeuermann opened this issue Jan 6, 2022 · 7 comments
Open

OMJulia not responding with long debug message #59

AnHeuermann opened this issue Jan 6, 2022 · 7 comments
Assignees

Comments

@AnHeuermann
Copy link
Member

When I run the following script

loadModel(Modelica, {"4.0.0"});
getErrorString();

setCommandLineOptions("-d=newinst,backenddaeinfo,dumpSimCode");
getErrorString();

simulate(Modelica.Mechanics.MultiBody.Examples.Systems.RobotR3.FullRobot);
getErrorString();

with omc the total time is around 8 seconds.

When I convert the script to be used in OMJulia it will not finish (at least in an amount I'm willing to wait).

using OMJulia
omc = OMJulia.OMCSession("path/to/omc");
OMJulia.sendExpression(omc, "setCommandLineOptions(\"-d=newInst,dumpSimCode\")")
OMJulia.sendExpression(omc, "loadModel(Modelica)")
msg = OMJulia.sendExpression(omc, "simulate(Modelica.Mechanics.MultiBody.Examples.Systems.RobotR3.FullRobot)");

When I cancle the last command I get

julia> msg = OMJulia.sendExpression(omc, "simulate(Modelica.Mechanics.MultiBody.Examples.Systems.RobotR3.FullRobot)");
^CERROR: InterruptException:
Stacktrace:
 [1] poptask(W::Base.InvasiveLinkedListSynchronized{Task})
   @ Base ./task.jl:827
 [2] wait()
   @ Base ./task.jl:836
 [3] wait(c::Base.GenericCondition{Base.Threads.SpinLock})
   @ Base ./condition.jl:123
 [4] wait(fdw::FileWatching._FDWatcher; readable::Bool, writable::Bool)
   @ FileWatching /opt/julia/julia-1.7.1/share/julia/stdlib/v1.7/FileWatching/src/FileWatching.jl:533
 [5] wait
   @ ~/.julia/packages/ZMQ/R3wSD/src/socket.jl:52 [inlined]
 [6] _recv!(socket::ZMQ.Socket, zmsg::ZMQ.Message)
   @ ZMQ ~/.julia/packages/ZMQ/R3wSD/src/comm.jl:75
 [7] recv
   @ ~/.julia/packages/ZMQ/R3wSD/src/comm.jl:94 [inlined]
 [8] sendExpression(omc::OMJulia.OMCSession, expr::String)
   @ OMJulia ~/.julia/packages/OMJulia/ZLXEs/src/OMJulia.jl:1014
 [9] top-level scope
   @ REPL[7]:1

When I remove dumpSimCode OMJulia simulates fine in around 7 seconds.

My guess is that the debug log is to large to be handled.

@arun3688
Copy link
Contributor

arun3688 commented Jan 7, 2022

@AnHeuermann I could also reproduce the problem and when i remove the dumpSimCode it works fine, As you said the debug log is too large, when i tried with small example it worked fine

julia> OMJulia.sendExpression(omc, "setCommandLineOptions(\"-d=-newInst,dumpSimCode\")")
true
julia> OMJulia.sendExpression(omc, "model a parameter Real x =1; end a;")

julia> OMJulia.sendExpression(omc, "simulate(a)")
Dict{String, Any} with 10 entries:
  "timeCompile"       => 7.54703
  "simulationOptions" => "startTime = 0.0, stopTime = 1.0, numberOfIntervals = 500, tolerance = 1e-06, method = 'dassl', fileNamePrefix = 'a', options = '', outputFormat = 'mat', variableFil…
  "messages"          => "LOG_SUCCESS       | info    | The initialization finished successfully without homotopy method.\nLOG_SUCCESS       | info    | The simulation finished successfully.…
  "timeFrontend"      => 0.0074089
  "timeTotal"         => 7.72618
  "timeTemplates"     => 0.028673
  "timeSimulation"    => 0.135537
  "resultFile"        => "C:/Users/arupa54/a_res.mat"
  "timeSimCode"       => 0.0007035
  "timeBackend"       => 0.0063886

but i could not see the debug log generated from dumpSimCode

@AnHeuermann
Copy link
Member Author

The message should be in "messages" of the Dict. With smaller debug flags this does work, for example backenddaeinfo worked.

@casella
Copy link
Contributor

casella commented Jan 7, 2022

My guess is that the debug log is to large to be handled.

As a general remark, once a simulation tool has been refined enough to be actually usable for real-life work, small problems work fine, while trouble can arise with large, complicated models, which inevitably generate humongous log files. Sizes of serveral hundreds MB, or even over the GB size, are not unheard of, I had to handle some of them in the past when I had to turn on LOG_NLS_V to figure out what was going wrong in serious simulation problems that were failing for some reason.

The most frustrating situation for an end-user (I say this out of personal experience) is that you can't get this information when you need it most, i.e. when you are trying to get difficult real-life problems to work.

When you have such large log files, there are two potential issues:

  • if you keep this information in memory, you may run out of memory and crash the entire OS, or freeze it almost to death by disk swapping, which is not nice
  • if this information is then parsed by GUIs or other structured environments (e.g. OMEdit), they may not be able to scale well to the required size, which is not nice either, if your precious simulation that took 30 minutes to run eventually causes OMEdit or OMJulia to hang or crash, leaving you without a result and without a clue.

As I understand, the only safe scalable way to handle log files is to write them directly to mass storage, and then, only when this is possible (e.g. if the size is below a certain threshold), to read them back into memory for further fancy post-processing. I understand this is what OMEdit does with the C runtime: if the log exceeds 10 MB size, OMEdit doesn't even try to load it in the simulation output window, which is simply not designed to handle such large files effectively. Then one can use other tools, for example I use jEdit (which has very handy grep-like search features) for file size up to a few hundred MB, and PilotEdit, which does not load the entire file in memory, for larger files. One could also write small python or julia scripts to look for specific features into those files, grep them, or whatever.

Bottom line: the debug log handling has to be designed with large files in mind from the start! This is not an advanced feature, it is a basic fundamental one. Implementing a log file handling mechanism that relies on data structures in RAM, string arrays, or stuff like that, only works for toy models, but is inevitably bound to fail when it is needed most, so it is not a sound design and should be avoided.

@JKRT, what do you think?

@casella
Copy link
Contributor

casella commented Jan 7, 2022

The message should be in "messages" of the Dict. With smaller debug flags this does work, for example backenddaeinfo worked.

That's exactly what I meant as a questionable design 😃. Guaranteed not to work when you really need it.

I guess this can easily be improved by only using this data structure for sizes up to a certain limit, adding a reference to a file on mass storage if that size is exceeded.

@JKRT
Copy link
Member

JKRT commented Jan 7, 2022

@casella

Some bullets. You are right that you might sometimes need to write to file, since in some cases available memory is not enough, and swapping might result in the simulation taking an even longer time to complete. However, even writing to file is not 100% bulletproof. I remember one instance where I managed to overwrite my hard drive when experimenting with the logfiles in the scaleable test suite (For such extreme cases a cloud solution might be needed, or storing the log file in some compressed format.)

Julia has builtin support for logging
I guess some heuristic could be implemented to log messages of a certain size say 10 >=MiB or tie it to the size of the model somehow.

Concerning this particular issue, I guess the issue is twofold. One being with the receive method the other concerns that this particular flag does not save any message at all.

function sendExpression(omc, expr; parsed=true)
    if (process_running(omc.omcprocess))
        ZMQ.send(omc.socket, expr)
        message = ZMQ.recv(omc.socket)
        if parsed
            return Parser.parseOM(unsafe_string(message))
        else
            return unsafe_string(message)
        end
    else
        return "Process Exited, No connection with OMC. Create a new instance of OMCSession"
    end
end

The send(socket, x) and recv(socket, SomeType) functions make an extra copy of the data when converting between ZMQ and Julia. Alternatively, for large data sets (e.g. very large arrays or long strings), it can be preferable to share data, with send(socket, Message(x)) and msg = recv(Message), where the msg::Message object acts like an array of bytes; this involves some overhead so it may not be optimal for short messages.

I guess this is what we do here, but I do not remember to 100% how this works and that we get the result as a very large string (If we get anything at all). Is this an issue in OMMatlab @arun3688 ?

@arun3688
Copy link
Contributor

arun3688 commented Jan 7, 2022

@JKRT, @AnHeuermann This is not an issue in OMMatlab or OMPython , The simulation does not hangs and it works, We get the following message from OMPython and same for OMMatlab, I think the issue is only related to OMJulia, and may be ZMQ package in OMJulia might have some issues i guess. But in the message field i don't see any dumps from dumpSimCode but it says end condition reached

>>> from OMPython import OMCSessionZMQ
>>> omc = OMCSessionZMQ()
>>> omc.sendExpression("getVersion()")
'OpenModelica v1.19.0-dev-497-g4c87b752b8 (64-bit)'
>>> omc.sendExpression("setCommandLineOptions(\"-d=newInst,dumpSimCode\")")
True
>>> omc.sendExpression("loadModel(Modelica)")
True
>>> omc.sendExpression("simulate(Modelica.Mechanics.MultiBody.Examples.Systems.RobotR3.FullRobot)")
{'resultFile': 'c:/OPENMODELICAGIT/OpenModelica/testsuite/fmpy/Modelica.Mechanics.MultiBody.Examples.Systems.RobotR3.FullRobot_res.mat', 'simulationOptions': "startTime = 0.0, stopTime = 2.0, numberOfIntervals = 500, tolerance = 1e-06, method = 'dassl', fileNamePrefix = 'Modelica.Mechanics.MultiBody.Examples.Systems.RobotR3.FullRobot', options = '', outputFormat = 'mat', variableFilter = '.*', cflags = '', simflags = ''", 'messages': 'LOG_SUCCESS       | info    | The initialization finished successfully without homotopy method.\n[C:/OPENMODELICAGIT/OpenModelica/build/lib/omlibrary/Modelica 4.0.0/Blocks/Logical.mo:942:7-942:33:writable]\nstdout            | info    | Simulation call terminate() at time 1.856045\n|                 | |       | Message : ... End condition reached\nLOG_SUCCESS       | info    | The simulation finished successfully.\n', 'timeFrontend': 1.3734987, 'timeBackend': 4.7949319, 'timeSimCode': 1.1752712, 'timeTemplates': 0.6500264, 'timeCompile': 13.595949, 'timeSimulation': 1.5467495, 'timeTotal': 23.136971}
>>>

@casella
Copy link
Contributor

casella commented Jan 7, 2022

You are right that you might sometimes need to write to file, since in some cases available memory is not enough, and swapping might result in the simulation taking an even longer time to complete. However, even writing to file is not 100% bulletproof.

Of course. But at least you have 1-2 more orders of magnitude of headroom. Setting an upper limit to file size as well seems a good strategy.

I remember one instance where I managed to overwrite my hard drive when experimenting with the logfiles in the scaleable test suite (For such extreme cases a cloud solution might be needed, or storing the log file in some compressed format.)

A full hard disk is in fact even worse than a full RAM, in the sense that the problem doesn't go away when you reboot...

Julia has builtin support for logging I guess some heuristic could be implemented to log messages of a certain size say 10 >=MiB or tie it to the size of the model somehow.

And maybe also do that for the file size, just in case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants