Missing output files from batch on Mac

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Missing output files from batch on Mac

Jonas Andersen
Hi,

We're running some batch simulations on Mac servers. We are from time to time experiencing that the output files in the instance directory are missing. Has anyone else experienced something similar?

The simulation seems to run fine and in the end there is the detection that the simulation has finished written in the Repast (Eclipse) console:

   Polled /....../simphony_model_1477633174684 for DONE:  yes

After a few more lines, there is the following warnings:

  No model output found matching glob:{**/,}Statistics*.batch_param_map.csv in /....../simphony_model_1477633174684/instance_1

The line is written for three different output files (defined as Text Sinks in the model) and repeated for each instance directory (we have five).

Finally there are the "Aggregating output", "Get Output Time" and "Finished" lines.

It appears that the output files are never generated. When checking the instance directories, these are all empty except for the symbolic link to the "data" directory. The usual files such as the debug and instance logs as well as the generated CSV files are all missing. This also appears to be the case during the simulation (not just after it has completed).

What could cause this files to not be properly created? There is plenty of disk space, and we have not observed any errors.

We are using Repast 2.3.1. The Repast environment is installed on the server, which is accessed through Remote Desktop / Screen Sharing. The simulations are launched as local instances.

Best regards,

Jonas Andersen




------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Repast-interest mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/repast-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missing output files from batch on Mac

srcnick
Jonas,

Just to clarify, these runs are launched as local batch runs on a server where you login through Remote Desktop?  The problem only occurs intermittently? Is there any info in the log file in the top directory, i.e. the one above the instance directories?

I think that the files must be successfully written somewhere, assuming the runs do actually execute, otherwise the runs would fail before completion with some sort of IO related exception.  So, either they are getting deleted or they are not being written where they are supposed to. What’s the full path to the instance directory, that is, what’s the parent path of the simphony_model_xxx. Is it possible that there could be some permission issue or periodic job that cleans up files from those directories? That seems unlikely given that the directories remain, but I’m at a loss to know what’s going on.

Nick

> On Nov 3, 2016, at 2:04 AM, Jonas Andersen <[hidden email]> wrote:
>
> Hi,
>
> We're running some batch simulations on Mac servers. We are from time to time experiencing that the output files in the instance directory are missing. Has anyone else experienced something similar?
>
> The simulation seems to run fine and in the end there is the detection that the simulation has finished written in the Repast (Eclipse) console:
>
>    Polled /....../simphony_model_1477633174684 for DONE:  yes
>
> After a few more lines, there is the following warnings:
>
>   No model output found matching glob:{**/,}Statistics*.batch_param_map.csv in /....../simphony_model_1477633174684/instance_1
>
> The line is written for three different output files (defined as Text Sinks in the model) and repeated for each instance directory (we have five).
>
> Finally there are the "Aggregating output", "Get Output Time" and "Finished" lines.
>
> It appears that the output files are never generated. When checking the instance directories, these are all empty except for the symbolic link to the "data" directory. The usual files such as the debug and instance logs as well as the generated CSV files are all missing. This also appears to be the case during the simulation (not just after it has completed).
>
> What could cause this files to not be properly created? There is plenty of disk space, and we have not observed any errors.
>
> We are using Repast 2.3.1. The Repast environment is installed on the server, which is accessed through Remote Desktop / Screen Sharing. The simulations are launched as local instances.
>
> Best regards,
>
> Jonas Andersen
>
>
>
> ------------------------------------------------------------------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi_______________________________________________
> Repast-interest mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/repast-interest


------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Repast-interest mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/repast-interest
Reply | Threaded
Open this post in threaded view
|

Re: Missing output files from batch on Mac

Jonas Andersen
Hi Nick,

Thanks for the reply. The path I've checked is the one written to the console in Eclipse (Repast). This was the polling location of instance_1 and I simply checked based on the parent directory.

The copy of the directory I currently have is only after the completion. There is a DONE file and the status_output.properties states OK for all five instances. I'm certain that the instance directories were also empty (except for the data symlink) during the execution of the simulation.
I'll keep my eye on future simulation and make a copy of the in-progress simulation directory if I notice that the output files are missing again.

This has happened a couple of times and the cause is none the clearer. I'll keep analyzing the issue.

Best regards,

Jonas


On 3 November 2016 at 14:17, Nick Collier <[hidden email]> wrote:
Jonas,

Just to clarify, these runs are launched as local batch runs on a server where you login through Remote Desktop?  The problem only occurs intermittently? Is there any info in the log file in the top directory, i.e. the one above the instance directories?

I think that the files must be successfully written somewhere, assuming the runs do actually execute, otherwise the runs would fail before completion with some sort of IO related exception.  So, either they are getting deleted or they are not being written where they are supposed to. What’s the full path to the instance directory, that is, what’s the parent path of the simphony_model_xxx. Is it possible that there could be some permission issue or periodic job that cleans up files from those directories? That seems unlikely given that the directories remain, but I’m at a loss to know what’s going on.

Nick

> On Nov 3, 2016, at 2:04 AM, Jonas Andersen <[hidden email]> wrote:
>
> Hi,
>
> We're running some batch simulations on Mac servers. We are from time to time experiencing that the output files in the instance directory are missing. Has anyone else experienced something similar?
>
> The simulation seems to run fine and in the end there is the detection that the simulation has finished written in the Repast (Eclipse) console:
>
>    Polled /....../simphony_model_1477633174684 for DONE:  yes
>
> After a few more lines, there is the following warnings:
>
>   No model output found matching glob:{**/,}Statistics*.batch_param_map.csv in /....../simphony_model_1477633174684/instance_1
>
> The line is written for three different output files (defined as Text Sinks in the model) and repeated for each instance directory (we have five).
>
> Finally there are the "Aggregating output", "Get Output Time" and "Finished" lines.
>
> It appears that the output files are never generated. When checking the instance directories, these are all empty except for the symbolic link to the "data" directory. The usual files such as the debug and instance logs as well as the generated CSV files are all missing. This also appears to be the case during the simulation (not just after it has completed).
>
> What could cause this files to not be properly created? There is plenty of disk space, and we have not observed any errors.
>
> We are using Repast 2.3.1. The Repast environment is installed on the server, which is accessed through Remote Desktop / Screen Sharing. The simulations are launched as local instances.
>
> Best regards,
>
> Jonas Andersen
>
>
>
> ------------------------------------------------------------------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi_______________________________________________
> Repast-interest mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/repast-interest



------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Repast-interest mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/repast-interest