<div dir="ltr">Hi Sorina,<div><br></div><div>I never had to deal with the SHADOW_ALLOW_UNSAFE_REMOTE_EXEC property, but I had been using Condor 7.4.4, and you are using a newer version. </div><div><br></div><div>I Googled around a bit on your error message, and saw a couple posts that might help. </div>
<div><br></div><div>Also, looking at the top of your attached condor.sched.log file I see (but haven't Googled for this) </div><div><br></div><div><span style="color:rgb(0,0,0);font-family:'Courier New',Courier,monospace;font-size:14.399999618530273px">3/03/14 18:09:56 authenticate_self_gss: acquiring self credentials failed. Please check your Condor configuration file if this is a server process. Or the user environment variable if this is a user process.</span><br>
</div><div><br></div><div><span style="color:rgb(0,0,0);font-family:'Courier New',Courier,monospace;font-size:14.399999618530273px"><br></span></div><div>And there may be more helpful hints below that message in the file. Together these suggest it is some authentication configuration problem, perhaps looking at these posts and checking the condor reference for this configuration might help.</div>
<div><br></div><div><a href="https://www-auth.cs.wisc.edu/lists/htcondor-users/2013-February/msg00129.shtml">https://www-auth.cs.wisc.edu/lists/htcondor-users/2013-February/msg00129.shtml</a><br></div><div><br></div><div>
<a href="http://comments.gmane.org/gmane.comp.distributed.condor.user/27728">http://comments.gmane.org/gmane.comp.distributed.condor.user/27728</a><br></div><div><br></div><div><br></div><div>You can also email the condor mailing list, I have done so in the past and the community has been quite helpful. Before you do this, I suggest you get this down to the simplest possible example, as all of the Midas/BatchMake stuff may just add confusion. Can you try a very simple example where you take a single PHP file and try to do a condor_submit_dag with it in the same way that the challenge module does? If you can repeat the problem with that test case, and then doing the same thing successfully as the apache user will give an easier problem to describe and get help with.</div>
<div><br></div><div>Let us know how it goes, and good luck!</div><div><br></div><div>Thanks,</div><div>Mike</div><div><span style="color:rgb(0,0,0);font-family:'Courier New',Courier,monospace;font-size:14.399999618530273px"><br>
</span></div><div><span style="color:rgb(0,0,0);font-family:'Courier New',Courier,monospace;font-size:14.399999618530273px"><br></span></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Mar 4, 2014 at 9:46 AM, Sorina Camarasu Pop <span dir="ltr"><<a href="mailto:sorina.pop@creatis.insa-lyon.fr" target="_blank">sorina.pop@creatis.insa-lyon.fr</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div><br>
Hello again,<br>
<br>
I've discovered an interesting config option within condor: the
SHADOW_ALLOW_UNSAFE_REMOTE_EXEC seems to allow shell calls via the
libc 'system()' function. Is it something any of you have already
used in order to allow calls with the Midas executor->exec ?<br>
<br>
I tried to put it on and use the Condor shadow daemon, but I get
an error saying "Assertion ERROR on (job_ad_file)" at line 166 in
file shadow_v61_main.cpp" ...<br>
<br>
So before going into the trouble of trying to solve this error, I
was wondering if you know about this "shadow" config option and if
you can confirm it is necessary.<br>
<br>
Best regards,<br>
Sorina<br>
<br>
Le 03/03/2014 18:44, Sorina Camarasu Pop a écrit :<br>
</div><div><div class="h5">
<blockquote type="cite">
<div><br>
<br>
Le 03/03/2014 18:06, Michael Grauer a écrit :<br>
</div>
<blockquote type="cite">
<div dir="ltr">Where did you see the message: "<span style="font-family:arial,sans-serif;font-size:12.800000190734863px">"DC_AUTHENTICATE:
authentication of <xxx.xxx.xxx.xxx:59888> did not
result in a valid mapped user name, which is required for
this command (1112 QMGMT_WRITE_CMD), so aborting."</span></div>
</blockquote>
<br>
In the condor log : /home/condor/localcondor/log/SchedLog<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><font face="arial, sans-serif">Was there any other output
included there?</font></div>
</div>
</blockquote>
<br>
I copied parts of the log file in the attached file containing the
output printed both when using batchmake and directly condor
commands.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><font face="arial, sans-serif">Do you have a "condor"
user on your VM? </font></div>
</div>
</blockquote>
<br>
Yes.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><font face="arial, sans-serif"> When you successfully run
jobs by doing "condor_submit_dag" from the command line as
the apache user, </font></div>
</div>
</blockquote>
<br>
apache 22773 29731 0 18:18 ? 00:00:00
condor_scheduniv_exec.26.0 -f -l . -Lockfile challenge.dagjob.lock
-AutoRescue 1 -DoRescueFrom 0 -Dag challenge.dagjob -CsdVersion
$CondorVersion: 7.9.1 Aug 24 2012 PRE-RELEASE-UWCS $ -Force
-Dagman /bin/condor_dagman<br>
<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><font face="arial, sans-serif">when you watch your job
run with ps or top, which user runs the actual execution
process (whatever job batchmake will run for you) ? <br>
</font></div>
</div>
</blockquote>
<br>
When launching it with <font face="arial, sans-serif">batchmake
(through the web interface) I do not manage to to get the
corresponding condor process... I only get a httpd process run
by apache....</font><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><font face="arial, sans-serif"><br>
</font></div>
<div><font face="arial, sans-serif">Can you include your
"challenge.bms" script in an email? <br>
</font></div>
</div>
</blockquote>
<br>
Of course, here it is attached.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><font face="arial, sans-serif"><br>
</font></div>
<div>Can you show me the output of "ls" from a directory where
the submit failed and then again from one where the submit
succeeded, at the end of the job processing run?</div>
</div>
</blockquote>
<br>
Failed :<br>
ls -la 52/<br>
total 56<br>
drwxrwxr-x 3 apache apache 4096 3 mars 18:25 .<br>
drwxr-xr-x 35 apache apache 4096 3 mars 18:25 ..<br>
-rw-r--r-- 1 apache apache 140 3 mars 18:25 adminconfig.cfg<br>
-rw-r--r-- 1 apache apache 355 3 mars 18:25 challenge.0.dagjob<br>
-rw-r--r-- 1 apache apache 332 3 mars 18:25 challenge.1.dagjob<br>
-rw-r--r-- 1 apache apache 564 3 mars 18:25 challenge.2.dagjob<br>
-rw-r--r-- 1 apache apache 355 3 mars 18:25 challenge.3.dagjob<br>
lrwxrwxrwx 1 apache apache 56 3 mars 18:25 challenge.bms
-> /var/www/miccai4/modules/challenge/library/challenge.bms<br>
-rw-r--r-- 1 apache apache 1473 3 mars 18:25
challenge.config.bms<br>
-rw-r--r-- 1 apache apache 1593 3 mars 18:25 challenge.dagjob<br>
-rw-r--r-- 1 apache apache 1043 3 mars 18:25
challenge.dagjob.condor.sub<br>
lrwxrwxrwx 1 apache apache 70 3 mars 18:25
challenge_validator_app.bms ->
/var/www/miccai4/modules/challenge/library/challenge_validator_app.bms<br>
drwxrwxr-x 4 apache apache 4096 3 mars 18:25 data<br>
lrwxrwxrwx 1 apache apache 50 3 mars 18:25 PHP.bmm ->
/var/www/miccai4/modules/challenge/library/PHP.bmm<br>
-rw-r--r-- 1 apache apache 138 3 mars 18:25 userconfig.cfg<br>
lrwxrwxrwx 1 apache apache 67 3 mars 18:25
ValidateImageAveDist.bmm ->
/var/www/miccai4/modules/challenge/library/ValidateImageAveDist.bmm<br>
<br>
<br>
OK (created by matchmake and relaunched by hand):<br>
-bash-4.2$ ls -la 48<br>
total 104<br>
drwxrwxr-x 3 apache apache 4096 3 mars 18:18 .<br>
drwxr-xr-x 35 apache apache 4096 3 mars 18:25 ..<br>
-rw-r--r-- 1 apache apache 140 3 mars 18:09 adminconfig.cfg<br>
-rw-r--r-- 1 apache apache 0 3 mars 18:13
bmGrid.0.error.txt<br>
-rw-r--r-- 1 apache apache 1968 3 mars 18:18 bmGrid.0.log.txt<br>
-rw-r--r-- 1 apache apache 148 3 mars 18:18 bmGrid.0.out.txt<br>
-rw-r--r-- 1 apache apache 355 3 mars 18:09
challenge.0.dagjob<br>
-rw-r--r-- 1 apache apache 332 3 mars 18:09
challenge.1.dagjob<br>
-rw-r--r-- 1 apache apache 564 3 mars 18:09
challenge.2.dagjob<br>
-rw-r--r-- 1 apache apache 355 3 mars 18:09
challenge.3.dagjob<br>
lrwxrwxrwx 1 apache apache 56 3 mars 18:09 challenge.bms
-> /var/www/miccai4/modules/challenge/library/challenge.bms<br>
-rw-r--r-- 1 apache apache 1473 3 mars 18:09
challenge.config.bms<br>
-rw-r--r-- 1 apache apache 1593 3 mars 18:09 challenge.dagjob<br>
-rw-r--r-- 1 apache apache 1042 3 mars 18:18
challenge.dagjob.condor.sub<br>
-rw-r--r-- 1 apache apache 610 3 mars 18:18
challenge.dagjob.dagman.log<br>
-rw-r--r-- 1 apache apache 16074 3 mars 18:18
challenge.dagjob.dagman.out<br>
-rw-r--r-- 1 apache apache 256 3 mars 18:18
challenge.dagjob.dot<br>
-rw-r--r-- 1 apache apache 0 3 mars 18:18
challenge.dagjob.lib.err<br>
-rw-r--r-- 1 apache apache 29 3 mars 18:18
challenge.dagjob.lib.out<br>
-rw-r--r-- 1 apache apache 970 3 mars 18:18
challenge.dagjob.nodes.log<br>
-rw-r--r-- 1 apache apache 243 3 mars 18:18
challenge.dagjob.rescue001<br>
-rw-r--r-- 1 apache apache 243 3 mars 18:13
challenge.dagjob.rescue001.old<br>
lrwxrwxrwx 1 apache apache 70 3 mars 18:09
challenge_validator_app.bms ->
/var/www/miccai4/modules/challenge/library/challenge_validator_app.bms<br>
drwxrwxr-x 4 apache apache 4096 3 mars 18:09 data<br>
lrwxrwxrwx 1 apache apache 50 3 mars 18:09 PHP.bmm ->
/var/www/miccai4/modules/challenge/library/PHP.bmm<br>
-rw-r--r-- 1 apache apache 138 3 mars 18:09 userconfig.cfg<br>
lrwxrwxrwx 1 apache apache 67 3 mars 18:09
ValidateImageAveDist.bmm ->
/var/www/miccai4/modules/challenge/library/ValidateImageAveDist.bmm<br>
<br>
<blockquote type="cite">
<div dir="ltr">I'm not sure what is going on, just trying to get
more context...
<div><br>
</div>
<div>I recall I ran into a problem where one machine was the
submitter, and there was a midas user there, with uid 100,
and a midas user on another machine (the execution node)
with a uid 200, and I got what sounded like a similar
message--I had to make sure their uids were the same across
machines to deal with permissions across an NFS mount on
both machines. This sounds nothing like your problem, but I
wanted to include it in case it gives you any ideas.</div>
</div>
</blockquote>
<br>
Thank you for the hint. <br>
My problem seems to be similar, in the sense that it looks like a
user problem. However, I do not manage to find the difference
between the 2 potential users : apache and who else ?...<br>
<br>
I noticed in the condor log (the one attached) the following line
:<br>
03/03/14 18:39:34 ATTEMPT_ACCESS: Switching to user uid: 48 gid:
48.<br>
uid 48 does corerspond to apache. What surprises me is that the
log prints out "Switching to user uid: 48". That means that till
that moment it is executed as some other user ?...<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>Can you explain more about the library issues you ran
into earlier that prevented you from running jobs?</div>
</div>
</blockquote>
<br>
I don't remember exactly, but I spent quite some time on that one
too.<br>
In that case, jobs were submitted, but stayed idle : if I remember
correctly, there was some library preventing one of the condor
daemons from launching/executing correctly. I really don't think
this could be connected...<br>
<br>
Thank you,<br>
Sorina<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Mike</div>
<div><font face="arial, sans-serif"><br>
</font></div>
<div><font face="arial, sans-serif"><br>
</font></div>
<div><font face="arial, sans-serif"><br>
</font></div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Mon, Mar 3, 2014 at 11:50 AM,
Sorina Camarasu Pop <span dir="ltr"><<a href="mailto:sorina.pop@creatis.insa-lyon.fr" target="_blank">sorina.pop@creatis.insa-lyon.fr</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>Hi Mike,<br>
<br>
Thank you for your prompt reply.<br>
<br>
Le 03/03/2014 17:27, Michael Grauer a écrit :<br>
</div>
<div>
<blockquote type="cite">
<div dir="ltr">Hi Sorina,
<div><br>
</div>
<div>These are tough to track down. <br>
</div>
</div>
</blockquote>
<br>
</div>
I know, I've spent my afternoon on it...
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>Can you tell me more about your environment?
Specifically, the 3 machines (possibly all the
same machine) that are your condor submit,
condor manager, and condor execute nodes? <br>
</div>
</div>
</blockquote>
<br>
</div>
I use the same machine (virtual machine configured as a
dual core) for my condor submit, condor manager, and
condor execute nodes.
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>What operating system is your web server, and
what version of Condor are you using? </div>
</div>
</blockquote>
<br>
</div>
Fedora 18.<br>
For Condor, I had compiled the latest version available,
but had some library problems preventing me from
launching any job. I finally had it work with the
version available for yum install :<br>
condor_version<br>
$CondorVersion: 7.9.1 Aug 24 2012 PRE-RELEASE-UWCS $<br>
$CondorPlatform: X86_64-Fedora_18 $
<div><br>
<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div> Is your condor submit node the same as your
web server (most likely yes)?</div>
</div>
</blockquote>
<br>
</div>
yes.
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>Are you running your web server as the apache
user (most likely yes), </div>
</div>
</blockquote>
<br>
</div>
Yes, I even printed out "whoami" to check that it really
runs as apache.
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>and is it your web server that is calling the
php code that results in condor_dag_submit (most
likely yes, again) ? </div>
</div>
</blockquote>
<br>
</div>
Yes. <br>
I use the "standard" batchmake config, i.e. the
condorSubmitDag function from KWBatchmakeComponent.php
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>Can you show the permissions and ownership of
the temporary work directory where the
condor_dag_submit command is executed?</div>
</div>
</blockquote>
<br>
</div>
ls -la<br>
...<br>
drwxrwxr-x 3 apache apache 4096 3 mars 16:53 45<br>
drwxrwxr-x 3 apache apache 4096 3 mars 17:41 46<br>
<br>
-bash-4.2$ cd 46<br>
-bash-4.2$ ls -la<br>
total 92<br>
drwxrwxr-x 3 apache apache 4096 3 mars 17:41 .<br>
drwxr-xr-x 29 apache apache 4096 3 mars 17:40 ..<br>
-rw-r--r-- 1 apache apache 140 3 mars 17:40
adminconfig.cfg<br>
-rw-r--r-- 1 apache apache 0 3 mars 17:41
bmGrid.0.error.txt<br>
lrwxrwxrwx 1 apache apache 56 3 mars 17:40
challenge.bms ->
/var/www/miccai4/modules/challenge/library/challenge.bms<br>
...
<div><br>
<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>When you tested as the apache user, did you
do this test from the same temporary work
directory that Midas/apache would have tried
this from?</div>
</div>
</blockquote>
<br>
</div>
Yes, from folder
/var/www/miccai4/tmp/misc/batchmake/tmp/SSP/7/46
(drwxrwxr-x , owned by apache)
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>Is there any more information in the logs or
error logs generated by Condor in the temp work
directory that you could share?</div>
</div>
</blockquote>
<br>
</div>
tail -f challenge.dagjob.condor.sub<br>
# Note: default on_exit_remove expression:<br>
# ( ExitSignal =?= 11 || (ExitCode =!= UNDEFINED
&& ExitCode >=0 && ExitCode <= 2))<br>
# attempts to ensure that DAGMan is automatically<br>
# requeued by the schedd if it exits abnormally or<br>
# is killed (e.g., during a reboot).<br>
on_exit_remove = ( ExitSignal =?= 11 || (ExitCode =!=
UNDEFINED && ExitCode >=0 && ExitCode
<= 2))<br>
copy_to_spool = False<br>
arguments = "-f -l . -Lockfile
challenge.dagjob.lock -AutoRescue 1 -DoRescueFrom 0 -Dag
challenge.dagjob -CsdVersion $CondorVersion:' '7.9.1'
'Aug' '24' '2012' 'PRE-RELEASE-UWCS' '$ -Dagman
/usr/bin/condor_dagman"<br>
environment =
_CONDOR_DAGMAN_LOG=challenge.dagjob.dagman.out;_CONDOR_MAX_DAGMAN_LOG=0<br>
queue<br>
<br>
tail -f challenge.0.dagjob<br>
# More information at: <a href="http://www.batchmake.org" target="_blank">http://www.batchmake.org</a><br>
Universe = vanilla<br>
Output = bmGrid.0.out.txt<br>
Error = bmGrid.0.error.txt<br>
Log = bmGrid.0.log.txt<br>
Notification = NEVER<br>
Executable = /usr/bin/php<br>
Arguments = "'--version'"<br>
Queue 1<br>
<br>
I hope this can help with debugging the problem...<br>
<br>
Thank you,<br>
Sorina
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>Thanks,</div>
<div>Mike</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Mon, Mar 3, 2014 at
11:16 AM, Sorina Camarasu Pop <span dir="ltr"><<a href="mailto:sorina.pop@creatis.insa-lyon.fr" target="_blank">sorina.pop@creatis.insa-lyon.fr</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear Midas
users and developers,<br>
<br>
I am trying to configure Midas with the
Challenge and BatchMake modules, but I
encounter problems when executing the
condor_submit_dag command.<br>
<br>
The error printed by Condor when executing
the condor_submit_dag command using the
Batchmake module looks like this :
"DC_AUTHENTICATE: authentication of
<xxx.xxx.xxx.xxx:59888> did not result
in a valid mapped user name, which is
required for this command (1112
QMGMT_WRITE_CMD), so aborting."<br>
<br>
Nevertheless, if I execute exactly the same
command line as apache in a console,
everything works fine. My condor I do not
understand where the difference comes from.<br>
<br>
Do you know if there's any special
configuration for Condor to work with the
Batchmake module ?<br>
<br>
Thank you for your help,<br>
Sorina<span><font color="#888888"><br>
<br>
-- <br>
Sorina Pop, PhD<br>
CNRS Research Engineer<br>
CREATIS<br>
Tel : <a href="tel:%2B33%20%280%294%2072%2043%2072%2099" value="+33472437299" target="_blank">+33
(0)4 72 43 72 99</a><br>
<br>
_______________________________________________<br>
Midas mailing list<br>
<a href="mailto:Midas@public.kitware.com" target="_blank">Midas@public.kitware.com</a><br>
<a href="http://public.kitware.com/cgi-bin/mailman/listinfo/midas" target="_blank">http://public.kitware.com/cgi-bin/mailman/listinfo/midas</a><br>
</font></span></blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
<br>
<br>
</div>
</div>
</blockquote>
<br>
<br>
<pre cols="72">--
Sorina Pop, PhD
CNRS Research Engineer
CREATIS
Tel : <a href="tel:%2B33%20%280%294%2072%2043%2072%2099" value="+33472437299" target="_blank">+33 (0)4 72 43 72 99</a></pre>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Thanks,<br>
Michael Grauer<br>
R & D Engineer<br>
Kitware, Inc.<br>
<a href="tel:919%20969%206990%20x322" value="+19199696990" target="_blank">919 969 6990 x322</a><br>
<br>
<br>
</div>
</blockquote>
<br>
<br>
<pre cols="72">--
Sorina Pop, PhD
CNRS Research Engineer
CREATIS
Tel : <a href="tel:%2B33%20%280%294%2072%2043%2072%2099" value="+33472437299" target="_blank">+33 (0)4 72 43 72 99</a></pre>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Midas mailing list
<a href="mailto:Midas@public.kitware.com" target="_blank">Midas@public.kitware.com</a>
<a href="http://public.kitware.com/cgi-bin/mailman/listinfo/midas" target="_blank">http://public.kitware.com/cgi-bin/mailman/listinfo/midas</a>
</pre>
</blockquote>
<br>
<br>
<pre cols="72">--
Sorina Pop, PhD
CNRS Research Engineer
CREATIS
Tel : <a href="tel:%2B33%20%280%294%2072%2043%2072%2099" value="+33472437299" target="_blank">+33 (0)4 72 43 72 99</a></pre>
</div></div></div>
<br>_______________________________________________<br>
Midas mailing list<br>
<a href="mailto:Midas@public.kitware.com">Midas@public.kitware.com</a><br>
<a href="http://public.kitware.com/cgi-bin/mailman/listinfo/midas" target="_blank">http://public.kitware.com/cgi-bin/mailman/listinfo/midas</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div><br><br>
</div></div>