Skip to main content

Running ABAQUS Explicit Analysis using Clusters

Submitted by parimal_maity on

 

Dear All,

I was running a long explicit analysis using cluster. My simulation may require 400 hrs to complete the whole analysis but I can set maximum walltime of 150 hrs in cluster. After waiting for 150 hrs I found that, analsyis is getting stopped due to unavailable cluster time. I would like to continue the explicit analsyis from 150 hrs, where it had acually stopped due to unavailable cluster time. How can I continue
simulation? I tried to run the same explicit analysis from command promt using suspend and resume command and it works fine. I also tried the same but using different node e.g suspend the job from one node and resume the the job from another node but it doesn't work. I found in abaqus documentation that recover command is generally used for restarting the same job but somehow this doesn't work for me. Here i have attached my script file for running an abaqus job

start.qsub script file

#!/bin/sh -login

#PBS -l nodes=1:ppn=1,walltime=00:10:00

#PBS -j oe

#PBS -W x=gres:explicit:5%abaqus:5



cd $PBS_O_WORKDIR



inputfile="Job-Dynamic-Model"



# Automatically calculate the number of processors

np=$(cat $PBS_NODEFILE | wc -l)



module unload mvapich

module load abaqus_parallel



#Make a temporary scratch space (this should be on /mnt/scratch)

scratch=/mnt/scratch/${USER}/${PBS_JOBID}

export TMPDIR=$scratch

mkdir -p $scratch



# Change to the working directory

cd ${PBS_O_WORKDIR}



# Run abaqus

abaqus job=$inputfile recover cpus=$np interactive &

PID=$!

sleep 600







# Remove scratch space

rm -rf $scratch



 

restart.qsub script file

#!/bin/sh -login

#PBS -l nodes=1:ppn=1,walltime=00:10:00

#PBS -j oe

#PBS -W x=gres:explicit:5%abaqus:5



cd $PBS_O_WORKDIR



inputfile="Job-Dynamic-Model"



# Automatically calculate the number of processors

np=$(cat $PBS_NODEFILE | wc -l)



module unload mvapich

module load abaqus_parallel



#Make a temporary scratch space (this should be on /mnt/scratch)

scratch=/mnt/scratch/${USER}/${PBS_JOBID}

export TMPDIR=$scratch

mkdir -p $scratch



# Change to the working directory

cd ${PBS_O_WORKDIR}



# Run abaqus

# abaqus restartjoin  originalodb=odb-file-name

#                     restartodb=odb-file-name

#                     [copyoriginal] [history] [compressresult]



abaqus job=${inputfile} recover

echo "sleeping"

sleep 600

echo "done sleeping"

#abaqus terminate job=$inputfile



#qsub restart.qsub



# Remove scratch space

rm -rf $scratch

Any suggestion is greatly appreciated. Thank you in advance.

Thanks

Dr. Parimal Maity

Michigan State University

ME