Gaussian on the Chem Cluster


	Using the Gaussian Suite J.T. Frey, 04/25/2001

This document gives an overview of the availability of the Gaussian quantum chemical program suite on the Chemistry Cluster; getting started using the program; and frequently encountered problems. A basic knowledge of the UNIX operating system is paramount to doing anything with Gaussian on the cluster. This page is not a manual on how to use Gaussian itself; that manual can be found online at http://www.gaussian.com/techinfo.htm.

  (1) Availability of Gaussian
  (2) Disk Space and your Account
  (3) Setting Up the Gaussian Environment
  (4) All About Scratch Space
  (5) Creating an Input File
  (6) Which Machine Do I Use?
  (7) Submitting a Job
  (8) Frequently Encountered Problems

Program Suite Availability
Gaussian is available to any student or faculty who have an account on the Chemistry Cluster. The current version is Gaussian98, revision A9. Older versions of Gaussian are available on the cluster, but it is recommended that for most cases you use the default version.

For calculations associated with a course offered by the Department of Chemistry and Biochemistry, please limit your usage to the following SGI's:

Eugene (eugene.duch.udel.edu) (home of the usr7 drive)

Alfred (alfred.duch.udel.edu) (home of the usr6 drive)

Zeppo (zeppo.duch.udel.edu)

Groucho (groucho.duch.udel.edu)

Other computing resources are available on a shared basis for graduate research purposes; consult your advisor or the systems administrator for information on whether or not you may use any of them for your research.

Disk Space and your Account
If you have your own account on the Chemistry Cluster, then you have your own directory in which you can store your files. When you log in to an SGI on the cluster, you are automatically in your home directory; you may eventually need to know on which SGI your home directory resides. You can determine this by typing the command `pwd` and looking at the drive -- if `pwd` produced `/usr7/people/joe_blow` then you are on the `usr7` drive.

INFO: If you are a member of Dr. Doren's research group your home directory will most likely not be on usr6 or usr7.

A good way to organize yourself and your usage of Gaussian is to keep each project on which you work in a separate directory. Keeping each project in its own directory not only yields a well-organized hierarchy of your work, but it keeps you from making a serious mistake like overwriting a file for one calculation with one from another.

NOTE: How you organize your home directory is entirely up to you; the following is merely one way to do so and reflects the author's opinion given his past experience with UNIX and computer usage.

When you first log in, create a directory to contain all of your work with Gaussian; you can use any name you'd like, perhaps `Research` or `GaussianWork`. To make a new directory, log in to your cluster account and from the command line type

  mkdir GaussianWork

and move into the directory by typing

  cd GaussianWork

You'll only have to use the `mkdir` command once; from then on that directory will be present. Once you're ready to do a calculation, you can create a new directory within your Gaussian work directory to hold all of the files for that calculation. So if I wanted to do a calculation on hydrogen, I might create a directory within my `GaussianWork` directory to hold the hydrogen calculation(s):

  Log in Either with Telnet or by sitting down at one of the SGI's

  cd GaussianWork This moves you into your Gaussian work directory; if you used a different name, substitute that for `GaussianWork`

  mkdir H2 This creates the directory `H2` within `GaussianWork`. If the directory exists, you will receive an error message; just use a different name.

  cd H2 This moves you into the newly-created directory. You can now use GaussView to build the H₂ molecule and set up the calcuation, etc.

In essence, you can categorize your work as deeply as you'd like, using a directory to hold related items at each level of your categorization. For example, let's say that I'm taking CHEM-671 and CHEM-672 and I'm also doing some research on several aldehydes. In that case, I may set up the following directories:

GaussianWork

CHEM671

CHEM672

Aldehydes

Acetaldehyde

Formaldehyde

Keep in mind, the list shown above is merely for illustrative purposes; in UNIX you won't see the icons. Organizing your work in Gaussian is important to preserving not only the information you produce, but also to preserving your sanity when you're trying to find a particular calculation.

Setting Up the Gaussian Environment
Once you log in to your account, you must set up several environment variables which tell Gaussian where to get its basis sets, where to put temporary files, etc. As in algebra, a variable in UNIX is something which "stands-in" for an actual value. All you need to know, though, is that by typing `setGauss`, the computer will set up these environment variables for you. No muss, no fuss.

TIP: Before you can use Gaussian, you must type the `setGauss` command.

Each SGI on the cluster may have more than one version of the Gaussian suite available for use. Unless you really need to use a different version, typing 'setGauss' will always set up your environment so that you're using the newest version of Gaussian. User's who need older versions of Gaussian should type 'setGauss info' to see which versions of the suite are available on the SGI they are logged in to and the command which is used to set up for use of that version.

All About Scratch Space
When Gaussian performs a calculation, it may create several temporary files, or temp files. As their name suggests, these files are used for the duration of a Gaussian calculation. Gaussian temp files can be rather large files, and as a result use a lot of disk space. Each of the Silicon Graphics computers on the Chemistry Cluster have some disk space which is set aside specifically for Gaussian temp files. After you've logged in and run `setGauss`, you can see where your temp files will be stored by looking at the `GAUSS_SCRDIR` environment variable.

TIP: To see where your scratch files are stored when you are logged in to an SGI on the cluster, type the `printenv GAUSS_SCRDIR` or the `echo $GAUSS_SCRDIR` command.

Sometimes if your Gaussian job does not complete successfully -- typically we say that your job has crashed -- the scratch files may not be deleted automatically. Since these files can be quite large, it is important for you to occasionally view the contents of your scratch directory on any SGI on the cluster on which you use Gaussian. If you have no jobs running on a particular SGI, and your scratch directory contains files, then you should delete them.

TIP: List the contents of your scratch directory with `ls $GAUSS_SCRDIR`. Delete files within your scratch directory with `rm $GAUSS_SCRDIR/*`.

Remember, only do this when you have no jobs running on the SGI to which you've logged in. Otherwise, you'll cause your job(s) running on that SGI to crash!

Creating an Input File
There are two ways by which you can create an input file for a Gaussian job. For the beginner, you will want to use Gaussian's molecule builder: GaussView. GaussView is a standard component of the Gaussian suite, and is available on any of the SGI's on which you can use Gaussian. The exact usage of GaussView will not be reviewed here; for extensive information concerning the use of GaussView consult the online manual. While you start GaussView from the command line, the program itself has a graphical user interface and thus requires that you be sitting at one of the SGI's on the cluster or that the computer you're using has XWindows capabilities (see the UNIX tutorial for information on XWindows).

TIP: To start GaussView, type the `gv` command.

Within GaussView you can create your molecule and even set up the parameters of the calculation you wish to perform. Gaussian input files will normally have names which end with `.com`. A com file as they are called specifies the necessary options to make Gaussian perform a calculation.

WARNING: GaussView will allow you to specify the calculation parameters within its Calculation Setup Window: basis set, method, charge, and spin multiplicity (to name a few). You may also start a calculation from this window; behind the scenes, GaussView will submit the job at the appropriate priority. See the section on Submitting Jobs for more information on how to start a Gaussian job without the help of GaussView.

For the more experienced user, you may edit or even create input files by means of a text editor like PICO or VI. There is a higher degree of skill involved in this, since Gaussian tends to be very picky about how an input file is formatted. For information on the appropriate layout of an input file, consult the Gaussian online manual.

Which Machine do I Use?
Each time you wish to do a calculation with Gaussian, you must figure out which of the available SGI's can best handle the addition of your calculation on top of what it is already doing.

On the Chemistry Cluster, you are expected to share resources with the other users. Loading up one SGI with many jobs will slow down everyone's calculations, including your own. While we do operate on the honor system, if you repeatedly ignore this guideline your account may be suspended.

Typing the 'highDemandJobs' command on the SGI's in the cluster will display a list of the processes running on that machine which qualify as "high-demand." In general, we like to try to keep two or, at most, four, high-demand jobs per-processor. All of the general access SGI's (zeppo, chico, groucho, harpo, alfred, eugene) have a single processor, so on all of those machines the list generated by the 'highDemandJobs' command should never get longer than four, and ideally should stay at two.

For the More Experienced User

Figuring out which SGI to use is not very complicated: it is an iterative process which has two criteria associated with it. First, pick one of the SGI's on the cluster on which you can use Gaussian. Log in to that machine. Once logged in, you can use the `top` command to view a list of processes -- tasks on which the computer is working -- which is sorted by how much of the computer's processor(s) is used by each. Based on how many processes are using a significant percentage of the computer's processing power, you can determine whether or not you could submit your job to the machine.

CRITERION 1: Type the `top` command. Since the list of processes is sorted by CPU usage, you're looking for all the jobs with over 10% listed in the CPU% column. You should definitely not submit a job on the system in question if the processes you see are getting under 30%. In general, the best policy is to look submit two jobs per processor in a machine. The number of processors is listed at the top of the display printed by the top command. Allowing two jobs per processor gives each job approximately 50% of a CPU.

In the image below, the five jobs which are in the light blue box are all Gaussian jobs (this coloring is for illustrative purposes only and will not appear on the SGI's). The SGI in question has two processors; each of the five jobs is getting above 30%; and the two job per processor guideline would dictate four jobs is the ideal. The percentages are all fairly close to 30% and five jobs is already over the ideal limit, so you probably want to look for another machine on which to run your job.

(Also of importance is the fact that the TIME column will show how long a job has been running and the COMMAND column will show you which link -- or sub-program of Gaussian -- is being executed. More on what these "links" are in a bit.)

The second criterion you must take into account when submitting a job is how much memory it will require. Each of the SGI's on the cluster has a fixed amount of memory in which to hold all of the jobs which are running. If it appears as though you could submit a job based on the first criterion, you should then be certain that you will not put the computer in question into paging. Paging occurs when a UNIX computer runs out of memory and begins using the hard disk as memory space. A hard disk is many times slower than the RAM in the computer, so typically the computer will slow to a crawl when it begins paging -- you want to avoid this at all costs.

CRITERION 2: Looking again to the display from the `top` command as shown above, find the fourth line which displays the memory profile of the computer. The first number indicates the maximum amount of RAM in the computer (1408 MB, where MB is megabytes, in this case). The second number subtracts the amount of memory used by the operating system; memory used by the operating system is not available to users. The third number from the left is labelled free and shows how much memory is not currently in use by processes on that SGI. There really isn't a good way to determine exactly how much memory your own Gaussian job will require, but keep in mind that memory requirements are proportional to the basis set size, the number of atoms in the chemical system, and the type of calculation you'll be performing. Typically the only time you'll need to worry about memory requirements is if you use post-Hartree Fock methods (CI, MP2, MP3, etc.) or a frequency calculation. By default, the minimum amount of memory Gaussian will use is around 18 MB; the amount listed for "free" memory should never fall below 50 MB to assure smooth user interaction.

Submitting a Job Manually
Outside of GaussView, you must use a special command to run Gaussian calculations because it allows the computer to give each calculation an equal share of the processor: no one calculation which is running on the computer can grab more than its share of CPU power. The name of the program which is used to start a job is npri. Using an option of `-w` with npri specifies that the job should be weightless -- in other words, it has no say in how much of the CPU it gets. This keeps the computer responsive to user actions like logging in, viewing Gaussian output files, etc.

TIP: When logged in to the SGI on which you wish to submit your job, you should be in the directory which contains the input file you're to use. From there, type the command `npri -w g98 FILENAME &` where FILENAME is the name of your input file. The & forces the computer to run the job in the background, i.e. you can continue to type commands, etc.

Another benefit of using npri to run your Gaussian jobs is the fact that you can log off of the computer and your Gaussian job will continue running!

To check on your job -- whether it is still running, what link it is in, etc. -- just use the `top` command, described above. When your job completes, you should make sure it completed successfully. Each time you run a Gaussian job, a log file is created in the same directory in which your input file resides. The filename will be the same as your input file, except that the .com is replaced by .log. At the end of the log file you will find any errors reported during the calculation which may have caused it to not complete successfully.

TIP: To view the tail end of a log file to determine if there were any errors in your Gaussian job, use the UNIX command `tail FILENAME`. You can monitor the progress of your job as it is running by adding the `-f` option to the tail command. For example, to watch Gaussian work on my H2.com input file, I would submit the job via the `npri -w g98 H2.com &` command and then immediately type `tail -f H2.log` on the command line.

When a Gaussian job completes successfully, you will find a random literary quotation at the end of the log file. You can use this as an indication of whether or not the job actually worked!

The links which have been mentioned several times already refer to what sub-program Gaussian is executing. Gaussian is split up into many small pieces. Each piece performs a specific task, and each one of these pieces is called a link. For instance, link zero scans your input file and determines what other links will run. Link zero would show up in a top listing as l0.exe. The exact task performed by each link can be found in the Gaussian manual.

Frequently Encountered Problems
A program as complex as Gaussian can be both difficult to use and difficult to understand when it doesn't work. What follows are some commonly-encountered problems that Gaussian users on the Chemistry Computing Cluster may experience.

When I type gv or try to submit a job with npri -w g98 FILENAME I get an error message that the command is undefined.

You neglected to set up the Gaussian environment; type setGauss.

I run an optimization and all appears to go well, but I don't get a quote at the end of my log file. Instead, Gaussian says something about the number of iterations.

By default, Gaussian restricts optimization to 20 iterations. If the convergence criteria for the optimization are not met after 20 iterations, Gaussian will terminate as though there was an error. Actually, if this is the case you can easily restart the optimization and give it another 20 iterations by adding the GUESS=READ GEOM=READ tags to the route line of your input file!

`Log in`	Either with Telnet or by sitting down at one of the SGI's
`cd GaussianWork`	This moves you into your Gaussian work directory; if you used a different name, substitute that for ``GaussianWork``
`mkdir H2`	This creates the directory ``H2`` within ``GaussianWork``. If the directory exists, you will receive an error message; just use a different name.
`cd H2`	This moves you into the newly-created directory. You can now use `GaussView` to build the H₂ molecule and set up the calcuation, etc.

`GaussianWork`
	`CHEM671`
	`CHEM672`
	`Aldehydes`
		`Acetaldehyde`
		`Formaldehyde`

		When I type `gv` or try to submit a job with `npri -w g98 FILENAME` I get an error message that the command is undefined.
		You neglected to set up the Gaussian environment; type `setGauss`.

		I run an optimization and all appears to go well, but I don't get a quote at the end of my `log` file. Instead, Gaussian says something about the number of iterations.
		By default, Gaussian restricts optimization to 20 iterations. If the convergence criteria for the optimization are not met after 20 iterations, Gaussian will terminate as though there was an error. Actually, if this is the case you can easily restart the optimization and give it another 20 iterations by adding the `GUESS=READ GEOM=READ` tags to the route line of your input file!