xpbsmon - Man Page

GUI for displaying, monitoring the nodes/execution hosts under PBS

Synopsis

xpbsmon

Description

The xpbsmon command provides a way to graphically display the various nodes that run jobs. A node or execution host can be running a pbs_mom daemon, or not running the daemon. For the latter case, it could just be a nodename that appears in a nodes file that is managed by a main pbs_server running on another host. This utility also provides the ability to monitor values of certain system resources by posting queries to the pbs_mom of a node.  With this utility, you can see what job is running on what node, who owns the job, how many nodes assigned to a job, status of each node (color-coded and the colors are user-modifiable), how many nodes are available, free, down, reserved, offline, of unknown status,  in use running multiple jobs or executing only 1 job. Please see the sections below for a tour and tutorials of xpbsmon. Also, within every dialog box, a Help button can be found for assistance.

Getting Started

Running  xpbsmon will initialize the X resource database from various sources in the following order:

  1. The RESOURCE_MANAGER property on the root window (updated via xrdb) with settings usually defined in the .Xdefaults file
  2. Preference settings defined by the system administrator in the global xpbsmonrc file
  3. User's ~/.xpbsmonrc file - this file defines various X resources like fonts, colors, list of colors to use to represent the various status of the nodes, list of PBS sites to query, list of server hosts on each site, list of nodes/execution hosts on each server host, list of system resource queries to send to the nodes' pbs_mom, and various view states. See Preferences section below for a list of resources that can be set.

Running Xpbsmon

xpbsmon can be run either as a regular user or superuser.  If you run it with less privilege, you may not be able to see all the information for a node. If it is executed as a regular user, you should still be able to see what jobs are running on what nodes, possibly state and properties as these information are obtained by xpbsmon talking directly to the specified server. If you want other system resource values, it may require special privilege since xpbsmon will have to talk directly to the pbs_mom of a node. In addition, the host where xpbsmon was running must also have been given explicit access permission by the mom (unless the GUI is running on the same host where mom is running). This is done done by updating the $clienthost and/or the $restricted parameter on the mom's configuration file.

To run xpbsmon, type:

   setenv DISPLAY <display_host>:0
   xpbsmon

If you are running the GUI and only interested in jobs data, then be sure to set all the nodes' type to NOMOM in the Pref  dialog box.

The Xpbsmon Display

This section describes the main parts of the xpbsmon display. The main window is composed of 3 distinct areas (subwindows) arranged vertically (one on top of another)  in  the following order:

   1) Menu
   2) Site Information
   3) Info

Menu. The Menu area is composed of a row of command buttons that  signal some action with a click of the left mouse button. The buttons  are:

Site..

displays a popup menu containing the list of PBS sites that have been added using the Sites Preferences window. Simply drag your mouse and release to the site name whose servers/nodes information you would like to see.

Pref..

brings up various dialog boxes for specifying the list of sites, servers on each site, nodes that are known to a server, and the system resource queries to be sent to a node's pbs_mom daemon.

Auto Update..

brings up another window for specifying whether or not to do auto updates of nodes information, and also for specifying the interval number of minutes between updates.

Help

contains some help information.

About

tells who the author is and who to send comments, bugs, suggestions to.

Close

for exiting xpbsmon plus saving the current setup information (if anything had changed) in the user's $HOME/.xpbsmonrc file. Information saved include the specified list of sites, servers on each site, nodes known to each server, and system resource queries to send to node's pbs_mom.

Minimize Button

shows the iconized view of Site Information where nodes are represented as tiny boxes, where each box is colored according to status. In order to get more information about a node, you need to double click on the colored box.

Maximize button

shows the full view of Site Information where nodes are represented in bigger boxes, still colored depending on the status, and some information on it is displayed.

Site Information.  Only one site at a time can be displayed. This area (shown as one huge box referred to as the site box) can be further sub-divided into 3 areas: the site name label at the top, server boxes in the middle, and the color status bar at the bottom.  The site name label shows the name of the site as specified in the Pref.. window.  At the middle of the site box shows a row of big boxes housing smaller boxes.

The big box is an abstraction of a server host (called a server box), showing its server display label at the top of the box, a grid of smaller boxes representing the nodes that the server knows about (where jobs are run), and summary status for the nodes under the server. Status information will show counters for the number of nodes used, available, reserved, offline, or of unknown status and even # of cpus assigned.  For a cleaner display, some counters with a value of zero are not displayed. The server boxes are placed in a grid, with a new row being started when either *siteBoxMaxNumServerBoxesPerRow or *siteBoxMaxWidth limit has been reached.

The smaller boxes represent the nodes/execution hosts where jobs are run (referred to as node boxes).  Each node box shows the name at the top, and a sub-box (a smaller square) that is is colored according to the status of the node that it represents, and if the view type is FULL, it will will display some node information according to the system resource queries specified on the Pref.. window.  Clicking on the sub-box will show a much bigger box (called the MIRROR view) with bigger fonts containing nodes information. Another view is called ICON and this shows a tiny box with a colored area. The node boxes are arranged in a grid, where a new row is created if either the *serverBoxMaxNumNodeBoxesPerRow or *serverBoxMaxWidth limit has been reached. ICON view of the node boxes will be constrained by the *nodeBoxIconMaxHeight and *nodeBoxIconMaxWidth pixel values; FULL view of the node boxes will be bounded by *nodeBoxFullMaxWidth and *nodeBoxFullMaxHeight; the mirror view of the node boxes has its size be *nodeBoxMirrorMaxWidth, and *nodeBoxMirrorMaxHeight.

Horizontal and vertical scrollbars for the site box, server box, and node box will be displayed as needed.

Finally, the color bar information shows a color chart displaying what the various colors mean in terms of node status. The color-to-status mapping can be modified by setting the X resources: *nodeColorNOINFO, *nodeColorFREE, *nodeColorINUSEshared, *nodeColorINUSEexclusive, *nodeColorDOWN, *nodeColorRSVD, *nodeColorOFFL.

Info. The Info Area shows the progress of some of the background actions performed by xpbsmon. Look into this box for errors.

Widgets Used in Xpbsmon

Some of the widgets used in xpbsmon and how they are manipulated are described in the following:

  1. listbox - the ones found in this GUI are only single-selectable (one  entry  can be highlighted/selected at a time via a mouse click).
  2. scrollbar - usually appears either vertically or horizontally and contains 5 distinct areas that are mouse clicked to achieve different effects:

    top arrow

    Causes the view in the associated widget to shift up by one unit (i.e. the object appears to move down one unit in its window). If the button is held down the action will auto-repeat.

    top gap

    Causes the view in the associated window to shift up by one less than the number of units in the window (i.e. the portion of the object that used to appear at the very top of the window will  now  appear at the  very bottom).  If the button is held down the action will auto-repeat.

    slider

    Pressing button 1 in this area has  no immediate effect except to  cause the slider to appear  sunken rather than raised.  However, if the mouse is moved with the button down  then  the slider will  be dragged, adjusting the view as the mouse is moved.

    bottom gap

    Causes the view in the associated window to shift down  by one less  than the number of units in the window (i.e.  the portion of  the  object  that  used to appear at the very bottom of the window will  now appear  at the very top).  If the button is held down the action  will auto-repeat.

    bottom arrow

    Causes the view in the associated window to shift down by one unit (i.e. the object appears to move up one unit in its window). If the button is held down the action will auto-repeat.

  3. entry - brought into focus with a click of the left mouse button.  To manipulate this widget, simply type in the text value. Use of arrow keys, mouse selection of text for deletion or overwrite, copying and pasting with sole use of mouse buttons are permitted. This widget is usually accompanied by a scrollbar for horizontally scanning a long text entry string.
  4. box -  made up of 1 or more listboxes displayed adjacent to each other giving the effect of a "matrix". Each row from the listboxes makes up an element of the box. In order to add items to the box, you need to manipulate the accompanying entry widgets, one for each listbox, and then clicking the add button. Removing items from the box is done by selecting an element, and then clicking delete.
  5. spinbox - a combination of an entry widget and a horizontal scrollbar.  The entry widget will only accept values that fall within a defined list of valid values, and incrementing through the valid values is done by clicking on the up/down arrows.
  6. button - a rectangular region appearing either raised or pressed that invokes an action when clicked with the left mouse button.  When the button appears pressed, then hitting the <RETURN> key will automatically select the button.

Updating Preferences

CASE 1: Time Sharing

Suppose you have a time-sharing environment where the front-end is called bower and you have 4 nodes: bower1, bower2, bower3, bower4. bower is the host that runs the server; jobs are submitted to host bower where it enqueues it for future execution. Also, a pbs_mom daemon is running on each of the execution hosts. If the server bower also maintains a nodes list containing information like state, properties for the 4 nodes, then this will also be reported. Then to setup xpbsmon, do the following:

  1. Click the Pref.. button on the Menu section.
  2. On the Sites Preference dialog, enter any arbitrary site name, for example "Local". Then click the add button.
  3. On the Server_Host entry box, enter "bower", and on the DisplayLabel entry box, put an arbitrary label (as it would appear on the header of the server box) like "Bower", and then click add.
  4. Click the nodes.. button that is accompanying the Servers box. This would bring up the Server Preference dialog.
  5. Now add the entries "bower1", "bower2", "bower3", "bower4" specifying type MOM for each on the Nodes box.
  6. If you need to monitor certain system resource parameters for each of the nodes, you need to specify query expressions containing resource queries to be sent to the individual PBS moms. For example, if you want to obtain memory usage, then select a node from the Nodes list, click on the query.. button that accompanies the Nodes list, and this would bring up the Query Table dialog. Specify the following input:

    Query_Expr:    (availmem/totmem) * 100
    Display_Info:  Memory Usage:
    Display_Type:  SCALE

    The above says to display the result of the "Query_Expr" in a scale widget calibrated over 100. The queries "availmem" and "totmem" will be sent to the PBS mom, and the expression is evaluated upon receiving all results from the mom. If you want to display the result of another query, say "loadave", directly, then specify the following:

    Query_Expr:    loadave
    Display_Info:  Load Average:
    Display_Type:  TEXT

    NOTE: For a list of queries that can be sent to a pbs_mom, please click on the Help button on the Query table window.

CASE 2: Jobs Exclusive Environment

Supposing you have a "space non-sharing" environment where the server maintains a list of nodes that it runs jobs on exclusively (one job at a time outstanding per node). Let's call this server b1. Simply update Preferences information as follows:

  1. Click the Pref.. button on the Menu section.
  2. On the Sites Preference dialog, enter a site name, for example "B System". Then click the add button.
  3. On the Server_Host entry box, enter "b1", DisplayLabel entry box type "B1" (or whatever label that you would like to appear on the header of the server box), and then click add.
CASE 3: Hybrid Time Sharing/Space Sharing Environment

A cluster of heterogeneous machines, time-sharing or jobs exclusive,  could easily be represented in xpbsmon by combining steps in CASE 1 and CASE 2.

Leaving Xpbsmon

Click on the Close button located in the Menu bar to leave xpbsmon. If anything had changed, it will bring up a  dialog box  asking  for a confirmation in regards to saving preferences information about list of sites, their view types, list of servers on each site, the list of nodes known to each server, and the list of queries to be sent to the pbs_mom of each node.  The information  is saved in ~/.xpbsmonrc file.

Preferences

The resources that can be set in the X resources file, ~/.xpbsmonrc, are described in the following:

Node Box Properties

Resource names beginning with "*small" or "*node" apply to the properties of the node boxes. A node box is made of an outer frame where the node label sits on top, the canvas (smaller box) is on the middle, and possibly some horizontal/ vertical scrollbars.

nodeColorNOINFO

color of node box when information for the node it represents could not be obtained.

*nodeColorFREE

color of canvas when node it represents is up.

*nodeColorINUSEshared

color when node it represents has more than 1 job running on it, or when node has been marked by the server that manages it as "job-sharing".

*nodeColorINUSEexclusive

list of colors to assign to a node box when host it represents is running only 1 job, or when node has been marked by the server that manages it as "time-sharing". xpbsmon will use this list to assign 1 distinct color per job unless all the colors have been exhausted, in which case, colors will start getting assigned more than once in a round-robin fashion.

*nodeColorDOWN

color when node it represents is down.

*nodeColorRSVD

color when node it represents is reserved.

*nodeColorOFFL

color when node it represents is offline.

*smallForeground

applies to the color of text inside the canvas.

*smallBackground

applies to the color of the frame.

*smallBorderWidth

distance (in pixels) from other node boxes.

*smallRelief

how node box will visually appear (style).

*smallScrollBorderWidth

significant only in FULL mode, this is the distance of the horizontal/vertical scrollbars from the canvas and lower edge of the frame.

*smallScrollBackground

background color of the scrollbars

*smallScrollRelief

how scrollbars would visually appear (style).

*smallCanvasBackground

color of the canvas (later overridden depending on status of the node it represents)

*smallCanvasBorderWidth

distance of the canvas from the frame and possibly the scrollbars.

*smallCanvasRelief

how the canvas is visually represented (style).

*smallLabelBorderWidth

the distance of the node label from the canvas and the topmost edge of the frame.

*smallLabelBackground

the background of the area of the node label that is not filled.

*smallLabelRelief

how the label would appear visually (style).

*smallLabelForeground

the color of node label text.

*smallLabelFont

the font to use for the node label text.

*smallLabelFontWidth

font width (in pixels) of *smallLabelFont

*smallLabelFontHeight

font height (in pixels) of *smallLabelFont

*smallTextFont

font to use for the text that appear inside a canvas.

*smallTextFontWidth

font width (in pixels) of *smallTextFont.

*smallTextFontHeight

font height (in pixels) of *smallTextFont.

*nodeColorTrough

color of trough part (the  /100 portion) of a canvas scale item.

*nodeColorSlider

color of slider part (value portion) of a canvas scale item.

*nodeColorExtendedTrough

color of extended trough (over 100 portion when value exceeds max) of a canvas scale item.

*nodeScaleFactor

tells how much bigger you want the scale item on the canvas to appear. (1 means to keep size as is)

*nodeBoxFullMaxWidth
*nodeBoxFullMaxHeight

maximum width and height (in pixels) of a node box in FULL mode.

*nodeBoxIconMaxWidth
*nodeBoxIconMaxHeight

maximum width and height (in pixels) of a node box in ICON mode.

*nodeBoxMirrorMaxWidth
*nodeBoxMirrorMaxHeight

maximum width and height (in pixels) of a node box displayed on a separate window (after it has been clicked with the mouse to obtain a bigger view)

*nodeBoxMirrorScaleFactor

tells how much bigger you want the scale item on the canvas to appear while the node box is displayed on a separate window (1 means to keep size as is)

Server Box Properties

Resource names beginning with "*medium" apply to the properties of the server boxes. A server box is made of an outer frame where the server display label sits on top, a canvas filled with node boxes is on the middle, possibly some horizontal/vertical scrollbars, and a status label at the bottom.

*mediumLabelForeground

color of text applied to the server display label and status label.

*mediumLabelBackground

background color of the unfilled portions of the server display label and status label.

*mediumLabelBorderWidth

distance of the server display label and status label from other parts of the server box.

*mediumLabelRelief

how the server display label and status label appear visually (style).

*mediumLabelFont"

font used for the text of the server display label and status label.

*mediumLabelFontWidth

font width (in pixels) of *mediumLabelFont.

*mediumLabelFontHeight

font height (in pixels) of *mediumLabelFont.

*mediumCanvasBorderWidth

the distance of the server box's canvas from the label widgets.

*mediumCanvasBackground

the background color of the canvas.

*mediumCanvasRelief

how the canvas appear visually (style).

*mediumScrollBorderWidth

distance of the scrollbars from the other parts of the server box.

*mediumScrollBackground

the background color of the scrollbars

*mediumScrollRelief

how the scrollbars appear visually.

*mediumBackground

the color of the server box frame.

*mediumBorderWidth

the distance of the server box from other boxes.

*mediumRelief

how the server box appears visually (style).

*serverBoxMaxWidth
*serverBoxMaxHeight

maximum width and height (in pixels) of a server box.

*serverBoxMaxNumNodeBoxesPerRow

maximum # of node boxes to appear in a row within a canvas.

Miscellaneous Properties

Resource names beginning with "*big" apply to the properties of a site box, as well as to widgets found outside of the server box and node box. This includes the dialog boxes that appear when the menu buttons of the main window are manipulated. The site box is the one that appears on the main region of xpbsmon.

*bigBackground

background color of the outer layer of the main window.

*bigForeground

color applied to regular text that appear outside of the node box and server box.

*bigBorderWidth

distance of the site box from the menu area and the color information area.

*bigRelief

how the site box is visually represented (style)

*bigActiveColor

the color applied to the background of a selection, a selected command button, or a selected scroll bar handle.

*bigShadingColor

a  color  shading applied to some of the frames to emphasize focus as well as decoration.

*bigSelectorColor

the  color applied to the selector box of a radiobutton or checkbutton.

*bigDisabledColor

color applied to a disabled widget.

*bigLabelBackground

color applied to the unfilled portions of label widgets.

*bigLabelBorderWidth

distance from other widgets of a label widget.

*bigLabelRelief

how label widgets appear visually (style)

*bigLabelFont

font to use for labels.

*bigLabelFontWidth

font width (in pixels) of *bigLabelFont.

*bigLabelFontHeight

font height (in pixels) of *bigLabelFont.

*bigLabelForeground

color applied to text that function as labels.

*bigCanvasBackground

the color of the main region.

*bigCanvasRelief

how the main region looks like visually (style)

*bigCanvasBorderWidth:

distance of the main region from the menu and info regions.

*bigScrollBorderWidth

if the main region has a scrollbar, this is its distance from other widgets appearing on the region.

*bigScrollBackground

background color of the scrollbar appearing outside a server box and node box.

*bigScrollRelief

how the scrollbar that appears outside a server box and node box looks like visually (style)

*bigTextFontWidth

the font width (in pixels) of *bigTextFont

*bigTextFontHeight

the font height (in pixels) of *bigTextFont

*siteBoxMaxWidth

maximum width (in pixels) of the site box.

*siteBoxMaxHeight

maximum height (in pixels) of the site box.

*siteBoxMaxNumServerBoxesPerRow

maximum number of server boxes to appear in a row inside the site box.

*autoUpdate

if set to true, then information about nodes is periodically gathered.

*autoUpdateMins

the # of minutes between polling for data regarding nodes when *autoUpdate is set.

*siteInView

the name of the site that should be in view

*rcSiteInfoDelimeterChar

the separator character for each input within a curly-bracketed line of input of *siteInfo.

*sitesInfo

{<site1name><sep><site1-display-type><sep><server-name><sep><server-display-label><sep><nodename><sep><nodetype><sep><node-query-expr>}
. . .
{<site2name><sep><site2-display-type><sep><server-name><sep><server-display-label><sep><nodename><sep><nodetype><sep><node-query-expr>}

information about a site where <site1-display-type> can be either {FULL,ICON}, <nodetype> can be {MOM, NOMOM}, and <node-query-expr> has the format:

{ {<expr>} {expr-label} <output-format>}

where <output-format> could be {TEXT, SCALE}. It's probably better to use the Pref dialog boxes in order to specify a value for this.

Example:

*rcSiteInfoDelimeterChar ;
*sitesInfo:     {Mars;ICON;newton;Newton;newton3;NOMOM;} {Langley;FULL;db;DB;db.OpenPBS.org;MOM;{{ ( availmem / totmem ) * 100} {Memory Usage:} SCALE} {{ ( loadave / ncpus ) * 100} {Cpu Usage:} SCALE} {ncpus {Number of Cpus:} TEXT} {physmem {Physical Memory:} TEXT} {idletime {Idle Time (s):} TEXT} {loadave {Load Avg:} TEXT}} {Mars;ICON;newton;Newton;newton4;NOMOM;} {Mars;ICON;newton;Newton;newton1;NOMOM;} {Mars;ICON;newton;Newton;newton2;NOMOM;} {Mars;ICON;b0101;DB;aspasia.OpenPBS.org;MOM;{{ ( availmem / totmem ) * 100} {Memory Usage:} SCALE} {{ ( loadave / ncpus ) * 100} {Cpu Usage:} SCALE} {ncpus {Number of Cpus:} TEXT} {physmem {Physical Memory:} TEXT} {idletime {Idle Time (s):} TEXT} {loadave {Load Avg:} TEXT}} {Mars;ICON;newton;Newton;newton7;NOMOM;}

Exit Status

Upon successful processing, the xpbsmon exit status will be a value of zero.

If the xpbsmon command fails, the command exits with a value greater than zero.

If xpbsmon is querying a host running a server with an incompatible version, you may see the following messages:

Internal error: pbsstatnode: End of File (15031)

The above message can be safely ignored.

See Also

pbs_sched_tcl(8B) and pbs_mom(8B).

Info

Local PBS