xpbsmon - Man Page
GUI for displaying, monitoring the nodes/execution hosts under PBS
Synopsis
xpbsmon
Description
The xpbsmon command provides a way to graphically display the various nodes that run jobs. A node or execution host can be running a pbs_mom daemon, or not running the daemon. For the latter case, it could just be a nodename that appears in a nodes file that is managed by a main pbs_server running on another host. This utility also provides the ability to monitor values of certain system resources by posting queries to the pbs_mom of a node. With this utility, you can see what job is running on what node, who owns the job, how many nodes assigned to a job, status of each node (color-coded and the colors are user-modifiable), how many nodes are available, free, down, reserved, offline, of unknown status, in use running multiple jobs or executing only 1 job. Please see the sections below for a tour and tutorials of xpbsmon. Also, within every dialog box, a Help button can be found for assistance.
Getting Started
Running xpbsmon will initialize the X resource database from various sources in the following order:
- The RESOURCE_MANAGER property on the root window (updated via xrdb) with settings usually defined in the .Xdefaults file
- Preference settings defined by the system administrator in the global xpbsmonrc file
- User's ~/.xpbsmonrc file - this file defines various X resources like fonts, colors, list of colors to use to represent the various status of the nodes, list of PBS sites to query, list of server hosts on each site, list of nodes/execution hosts on each server host, list of system resource queries to send to the nodes' pbs_mom, and various view states. See Preferences section below for a list of resources that can be set.
Running Xpbsmon
xpbsmon can be run either as a regular user or superuser. If you run it with less privilege, you may not be able to see all the information for a node. If it is executed as a regular user, you should still be able to see what jobs are running on what nodes, possibly state and properties as these information are obtained by xpbsmon talking directly to the specified server. If you want other system resource values, it may require special privilege since xpbsmon will have to talk directly to the pbs_mom of a node. In addition, the host where xpbsmon was running must also have been given explicit access permission by the mom (unless the GUI is running on the same host where mom is running). This is done done by updating the $clienthost and/or the $restricted parameter on the mom's configuration file.
To run xpbsmon, type:
setenv DISPLAY <display_host>:0
xpbsmon
If you are running the GUI and only interested in jobs data, then be sure to set all the nodes' type to NOMOM in the Pref dialog box.
The Xpbsmon Display
This section describes the main parts of the xpbsmon display. The main window is composed of 3 distinct areas (subwindows) arranged vertically (one on top of another) in the following order:
1) Menu
2) Site Information
3) Info
Menu. The Menu area is composed of a row of command buttons that signal some action with a click of the left mouse button. The buttons are:
- Site..
displays a popup menu containing the list of PBS sites that have been added using the Sites Preferences window. Simply drag your mouse and release to the site name whose servers/nodes information you would like to see.
- Pref..
brings up various dialog boxes for specifying the list of sites, servers on each site, nodes that are known to a server, and the system resource queries to be sent to a node's pbs_mom daemon.
- Auto Update..
brings up another window for specifying whether or not to do auto updates of nodes information, and also for specifying the interval number of minutes between updates.
- Help
contains some help information.
- About
tells who the author is and who to send comments, bugs, suggestions to.
- Close
for exiting xpbsmon plus saving the current setup information (if anything had changed) in the user's $HOME/.xpbsmonrc file. Information saved include the specified list of sites, servers on each site, nodes known to each server, and system resource queries to send to node's pbs_mom.
- Minimize Button
shows the iconized view of Site Information where nodes are represented as tiny boxes, where each box is colored according to status. In order to get more information about a node, you need to double click on the colored box.
- Maximize button
shows the full view of Site Information where nodes are represented in bigger boxes, still colored depending on the status, and some information on it is displayed.
Site Information. Only one site at a time can be displayed. This area (shown as one huge box referred to as the site box) can be further sub-divided into 3 areas: the site name label at the top, server boxes in the middle, and the color status bar at the bottom. The site name label shows the name of the site as specified in the Pref.. window. At the middle of the site box shows a row of big boxes housing smaller boxes.
The big box is an abstraction of a server host (called a server box), showing its server display label at the top of the box, a grid of smaller boxes representing the nodes that the server knows about (where jobs are run), and summary status for the nodes under the server. Status information will show counters for the number of nodes used, available, reserved, offline, or of unknown status and even # of cpus assigned. For a cleaner display, some counters with a value of zero are not displayed. The server boxes are placed in a grid, with a new row being started when either *siteBoxMaxNumServerBoxesPerRow or *siteBoxMaxWidth limit has been reached.
The smaller boxes represent the nodes/execution hosts where jobs are run (referred to as node boxes). Each node box shows the name at the top, and a sub-box (a smaller square) that is is colored according to the status of the node that it represents, and if the view type is FULL, it will will display some node information according to the system resource queries specified on the Pref.. window. Clicking on the sub-box will show a much bigger box (called the MIRROR view) with bigger fonts containing nodes information. Another view is called ICON and this shows a tiny box with a colored area. The node boxes are arranged in a grid, where a new row is created if either the *serverBoxMaxNumNodeBoxesPerRow or *serverBoxMaxWidth limit has been reached. ICON view of the node boxes will be constrained by the *nodeBoxIconMaxHeight and *nodeBoxIconMaxWidth pixel values; FULL view of the node boxes will be bounded by *nodeBoxFullMaxWidth and *nodeBoxFullMaxHeight; the mirror view of the node boxes has its size be *nodeBoxMirrorMaxWidth, and *nodeBoxMirrorMaxHeight.
Horizontal and vertical scrollbars for the site box, server box, and node box will be displayed as needed.
Finally, the color bar information shows a color chart displaying what the various colors mean in terms of node status. The color-to-status mapping can be modified by setting the X resources: *nodeColorNOINFO, *nodeColorFREE, *nodeColorINUSEshared, *nodeColorINUSEexclusive, *nodeColorDOWN, *nodeColorRSVD, *nodeColorOFFL.
Info. The Info Area shows the progress of some of the background actions performed by xpbsmon. Look into this box for errors.
Widgets Used in Xpbsmon
Some of the widgets used in xpbsmon and how they are manipulated are described in the following:
- listbox - the ones found in this GUI are only single-selectable (one entry can be highlighted/selected at a time via a mouse click).
scrollbar - usually appears either vertically or horizontally and contains 5 distinct areas that are mouse clicked to achieve different effects:
- top arrow
Causes the view in the associated widget to shift up by one unit (i.e. the object appears to move down one unit in its window). If the button is held down the action will auto-repeat.
- top gap
Causes the view in the associated window to shift up by one less than the number of units in the window (i.e. the portion of the object that used to appear at the very top of the window will now appear at the very bottom). If the button is held down the action will auto-repeat.
- slider
Pressing button 1 in this area has no immediate effect except to cause the slider to appear sunken rather than raised. However, if the mouse is moved with the button down then the slider will be dragged, adjusting the view as the mouse is moved.
- bottom gap
Causes the view in the associated window to shift down by one less than the number of units in the window (i.e. the portion of the object that used to appear at the very bottom of the window will now appear at the very top). If the button is held down the action will auto-repeat.
- bottom arrow
Causes the view in the associated window to shift down by one unit (i.e. the object appears to move up one unit in its window). If the button is held down the action will auto-repeat.
- entry - brought into focus with a click of the left mouse button. To manipulate this widget, simply type in the text value. Use of arrow keys, mouse selection of text for deletion or overwrite, copying and pasting with sole use of mouse buttons are permitted. This widget is usually accompanied by a scrollbar for horizontally scanning a long text entry string.
- box - made up of 1 or more listboxes displayed adjacent to each other giving the effect of a "matrix". Each row from the listboxes makes up an element of the box. In order to add items to the box, you need to manipulate the accompanying entry widgets, one for each listbox, and then clicking the add button. Removing items from the box is done by selecting an element, and then clicking delete.
- spinbox - a combination of an entry widget and a horizontal scrollbar. The entry widget will only accept values that fall within a defined list of valid values, and incrementing through the valid values is done by clicking on the up/down arrows.
- button - a rectangular region appearing either raised or pressed that invokes an action when clicked with the left mouse button. When the button appears pressed, then hitting the <RETURN> key will automatically select the button.
Updating Preferences
- CASE 1: Time Sharing
Suppose you have a time-sharing environment where the front-end is called bower and you have 4 nodes: bower1, bower2, bower3, bower4. bower is the host that runs the server; jobs are submitted to host bower where it enqueues it for future execution. Also, a pbs_mom daemon is running on each of the execution hosts. If the server bower also maintains a nodes list containing information like state, properties for the 4 nodes, then this will also be reported. Then to setup xpbsmon, do the following:
- Click the Pref.. button on the Menu section.
- On the Sites Preference dialog, enter any arbitrary site name, for example "Local". Then click the add button.
- On the Server_Host entry box, enter "bower", and on the DisplayLabel entry box, put an arbitrary label (as it would appear on the header of the server box) like "Bower", and then click add.
- Click the nodes.. button that is accompanying the Servers box. This would bring up the Server Preference dialog.
- Now add the entries "bower1", "bower2", "bower3", "bower4" specifying type MOM for each on the Nodes box.
If you need to monitor certain system resource parameters for each of the nodes, you need to specify query expressions containing resource queries to be sent to the individual PBS moms. For example, if you want to obtain memory usage, then select a node from the Nodes list, click on the query.. button that accompanies the Nodes list, and this would bring up the Query Table dialog. Specify the following input:
Query_Expr: (availmem/totmem) * 100
Display_Info: Memory Usage:
Display_Type: SCALEThe above says to display the result of the "Query_Expr" in a scale widget calibrated over 100. The queries "availmem" and "totmem" will be sent to the PBS mom, and the expression is evaluated upon receiving all results from the mom. If you want to display the result of another query, say "loadave", directly, then specify the following:
Query_Expr: loadave
Display_Info: Load Average:
Display_Type: TEXTNOTE: For a list of queries that can be sent to a pbs_mom, please click on the Help button on the Query table window.
- CASE 2: Jobs Exclusive Environment
Supposing you have a "space non-sharing" environment where the server maintains a list of nodes that it runs jobs on exclusively (one job at a time outstanding per node). Let's call this server b1. Simply update Preferences information as follows:
- Click the Pref.. button on the Menu section.
- On the Sites Preference dialog, enter a site name, for example "B System". Then click the add button.
- On the Server_Host entry box, enter "b1", DisplayLabel entry box type "B1" (or whatever label that you would like to appear on the header of the server box), and then click add.
- CASE 3: Hybrid Time Sharing/Space Sharing Environment
A cluster of heterogeneous machines, time-sharing or jobs exclusive, could easily be represented in xpbsmon by combining steps in CASE 1 and CASE 2.
Leaving Xpbsmon
Click on the Close button located in the Menu bar to leave xpbsmon. If anything had changed, it will bring up a dialog box asking for a confirmation in regards to saving preferences information about list of sites, their view types, list of servers on each site, the list of nodes known to each server, and the list of queries to be sent to the pbs_mom of each node. The information is saved in ~/.xpbsmonrc file.
Preferences
The resources that can be set in the X resources file, ~/.xpbsmonrc, are described in the following:
Node Box Properties
Resource names beginning with "*small" or "*node" apply to the properties of the node boxes. A node box is made of an outer frame where the node label sits on top, the canvas (smaller box) is on the middle, and possibly some horizontal/ vertical scrollbars.
- nodeColorNOINFO
color of node box when information for the node it represents could not be obtained.
- *nodeColorFREE
color of canvas when node it represents is up.
- *nodeColorINUSEshared
color when node it represents has more than 1 job running on it, or when node has been marked by the server that manages it as "job-sharing".
- *nodeColorINUSEexclusive
list of colors to assign to a node box when host it represents is running only 1 job, or when node has been marked by the server that manages it as "time-sharing". xpbsmon will use this list to assign 1 distinct color per job unless all the colors have been exhausted, in which case, colors will start getting assigned more than once in a round-robin fashion.
- *nodeColorDOWN
color when node it represents is down.
- *nodeColorRSVD
color when node it represents is reserved.
- *nodeColorOFFL
color when node it represents is offline.
- *smallForeground
applies to the color of text inside the canvas.
- *smallBackground
applies to the color of the frame.
- *smallBorderWidth
distance (in pixels) from other node boxes.
- *smallRelief
how node box will visually appear (style).
- *smallScrollBorderWidth
significant only in FULL mode, this is the distance of the horizontal/vertical scrollbars from the canvas and lower edge of the frame.
- *smallScrollBackground
background color of the scrollbars
- *smallScrollRelief
how scrollbars would visually appear (style).
- *smallCanvasBackground
color of the canvas (later overridden depending on status of the node it represents)
- *smallCanvasBorderWidth
distance of the canvas from the frame and possibly the scrollbars.
- *smallCanvasRelief
how the canvas is visually represented (style).
- *smallLabelBorderWidth
the distance of the node label from the canvas and the topmost edge of the frame.
- *smallLabelBackground
the background of the area of the node label that is not filled.
- *smallLabelRelief
how the label would appear visually (style).
- *smallLabelForeground
the color of node label text.
- *smallLabelFont
the font to use for the node label text.
- *smallLabelFontWidth
font width (in pixels) of *smallLabelFont
- *smallLabelFontHeight
font height (in pixels) of *smallLabelFont
- *smallTextFont
font to use for the text that appear inside a canvas.
- *smallTextFontWidth
font width (in pixels) of *smallTextFont.
- *smallTextFontHeight
font height (in pixels) of *smallTextFont.
- *nodeColorTrough
color of trough part (the /100 portion) of a canvas scale item.
- *nodeColorSlider
color of slider part (value portion) of a canvas scale item.
- *nodeColorExtendedTrough
color of extended trough (over 100 portion when value exceeds max) of a canvas scale item.
- *nodeScaleFactor
tells how much bigger you want the scale item on the canvas to appear. (1 means to keep size as is)
- *nodeBoxFullMaxWidth
- *nodeBoxFullMaxHeight
maximum width and height (in pixels) of a node box in FULL mode.
- *nodeBoxIconMaxWidth
- *nodeBoxIconMaxHeight
maximum width and height (in pixels) of a node box in ICON mode.
- *nodeBoxMirrorMaxWidth
- *nodeBoxMirrorMaxHeight
maximum width and height (in pixels) of a node box displayed on a separate window (after it has been clicked with the mouse to obtain a bigger view)
- *nodeBoxMirrorScaleFactor
tells how much bigger you want the scale item on the canvas to appear while the node box is displayed on a separate window (1 means to keep size as is)
Server Box Properties
Resource names beginning with "*medium" apply to the properties of the server boxes. A server box is made of an outer frame where the server display label sits on top, a canvas filled with node boxes is on the middle, possibly some horizontal/vertical scrollbars, and a status label at the bottom.
- *mediumLabelForeground
color of text applied to the server display label and status label.
- *mediumLabelBackground
background color of the unfilled portions of the server display label and status label.
- *mediumLabelBorderWidth
distance of the server display label and status label from other parts of the server box.
- *mediumLabelRelief
how the server display label and status label appear visually (style).
- *mediumLabelFont"
font used for the text of the server display label and status label.
- *mediumLabelFontWidth
font width (in pixels) of *mediumLabelFont.
- *mediumLabelFontHeight
font height (in pixels) of *mediumLabelFont.
- *mediumCanvasBorderWidth
the distance of the server box's canvas from the label widgets.
- *mediumCanvasBackground
the background color of the canvas.
- *mediumCanvasRelief
how the canvas appear visually (style).
- *mediumScrollBorderWidth
distance of the scrollbars from the other parts of the server box.
- *mediumScrollBackground
the background color of the scrollbars
- *mediumScrollRelief
how the scrollbars appear visually.
- *mediumBackground
the color of the server box frame.
- *mediumBorderWidth
the distance of the server box from other boxes.
- *mediumRelief
how the server box appears visually (style).
- *serverBoxMaxWidth
- *serverBoxMaxHeight
maximum width and height (in pixels) of a server box.
- *serverBoxMaxNumNodeBoxesPerRow
maximum # of node boxes to appear in a row within a canvas.
Miscellaneous Properties
Resource names beginning with "*big" apply to the properties of a site box, as well as to widgets found outside of the server box and node box. This includes the dialog boxes that appear when the menu buttons of the main window are manipulated. The site box is the one that appears on the main region of xpbsmon.
- *bigBackground
background color of the outer layer of the main window.
- *bigForeground
color applied to regular text that appear outside of the node box and server box.
- *bigBorderWidth
distance of the site box from the menu area and the color information area.
- *bigRelief
how the site box is visually represented (style)
- *bigActiveColor
the color applied to the background of a selection, a selected command button, or a selected scroll bar handle.
- *bigShadingColor
a color shading applied to some of the frames to emphasize focus as well as decoration.
- *bigSelectorColor
the color applied to the selector box of a radiobutton or checkbutton.
- *bigDisabledColor
color applied to a disabled widget.
- *bigLabelBackground
color applied to the unfilled portions of label widgets.
- *bigLabelBorderWidth
distance from other widgets of a label widget.
- *bigLabelRelief
how label widgets appear visually (style)
- *bigLabelFont
font to use for labels.
- *bigLabelFontWidth
font width (in pixels) of *bigLabelFont.
- *bigLabelFontHeight
font height (in pixels) of *bigLabelFont.
- *bigLabelForeground
color applied to text that function as labels.
- *bigCanvasBackground
the color of the main region.
- *bigCanvasRelief
how the main region looks like visually (style)
- *bigCanvasBorderWidth:
distance of the main region from the menu and info regions.
- *bigScrollBorderWidth
if the main region has a scrollbar, this is its distance from other widgets appearing on the region.
- *bigScrollBackground
background color of the scrollbar appearing outside a server box and node box.
- *bigScrollRelief
how the scrollbar that appears outside a server box and node box looks like visually (style)
- *bigTextFontWidth
the font width (in pixels) of *bigTextFont
- *bigTextFontHeight
the font height (in pixels) of *bigTextFont
- *siteBoxMaxWidth
maximum width (in pixels) of the site box.
- *siteBoxMaxHeight
maximum height (in pixels) of the site box.
- *siteBoxMaxNumServerBoxesPerRow
maximum number of server boxes to appear in a row inside the site box.
- *autoUpdate
if set to true, then information about nodes is periodically gathered.
- *autoUpdateMins
the # of minutes between polling for data regarding nodes when *autoUpdate is set.
- *siteInView
the name of the site that should be in view
- *rcSiteInfoDelimeterChar
the separator character for each input within a curly-bracketed line of input of *siteInfo.
- *sitesInfo
{<site1name><sep><site1-display-type><sep><server-name><sep><server-display-label><sep><nodename><sep><nodetype><sep><node-query-expr>}
. . .
{<site2name><sep><site2-display-type><sep><server-name><sep><server-display-label><sep><nodename><sep><nodetype><sep><node-query-expr>}information about a site where <site1-display-type> can be either {FULL,ICON}, <nodetype> can be {MOM, NOMOM}, and <node-query-expr> has the format:
{ {<expr>} {expr-label} <output-format>}
where <output-format> could be {TEXT, SCALE}. It's probably better to use the Pref dialog boxes in order to specify a value for this.
Example:
*rcSiteInfoDelimeterChar ;
*sitesInfo: {Mars;ICON;newton;Newton;newton3;NOMOM;} {Langley;FULL;db;DB;db.OpenPBS.org;MOM;{{ ( availmem / totmem ) * 100} {Memory Usage:} SCALE} {{ ( loadave / ncpus ) * 100} {Cpu Usage:} SCALE} {ncpus {Number of Cpus:} TEXT} {physmem {Physical Memory:} TEXT} {idletime {Idle Time (s):} TEXT} {loadave {Load Avg:} TEXT}} {Mars;ICON;newton;Newton;newton4;NOMOM;} {Mars;ICON;newton;Newton;newton1;NOMOM;} {Mars;ICON;newton;Newton;newton2;NOMOM;} {Mars;ICON;b0101;DB;aspasia.OpenPBS.org;MOM;{{ ( availmem / totmem ) * 100} {Memory Usage:} SCALE} {{ ( loadave / ncpus ) * 100} {Cpu Usage:} SCALE} {ncpus {Number of Cpus:} TEXT} {physmem {Physical Memory:} TEXT} {idletime {Idle Time (s):} TEXT} {loadave {Load Avg:} TEXT}} {Mars;ICON;newton;Newton;newton7;NOMOM;}
Exit Status
Upon successful processing, the xpbsmon exit status will be a value of zero.
If the xpbsmon command fails, the command exits with a value greater than zero.
If xpbsmon is querying a host running a server with an incompatible version, you may see the following messages:
Internal error: pbsstatnode: End of File (15031)
The above message can be safely ignored.
See Also
pbs_sched_tcl(8B) and pbs_mom(8B).