Pegasus news feed > Pegasus @ SC16

  Are you going to attend the SC16 conference in Salt Lake City, Utah on November 13-18, 2016?   We will be presenting two research papers in the workshops below. Please, join us in the workshops and let’s have some coffee and very interesting discussions.   DataCloud 2016: The Seventh … Read More

News and Announcements from OSG Operations > Announcing OSG CA Certificate Package Update

We are pleased to announce our first data only release for the OSG Software Stack. Data releases will not contain any software changes. A typical data release would only have CA certificate and/or VO package changes.

This release contains updated CA Certificates based on IGTF 1.78
* Removed superseded INFN-CA-2006 CA (IT)
* Updated Debian packaging to support APT security improvements
* Updated namespaces and signing_policy files for CILogon Basic CA to permit DNs without "/C=US" (US)
* Added G2 series (sha-2) QuoVadis Root 2 and Grid ICA G2 (BM)
* Removed discontinued UniandesCA (CO)

Release notes and pointers to more documentation can be found at:

Need help? Let us know:

We welcome feedback on this release!

News and Announcements from OSG Operations > End of OSG GRAM Support - 8 November 2016

As part of the OSG transition from GRAM to HTCondor-CE technology, OSG Operations and Software teams will stop supporting GRAM CE installations on November 8, 2016. A site that requests support for GRAM or GRAM-based software will be asked to upgrade to HTCondor-CE instead. However, GRAM packages will remain available for sites that accept full responsibility for the operation of GRAM.

Other milestones that will likely occur in early 2017:

* Ending support for GRAM CEs at the OSG pilot factories, which will leave many VOs unable to run at GRAM sites

* Dropping GRAM software from the OSG software stack (starting in OSG 3.4.0)

* Removing GRAM-based code from OSG tools like osg-configure


An OSG CE is the entry point for the OSG to your local resources. At the heart of the CE is the job gateway software, which accepts incoming jobs, authorizes them, and delegates them to your batch system for execution. These days, OSG jobs come from the OSG factories and are really pilot jobs, which in turn run actual end-user jobs. Since OSG started, the Globus GRAM gatekeeper has managed grid jobs, but starting in May 2014, the OSG added another job gateway option - HTCondor-CE - that is built on core HTCondor technology. For many reasons, OSG has decided to migrate all CEs to HTCondor-CE and is today in the midst of this change.

Since August 2014, new CE installs included both HTCondor-CE and GRAM, and since December 2014, HTCondor-CE is the default job gateway software for a new site. Today, HTCondor-CE is on 60 CEs and is the sole job gateway software on many of them; the list of sites that have already migrated includes some of the largest OSG sites, as well as medium and small ones.

This is a significant technical change for OSG, but the good news is that it is well underway and has been going very smoothly for most sites. If you are still running GRAM, please consider migrating to HTCondor-CE as soon as possible (you can run both for a while, if you like). Then, once your HTCondor-CE is running well, decommission the GRAM software.

News and Announcements from OSG Operations > Planned Retirement of OSG BDII

OSG Collaborators,

OSG Operations and Technology are planning the retirement of the BDII
information service located at on March 31st, 2017. We have
been working with WLCG, ATLAS and CMS to remove dependencies or replace
the functionality within our HTCondor Collector service. This work is
still ongoing. This message is to alert you to the upcoming deprecation
date and to get feedback on any other existing dependencies that might
exist to the BDII.

If you are dependent in any way on the OSG BDII or information the OSG
BDII supplies to the WLCG or EGI BDIIs please contact us at

Pegasus news feed > Pegasus 4.7.0 Released

pegasus-47We are happy to announce the release of Pegasus 4.7.0.  Pegasus 4.7.0 is a major release of Pegasus and includes all the bug fixes and improvements in the 4.6.2 release.
New features and Improvements in 4.7.0 include:
  • Automatic submit directory organization
  • Improved directory management on staging site in nonsharedfs mode
  • pegasus-analyzer reports information about held jobs
  • Check for cyclic dependencies in DAG


New Features

  1. [PM-833] – Pegasus should organize submit files of workflows in hierarchal data structure
    • Pegasus now automatically distributes the files in HTCondor submit directory for all workflows in 2 level directory structure. This is done to prevent having too many workflow and condor submit files in one directory for a large workflow.  The behavior of submit directory organization can be controlled by the following properties
      • pegasus.dir.submit.mapper.hashed.levels         the number of directory levels used to accomodate the files. Defaults to 2. pegasus.dir.submit.mapper.hashed.multiplier      the number of files associated with a job in the submit directory. defaults to 5.
    • Note that this is enabled by default.  If you want to have pre 4.7.0 behavior you can
      • pegasus.dir.submit.mapper Flat
    • Submit mapper properties are documented in the user guide here
  2. [PM-833] – Pegasus should manage directory structure on the staging site
    For non sharedfs mode, Pegasus will now automatically manage the directory structure on the staging site in a hierarchal directory structure via use of staging mappers. The staging mappers determine what sub directory on the staging site a job will be associated with. Before, the introduction of staging mappers, all files associated with the jobs scheduled for a particular site landed in the same directory on the staging site. As a result, for large workflows this could degrade filesystem performance on the staging servers. More information can be found in the documentation at
  3. [PM-1036] – R DAX API
    Pegasus now includes an R API for generating DAXes of complex and large workflows in R environments. The API follows the Google’ R style guide, and all objects and methods are defined using the S3OOP system. The source package can be obtained by running pegasus-config --r or from the Pegasus’ downloads page. A tutorial workflow can be generated using pegasus-init, and an example workflow is provided in the examples folder. More information can be found in the documentation at

Related JIRA item: [PM-1074] – Add R example to pegasus-init

  • [PM-1126] – pegasus-analyzer should report information about held jobs
    Pegasus monitoring daemon now populates the reason for held jobs in it’s database. Both pegasus-analyzer and dashboard were updated to show this information.
    Related JIRA items:
    • [PM-1121] – Store reasons for workflow failure in stampede database
    • [PM-1122] – update dashboard to display reasons for workflow state and jobstate
  • [PM-1058] – Create homebrew tap for pegasus
    Pegasus is now also available via home-brew on MACOSX via a tap repository: contains formulas for pegasus and htcondor.Users can do:$ brew tap pegasus-isi/tools
    $ brew install pegasus htcondor
    $ brew tap homebrew/services
    $ brew services start htcondor
  • [PM-928] – pegasus-exitcode should write its output to a log file
    pegasus-exticode is now set to write to a workflow global log file ending in exitcode.log that captures pegasus-exitcode stdout and stderr as json messages.  This allows users to check pegasus-exitcode messages, which otherwise would have been set to /dev/null by condor dagman.
  • [PM-1115] – Pegasus to check for cyclic dependencies in the DAG
    Pegasus now explicitly checks for cyclic dependencies and reports one of the edges making up the cycle.
  • [PM-1054] – Add option to ignore files in libinterpose
    kickstart now has support for environment variables KICKSTART_TRACE_MATCH and KICKSTART_TRACE_IGNORE that determine what file accesses are captured via lib interpose. The MATCH version only traces files that match the patterns, and the IGNORE version does NOT trace files that match the patterns. Only one of the two can be specified.
  • [PM-915] – modify kickstart to collect and aggregate runtime metadata
  • [PM-1004] – update metrics server ui to display extra planner configuration metrics
  • [PM-1111] – pegasus planner and api’s should have support for ppc64 as architecture type
  • [PM-1117] – Support for tutorial via pegasus-init on bluewaters
  • Improvements

    1. [PM-1125] – Disable builds for older platforms
    2. [PM-1116] – pass task resource requirements as environment variables for job wrappers to pick up
    3. [PM-1112] – enable variable expansion for regex based replica catalog
    4. [PM-1105] – Mirror job priorities to DAGMan node priorities
      • HTCondor ticket 5749 . We can assign DAG priorities only if detected condor version is greater than 8.5.
    5. [PM-1094] – pegasus-dashboard file browser should load on demand
    6. [PM-1079] – pegasus-statistics should be able to skip over failures when generating particular type of stats
    7. [PM-1073] – condor_q changes in 8.5.x will affect pegasus-status
    8. [PM-1023] – KIckstart stdout/stderr as CDATA
    9. [PM-749] – Store job held reasons in stampede database
    10. [PM-1088] – Move to relative paths in dagman and condor submit files
    11. [PM-900] – site catalog for XSEDE
    12. [PM-901] – site for OSG

    Bugs Fixed

    1. [PM-1061] – pegasus-analyzer should detect and report on failed job submissions
    2. [PM-1118] – database changes to jobstate and workflow state tables
    3. [PM-1124] – Hashed Output Mappers throw unable to instantiate error
    4. [PM-1127] – Wf adds worker package staging even though it has already placed a worker package in place




    Pegasus news feed > Pegasus at ASHG 2016

    ashg-2016-logo-blk-v2Are you going to the ASHG’16 conference in Vancouver, Canada and want to learn more about Pegasus? Checkout the poster on “PgmNr 1877: Automated Genotypic Imputation of PAGE II Data using Scientific Workflows” in the Posters area. The poster will be presented on  Thursday, Oct 20th. 2:00pm – 3:00pm in Exhibit Hall B, West Building.

    We will also be available to meet with users individually during the conference. Send email to pegasus AT isi dot edu to setup a meeting.

    We hope to see you there!



    News and Announcements from OSG Operations > Announcing OSG Software version 3.3.17

    We are pleased to announce OSG Software version 3.3.17.

    Changes to OSG 3.3.17 include:
    * HTCondor 8.4.9: Job Router prompts schedd reschedule, other bug fixes
    * HTCondor-CE 2.0.10: handles unbounded accounting directory, other bug fixes
    * Update gratia probe to work with more recent versions of Slurm
    * Frontier-squid 2.7.STABLE9-27: fix unbounded growth of swap.state
    * CVMFS 2.3.2: secure access to data in repositories supported
    * XRootD 4.4.0
    * Tarballs contain RPM package version list
    * Add fallback default in gratia probe for HTCondor-CE history folder
    * Several configuration updates to better mesh with EL7 and systemd
    * HTCondor 8.5.7 in upcoming: schedd can perform job ClassAd transformations
    * VO Package v69: Added: miniclean VO; Removed: LNBE, CDF INFN

    Release notes and pointers to more documentation can be found at:

    Need help? Let us know:

    We welcome feedback on this release!

    News and Announcements from OSG Operations > GOC Service Update - Tuesday, October 11th at 13:00 UTC

    The GOC will upgrade the following services beginning Tuesday, October 11th at 13:00 UTC.
    The GOC reserves 8 hours in the unlikely event unexpected problems are encountered.

    - reverting previous temporary fix for CILogon DNS issues, as issue has been fixed at DNS level

    Condor Project News > HTCondor 8.5.7 released! ( September 29, 2016 )

    The HTCondor team is pleased to announce the release of HTCondor 8.5.7. This development series release contains new features that are under development. This release contains all of the bug fixes from the 8.4.9 stable release. Highlights of the release are: The schedd can perform job ClassAd transformations; Specifying dependencies between DAGMan splices is much more flexible; The second argument of the ClassAd ? : operator may be omitted; Many usability improvements in condor_q and condor_status; condor_q and condor_status can produce JSON, XML, and new ClassAd output; To prepare for a 64-bit Windows release, HTCondor identifies itself as X86; Automatically detect Daemon Core daemons and pass localname to them. Further details can be found in the Development Version History and the Stable Version History. HTCondor 8.5.7 binaries and source code are available from our Downloads page.

    Condor Project News > HTCondor 8.4.9 released! ( September 29, 2016 )

    The HTCondor team is pleased to announce the release of HTCondor 8.4.9. A stable series release contains significant bug fixes. Highlights of this release are: The condor_startd removes orphaned Docker containers on restart; Job Router and HTCondor-C job submission prompts schedd reschedule; Fixed bugs in the Job Router's hooks; Improved systemd integration on Enterprise Linux 7; Upped default number of Chirp attributes to 100, and made it configurable; Fixed a bug where variables starting with STARTD. or STARTER. were ignored. Further details can be found in the Version History. HTCondor 8.4.9 binaries and source code are available from our Downloads page.

    Pegasus news feed > Pegasus Workshop at USC, September 30th

    usc-930Time: 2:30pm-4:30pm
    Date:  Friday, September 30th, 2016
    Location: VPD 106 (Verna and Peter Dauterive Hall, UPC).

    Instructor:  The USC HPC and Pegasus team
    Course Material:


    The  USC HPC and Pegasus Team is hosting a half day workshop on September 30th, 2016 at the USC main campus. This workshop includes a hands-on component that requires an active HPC account. If you don’t have USC HPC account and want to attend the workshop, HPC team can now offer temporary HPC accounts for workshop attendees. To be eligible, you must have a USC NetID, and must register via the Registration link below. This is a great way to check out HPC and learn about Workflows if you do not have an HPC account.

    Scientific Workflows via The Pegasus Workflow Management System on the HPC Cluster

    Workflows are a key technology for enabling complex scientific applications. They capture the interdependencies between processing steps in data analysis and simulation pipelines, as well as the mechanisms to execute those steps reliably and efficiently in a distributed computing environment. They also enable scientists to capture complex processes to promote sharing and reuse, and provide provenance information necessary for the verification of scientific results and scientific reproducibility.

    In this workshop, we will focus on how to model scientific analysis as a workflow that can be executed on the USC HPC cluster using Pegasus WMS ( Pegasus allows users to design workflows at a high-level of abstraction, that is independent of the resources available to execute them and the location of data and executables. It compiles these abstract workflows to executable workflows that can be deployed onto distributed resources such local campus clusters, computational clouds and grids such as XSEDE and Open Science Grid. During the compilation process, Pegasus WMS does data discovery, whereby it determines the locations of input data files and executables. Data transfer tasks are added to the executable workflow that are responsible for staging in the input files to the cluster, and the generated output files back to a user specified location. In addition to the data transfers tasks, data cleanup (cleanup data that is no longer required) and data registration tasks (catalog the output files) are be added to the pipeline.

    Through hands-on exercises, we will cover issues of workflow composition, how to design a workflow in a portable way, workflow execution and how to run the workflow efficiently and reliably on the USC HPC cluster. An important component of the tutorial will be how to monitor, debug and analyze workflows using Pegasus-provided tools. The workshop will also cover how to execute MPI application codes as part of a workflow.

    This workshop is intended for both new and existing HPC users. It is highly recommended that you take the Introduction to Linux/Unix workshop if you haven’t worked in the Linux environment before. The participants will be expected to bring in their own laptops with the following software installed: SSH client, Web Browser, PDF reader. If you have any questions about either of these workshops, please send email to and We look forward to seeing you there!



    Derek's Blog > Running R at HCC

    The full presentation.

    There are many methods to run R applications at HCC. I can break these uses down to:

    1. Creating a traditional Slurm submit file that runs an R script. The vast majority of R users do this.
    2. Using a program, such as GridR, that will create the submission files for you from within R.

    In this post, I will discuss and layout the different methods of submitting jobs to HCC and the OSG. Further these methods lie on a spectrum of both difficulty in using.

    Difficulty Spectrum Each step is more and more difficult. Running R on your laptop is much easier than running R on a cluster. And running R on a cluster is less difficult than running it on the Grid. But there are techniques to bring these closer together.

    Creating Slurm submit files

    Creating Slurm submit files and writing R scripts is the most common method of R users at HCC. The steps to this workflow is:

    1. Create a Slurm submit file
    2. Write a R script that will read in your data, and output it
    3. Copy data onto cluster from the laptop
    4. Submit Slurm submit file
    5. Wait for completion (you can ask to get an email)

    More on the Slurm configuration is available at HCC Documentation.

    A submit file for Slurm is below:

    #SBATCH --time=00:30:00
    #SBATCH --mem-per-cpu=1024
    #SBATCH --job-name=TestJob
    #SBATCH --error=TestJob.stdout
    #SBATCH --output=TestJob.stderr
    module load R/3.3
    R CMD BATCH Rcode.R

    This submit file describes a job that will run 30 minutes, and require 1024MB of ram. Below the #SBATCH lines are the actual script that will run on the worker node. The module command loads the newest version of R on HCC’s clusters. Next command runs an R script named Rcode.R.

    A parallel submission is:

    #SBATCH --ntasks-per-node=16
    #SBATCH --nodes=1
    #SBATCH --time=00:30:00
    #SBATCH --mem-per-cpu=1024
    #SBATCH --job-name=TestJob
    #SBATCH --error=TestJob.stdout
    #SBATCH --output=TestJob.stderr
    module load R/3.3
    R CMD BATCH Rcode.R

    This submit file adds --ntasks-per-node and --nodes=1 that describes the parallel jobs. ntasks-per-node specifies how many cores on a remote worker node is required for the job. --nodes describes the number of physical nodes that this job should span across. All other lines are very similar to the previous single core submission file.

    The R code looks a bit different though. Here is an example:

    a <- function(s) { return (2*s) }
    mclapply(c(1:20), a, mc.cores = 16)

    This will run mclapply which will apply the made up function a across the list specified in c(1, 20).

    Using GridR to submit processing

    GridR is another method for farming processing out to remote cluster. GridR is able to submit to HTCondor clusters. Therefore, it is able to submit to the OSG through HTCondor.

    The GridR package is hosted on Github. The wiki is very useful with examples and tutorials on how to use GridR.

    Below is a working example script of using R on HCC’s Crane cluster.

    # Initialize the GridR library for submissions
    grid.init(service="condor.local", localTmpDir="tmp", bootstrap=TRUE, remoteRPath="/util/opt/R/3.3/gcc/4.4/bin/R", Rurl="")
    # Create a quick function to run remotely
    a <- function(s) { return (2*s) }
    # Run the apply function, much like lapply.  In this case, with only 1 attribute to apply
    grid.apply("x", a, 13, wait = TRUE)
    # Output the results.

    This R script submits jobs to the OSG from the Crane cluster. It will run the simple a function on the remote worker nodes on the OSG.

    The jobs can run anywhere on the OSG:

    OSG Running Jobs

    Jobs submitted to the OSG can run on multiple sites around the U.S. They will execute and and return the results.


    There are many methods to submitting R processing to clusters and the grid. One has to choose which one best suites them.

    The GridR method is easy for experience R programmers. But, it lacks the flexibility of the Slurm submit file method. The Slurm submit method requires learning some Linux and Slurm syntax, but offers the flexibility to specify multiple cores per R script or more memory per job.

    News and Announcements from OSG Operations > GOC Service Update - Tuesday, September 27th at 13:00 UTC

    The GOC will upgrade the following services beginning Tuesday, September 13th at 13:00 UTC. The GOC reserves 8 hours in the unlikely event unexpected problems are encountered.

      * Changes to wording for instructions of command line host cert issuance
      * Changes to DNS query for host cert issuance via CILogon. We will try to resolve the DNS address more than once on failure.

      * Configuration changes to extend 24-hour recovery limitations

    Ticket Exchange
      * Configuration changes
      * GGUS synchronization format update
      * FNAL synchronization format update

    OSG Website
      * Routine Wordpress version and plugin updates

    All Services
      * Operating system updates; reboots will be required. The usual HA mechanisms will be used, but some services will experience brief outages. Additionally, the primary and backup LDAP and DNS servers used internally will be exchanged to allow maintenance on the original primary server

    Pegasus news feed > $1M NSF award for Data Integrity Project


    The three-year project, Scientific Workflow Integrity with Pegasus, is funded by a $1 million grant from the National Science Foundation (NSF) as part of its Cybersecurity Innovation for Cyberinfrastructure (CICI) program. Von Welch, director of IU’s Center for Applied Cybersecurity Research (CACR), is the project’s principal investigator.

    The Pegasus Workflow Management System is popular among the research community for its ability to easily structure and execute large-scale data analyses. The application benefits a wide range of scientific applications including LIGO (the Laser Interferometer Gravitational-Wave Observatory), which announced the first direct detection of gravitational waves earlier this year—proving that Einstein’s theory was right.

    IU will receive nearly half of the grant, $479,855, to increase cybersecurity within Pegasus’s computational science and give scientists added ease of mind by providing the means to validate their data. The remaining half has been awarded to the project’s collaborators—the Renaissance Computing Institute (RENCI) at the University of North Carolina ($230k) and the Information Sciences Institute (ISI) at the University of Southern California ($290k).

    By digitally signing the data that is run through Pegasus, these improvements will strengthen consistency in results from multiple workflows. They’ll also allow users to see whether their data has changed since the last time a workflow was completed.

    “Scientific data is a key part of scientific workflows and, ultimately, the science project,” said Welch. “By integrating support for data integrity into the popular workflow management tool Pegasus, we increase our trust in computational science in a manner that will be easy for scientists to use.”

    Welch and Steven Myers, associate professor at IU’s School of Informatics and Computing, will lead the project team, which includes experts in cybersecurity and virtualization, alongside the Pegasus development team.

    One of the challenges of the new project will be to make sure that the cryptography used for ensuring data integrity, such as the digital signatures, will scale appropriately to handle the increasingly large scientific datasets. Myers, an expert in cryptography, will guide the selection, implementation and deployment of the cryptographic systems, making sure they are efficient, and likely to maintain their security over the lengthy time periods scientific data is referenced and used.

    “Cryptography can provide strong assurances of data integrity and records of its origin and modifications over the long periods of time that much scientific data is used and must be maintained,” said Myers. “Given the experimental costs of some of this data, having strong assurances is critical, as some groups have definite motive to modify the data, and the experiments are incredibly costly to reproduce if the data’s integrity is questioned.”

    Scientists from a variety of disciplines, including astronomy, bioinformatics, earthquake science, gravitational wave physics, ocean science and neuroscience, have used Pegasus to run over 700,000 workflows over the last three years. However, Welch’s team aims to achieve solutions that will be generic enough to apply to other workflow systems and applications and help an even broader scope of researchers.

    “I am very excited to work with the IU and RENCI teams to include new and critical data integrity solutions into Pegasus,” said Ewa Deelman, research professor and director at ISI. “The results of this work will benefit a number of science disciplines and will help scientists to have a higher degree of trust in their results and the results shared by their colleagues.”





    News and Announcements from OSG Operations > Announcing OSG Software version 3.3.16

    We are pleased to announce OSG Software version 3.3.16.

    Changes to OSG 3.3.16 include:
    * Updated most Globus Packages to latest available from EPEL
    * Note: Now Globus Toolkit strictly checks host names against certificates
    * BLAHP 1.18.25: Additional features supported for SGE, PBS Pro, and Slurm
    * Update to GlideinWMS 3.2.15
    * Fixed major scalability problem in GUMS on EL7
    * HTCondor-CE 2.0.8: Support for Terana eScience, minor bug fixes
    * The MyProxy server now produces RFC compliant proxies
    * Fixed load-balancing in Globus GridFTP when using IPv6 addresses
    * Added the HTCondor CREAM GAHP for EL7 platforms
    * Completed porting components of OSG Software Stack to EL7
    * Added RSV GlideinWMS Tester for VO Front-ends to test site support
    * Updated to lcas-lcmaps-gt4-interface to version 0.3.1
    * VO Package v68: Added project8 VO

    Release notes and pointers to more documentation can be found at:

    Need help? Let us know:

    We welcome feedback on this release!

    News and Announcements from OSG Operations > Scheduled FermiLab Power Outage

    This weekend, Fermilab will have a scheduled power outage in the Feynman Computer Center to repair an automatic power transfer switch. The transfer switch ensures that the lab’s computing services have redundant power. This scheduled outage will cause many services at the laboratory to be unavailable.

    The outage date is Saturday, Sept. 17, the same day as a scheduled Wilson Hall cooling outage, and is expected to last more than 8 hours. Services may be affected starting Friday, September 16 at 4:00 PM and are estimated to be restored by Saturday at 6:00 PM, though the outage could last longer. (All times US central) At this time we expect Fermilab email, listserv and analog telephones to be operational.

    The Open Science Grid services expected to be affected include the OSG VOMS, Gratia, Indico and Docdb. One (of three) oasis replica will be out of service and replaced temporarily by another elsewhere. Details provided by FNAL can be found here:

    Updates will be provided via Twitter throughout the outage, so follow the Service Desk at to stay informed.