News and Announcements from OSG Operations > GOC Service Update - Tuesday, September 26th at 14:00 UTC

The GOC will upgrade the following services beginning Tuesday, September 26th at 14:00 UTC. The GOC reserves 8 hours in the unlikely event unexpected problems are encountered.

Oasis
* Update oasis-login to centos7
* Change OSG software installation to come from the goc yum repository
* Update oasis, oasis-replica, and oasis-login to the latest OSG software releases
* Move the master signing key to a Yubikey device on oasis
* Enable incoming IPv6 on oasis, oasis-replica, and oasis-itb

MyOSG
* Modifications to format of OSG sites map used by display.opensciencegrid.org

Event
* Enable STOMP plugin

RSV1
* Remove data older than 3 years
* Delete unused previous versions of the virtual machine

All Services
* Operating system updates, reboots will be required. The usual HA mechanisms will be used, but some services will experience brief outages.

News and Announcements from OSG Operations > TWiki goes read-only October 3, 2017

Colleagues,
   
As has been discussed in several OSG channels, the OSG Operations team will be transitioning the TWiki to a read-only state as of October 3, 2017.  All current documentation is being moved to a Github location. We will also help as much as possible by placing pointers in the TWiki that will let people know where new documentation sets will be.
   
All documentation on the TWiki will still be available to read, but not editable. This is the initial step of retiring the TWiki completely, which will occur at the start of the calendar year 2018. All TWiki documents will be archived in case they are needed.
   
Please let us know if you have any questions about this service change.


Pegasus news feed > Pegasus receives continued support from the National Science Foundation

The Pegasus team is pleased to announce that it has received a new grant from the National Science Foundation to support new development and maintenance of the Pegasus Workflow Management System.   It will support Pegasus for the next 5 years and help address the needs of our diverse user community.

Since 2001, the Pegasus Workflow Management System has been designed, implemented and supported to provide abstractions that enable scientists to focus on structuring their computations without worrying about the details of the target cyberinfrastructure. To support these workflow abstractions Pegasus provides automation capabilities that seamlessly map workflows onto target resources, sparing scientists the overhead of managing the data flow, job scheduling, fault recovery and adaptation of their applications.  Automation enables the delivery of services that consider criteria such as time-to-solution, as well as takes into account efficient use of resources, managing the throughput of tasks, and data transfer requests. Pegasus allows scientists to easily monitor and debug their scientific workflows, providing a suite of command line tools and a web-based workflow dashboard. These capabilities allow scientists to do production-grade science at scale using Pegasus. The power of these abstractions was demonstrated in 2015 when Pegasus was used by an international collaboration to harness a diverse set of resources and to manage compute and data- intensive workflows that confirmed the existence of gravitational waves, as predicted by Einstein’s theory of relativity.

Scientists are using Pegasus to model new materials, model the effects of seismic activity  infer human demographic history, develop a better soybean among others.

Experience from working with these diverse scientific domains has helped us uncover opportunities for further automation of scientific workflows.  The new effort will addresses these opportunities through innovation in the following areas: automation methods to include resource provisioning ahead of and during workflow execution, data-aware job scheduling algorithms, and data sharing mechanisms in high-throughput environments.  Near-term capabilities to be released in the 4.8 software release include:

  • Integration with Jupyter Notebook;
  • Support for application container technologies: both Docker and Singularity.

To support a broader group of “long-tail” scientists, the new grant provides funding towards usability improvements as well as outreach, education, and training activities.

The proposed enhancements will be integrated into Pegasus, and distributed to the user community as part of regular Pegasus software releases. This will facilitate adoption and evaluation of these capabilities in the context of real-life applications and computing environments.  The data-aware focus targets new classes of applications executing in high-throughput and high-performance environments.

The Pegasus team very much looks forward to our continued collaboration with domain and computer scientists and we also hope to work with new users and communities.  Please contact us at pegasus@isi.edu if you would like to discuss your workflow needs and ideas.

 

 

162 views


News and Announcements from OSG Operations > Announcing OSG Software version 3.4.3

We are pleased to announce OSG Software version 3.4.3.

Changes to OSG 3.4.3 include:
- Updated to CVMFS 2.4.1
- Updated to Singularity 2.3.1
- Updated to BLAHP 1.18.33: improved handling of Slurm and PBS jobs
- Updated to XRootD 4.7.0
- Updated to StashCache 0.8
- Updated Globus packages to latest EPEL versions
- osg-ca-scripts now use HTTPS to download CA certificates
- Added the ability to limit transfer load in globus-gridftp-osg-extensions
- Fixed a few memory management bugs in xrootd-lcmaps
- HTCondor CE 3.0.2 reports an error if JOB_ROUTER_ENTRIES are not defined

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release343

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

News and Announcements from OSG Operations > Announcing OSG Software version 3.3.28

We are pleased to announce OSG Software version 3.3.28.

Changes to OSG 3.3.28 include:
- Updated to BLAHP 1.18.33: improved handling of Slurm and PBS jobs
- Updated to XRootD 4.7.0
- Updated to StashCache 0.8
- Updated Globus packages to latest EPEL versions
- osg-ca-scripts now use HTTPS to download CA certificates
- Added the ability to limit transfer load in globus-gridftp-osg-extensions
- Fixed a few memory management bugs in xrootd-lcmaps
- Updated to xrootd-hdfs 1.9.2
- HTCondor CE 2.2.3 reports an error if JOB_ROUTER_ENTRIES are not defined
- Updated SELinux profile to allow GUMS to access the MySQL port

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release3328

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

Condor Project News > HTCondor 8.7.3 released! ( September 12, 2017 )

The HTCondor team is pleased to announce the release of HTCondor 8.7.3. This development series release contains new features that are under development. This release contains all of the bug fixes from the 8.6.6 stable release. Enhancements in the release include: Further updates to the late job materialization technology preview; An improved condor_top tool; Enhanced the AUTO setting for ENABLE_IPV{4,6} to be more selective; Fixed several small memory leaks. Further details can be found in the Development Version History and the Stable Version History. HTCondor 8.7.3 binaries and source code are available from our Downloads page.

Condor Project News > HTCondor 8.6.6 released! ( September 12, 2017 )

The HTCondor team is pleased to announce the release of HTCondor 8.6.6. A stable series release contains significant bug fixes. Highlights of this release are: HTCondor daemons no longer crash on reconfig if syslog is used for logging; HTCondor daemons now reliably leave a core file when killed by a signal; Negotiator won't match machines with incompatible IPv{4,6} protocol; On Ubuntu, send systemd alive messages to avoid restarting HTCondor; Fixed a problem parsing old ClassAd string escapes in the python bindings; Properly parse CPU time for Slurm grid universe jobs; Claims are released when parallel univ jobs are removed while claiming; Starter won't get stuck when a job is removed with JOB_EXIT_HOOK defined; To reduce audit logging, added cgroup rules to SELinux profile. More details about the fixes can be found in the Version History. HTCondor 8.6.6 binaries and source code are available from our Downloads page.

Derek's Blog > Installing SciTokens on a Mac

In case I ever have to install SciTokens again, the steps I took to make it work on my mac. The most difficult part of this is installing openssl headers for the jwt python library. I followed the advice on this blog post.

  1. Install Homebrew
  2. Install openssl:

     brew install openssl
    
  3. Download the SciTokens library:

     git clone https://github.com/scitokens/scitokens.git
     cd scitokens
    
  4. Create the virtualenv to install the jwt library

     virtualenv jwt
     . jwt/bin/activate
    
  5. Install jwt pointing to the Homebrew installed openssl headers:

     env LDFLAGS="-L$(brew --prefix openssl)/lib" CFLAGS="-I$(brew --prefix openssl)/include" pip install cryptography PyJWT
    
In case I ever have to install SciTokens again, the steps I took to make it work on my mac. The most difficult part of this is installing openssl headers for the jwt python library. I followed the advice on this blog post.

Pegasus news feed > Pegasus 4.8.0 Released

We are pleased to announce release of Pegasus 4.8.0 

Pegasus 4.8.0 is be a major release of Pegasus and includes improvements and bug fixes to the 4.7.4 release.

Pegasus 4.8.0 Release has support for

  • Application Containers – Pegasus now supports containers for user applications. Both Docker and Singularity are supported. More details can be found in the documentation at https://pegasus.isi.edu/docs/4.8.0/containers.php
  • Jupyter Support –  Pegasus now provides a Python API to declare and manage workflows via Jupyter, which allows workflow creation, execution, and monitoring. The API also provides mechanisms to create Pegasus catalogs (sites, replica, and transformation). More details can be found in the documentation at https://pegasus.isi.edu/docs/4.8.0/jupyter.php 
  • Tuning of Transfer and Cleanup jobs – Pegasus now computes the number of transfer and cleanup jobs to be added for a workflow for a particular level, according to number of jobs on that level https://pegasus.isi.edu/docs/4.8.0/data_transfers.php

The release can be downloaded from
http://pegasus.isi.edu/downloads

Exhaustive list of features, improvements and bug fixes can be found below

New Features and Improvements

  • [PM-1159] – Support for containers
  • [PM-1177] – API for running Pegasus workflows via Jupyter
  • [PM-1191] – If available, use GFAL over globus url copy
    • JGlobus is no longer actively supported and is not in compliance with RFC 2818(https://docs.globus.org/security-bulletins/2015-12-strict-mode). As a result cleanup jobs using pegasus-gridftp client would fail against the servers supporting the strict mode. We have removed the pegasus-gridftp client and now use gfal clients as globus-url-copy does not support removes. If gfal is not available, globus-url-copy is used for cleanup by writing out zero bytes files instead of removing them.
  • [PM-1212] – new defaults for number of transfer and inplace jobs created
  • [PM-1134] – Capture the execution site information in pegasus lite
  • [PM-1109] – dashboard to display errors if a job is killed instead of exiting with non zero exitcode
  • [PM-1146] – There doesn’t seem to be a way to get a persistent URL for a workflow in dashboard
  • [PM-1155] – remote cleanup jobs should have file url’s if possible
  • [PM-1158] – Make DAX3 API compatible with Python 2.6+ and Python3+
  • [PM-1161] – Update documentation of large databases for handling mysqldump: Error 2013
  • [PM-1187] – make scheduler type case insensitive for grid gateway in site catalog
  • [PM-1165] – Update Transformation Catalog format to support containers
  • [PM-1166] – pegasus-transfer to support transfers from docker hub
  • [PM-1180] – update monitord to populate checksums
  • [PM-1183] – monitord plumbing for putting hooks for integrity checking
  • [PM-1188] – Add tool to integrity check transferred files
  • [PM-1190] – planner changes to enable integrity checking
  • [PM-1194] – update planner pegasus lite mode to support for docker container wrapper
  • [PM-1195] – update site selection to handle containers
  • [PM-1197] – handle symlinks for input files when launching job via container
  • [PM-1200] – update pegasus lite mode to support singulartiy
  • [PM-1201] – Transformation Catalog API should support the container keywords
  • [PM-1202] – Move catalog APIs into Pegasus.catalogs and develop standalone test cases independent from Jupyter
  • [PM-1210] – update pegasus-transfer to support transfers from singularity hub
  • [PM-1214] – Specifying environment for sites and containers
  • [PM-1215] – Document support for containers for 4.8
  • [PM-1178] – kickstart to checksum output files
  • [PM-1220] – default app name for metrics server based on dax label
  • [PM-1146] – There doesn’t seem to be a way to get a persistent URL for a workflow in dashboard
  • [PM-1186] – pegasus-db-admin should list compatibility with latest pegasus version if no changes to schema
  • [PM-1187] – make scheduler type case insensitive for grid gateway in site catalog

 

Bugs Fixed

  • [PM-1032] – Handle proper error message for non-standard python usage
  • [PM-1162] – Running pegasus-monitord replay created an unreadable database
  • [PM-1171] – Monitord regularly produces empty stderr and stdout files
  • [PM-1172] – pegasus-rc-client deletes all entries for a lfn
  • [PM-1173] – cleanup jobs failing against Titan gridftp server due to RFC 2818 compliance
  • [PM-1174] – monitord should maintain the permissions on ~/.pegasus/workflow.db
  • [PM-1176] – the job notifications on failure and success should have exitcode from kickstart file
  • [PM-1181] – monitord fails to exit if database is locked
  • [PM-1185] – destination in remote file transfers for inter site jobs point’s to directory
  • [PM-1189] – Making X86_64 the default arch in the site catalog
  • [PM-1193] – “pegasus-rc-client list” modifies rc.txt
  • [PM-1196] – pegasus-statistics is not generating jobs.txt for some large workflows
  • [PM-1207] – Investigate error message: Normalizing ‘4.8.0dev’ to ‘4.8.0.dev0’
  • [PM-1208] – Improve database is locked error message
  • [PM-1209] – Analyzer gets confused about retry number in hierarchical workflows
  • [PM-1211] – DAX API should tell which lfn was a dup
  • [PM-1213] – pegasus creates duplicate source URL’s for staged executables
  • [PM-1217] – monitord exits prematurely, when in dagman recovery mode
  • [PM-1218] – monitord replay against mysql with registration jobs 

104 views


Pegasus news feed > Pegasus 4.7.5 Released

We are happy to announce the release of Pegasus 4.7.5 . Pegasus 4.7.5 is a minor release, which contains minor enhancements and fixes bugs. This will most likely be the last release in the 4.7 series, and unless you have specific reasons to stay with the 4.7.x series, we recommend to upgrade to 4.8.0.

Improvements

  • [PM-1146] – There doesn’t seem to be a way to get a persistent URL for a workflow in dashboard
  • [PM-1186] – pegasus-db-admin should list compatibility with latest pegasus version if no changes to schema
  • [PM-1187] – make scheduler type case insensitive for grid gateway in site catalog
  • [PM-1191] – If available, use GFAL over globus url copy
    • JGlobus is no longer actively supported and is not in compliance with RFC 2818(https://docs.globus.org/security-bulletins/2015-12-strict-mode). As a result cleanup jobs using pegasus-gridftp client would fail against the servers supporting the strict mode. We have removed the pegasus-gridftp client and now use gfal clients as globus-url-copy does not support removes. If gfal is not available, globus-url-copy is used for cleanup by writing out zero bytes files instead of removing them.

Bugs Fixed

  • [PM-1032] – Handle proper error message for non-standard python usage
  • [PM-1171] – Monitord regularly produces empty stderr and stdout files
  • [PM-1172] – pegasus-rc-client deletes all entries for a lfn
  • [PM-1173] – cleanup jobs failing against Titan gridftp server due to RFC 2818 compliance
  • [PM-1176] – the job notifications on failure and success should have exitcode from kickstart file
  • [PM-1181] – monitord fails to exit if database is locked
  • [PM-1185] – destination in remote file transfers for inter site jobs point’s to directory
  • [PM-1193] – “pegasus-rc-client list” modifies rc.txt
  • [PM-1196] – pegasus-statistics is not generating jobs.txt for some large workflows
  • [PM-1207] – Investigate error message: Normalizing ‘4.8.0dev’ to ‘4.8.0.dev0’
  • [PM-1208] – Improve database is locked error message
  • [PM-1209] – Analyzer gets confused about retry number in hierarchical workflows
  • [PM-1211] – DAX API should tell which lfn was a dup
  • [PM-1213] – pegasus creates duplicate source URL’s for staged executables
  • [PM-1217] – monitord exits prematurely, when in dagman recovery mode

55 views


News and Announcements from OSG Operations > OSG Certificate Authority Training Sessions

Dear OSG Security Contacts,

The OSG security team is asking all the VO security contacts, Registration Agents (RAs) and Grid Admins (GAs) to attend a 20-min security training about how to handle OSG certificate requests. Policies about our OSG Certificate Authority (CA) and guidelines on what to check when approving a certificate request will be discussed in each session.  There are 3 available sessions, all times are Central Standard Time (CST). Participants can subscribe to each session in advance here: https://doodle.com/poll/bsv9v3p7tye25c4v


Topic: Security Training for OSG VO contacts
Call US: +1 408 638 0968 or +1 646 558 8656 and dial the Meeting ID for the selected time.

1) Time: Sep 14, 2017 10:00 AM Central Standard Time (US and Canada)
- https://fnal.zoom.us/j/942795305
- Meeting ID: 942 795 305

2) Time: Sep 19, 2017 2:00 PM Central Standard Time (US and Canada)
- https://fnal.zoom.us/j/537276995
- Meeting ID: 537 276 995

3) Time: Sep 20, 2017 9:00 AM Central Standard Time (US and Canada)
- https://fnal.zoom.us/j/657145301
- Meeting ID: 657 145 301

  Please contact the OSG security team at security@opensciencegrid.org if you have any questions.

News and Announcements from OSG Operations > GOC Service Update - Tuesday, September 12th at 14:00 UTC

The GOC will upgrade the following services beginning Tuesday, September 12th at 13:00 UTC. The GOC reserves 8 hours in the unlikely event unexpected problems are encountered.

Event
  * Modifications to logging levels and virtual machine memory

GRACC
  * Cleaned database of extraneous users from the OSG User School 2017.  Assigned them to proper VO, OSG.  See https://jira.opensciencegrid.org/browse/GRACC-134
  * Cleaned extraneous facilities.  Partial completion, see https://jira.opensciencegrid.org/browse/GRACC-130
  * Deployed Backup Confirmation email (https://github.com/opensciencegrid/gracc-email)

Oasis
  * Update oasis-login to centos7
  * Change OSG software installation to come from the goc yum repository
  * Update oasis, oasis-replica, and oasis-login to the latest OSG software releases
  * Move the master signing key to a Yubikey device on oasis
  * Enable incoming IPv6 on oasis, oasis-replica, and oasis-itb

RSV
  * Remove data older than 3 years
  * Delete unused previous versions of the virtual machine

Ticket
  * Adding a CC field to a ticket security notification

Pegasus news feed > Pegasus 4.8.0beta3 Released

We are pleased to announce a beta release Pegasus 4.8.0beta3 for upcoming Pegasus 4.8.0.

Pegasus 4.8.0 will be a major release of Pegasus and includes improvements and bug fixes to the 4.7.4 release.

Pegasus 4.8.0 Release has support for

  • Application Containers – Pegasus now supports containers for user applications. Both Docker and Singularity are supported. More details can be found in the documentation at https://pegasus.isi.edu/docs/4.8.0dev/containers.php
  • Jupyter Support –  Pegasus now provides a Python API to declare and manage workflows via Jupyter, which allows workflow creation, execution, and monitoring. The API also provides mechanisms to create Pegasus catalogs (sites, replica, and transformation). More details can be found in the documentation at https://pegasus.isi.edu/docs/4.8.0dev/jupyter.php
  • Tuning of Transfer and Cleanup jobs – Pegasus now computes the number of transfer and cleanup jobs to be added for a workflow for a particular level, according to number of jobs on that level https://pegasus.isi.edu/docs/4.8.0dev/data_transfers.php

The beta release can be downloaded from

https://download.pegasus.isi.edu/pegasus/4.8.0beta3/

Exhaustive list of features, improvements and bug fixes can be found below

New Features and Improvements

  • [PM-1134] – Capture the execution site information in pegasus lite
  • [PM-1159] – Support for containers
  • [PM-1177] – API for running Pegasus workflows via Jupiter
  • [PM-1109] – dashboard to display errors if a job is killed instead of exiting with non zero exitcode
  • [PM-1146] – There doesn’t seem to be a way to get a persistent URL for a workflow in dashboard
  • [PM-1155] – remote cleanup jobs should have file url’s if possible
  • [PM-1158] – Make DAX3 API compatible with Python 2.6+ and Python3+
  • [PM-1161] – Update documentation of large databases for handling mysqldump: Error 2013
  • [PM-1187] – make scheduler type case insensitive for grid gateway in site catalog
  • [PM-1212] – new defaults for number of transfer and inplace jobs created
  • [PM-1165] – Update Transformation Catalog format to support containers
  • [PM-1166] – pegasus-transfer to support transfers from docker hub
  • [PM-1180] – update monitord to populate checksums
  • [PM-1183] – monitord plumbing for putting hooks for integrity checking
  • [PM-1188] – Add tool to integrity check transferred files
  • [PM-1190] – planner changes to enable integrity checking
  • [PM-1194] – update planner pegasus lite mode to support for docker container wrapper
  • [PM-1195] – update site selection to handle containers
  • [PM-1197] – handle symlinks for input files when launching job via container
  • [PM-1200] – update pegasus lite mode to support singulartiy
  • [PM-1201] – Transformation Catalog API should support the container keywords
  • [PM-1202] – Move catalog APIs into Pegasus.catalogs and develop standalone test cases independent from Jupyter
  • [PM-1210] – update pegasus-transfer to support transfers from singularity hub
  • [PM-1214] – Specifying environment for sites and containers
  • [PM-1215] – Document support for containers for 4.8

 

Bugs Fixed

  • [PM-1032] – Handle proper error message for non-standard python usage
  • [PM-1131] – stage in jobs repeat portions of deep LFN’s
  • [PM-1132] – Hashed staging mapper doen’t work correctly with sub dax generation jobs
  • [PM-1135] – pegasus.transfer.bypass.input.staging breaks symlinking on the local site
  • [PM-1136] – With bypass input staging some URLs are ending up in the wrong site
  • [PM-1141] – The commit to allow symlinks in pegasus-transfer broke PFN fall through
  • [PM-1142] – Do not set LD_LIBRARY_PATH in job env
  • [PM-1144] – pegasus lite prints the wrong hostname for non-glidein jobs
  • [PM-1147] – pegasus-transfer should check that files exist before trying to transfer them
  • [PM-1148] – kickstart should print a more helpful error message if the executable is missing
  • [PM-1151] – pegasus-monitord fails to populate stampede DB correctly when workflow is run on HTCondor 8.5.8
  • [PM-1152] – pegasus-analyzer not showing stdout and stderr of failed transfer jobs
  • [PM-1153] – Pegasus creates extraneous spaces when replacing <file name=”something” />
  • [PM-1154] – regex too narrow for GO names with dashes
  • [PM-1157] – monitord replay should work on submit directories that are moved
  • [PM-1160] – Dashboard is not recording the hostname correctly
  • [PM-1162] – Running pegasus-monitord replay created an unreadable database
  • [PM-1163] – Confusing error message in pegasus-kickstart
  • [PM-1164] – worker package in submit directory gets deleted during workflow run
  • [PM-1171] – Monitord regularly produces empty stderr and stdout files
  • [PM-1172] – pegasus-rc-client deletes all entries for a lfn
  • [PM-1173] – cleanup jobs failing against Titan gridftp server due to RFC 2818 compliance
  • [PM-1174] – monitord should maintain the permissions on ~/.pegasus/workflow.db
  • [PM-1176] – the job notifications on failure and success should have exitcode from kickstart file
  • [PM-1181] – monitord fails to exit if database is locked
  • [PM-1182] – registration jobs fail if a file based RC has variables defined
  • [PM-1185] – destination in remote file transfers for inter site jobs point’s to directory
  • [PM-1189] – Making X86_64 the default arch in the site catalog
  • [PM-1191] – If available, use GFAL over guc
  • [PM-1192] – User supplied env setup script for lite
  • [PM-1193] – “pegasus-rc-client list” modifies rc.txt
  • [PM-1196] – pegasus-statistics is not generating jobs.txt for some large workflows
  • [PM-1207] – Investigate error message: Normalizing ‘4.8.0dev’ to ‘4.8.0.dev0’
  • [PM-1208] – Improve database is locked error message
  • [PM-1209] – Analyzer gets confused about retry number in hierarchical workflows
  • [PM-1211] – DAX API should tell which lfn was a dup
  • [PM-1213] – pegasus creates duplicate source URL’s for staged executables
  • [PM-1217] – monitord exits prematurely, when in dagman recovery mode

Technical task

  • [PM-1178] – kickstart to checksum output files

 

144 views


Erik Erlandson - Tool Monkey > Rethinking the Concept of Release Versioning

Recently I've been thinking about a few related problems with our current concepts of software release versioning, release dependencies and release building. These problems apply to software releases in all languages and build systems that I've experienced, but in the interest of keeping this post as simple as possible I'm going to limit myself to talking about the Maven ecosystem of release management and build tooling.

Consider how we annotate and refer to release builds for a Scala project: The version of Scala -- 2.10, 2.11, etc -- that was used to build the project is a qualifier for the release. For example, if I am building a project using Scala 2.11, and package P is one of my project dependencies, then the maven build tooling (or sbt, etc) looks for a version of P that was also built using Scala 2.11; the build will fail if no such incarnation of P can be located. This build constraint propagates recursively throughout the entire dependency tree for a project.

Now consider how we treat the version for the package P dependency itself: Our build tooling forces us to specify one exact release version x.y.z for P. This is superficially similar to the constraint for building with Scala 2.11, but unlike the Scala constraint, the knowledge about using P x.y.z is not propagated down the tree.

If the dependency for P appears only once in the depenency tree, everything is fine. However, as anybody who has ever worked with a large dependency tree for a project knows, package P might very well appear in multiple locations of the dep-tree, as a transitive dependency of different packages. Worse, these deps may be specified as different versions of P, which may be mutually incompatible.

Transitive dep incompatibilities are a particularly thorny problem to solve, but there are other annoyances related to release versioning. Often a user would like a "major" package dependency built against a particular version of that dep. For example, packages that use Apache Spark may need to work with a particular build version of Spark (2.1, 2.2, etc). If I am the package purveyor, I have no very convenient way to build my package against multiple versions of spark, and then annotate those builds in Maven Central. At best I can bake the spark version into the name. But what if I want to specify other package dep verions? Do I create package names with increasingly-long lists of (package,version) pairs hacked into the name?

Finally, there is simply the annoyance of revving my own package purely for the purpose of building it against the latest versions of my dependencies. None of my code has changed, but I am cutting a new release just to pick up current dependency releases. And then hoping that my package users will want those particular releases, and that these won't break their builds with incompatible transitive deps!

I have been toying with a release and build methodology for avoiding these headaches. What follows is full of vigorous hand-waving, but I believe something like it could be formalized in a useful way.

The key idea is that a release build is defined by a build signature which is the union of all (dep, ver) pairs. This includes:

  1. The actual release version of the package code, e.g. (mypackage, 1.2.3)
  2. The (dep, ver) for all dependencies (taken over all transitive deps, recursively)
  3. The (tool, ver) for all impactful build tooling, e.g. (scala, 2.11), (python, 3.5), etc

For example, if I maintain a package P, whose latest code release is 1.2.3, built with dependencies (A, 0.5), (B, 2.5.1) and (C, 1.7.8), and dependency B built against (Q, 6.7) and (R, 3.3), and C built against (Q, 6.7) and all compiled with (scala, 2.11), then the build signature will be:

{ (P, 1.2.3), (A, 0.5), (B, 2.5.1), (C, 1.7.8), (Q, 6.7), (R, 3.3), (scala, 2.11) }

Identifying a release build in this way makes several interesting things possible. First, it can identify a build with a transitive dependency problem. For example, if C had been built against (Q, 7.0), then the resulting build signature would have two pairs for Q; (Q, 6.7) and (Q, 7.0), which is an immediate red flag for a potential problem.

More intriguingly, it could provide a foundation for avoiding builds with incompatible dependencies. Suppose that I redefine my build logic so that I only specify dependency package names, and not specific versions. Whenever I build a project, the build system automatically searches for the most-recent version of each dependency. This already addresses some of the release headaches above. As a project builder, I can get the latest versions of packages when I build. As a package maintainer, I do not have to rev a release just to update my package deps; projects using my package will get the latest by default. Moreover, because the latest package release is always pulled, I never get multiple incompatible dependency releases in a build.

Suppose that for some reason I need a particular release of some dependency. From the example above, imagine that I must use (Q, 6.7). We can imagine augmenting the build specification to allow overriding the default behavior of pulling the most recent release. We might either specify a specific version as we do currently, or possibly specify a range of releases, as systems like brew or ruby gemfiles allow. In the case where some constraint is placed on releases, this constraint would be propagaged down the tree (or possibly up from the leaves), in essentially the same way that the constraint of scala version is already. In the event that the total set of constraints over the whole dependency tree is not satisfiable, then the build will fail.

With a build annotation system like the one I just described, one could imagine a new role for registries like Maven Central, where different builds are automatically cached. The registry could maybe even automatically run CI testing to identify the most-recent versions of package dependencies that satisfy any given package build, or perhaps valid dependency release ranges.

To conclude, I believe that re-thinking how we describe the dependencies used to build and annotate package releases, by generalizing release version to include the release version of all transitive deps (including build tooling as deps), may enable more flexible ways to both build software releases and specify them for pulling.

Happy Computing!


Pegasus news feed > Pegasus 4.8.0beta2 Released

We are pleased to announce a beta release Pegasus 4.8.0beta2 for upcoming Pegasus 4.8.0.

Pegasus 4.8.0 will be a major release of Pegasus and includes improvements and bug fixes to the 4.7.4 release.

Pegasus 4.8.0 Release has support for

  • Application Containers – Pegasus now supports containers for user applications. Both Docker and Singularity are supported. More details can be found in the documentation at https://pegasus.isi.edu/docs/4.8.0dev/containers.php
  • Jupyter Support –  Pegasus now provides a Python API to declare and manage workflows via Jupyter, which allows workflow creation, execution, and monitoring. The API also provides mechanisms to create Pegasus catalogs (sites, replica, and transformation). More details can be found in the documentation at https://pegasus.isi.edu/docs/4.8.0dev/jupyter.php

The beta release can be downloaded from

https://download.pegasus.isi.edu/pegasus/4.8.0beta2/

Exhaustive list of features, improvements and bug fixes can be found below

New Features and Improvements

  • [PM-1134] – Capture the execution site information in pegasus lite
  • [PM-1159] – Support for containers
  • [PM-1177] – API for running Pegasus workflows via Jupiter
  • [PM-1109] – dashboard to display errors if a job is killed instead of exiting with non zero exitcode
  • [PM-1146] – There doesn’t seem to be a way to get a persistent URL for a workflow in dashboard
  • [PM-1155] – remote cleanup jobs should have file url’s if possible
  • [PM-1158] – Make DAX3 API compatible with Python 2.6+ and Python3+
  • [PM-1161] – Update documentation of large databases for handling mysqldump: Error 2013
  • [PM-1187] – make scheduler type case insensitive for grid gateway in site catalog
  • [PM-1212] – new defaults for number of transfer and inplace jobs created
  • [PM-1165] – Update Transformation Catalog format to support containers
  • [PM-1166] – pegasus-transfer to support transfers from docker hub
  • [PM-1180] – update monitord to populate checksums
  • [PM-1183] – monitord plumbing for putting hooks for integrity checking
  • [PM-1188] – Add tool to integrity check transferred files
  • [PM-1190] – planner changes to enable integrity checking
  • [PM-1194] – update planner pegasus lite mode to support for docker container wrapper
  • [PM-1195] – update site selection to handle containers
  • [PM-1197] – handle symlinks for input files when launching job via container
  • [PM-1200] – update pegasus lite mode to support singulartiy
  • [PM-1201] – Transformation Catalog API should support the container keywords
  • [PM-1202] – Move catalog APIs into Pegasus.catalogs and develop standalone test cases independent from Jupyter
  • [PM-1210] – update pegasus-transfer to support transfers from singularity hub
  • [PM-1214] – Specifying environment for sites and containers
  • [PM-1215] – Document support for containers for 4.8

 

Bugs Fixed

  • [PM-1032] – Handle proper error message for non-standard python usage
  • [PM-1131] – stage in jobs repeat portions of deep LFN’s
  • [PM-1132] – Hashed staging mapper doen’t work correctly with sub dax generation jobs
  • [PM-1135] – pegasus.transfer.bypass.input.staging breaks symlinking on the local site
  • [PM-1136] – With bypass input staging some URLs are ending up in the wrong site
  • [PM-1141] – The commit to allow symlinks in pegasus-transfer broke PFN fall through
  • [PM-1142] – Do not set LD_LIBRARY_PATH in job env
  • [PM-1144] – pegasus lite prints the wrong hostname for non-glidein jobs
  • [PM-1147] – pegasus-transfer should check that files exist before trying to transfer them
  • [PM-1148] – kickstart should print a more helpful error message if the executable is missing
  • [PM-1151] – pegasus-monitord fails to populate stampede DB correctly when workflow is run on HTCondor 8.5.8
  • [PM-1152] – pegasus-analyzer not showing stdout and stderr of failed transfer jobs
  • [PM-1153] – Pegasus creates extraneous spaces when replacing <file name=”something” />
  • [PM-1154] – regex too narrow for GO names with dashes
  • [PM-1157] – monitord replay should work on submit directories that are moved
  • [PM-1160] – Dashboard is not recording the hostname correctly
  • [PM-1162] – Running pegasus-monitord replay created an unreadable database
  • [PM-1163] – Confusing error message in pegasus-kickstart
  • [PM-1164] – worker package in submit directory gets deleted during workflow run
  • [PM-1171] – Monitord regularly produces empty stderr and stdout files
  • [PM-1172] – pegasus-rc-client deletes all entries for a lfn
  • [PM-1173] – cleanup jobs failing against Titan gridftp server due to RFC 2818 compliance
  • [PM-1174] – monitord should maintain the permissions on ~/.pegasus/workflow.db
  • [PM-1176] – the job notifications on failure and success should have exitcode from kickstart file
  • [PM-1181] – monitord fails to exit if database is locked
  • [PM-1182] – registration jobs fail if a file based RC has variables defined
  • [PM-1185] – destination in remote file transfers for inter site jobs point’s to directory
  • [PM-1189] – Making X86_64 the default arch in the site catalog
  • [PM-1191] – If available, use GFAL over guc
  • [PM-1192] – User supplied env setup script for lite
  • [PM-1193] – “pegasus-rc-client list” modifies rc.txt
  • [PM-1196] – pegasus-statistics is not generating jobs.txt for some large workflows
  • [PM-1207] – Investigate error message: Normalizing ‘4.8.0dev’ to ‘4.8.0.dev0’
  • [PM-1208] – Improve database is locked error message
  • [PM-1209] – Analyzer gets confused about retry number in hierarchical workflows
  • [PM-1211] – DAX API should tell which lfn was a dup
  • [PM-1213] – pegasus creates duplicate source URL’s for staged executables
  • [PM-1217] – monitord exits prematurely, when in dagman recovery mode

Technical task

  • [PM-1178] – kickstart to checksum output files

 

153 views


News and Announcements from OSG Operations > GOC Service Update - Tuesday, August 22nd at 14:00 UTC

The GOC will upgrade the following services beginning Tuesday, August 22nd at 14:00 UTC.
The GOC reserves 8 hours in the unlikely event unexpected problems are encountered.

Glidein
- Update GOC factory to use OSG 3.4 and HTCondor 8.6.

Repo
- Update createrepo rpm to properly cleanup temporary files after updates.

OIM
- Project name format update to include periods.

Ticket  
- Remove Gratia-Software as a contact option.

All Services 
- Operating system updates, reboots will be required. The usual HA mechanisms will be used, but some services will experience brief outages.

News and Announcements from OSG Operations > Announcing OSG CA Certificate Update

We are pleased to announce a data release for the OSG Software Stack.
Data releases do not contain any software changes.

This release contains updated CA Certificates based on IGTF 1.85:
- Updated URL domain information for CyGrid (CY)

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release3422

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

News and Announcements from OSG Operations > GUMS incompatibility with RHEL 7.4

Last week, Red Hat released Enterprise Linux 7.4 with a new SELinux policy that negatively affects some Tomcat web applications including GUMS. We recommend that sites running GUMS refrain from updating their GUMS hosts to CentOS, RHEL, or Scientific Linux 7.4 until we have released a fix.

If you have any questions or concerns, please contact us at goc@opensciencegrid.org.

Condor Project News > Intel and Broad Institute identify HTCondor for Genomics Research ( August 10, 2017 )

Researchers and software engineers at the Intel-Broad Center for Genomic Data Engineering build, optimize, and widely share new tools and infrastructure that will help scientists integrate and process genomic data. The project is optimizing best practices in hardware and software for genome analytics to make it possible to combine and use research data sets that reside on private, public, and hybrid clouds, and have recently identified HTCondor on their web site as an open source framework well suited for genomics analytics.

News and Announcements from OSG Operations > Announcing OSG Software version 3.3.27

We are pleased to announce OSG Software version 3.3.27.

Changes to OSG 3.3.27 include:
- Updated SELinux profile for HTCondor which is required on Red Hat 7.4
- HTCondor-CE: don't hold running jobs with expired proxy, other updates
- Default configuration improvements for condor-cron
- Several improvements to osg-configure
- Added blahp configuration option for PBS Pro
- Patched jGlobus so that proxies work with BeStMan
- HTCondor 8.4.12

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release3327

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

News and Announcements from OSG Operations > Announcing OSG Software version 3.4.2

We are pleased to announce OSG Software version 3.4.2.

Changes to OSG 3.4.2 include:
- Updated SELinux profile for HTCondor which is required on Red Hat 7.4
- HTCondor 8.6.5: important bug fixes, see release notes
- HTCondor-CE 3.0.1: added pilot payload auditing and other improvements
- Default configuration improvements for condor-cron
- Several improvements to osg-configure
- Added blahp configuration option for PBS Pro
- Reorganize the osg-ce packages

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release342

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

News and Announcements from OSG Operations > Apply for CODATA-RDA School of Research Data Science, Sao Paolo, 4-15 December 2017: Deadline 22 September

Applications are invited to participate in the CODATA-RDA School of Research Data Science which will be held at ICTP-SAIFR, Sao Paolo, Brazil, 4-15 December 2017.



Further information about the Sao Paolo edition of the CODATA-RDA School of Research Data Science is available at http://www.ictp-saifr.org/?page_id=15270



To apply you must use the online form which is available here http://ictp-saifr.org/sis/datasci2017step0.php 


The deadline for applications is 22 September 2017.



About the CODATA-RDA School of Research Data Science
:

What can justly be called the ‘Data Revolution’ offers many opportunities coupled with significant challenges. High among the latter is the need to develop the necessary data professions and data skills.  Researchers and research institutions worldwide recognise the need to promote data skills and we see short courses, continuing professional development and MOOCs providing training in data science and research data management.



In sum, this is because of the realisation that contemporary research – particularly when addressing the most significant, inter-disciplinary research challenges – cannot effectively be done without a range of skills relating to data.  These skills include the principles and practice of Open Science and research data management and curation, the use of a range of data platforms and infrastructures, large scale analysis, statistics, visualisation and modelling techniques, software development and annotation, etc, etc. The ensemble of these skills, we define as ‘Research Data Science’, that is the science of research data: how to look after and use the data that is core to your research.



The CODATA-RDA School of Research Data Science has developed a short course, summer school, style curriculum that addresses these training requirements.  The course partners Software Carpentry (using the Shell command line and GitHub), Data Carpentry (using R and SQL) and the Digital Curation Centre (research data management and data management plans) and builds on materials developed by these organisations.  Also included in the programme are modules on Open Science, ethics, visualisation, machine learning (recommender systems and artificial neural networks) and research computational infrastructures.



The school has been successfully piloted at ICTP in Trieste in 2016 and 2017.  The vision of the CODATA-RDA Schools of Research Data Science is to develop into an international network which makes it easy for partner organisations and institutions to run the schools in a variety of locations.  The annual event at the ICTP in Trieste will serve as a motor for building the network and building expertise and familiarity with the initiative’s mission and objectives.  The core materials are made available for reuse and the co-chairs and Working Group team will provide guidance to assist partners in organising the school, in identifying instructors and helpers etc. The first school to expand this initiative will take place at ICTP-SAIFR (South American Institute of Fundamental Research), Sao Paolo, Brazil in December 2017.



Further information about the CODATA-RDA Schools of Research Data Science is available at http://www.codata.org/working-groups/research-data-science-summer-schools



Short Report on the First CODATA-RDA School of Research Data Science, August 2016 https://doi.org/10.5281/zenodo.835565



Programme for the First CODATA-RDA School of Research Data Science, ICTP, Trieste, August 2016 http://indico.ictp.it/event/7658/other-view?view=ictptimetable



Materials from the First CODATA-RDA School of Research Data Science, ICTP, Trieste, August 2016 https://zenodo.org/communities/codata-rda-research-data-science-summer-school/?page=1&size=20



Programme for the Second CODATA-RDA School of Research Data Science, ICTP, Trieste, July 2017 http://indico.ictp.it/event/7974/other-view?view=ictptimetable

----------------------------------
 
CODATA 2017 International Conference, ‘Global Challenges and Data-Driven Research’, Saint-Petersburg, 8-13 October: http://codata2017.gcras.ru/

What has CODATA delivered recently? See the CODATA Prospectus: Strategy and Achievement, 2015-2016 https://doi.org/10.5281/zenodo.165830

----------------------------------

Dr Simon Hodson | Executive Director CODATA | http://www.codata.org

E-Mail: simon@codata.org | Twitter: @simonhodson99 | Skype: simonhodson99
Tel (Office): +33 1 45 25 04 96 | Tel (Cell): +33 6 86 30 42 59

CODATA (Committee on Data of the International Council for Science), 5 rue Auguste Vacquerie, 75016 Paris, FRANCE

Condor Project News > Updated EL7 RPMs for HTCondor 8.4.12, 8.6.5, and 8.7.2 ( August 7, 2017 )

The HTCondor team has released updated RPMs for HTCondor versions 8.4.12, 8.6.5, and 8.7.2 running on Enterprise Linux 7. In the recent Red Hat 7.4 release, the updated SELinux targeted policy package prevented HTCondor's SELinux policy module from loading. Red Hat Enterprise Linux 7.4 systems running with SELinux enabled will require this update for HTCondor to function properly.


Subscribe