Skip to content

Commit

Permalink
Experimental fast validator code path, using cwl-utils (#1720)
Browse files Browse the repository at this point in the history
On a very large workflow I was testing with, the validation time went
120 seconds to 20 seconds.

* conformance testing: use CWLTOOL_OPTIONS
* CWLTOOL_OPTIONS: ignore an empty string
* Add loadingContext.skip_resolve_all with note
* conformance tests: report CWLTOOL_OPTIONS
* CI: include extra options in coverge classname
* conformance coverage: normalize paths
* Bump cwl-utils version requirement

Co-authored-by: Michael R. Crusoe <1330696+mr-c@users.noreply.github.com>
  • Loading branch information
2 people authored and GlassOfWhiskey committed Sep 15, 2022
1 parent a3419b8 commit 154486f
Show file tree
Hide file tree
Showing 11 changed files with 234 additions and 32 deletions.
6 changes: 6 additions & 0 deletions .github/workflows/ci-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,11 @@ jobs:
matrix:
cwl-version: [v1.0, v1.1, v1.2]
container: [docker, singularity, podman]
extras: [""]
include:
- cwl-version: v1.2
container: docker
extras: "--fast-parser"

steps:
- uses: actions/checkout@v3
Expand All @@ -141,6 +146,7 @@ jobs:
version: ${{ matrix.cwl-version }}
container: ${{ matrix.container }}
spec_branch: main
CWLTOOL_OPTIONS: ${{ matrix.extras }}
run: ./conformance-test.sh

release_test:
Expand Down
13 changes: 13 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -667,6 +667,19 @@ given in the following table; all are optional.
+----------------+------------------+----------+------------------------------+


Enabling Fast Parser (experimental)
-----------------------------------

For very large workflows, `cwltool` can spend a lot of time in
initialization, before the first step runs. There is an experimental
flag ``--fast-parser`` which can dramatically reduce the
initialization overhead, however as of this writing it has several limitations:

- Error reporting in general is worse than the standard parser, you will want to use it with workflows that you know are already correct.

- It does not check for dangling links (these will become runtime errors instead of loading errors)

- Several other cases fail, as documented in https://github.com/common-workflow-language/cwltool/pull/1720

===========
Development
Expand Down
21 changes: 11 additions & 10 deletions conformance-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ pip3 install -U setuptools wheel pip
pip3 uninstall -y cwltool
pip3 install -e .
pip3 install codecov cwltest>=2.1
root_folder=${PWD}
pushd "${repo}-${spec_branch}" || exit 1

# shellcheck disable=SC2043
Expand All @@ -71,6 +72,7 @@ cat > "${COVERAGE_RC}" <<EOF
[run]
branch = True
source_pkgs = cwltool
source = ${root_folder}
[report]
exclude_lines =
Expand All @@ -92,15 +94,15 @@ chmod a+x "${CWLTOOL_WITH_COV}"
unset exclusions
declare -a exclusions

EXTRA="--parallel"
CWLTOOL_OPTIONS+=" --parallel"
# shellcheck disable=SC2154
if [[ "$version" = *dev* ]]
then
EXTRA+=" --enable-dev"
CWLTOOL_OPTIONS+=" --enable-dev"
fi

if [[ "$container" = "singularity" ]]; then
EXTRA+=" --singularity"
CWLTOOL_OPTIONS+=" --singularity"
# This test fails because Singularity and Docker have
# different views on how to deal with this.
exclusions+=(docker_entrypoint)
Expand All @@ -113,13 +115,9 @@ if [[ "$container" = "singularity" ]]; then
exclusions+=(stdin_shorcut)
fi
elif [[ "$container" = "podman" ]]; then
EXTRA+=" --podman"
CWLTOOL_OPTIONS+=" --podman"
fi

if [ -n "$EXTRA" ]
then
EXTRA="EXTRA=${EXTRA}"
fi
if [ "$GIT_BRANCH" = "origin/main" ] && [[ "$version" = "v1.0" ]] && [[ "$container" = "docker" ]]
then
rm -Rf conformance
Expand All @@ -133,6 +131,7 @@ Conformance test of cwltool ${tool_ver} for CWL ${version}
Commit: ${GIT_COMMIT}
Python version: 3
Container: ${container}
Extra options: ${CWLTOOL_OPTIONS}
EOM
)

Expand All @@ -148,11 +147,13 @@ if (( "${#exclusions[*]}" > 0 )); then
else
EXCLUDE=""
fi
export CWLTOOL_OPTIONS
echo CWLTOOL_OPTIONS="${CWLTOOL_OPTIONS}"
# shellcheck disable=SC2086
LC_ALL=C.UTF-8 ./run_test.sh --junit-xml=result3.xml ${EXCLUDE} \
RUNNER=${CWLTOOL_WITH_COV} "-j$(nproc)" ${BADGE} \
${DRAFT} "${EXTRA}" \
"--classname=py3_${container}"
${DRAFT} \
"--classname=py3_${container}_$(echo ${CWLTOOL_OPTIONS} | tr "[:blank:]-" _)"
# LC_ALL=C is to work around junit-xml ASCII only bug

# capture return code of ./run_test.sh
Expand Down
7 changes: 7 additions & 0 deletions cwltool/argparser.py
Original file line number Diff line number Diff line change
Expand Up @@ -576,6 +576,13 @@ def arg_parser() -> argparse.ArgumentParser:
default=True,
help=argparse.SUPPRESS,
)
parser.add_argument(
"--fast-parser",
dest="fast_parser",
action="store_true",
default=False,
help=argparse.SUPPRESS,
)

reggroup = parser.add_mutually_exclusive_group()
reggroup.add_argument(
Expand Down
20 changes: 18 additions & 2 deletions cwltool/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,18 @@
import shutil
import tempfile
import threading
from typing import IO, Any, Callable, Dict, Iterable, List, Optional, TextIO, Union
from typing import (
IO,
Any,
Callable,
Dict,
Iterable,
List,
Optional,
TextIO,
Tuple,
Union,
)

# move to a regular typing import when Python 3.3-3.6 is no longer supported
from ruamel.yaml.comments import CommentedMap
Expand All @@ -23,6 +34,8 @@
from .utils import DEFAULT_TMP_PREFIX, CWLObjectType, HasReqsHints, ResolverType

if TYPE_CHECKING:
from cwl_utils.parser.cwl_v1_2 import LoadingOptions

from .process import Process
from .provenance import ResearchObject # pylint: disable=unused-import
from .provenance_profile import ProvenanceProfile
Expand Down Expand Up @@ -102,7 +115,10 @@ def __init__(self, kwargs: Optional[Dict[str, Any]] = None) -> None:
self.relax_path_checks = False # type: bool
self.singularity = False # type: bool
self.podman = False # type: bool
self.eval_timeout = 60 # type: float
self.eval_timeout: float = 60
self.codegen_idx: Dict[str, Tuple[Any, "LoadingOptions"]] = {}
self.fast_parser = False
self.skip_resolve_all = False

super().__init__(kwargs)

Expand Down
Loading

0 comments on commit 154486f

Please sign in to comment.