-
-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Merge idist into master #1045
Merged
Changes from all commits
Commits
Show all changes
54 commits
Select commit
Hold shift + click to select a range
177fb6f
Improved parallel utils (#1023)
vfdev-5 91d8875
[WIP] create from context for XLA
vfdev-5 3cfccd4
autopep8 fix
f71043f
Tests for _sync_model for XLA
vfdev-5 093ddb1
autopep8 fix
7ad7fcf
More tests and updates
vfdev-5 d57b3c9
autopep8 fix
7fcadca
[WIP] create from context for Native Torch Dist
vfdev-5 5a6e052
autopep8 fix
1c362fe
Added tests for idist.* created from context for native dist settings
vfdev-5 12512cf
[WIP] Fix tests
vfdev-5 228fd89
Fixed metric related tests
vfdev-5 b09ea05
autopep8 fix
a23da8e
Merge branch 'master' of https://github.com/pytorch/ignite into idist
vfdev-5 da72b15
[WIP] idist - Docs & code updates (#1034)
vfdev-5 0352bc6
Merge branch 'master' into origin-idist
vfdev-5 16256cf
Merge branch 'master' of https://github.com/pytorch/ignite into origi…
vfdev-5 914bba9
Tpu metrics (#1042)
vfdev-5 feb79b4
Merge branch 'master' into idist
vfdev-5 25d38d1
Increased err tol for mse and rmse tests on single TPU
vfdev-5 8886948
Fixes #991 (#1047)
vfdev-5 add8a4d
Merge branch 'master' into idist
vfdev-5 bdae449
add TPU checkpointing to CPU. (#1005)
erip d1cc29d
Updated tests on checkpoint and TPU
vfdev-5 977ac8c
Merge branch 'master' into idist
vfdev-5 15072ae
Added barrier op in idist (#1050)
vfdev-5 ac86d46
Merge branch 'master' into idist
vfdev-5 037e7f7
Fixed bug with torch.cuda.set_device
vfdev-5 2a01cc3
Fixed cuda device index, added warning if cuda device index != local …
vfdev-5 1f54ab5
autopep8 fix
199224a
Merge branch 'master' into idist
vfdev-5 888a654
Issue 1011 (#1053)
vfdev-5 ae1bdf5
Improved device() method (#1062)
vfdev-5 0fa8c61
Merge branch 'master' into idist
sdesrozis 537dbd0
Idist kwargs dict (#1064)
vfdev-5 727f038
removed badly merged _need_to_sync
vfdev-5 530c422
Improved device and setup_common_training_handlers (#1066)
vfdev-5 74ddacb
Idist improve2 (#1075)
vfdev-5 6735dc0
Merge branch 'master' into idist
vfdev-5 b1b5d56
Merge branch 'master' into idist
vfdev-5 1e5d7d3
Added support for str input for all gather (#1081)
vfdev-5 89e1358
Fix #1055 (#1068)
sdesrozis 1c34eda
Merge branch 'master' into idist
vfdev-5 d277a25
Fix failing tests on multi-gpus
vfdev-5 d9a80c6
Fix failing XLA tests
vfdev-5 f617787
Merge branch 'master' into idist
vfdev-5 a8f03e8
Merge branch 'master' into idist
vfdev-5 b41cf6d
Fixes failing tests on multi-GPUs
vfdev-5 222cb60
autopep8 fix
b3b9aff
Remove useless barriers (#1085)
sdesrozis 44f4c63
Fixes failing TPU with fork mp
vfdev-5 8989e5e
Merge branch 'master' into idist
vfdev-5 f4ee4f9
Applied review suggestions
vfdev-5 669ef8a
autopep8 fix
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
ignite.distributed | ||
================== | ||
|
||
Helper module to use distributed settings for multiple backends: | ||
|
||
- backends from native torch distributed configuration: "nccl", "gloo", "mpi" | ||
|
||
- XLA on TPUs via `pytorch/xla <https://github.com/pytorch/xla>`_ | ||
|
||
This module wraps common methods to fetch information about distributed configuration, initialize/finalize process | ||
group or spawn multiple processes. | ||
|
||
|
||
Examples: | ||
|
||
- Example to spawn `nprocs` processes that run `fn` with `args`: :meth:`~ignite.distributed.spawn` | ||
|
||
|
||
.. currentmodule:: ignite.distributed | ||
|
||
.. automodule:: ignite.distributed | ||
:members: | ||
:imported-members: | ||
|
||
.. attribute:: has_xla_support | ||
|
||
True if `torch_xla` package is found |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -78,6 +78,7 @@ of stable version as dependency): | |
engine | ||
handlers | ||
metrics | ||
distributed | ||
exceptions | ||
utils | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,13 +6,14 @@ | |
import torch | ||
|
||
import ignite | ||
import ignite.distributed as idist | ||
from ignite.contrib.handlers.base_logger import ( | ||
BaseLogger, | ||
BaseOptimizerParamsHandler, | ||
BaseOutputHandler, | ||
BaseWeightsScalarHandler, | ||
global_step_from_engine, | ||
) | ||
from ignite.handlers import global_step_from_engine | ||
from ignite.handlers.checkpoint import BaseSaveHandler | ||
|
||
__all__ = [ | ||
|
@@ -478,10 +479,7 @@ def __getattr__(self, attr): | |
|
||
import neptune | ||
|
||
def wrapper(*args, **kwargs): | ||
return getattr(neptune, attr)(*args, **kwargs) | ||
|
||
return wrapper | ||
return getattr(neptune, attr) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And here. |
||
|
||
def __init__(self, *args, **kwargs): | ||
try: | ||
|
@@ -571,14 +569,18 @@ def score_function(engine): | |
|
||
""" | ||
|
||
@idist.one_rank_only() | ||
def __init__(self, neptune_logger: NeptuneLogger): | ||
self._logger = neptune_logger | ||
|
||
@idist.one_rank_only() | ||
def __call__(self, checkpoint: Mapping, filename: str, metadata: Optional[Mapping] = None) -> None: | ||
# wont work on XLA | ||
|
||
with tempfile.NamedTemporaryFile() as tmp: | ||
torch.save(checkpoint, tmp.name) | ||
self._logger.log_artifact(tmp.name, filename) | ||
|
||
@idist.one_rank_only(with_barrier=True) | ||
def remove(self, filename: str) -> None: | ||
self._logger.delete_artifacts(filename) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,12 +3,8 @@ | |
|
||
import torch | ||
|
||
from ignite.contrib.handlers.base_logger import ( | ||
BaseLogger, | ||
BaseOptimizerParamsHandler, | ||
BaseOutputHandler, | ||
global_step_from_engine, | ||
) | ||
from ignite.contrib.handlers.base_logger import BaseLogger, BaseOptimizerParamsHandler, BaseOutputHandler | ||
from ignite.handlers import global_step_from_engine | ||
|
||
__all__ = ["PolyaxonLogger", "OutputHandler", "OptimizerParamsHandler", "global_step_from_engine"] | ||
|
||
|
@@ -271,10 +267,7 @@ def __init__(self, *args, **kwargs): | |
self.experiment = Experiment(*args, **kwargs) | ||
|
||
def __getattr__(self, attr): | ||
def wrapper(*args, **kwargs): | ||
return getattr(self.experiment, attr)(*args, **kwargs) | ||
|
||
return wrapper | ||
return getattr(self.experiment, attr) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And here. |
||
|
||
def _create_output_handler(self, *args, **kwargs): | ||
return OutputHandler(*args, **kwargs) | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above - would be nice to replace this if we can.