Skip to content
This repository has been archived by the owner on Sep 11, 2020. It is now read-only.

config: support a configurable, and turn-off-able, pack.window #587

Merged
merged 1 commit into from
Sep 11, 2017

Conversation

strib
Copy link
Contributor

@strib strib commented Sep 9, 2017

One use of go-git is to transfer git data from a non-standard git repo (not stored in a file system, for example) to a "remote" backed by a standard, local .git repo.

In this scenario, delta compression is not needed to reduce transfer time over the "network", because there is no network. The underlying storage layer has already taken care of the data tranfer, and sending the objects to local .git storage doesn't require compression. So this PR gives the user the option to turn off compression when it isn't needed.

Of course, this results in a larger, uncompressed local .git repo, but the user can then run git gc or git repack on that repo if they care about the storage costs.

Turning the pack window to 0 reduces total push time of a 36K repo by 50 seconds (out of a pre-PR total of 3m26s).

@orirawlings
Copy link
Contributor

Is this a feature in the standard git tooling?

@strib
Copy link
Contributor Author

strib commented Sep 9, 2017

Is this a feature in the standard git tooling?

Yes: https://stackoverflow.com/questions/7102053/git-pull-without-remotely-compressing-objects

@mcuadros
Copy link
Contributor

Thanks for your contribution, make a lot of sense for people using memory based storage or others.

I cannot see a config flag exactly for this feature, the core.compression flag in stackoverflow, has a different behavior.

As you can see in the document this kind of configuration it's done at config level. So instead of having it as a flag of the PullOptions, should be implemented as property in the config file.

If this property doesn't exists on the standard git, I am open to merge this PR with a flag call gogit.deltaCompression or something similar.

@orirawlings
Copy link
Contributor

It looks like this is controlled via a gitattributes option on specific file glob patterns. Ideally we would control this behavior via the same gitattributes, but I don't believe that currently the go-git library supports gitattributes. The are a bunch of other behaviors that can be controlled via gitattributes as well.

@mcuadros
Copy link
Contributor

This is handle by the config package current we only support a few config settings, but we already have the Core property.

go-git/config/config.go

Lines 35 to 42 in bb3217c

type Config struct {
Core struct {
// IsBare if true this repository is assumed to be bare and has no
// working directory associated with it.
IsBare bool
// Worktree is the path to the root of the working tree.
Worktree string
}

@strib
Copy link
Contributor Author

strib commented Sep 11, 2017

Thanks for taking a look guys. I don't think this is really equivalent to a core config param. It's more equivalent to setting --window=0 when calling git-pack-objects: https://git-scm.com/docs/git-pack-objects#git-pack-objects---windowltngt . That can also be controlled in the config file with pack.window: https://git-scm.com/docs/git-config#git-config-packwindow

I'm not sure why git doesn't make it a per-remote option, that seems more reasonable to me. But if you guys want to keep it closer to the real git, I can turn this into support for pack.window, where window=0 means delta compression is turned off. What do you think?

@mcuadros
Copy link
Contributor

Yes, I think this is the best solution, since implementing this as pack.window is giving support a feature from git. Also will be nice change the the packfile encoder to accept the argument of the window.

@strib strib force-pushed the strib/skip-compression-gh-master branch from 032196f to db279de Compare September 11, 2017 04:57
@codecov
Copy link

codecov bot commented Sep 11, 2017

Codecov Report

Merging #587 into master will decrease coverage by 0.6%.
The diff coverage is 74.41%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #587      +/-   ##
==========================================
- Coverage   78.32%   77.72%   -0.61%     
==========================================
  Files         130      130              
  Lines       10092    10122      +30     
==========================================
- Hits         7905     7867      -38     
- Misses       1342     1418      +76     
+ Partials      845      837       -8
Impacted Files Coverage Δ
plumbing/transport/server/server.go 65.74% <100%> (ø) ⬆️
plumbing/format/packfile/encoder.go 80.85% <100%> (ø) ⬆️
remote.go 74.28% <40%> (-0.3%) ⬇️
config/config.go 84.05% <61.11%> (-3.45%) ⬇️
plumbing/format/packfile/delta_selector.go 80.12% <93.75%> (+1.17%) ⬆️
plumbing/transport/ssh/common.go 20.54% <0%> (-45.21%) ⬇️
plumbing/transport/ssh/auth_method.go 31.57% <0%> (-22.81%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bb3217c...52c1f98. Read the comment docs.

@strib strib changed the title remote: option to skip delta compression while encoding packfiles config: support a configurable, and turn-off-able, pack.window Sep 11, 2017
@strib
Copy link
Contributor Author

strib commented Sep 11, 2017

Thanks @mcuadros, done. Please take another look.

Copy link
Contributor

@mcuadros mcuadros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work! Just one small comment and ready to merge.

otp, err := dw.objectsToPack(hashes)
func (dw *deltaSelector) ObjectsToPack(
hashes []plumbing.Hash,
packWindow uint,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the comment about this new argument.

objects, err := e.selector.ObjectsToPack(hashes)
func (e *Encoder) Encode(
hashes []plumbing.Hash,
packWindow uint,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

@erizocosmico
Copy link
Contributor

lgtm once the requests by @mcuadros are added

@bzz
Copy link
Contributor

bzz commented Sep 11, 2017

Looks great.

Probably a nitpick, but could you also please double-check if 0d74736 is intended to be part of these changes?

One use of go-git is to transfer git data from a non-standard git repo
(not stored in a file system, for example) to a "remote" backed by a
standard, local .git repo.

In this scenario, delta compression is not needed to reduce transfer
time over the "network", because there is no network. The underlying
storage layer has already taken care of the data tranfer, and sending
the objects to local .git storage doesn't require compression. So this
PR gives the user the option to turn off compression when it isn't
needed.

Of course, this results in a larger, uncompressed local .git repo, but
the user can then run git gc or git repack on that repo if they care
about the storage costs.

Turning the pack window to 0 on reduces total push time of a 36K repo
by 50 seconds (out of a pre-PR total of 3m26s).
@strib strib force-pushed the strib/skip-compression-gh-master branch from a1681c8 to 52c1f98 Compare September 11, 2017 15:57
@strib
Copy link
Contributor Author

strib commented Sep 11, 2017

Added comments and rebased it correctly. Thanks all!

@mcuadros mcuadros merged commit 032ec28 into src-d:master Sep 11, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants