This repository has been archived by the owner on Sep 11, 2020. It is now read-only.
plumbing/format/idxfile: add new Index and MemoryIndex #896
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #893
This PR adds the
Index
andEntryIter
interfaces toidxfile
and renamesIdxfile
toMemoryIndex
as described in the design document.The new
MemoryIndex
now works like the JGit index reader, it loads everything as byte slices. The only change I did that diverges from the JGit implementation is adding a mapping between the fanout index and the slice index to drastically reduce the memory usage.In JGit they use an array of 256 byte arrays for the name, offset and crc32 tables. This is such an overkill when you don't have a gigantic packfile that uses all slots of the array. By doing this, we only use as much memory as we need.
According to the benchmarks, the speed remains the same decoding the idx file, and the memory consumption is slightly increased. Access to the entries should be way faster now due to how the accesses are done.
Benchmarks have been added for all operations on idxfiles.
Warning
This will leave the repo in a broken state (which is why it's being PR'd to a feature branch), since previously there was a
packfile.Index
that was writable and now there is not. This is a work in progress that will probably need some of the work being done by @jfontan right now to be fully working.Everything inside
idxfile
package is expected to be working correctly, though.Caveats
As mentioned in #893, some way of retrieving hashes by offset may be needed for the packfile decoder. After some discussion with @jfontan we agreed that it's better to wait until all the pieces are ready to see how they play together before adding such a change. That is only needed when the source is not seekable, so another solution might be to just build a map from offset to hash for those cases.