Discussion:
Why no git --depth=1 option for makepkg?
(too old to reply)
Adam Levy via arch-general
2018-03-03 07:52:47 UTC
Permalink
Raw Message
Hi

I recently came across this closed feature request for a way to perform a
shallow clone with makepkg.

https://bugs.archlinux.org/task/52957

The closing comment was:
Closed by Andrew Gregory (andrewgregory)
Monday, 13 February 2017, 17:29 GMT-9
Reason for closing: Won't implement
Additional comments about closing: This has been rejected numerous times:
https://wiki.archlinux.org/index.php/Use r:Apg#makepkg:_shallow_git_clones

Which provides a now dead link.

Would anyone in the know be willing to explain why this feature is
considered outside the scope of makepkg?

Clearly if the developers aren't interested in the feature, well then they
just won't want to support it. But if there is some rationale behind this I
am curious about it. On one hand this is pure curiosity, but on the other
hand, I use makepkg and the Arch packaging system multiple times per week
so I like to understand the design and the intent.

Thank you!
Adam Levy (alaskanarcher)
mike lojkovic via arch-general
2018-03-03 08:48:45 UTC
Permalink
Raw Message
It would be extremely nice to have shallow clone support for some packages.
The Unreal git repo requires pulling down 20 gigabytes for a build, taking
maybe a half hour each time.

On Mar 3, 2018 1:53 AM, "Adam Levy via arch-general" <
Post by Adam Levy via arch-general
Hi
I recently came across this closed feature request for a way to perform a
shallow clone with makepkg.
https://bugs.archlinux.org/task/52957
Closed by Andrew Gregory (andrewgregory)
Monday, 13 February 2017, 17:29 GMT-9
Reason for closing: Won't implement
https://wiki.archlinux.org/index.php/Use r:Apg#makepkg:_shallow_git_clones
Which provides a now dead link.
Would anyone in the know be willing to explain why this feature is
considered outside the scope of makepkg?
Clearly if the developers aren't interested in the feature, well then they
just won't want to support it. But if there is some rationale behind this I
am curious about it. On one hand this is pure curiosity, but on the other
hand, I use makepkg and the Arch packaging system multiple times per week
so I like to understand the design and the intent.
Thank you!
Adam Levy (alaskanarcher)
Jonathon Fernyhough
2018-03-03 17:50:57 UTC
Permalink
Raw Message
Post by mike lojkovic via arch-general
It would be extremely nice to have shallow clone support for some packages.
The Unreal git repo requires pulling down 20 gigabytes for a build, taking
maybe a half hour each time.
An effective workaround is to create a shallow clone prior to running
makepkg,

$ cd $SRCDEST
$ git clone --bare --depth=1 https://github.com/cisco/ChezScheme.git
ChezScheme
$ cd ChezScheme
$ git config remote.origin.fetch "+refs/*:refs/*"

and away you go.

However.

You can't just use --depth=1 on everything without running into "weird"
problems. For example, any VCS package that relies on tags for its
pkgver will fail to find the last tagged commit, and so the fetch depth
must be increased to extend to the tagged commit.
Eli Schwartz via arch-general
2018-03-04 00:08:21 UTC
Permalink
Raw Message
Post by Jonathon Fernyhough
Post by mike lojkovic via arch-general
It would be extremely nice to have shallow clone support for some packages.
The Unreal git repo requires pulling down 20 gigabytes for a build, taking
maybe a half hour each time.
An effective workaround is to create a shallow clone prior to running
makepkg,
$ cd $SRCDEST
$ git clone --bare --depth=1 https://github.com/cisco/ChezScheme.git
ChezScheme
$ cd ChezScheme
$ git config remote.origin.fetch "+refs/*:refs/*"
and away you go.
However.
You can't just use --depth=1 on everything without running into "weird"
problems. For example, any VCS package that relies on tags for its
pkgver will fail to find the last tagged commit, and so the fetch depth
must be increased to extend to the tagged commit.
Yep -- more or less this. There is no way for git to fetch "all commits
since a given tag", and obviously `git describe` which is used in the
standard pkgver() function cannot describe the remote repository... not
to mention what happens when the repository has *no* tags, and git
rev-list --count HEAD depends on all commits since the repository was
initialized.

Then there is the fact that --depth, or even --single-branch (not that
this usually saves much space or time), will break on PKGBUILDs that use
`git cherry-pick` to backport fixes (more commonly seen in non-VCS
packages obviously).

All in all, there is simply no way to generically support shallow clones
in a generic way. The best you can do is take a given PKGBUILD, predict
what it needs, and perform the clone manually according to handpicked
criteria as makepkg will detect that clone and then simply fetch new
changes which respects a previous shallow clone designation.
--
Eli Schwartz
Bug Wrangler and Trusted User
Adam Levy via arch-general
2018-03-04 01:12:52 UTC
Permalink
Raw Message
Post by Tinu Weber
It provides a now dead link because there is a rogue space character
Thank you Tinu Weber. Silly oversight on my part.

After reading the discussions in previous feature requests the answer is
pretty clear. It could break some packages if used incorrectly and the same
functionality can be achieved by manually cloning the repo.

Thanks all for the responses.

On Sat, Mar 3, 2018, 3:07 PM Eli Schwartz via arch-general <
Post by Tinu Weber
Post by Jonathon Fernyhough
Post by mike lojkovic via arch-general
It would be extremely nice to have shallow clone support for some
packages.
Post by Jonathon Fernyhough
Post by mike lojkovic via arch-general
The Unreal git repo requires pulling down 20 gigabytes for a build,
taking
Post by Jonathon Fernyhough
Post by mike lojkovic via arch-general
maybe a half hour each time.
An effective workaround is to create a shallow clone prior to running
makepkg,
$ cd $SRCDEST
$ git clone --bare --depth=1 https://github.com/cisco/ChezScheme.git
ChezScheme
$ cd ChezScheme
$ git config remote.origin.fetch "+refs/*:refs/*"
and away you go.
However.
You can't just use --depth=1 on everything without running into "weird"
problems. For example, any VCS package that relies on tags for its
pkgver will fail to find the last tagged commit, and so the fetch depth
must be increased to extend to the tagged commit.
Yep -- more or less this. There is no way for git to fetch "all commits
since a given tag", and obviously `git describe` which is used in the
standard pkgver() function cannot describe the remote repository... not
to mention what happens when the repository has *no* tags, and git
rev-list --count HEAD depends on all commits since the repository was
initialized.
Then there is the fact that --depth, or even --single-branch (not that
this usually saves much space or time), will break on PKGBUILDs that use
`git cherry-pick` to backport fixes (more commonly seen in non-VCS
packages obviously).
All in all, there is simply no way to generically support shallow clones
in a generic way. The best you can do is take a given PKGBUILD, predict
what it needs, and perform the clone manually according to handpicked
criteria as makepkg will detect that clone and then simply fetch new
changes which respects a previous shallow clone designation.
--
Eli Schwartz
Bug Wrangler and Trusted User
ProgAndy
2018-03-04 15:37:18 UTC
Permalink
Raw Message
Post by Eli Schwartz via arch-general
Yep -- more or less this. There is no way for git to fetch "all commits
since a given tag", and obviously `git describe` which is used in the
standard pkgver() function cannot describe the remote repository... not
to mention what happens when the repository has *no* tags, and git
rev-list --count HEAD depends on all commits since the repository was
initialized.
Then there is the fact that --depth, or even --single-branch (not that
this usually saves much space or time), will break on PKGBUILDs that use
`git cherry-pick` to backport fixes (more commonly seen in non-VCS
packages obviously).
All in all, there is simply no way to generically support shallow clones
in a generic way. The best you can do is take a given PKGBUILD, predict
what it needs, and perform the clone manually according to handpicked
criteria as makepkg will detect that clone and then simply fetch new
changes which respects a previous shallow clone designation.
Maybe a working option would be to implement fragmant variables for some
git options like depth, shallow-exclude and shallow-since, but that is
likely not trivial.

source=("one::git+https://repo.git#branch=master:shallow-exclude=v4.14"
"two::git+https://repo.git#branch=master:shallow-since=2017-12-30")

--
Andy
Eli Schwartz via arch-general
2018-03-04 15:51:21 UTC
Permalink
Raw Message
Post by ProgAndy
Maybe a working option would be to implement fragmant variables for some
git options like depth, shallow-exclude and shallow-since, but that is
likely not trivial.
source=("one::git+https://repo.git#branch=master:shallow-exclude=v4.14"
"two::git+https://repo.git#branch=master:shallow-since=2017-12-30")
That would require opt-in support for every package to describe which
commits it needs, something which the vast majority of maintainers are
uninterested in and requires successively more query strings for each
branch you want to cherry-pick from.

Also shallow-exclude would exclude the tag itself, you cannot specify
"v${pkgver}~1" to shallow-exclude.

As you say, not trivial. ;) I've thought about it too...
--
Eli Schwartz
Bug Wrangler and Trusted User
Carsten Mattner via arch-general
2018-03-04 15:58:11 UTC
Permalink
Raw Message
At least for GitHub remotes, don't they still support checking out
with SVN? If they do, this would be faster and use less space, too,
when we just need a certain revision and no history at all.

Other than that, I'm "pretty sure" that a git depth of 10 commits
will work for most repositories when you clone normally, not
shallow. Should also work for tags. However, it's true that git's
limited depth clone isn't implemented fully. There are many unhandled
cases and surprises.

All that being said, I can report that in CI of personal and company
projects, I haven't yet run into problems with depth=5. It speeds up
checking out the tree, even when it's a fast local network remote.
Post by Eli Schwartz via arch-general
Post by ProgAndy
Maybe a working option would be to implement fragmant variables for some
git options like depth, shallow-exclude and shallow-since, but that is
likely not trivial.
source=("one::git+https://repo.git#branch=master:shallow-exclude=v4.14"
"two::git+https://repo.git#branch=master:shallow-since=2017-12-30")
That would require opt-in support for every package to describe which
commits it needs, something which the vast majority of maintainers are
uninterested in and requires successively more query strings for each
branch you want to cherry-pick from.
Also shallow-exclude would exclude the tag itself, you cannot specify
"v${pkgver}~1" to shallow-exclude.
As you say, not trivial. ;) I've thought about it too...
--
Eli Schwartz
Bug Wrangler and Trusted User
Eli Schwartz via arch-general
2018-03-04 16:05:25 UTC
Permalink
Raw Message
Post by Carsten Mattner via arch-general
At least for GitHub remotes, don't they still support checking out
with SVN? If they do, this would be faster and use less space, too,
when we just need a certain revision and no history at all.
Other than that, I'm "pretty sure" that a git depth of 10 commits
will work for most repositories when you clone normally, not
shallow. Should also work for tags. However, it's true that git's
limited depth clone isn't implemented fully. There are many unhandled
cases and surprises.
All that being said, I can report that in CI of personal and company
projects, I haven't yet run into problems with depth=5. It speeds up
checking out the tree, even when it's a fast local network remote.
depth=1 is perfectly okay for most travis cases, as you don't need any
history at all unless your build system looks for it... this is a
bizarre comparison.

The point, is that PKGBUILDs do look for history, and make use of it --
figuring out clever ways to avoid pulling history is completely missing
the point that we, well, want history.

depth=10 will only work for tags that are present in the last ten
commits, which unsurprisingly is exactly the opposite of most projects
(which don't have tags at all and therefore require all history without
exception in order to implement the pkgver() function) or even most
projects with tags (which don't release stable releases on basically
every other commit).
--
Eli Schwartz
Bug Wrangler and Trusted User
Uwe Koloska
2018-03-04 17:06:04 UTC
Permalink
Raw Message
Post by Eli Schwartz via arch-general
The point, is that PKGBUILDs do look for history, and make use of it --
figuring out clever ways to avoid pulling history is completely missing
the point that we, well, want history.
But the history is only needed for the default functions, isn't it? And
shallow clones are only needed for special repositories where a full
clone is not feasible.

So in this case it's a far better approach to provide your own functions
that don't need the whole git history, cause this has all needed changes
for this special repository inside the recipe and is not something a
user has to do.

On the other hand, it certainly doesn't make sense to use shallow copies
in general, because they raise the discussed problems for functions that
should be generally usable.

Uwe
Carsten Mattner via arch-general
2018-03-04 20:27:49 UTC
Permalink
Raw Message
Post by Eli Schwartz via arch-general
Post by Carsten Mattner via arch-general
At least for GitHub remotes, don't they still support checking out
with SVN? If they do, this would be faster and use less space, too,
when we just need a certain revision and no history at all.
Other than that, I'm "pretty sure" that a git depth of 10 commits
will work for most repositories when you clone normally, not
shallow. Should also work for tags. However, it's true that git's
limited depth clone isn't implemented fully. There are many unhandled
cases and surprises.
All that being said, I can report that in CI of personal and company
projects, I haven't yet run into problems with depth=5. It speeds up
checking out the tree, even when it's a fast local network remote.
depth=1 is perfectly okay for most travis cases, as you don't need any
history at all unless your build system looks for it... this is a
bizarre comparison.
The point, is that PKGBUILDs do look for history, and make use of it --
figuring out clever ways to avoid pulling history is completely missing
the point that we, well, want history.
Interesting. What does PKGBUILD do with history of more than 10 revisions?
If we checkout a tag or specific commit (e.g. xf86-video-intel), what
does PKGBUILD need prior revisions for? I'm sure you're correct, I'd
like to know what it is, if you don't mind explaining.
Post by Eli Schwartz via arch-general
depth=10 will only work for tags that are present in the last ten
commits, which unsurprisingly is exactly the opposite of most projects
(which don't have tags at all and therefore require all history without
exception in order to implement the pkgver() function) or even most
projects with tags (which don't release stable releases on basically
every other commit).
Eli, you certainly have more experience, so I'm trusting your word here.
However, I don't understand how depth=10 can fail when trying to checkout
a specific git tag. Wouldn't the tag be the HEAD in that case?


Checking out with SVN is a speedup trick, and I still think it can make
sense if depth limiting git clone is not possible. svn checkout is
basically just copying the tree of that revision (or branch/tag path)
specified.
Eli Schwartz via arch-general
2018-03-04 20:50:31 UTC
Permalink
Raw Message
Post by Carsten Mattner via arch-general
Interesting. What does PKGBUILD do with history of more than 10 revisions?
If we checkout a tag or specific commit (e.g. xf86-video-intel), what
does PKGBUILD need prior revisions for? I'm sure you're correct, I'd
like to know what it is, if you don't mind explaining.
You cannot clone a tag or commit, you can only clone a branch and check
out the tag or commit. So you need enough revisions on that branch to
reach said tag... and you cannot use shallow-exclude as I mentioned in a
previous email.

This means that PKGBUILDs which checkout a specific revision are
actually worse than the rest, as you cannot even get the source without
knowing how many commits you need (rather than failing afterwards in
pkgver() or something).
Post by Carsten Mattner via arch-general
Post by Eli Schwartz via arch-general
depth=10 will only work for tags that are present in the last ten
commits, which unsurprisingly is exactly the opposite of most projects
(which don't have tags at all and therefore require all history without
exception in order to implement the pkgver() function) or even most
projects with tags (which don't release stable releases on basically
every other commit).
Eli, you certainly have more experience, so I'm trusting your word here.
However, I don't understand how depth=10 can fail when trying to checkout
a specific git tag. Wouldn't the tag be the HEAD in that case?
If that were true, then depth=1 would work. But tags are usually not the
upstream HEAD commit, because development continues afterwards...

So first you clone a branch, and then you try to checkout a tag (and
fail, if you used depth=10 and the tag is not attached to one of those
ten commits).
Post by Carsten Mattner via arch-general
Checking out with SVN is a speedup trick, and I still think it can make
sense if depth limiting git clone is not possible. svn checkout is
basically just copying the tree of that revision (or branch/tag path)
specified.
I know how SVN works. :p

I also know how svn doesn't work -- you cannot get tag information, for
example, and svn revision numbers do not necessarily cleanly translate
to git revisions numbers let alone commit hashes.

Giving users a mysterious svn revision number they don't know how to
trace, is confusing UI. So I wouldn't recommend this even for projects
without tags at all.
--
Eli Schwartz
Bug Wrangler and Trusted User
Carsten Mattner via arch-general
2018-03-04 23:06:45 UTC
Permalink
Raw Message
Post by Eli Schwartz via arch-general
Post by Carsten Mattner via arch-general
Interesting. What does PKGBUILD do with history of more than 10 revisions?
If we checkout a tag or specific commit (e.g. xf86-video-intel), what
does PKGBUILD need prior revisions for? I'm sure you're correct, I'd
like to know what it is, if you don't mind explaining.
You cannot clone a tag or commit, you can only clone a branch and check
out the tag or commit. So you need enough revisions on that branch to
reach said tag... and you cannot use shallow-exclude as I mentioned in a
previous email.
This means that PKGBUILDs which checkout a specific revision are
actually worse than the rest, as you cannot even get the source without
knowing how many commits you need (rather than failing afterwards in
pkgver() or something).
Right. I had assumed that git clone -b/--branch did also exist for
tags. Git is like Linux and very evolutionary, with many warts,
only some parts designed before implementation. This means some
features are only implemented partially. I like and use git, but
sometimes it feels like it's a car where there are five doors, but
you're only supposed to use 2.5 of them.
Post by Eli Schwartz via arch-general
Post by Carsten Mattner via arch-general
Post by Eli Schwartz via arch-general
depth=10 will only work for tags that are present in the last ten
commits, which unsurprisingly is exactly the opposite of most projects
(which don't have tags at all and therefore require all history without
exception in order to implement the pkgver() function) or even most
projects with tags (which don't release stable releases on basically
every other commit).
Eli, you certainly have more experience, so I'm trusting your word here.
However, I don't understand how depth=10 can fail when trying to checkout
a specific git tag. Wouldn't the tag be the HEAD in that case?
If that were true, then depth=1 would work. But tags are usually not the
upstream HEAD commit, because development continues afterwards...
So first you clone a branch, and then you try to checkout a tag (and
fail, if you used depth=10 and the tag is not attached to one of those
ten commits).
See above.
Post by Eli Schwartz via arch-general
Post by Carsten Mattner via arch-general
Checking out with SVN is a speedup trick, and I still think it can make
sense if depth limiting git clone is not possible. svn checkout is
basically just copying the tree of that revision (or branch/tag path)
specified.
I know how SVN works. :p
I also know how svn doesn't work -- you cannot get tag information, for
example, and svn revision numbers do not necessarily cleanly translate
to git revisions numbers let alone commit hashes.
svn works differently, whereas git is all about the DAG. But let's
not discuss svn's design. The idea was that when you the ability
to svn checkout a github project or maybe Apache svn repository,
and those have proper tags and branches, then this will be very
quick in comparison. But as you say, this is bound to be problematic
for other reasons.

I believe git devs are working on checking out tags with shallow depth,
not sure how many years it will take.
Post by Eli Schwartz via arch-general
Giving users a mysterious svn revision number they don't know how to
trace, is confusing UI. So I wouldn't recommend this even for projects
without tags at all.
Let's ignore the possibility of svn, but tracking a revision number
is the same for those projects without tags as it is for git. As
in the xf86-video-intel project.
Damjan Georgievski via arch-general
2018-03-05 00:13:28 UTC
Permalink
Raw Message
Post by Carsten Mattner via arch-general
Post by Eli Schwartz via arch-general
This means that PKGBUILDs which checkout a specific revision are
actually worse than the rest, as you cannot even get the source without
knowing how many commits you need (rather than failing afterwards in
pkgver() or something).
Right. I had assumed that git clone -b/--branch did also exist for
tags.
https://www.kernel.org/pub/software/scm/git/docs/git-clone.html

--branch can also take tags and detaches the HEAD at that commit in
the resulting repository.
Eli Schwartz via arch-general
2018-03-05 00:38:08 UTC
Permalink
Raw Message
Post by Damjan Georgievski via arch-general
Post by Carsten Mattner via arch-general
Post by Eli Schwartz via arch-general
This means that PKGBUILDs which checkout a specific revision are
actually worse than the rest, as you cannot even get the source without
knowing how many commits you need (rather than failing afterwards in
pkgver() or something).
Right. I had assumed that git clone -b/--branch did also exist for
tags.
https://www.kernel.org/pub/software/scm/git/docs/git-clone.html
--branch can also take tags and detaches the HEAD at that commit in
the resulting repository.
... huh, I stand corrected. :D

I did not realize this was possible -- I've looked at clone depth fairly
often but never noticed this... well, you live and learn!

This actually makes it pretty easy to clone what you need in a stable
PKGBUILD that checks out a tag (but not one that checks out a commit).

Although it makes it no easier to also grab commits that are
cherry-picked in prepare() or get the output of `git describe` for an
unpredictable number of commits since and including a tag, which are
also significant blockers. And these cannot be syntactically parsed from
the source=() which means they would require PKGBUILD metadata to either
indicate if it is safe to shallow clone or (manually specify) e.g. a
date or tag-1 to fetch commits since.

Probably still too much effort to implement...

This would in theory be totally feasible if makepkg had a builtin
feature to apply patches (which I think would be considered a "this is
doing too much" feature) in addition to some way to reverse the pkgver()
function to acquire the tag used in pkgver= and then specify git clone
--shallow-since=${tag}~1 but at this point it becomes understandable why
no one has any interest in implementing it. :)
--
Eli Schwartz
Bug Wrangler and Trusted User
Tinu Weber
2018-03-03 10:48:31 UTC
Permalink
Raw Message
Post by Adam Levy via arch-general
https://wiki.archlinux.org/index.php/Use r:Apg#makepkg:_shallow_git_clones
Which provides a now dead link.
It provides a now dead link because there is a rogue space character
("Use r"). The following link works:

https://wiki.archlinux.org/index.php/User:Apg#makepkg:_shallow_git_clones
Loading...